Skip to content

[Refactor] Decompose chat_service.py - 563 lines, 7 responsibilities #349

@GunaPalanivel

Description

@GunaPalanivel

Problem

chat_service.py is 563 lines and has accumulated 49 commits - nearly 2x the next highest-churn file in the repo. It owns at least 7 distinct responsibilities:

  1. Query classification (_classify_query)
  2. Query decomposition / sub-query splitting (_get_sub_queries)
  3. Tool call routing (_get_agent_tool_calls)
  4. Tool execution dispatch (_execute_search_tools)
  5. Context retrieval and assembly (retrieve_context)
  6. Relevance scoring and reformulation loop (_get_query_context_relevance)
  7. Answer generation (generate_answer, get_chatbot_reply, get_chatbot_reply_stream)

It also re-defines CODE_BLOCK_PLACEHOLDER_PATTERN (line 36) which is already in api/tools/utils.py (line 17) - a leftover from when some logic was extracted but the constant was copy-pasted instead of imported.

Impact

  • Every feature PR touches this file, causing merge conflicts across contributors
  • Testing individual responsibilities requires mocking half the module
  • The file is the single biggest contributor to the bus factor problem - one person has authored most of the 49 commits
  • New contributors avoid touching it because a change anywhere can break everything

Proposed Decomposition

Split into focused modules under api/services/:

New Module Responsibilities Moved
query_classifier.py _classify_query, _get_sub_queries
tool_dispatcher.py _get_agent_tool_calls, _execute_search_tools
context_retriever.py retrieve_context, _get_query_context_relevance, reformulation loop
chat_service.py (slim) get_chatbot_reply, get_chatbot_reply_stream, generate_answer - pure orchestration

Move CODE_BLOCK_PLACEHOLDER_PATTERN to a shared api/constants.py and import everywhere.

Migration Strategy

This can be done incrementally - one module at a time:

  1. Extract query_classifier.py first (smallest, fewest dependencies)
  2. Update imports in chat_service.py to use the extracted module
  3. Run existing tests - they should pass without changes since the public API stays the same
  4. Repeat for tool_dispatcher.py then context_retriever.py

Acceptance Criteria

  • chat_service.py reduced to < 200 lines of orchestration logic
  • Each extracted module has a clear single responsibility
  • CODE_BLOCK_PLACEHOLDER_PATTERN defined in one place, imported everywhere
  • All existing tests in tests/unit/services/test_chat_service.py pass without modification
  • No changes to route-level imports (from api.services.chat_service import get_chatbot_reply)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions