Skip to content

[MCP T11] ask MCP tool (NL\u2192Cypher via GraphRAG) #659

@DvirDukhan

Description

@DvirDukhan

Phase 1 ticket T11. Depends on #650 (T3 fixture), #658 (T10 prompts), and transitively #657 (T9 GraphRAG init).

Context

The ask tool is the strategic differentiator. None of the 5 competing code-graph MCP servers expose a natural-language query interface — they all top out at structural traversal. ask lets agents ask "what calls processPayment?" in English and get a grounded answer with the executed Cypher visible for transparency.

How it works (two LLM round-trips bracketing one Cypher query against FalkorDB):

  1. LLM Add client side validation for URL #1 (cypher generation): question + ontology → Cypher
  2. FalkorDB: execute Cypher → rows of nodes
  3. LLM Error on Attribute 'name' is already indexed #2 (QA synthesis): question + rows → natural-language answer

The graph itself never goes to the LLM — only the schema and query results — which is why this works on huge codebases.

All the GraphRAG plumbing already exists in api/llm.py (_create_kg_agent at api/llm.py:238-258, _ask_sync at api/llm.py:260-268, ask at api/llm.py:271-273). T9 extracted the construction; T10 set up the prompt seam. This ticket is the thin async MCP wrapper.

Scope

In:

  • New api/mcp/tools/ask.py registering the ask MCP tool:
    @app.tool()
    async def ask(question: str, project: str | None = None, branch: str | None = None) -> dict:
        kg = get_or_create_kg(project or current_project_name(), branch or "_default")
        response = await asyncio.get_event_loop().run_in_executor(None, kg.ask, question)
        return {
            "answer": response.answer,
            "cypher_query": response.cypher_query,    # exposed for transparency
            "context_nodes": response.context_nodes,
        }
  • The cypher_query field is required in the response — the design doc explicitly calls this out as a transparency requirement so the agent can inspect, learn, and debug.
  • Tests in tests/mcp/test_ask.py:
    • Unit test with fully mocked KnowledgeGraph: assert response shape {answer, cypher_query, context_nodes}.
    • Integration test with mocked LiteModel: stub the model to return canned content for both LLM round-trips:
      1. First call (cypher gen): returns a known Cypher targeting the T3 fixture (e.g. MATCH (n:Function {name:"service"})<-[:CALLS]-(c) RETURN c).
      2. Real Cypher executes against the real fixture graph in FalkorDB, returning real nodes.
      3. Second call (QA synthesis): returns a canned answer string.
      4. Assert the response includes the executed cypher in cypher_query and the real nodes in context_nodes.
    • Protocol round-trip: tool registered and callable via stdio client.

Out:

  • Real-LLM E2E (Phase 1.5 nightly with API-key secrets).
  • Streaming responses.
  • Multi-turn conversation memory (each ask is independent).
  • Prompt iteration (Phase 1.5).

Files to create / modify

  • new api/mcp/tools/ask.py
  • modified api/mcp/server.py if needed (auto-discovery of new tools)
  • new tests/mcp/test_ask.py

Acceptance criteria

  • Tool registered with FastMCP and discoverable via session.list_tools().
  • Input schema: question: str, project: str | None = None, branch: str | None = None.
  • Response shape: {answer: str, cypher_query: str, context_nodes: list}.
  • Unit test asserts response shape with fully mocked KnowledgeGraph.
  • Integration test:
    • Mocks LiteModel so neither LLM call hits a real provider.
    • The mocked Cypher-gen response contains real Cypher that executes against the T3 fixture in CI's FalkorDB.
    • Asserts that response.context_nodes contains real nodes from the fixture (not from the mock).
    • Asserts that response.cypher_query matches the mocked Cypher.
  • Protocol round-trip test calls the tool via stdio client and asserts a non-error structured response.
  • CI workflow [MCP T2] CI workflow with FalkorDB service for MCP tests #649 green.

Dependencies

Out of scope (do NOT do in this PR)

  • Real-LLM smoke test (Phase 1.5).
  • Streaming, multi-turn, or memory.
  • Prompt iteration.
  • Per-question caching (the KnowledgeGraph is cached per (project, branch) in T9; per-question caching is overkill).

Notes for the implementer

  • The async wrapper around the sync kg.ask() follows the same run_in_executor pattern as the existing api/llm.py:271-273.
  • Auto-detect project from CWD when not provided (similar to T4's branch auto-detect — just use Path.cwd().name).
  • Auto-detect branch from CWD when not provided (reuse the helper from T17).
  • The mocked-LLM integration test is the key innovation here: it gives us real coverage of the Cypher execution path without needing API credentials. Pattern: unittest.mock.patch("graphrag_sdk.models.litellm.LiteModel.ask", side_effect=[cypher_response, qa_response]).
  • Reuse the protocol-test helper from T4.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestmcpMCP server (model context protocol) work

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions