[Bug]: JSONDecodeError in DRIFT Search due to malformed LLM response (Invalid control character)

### Do you need to file an issue?

- [x] I have searched the existing issues and this bug is not already filed.
- [x] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
- [x] I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

### Describe the bug

When using DRIFT search with certain LLM providers (specifically `moonshotai/kimi-k2-0905` via OpenRouter), GraphRAG encounters a `JSONDecodeError: Invalid control character`. This occurs because the LLM returns a JSON response containing malformed fragments (specifically trailing characters like `", ":", "}`) outside the valid JSON structure.

### Steps to reproduce

1.  **Configuration:** Set up GraphRAG v2.x to use OpenRouter as the LLM provider.
2.  **Model:** Configure the model to `moonshotai/kimi-k2-0905:moonshotai`.
3.  **Search Mode:** Initiate a DRIFT search.
4.  **Query:** Execute the following query (failure rate is approximately 1 in 8):
    ```json
    {
      "question_type": "data_global",
      "search_mode": "drift",
      "question_text": "How should grass hay be positioned within the Litter Box Strategy to maximize the effectiveness of the litter training program for shelter rabbits?"
    }
    ```
5.  **Observation:** The search fails with a `JSONDecodeError`. Retrying the same query results in the same error at the exact same character position.

### Expected Behavior

GraphRAG should gracefully handle or sanitize LLM responses before attempting to parse them as JSON, or provide a retry mechanism that catches `JSONDecodeError` specifically when the LLM provider returns slightly malformed output.

### GraphRAG Config Used

```yaml
# Paste your config here

```


### Logs and screenshots

**Error Trace:**
```
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 25 (char 24)
```

**Raw Response (Debug Log):**
The LLM provider returns JSON responses with malformed fragments. Note the stray characters after the closing brace:
```
2026-01-02 12:24:26,759 - LiteLLM - DEBUG - RAW RESPONSE:
{"id": "gen-1767353063-Ho0rIH8LK3FLrGBfQT64", "choices": [{"finish_reason": null, "index": 0, 
"logprobs": null, "message": {"content": "{\"intermediate_answer\":\"# How the Spring Pines Micro 
Farm community relates to dry-bath instructions\\n\\n...\", \":\", \"}", "refusal": null, 
"role": "assistant"...
```

### Additional Information

### Environment Information
*   **GraphRAG Version:** v2.x (async LiteLLM streaming)
*   **Python Version:** 3.12.2
*   **LLM Provider:** OpenRouter (openrouter.ai)
*   **LLM Model:** `moonshotai/kimi-k2-0905:moonshotai`
*   **OS:** Mac OS

### Additional context
*   **Root Cause Analysis:** The error is strictly caused by the LLM appending invalid JSON control characters (`", ":", "}`) to the end of the `intermediate_answer` content string.
*   **Scope:** This does not affect local or global search modes using the same model; it appears specific to the prompt structure or parsing logic of DRIFT search.
*   **Source Data:** The source question JSON files are valid; the corruption is introduced during generation.

### Suggested Fix
I have identified a potential fix. Adding a sanitization step before parsing the JSON response resolves the issue.

```python
import re

def sanitize_json_response(text: str) -> str:
    """Remove invalid JSON control characters from LLM response."""
    if not text:
        return text
    
    # Remove control characters (ASCII < 32) except valid JSON whitespace
    sanitized = ''.join(
        char for char in text 
        if ord(char) >= 32 or char in '\n\r\t'
    )
    
    # Remove common malformed patterns like trailing ", ":", "}"
    sanitized = re.sub(r',\s*":\s*",\s*"}\s*$', '}', sanitized)
    
    return sanitized.strip()
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: JSONDecodeError in DRIFT Search due to malformed LLM response (Invalid control character) #2163

Do you need to file an issue?

Describe the bug

Steps to reproduce

Expected Behavior

GraphRAG Config Used

Logs and screenshots

Additional Information

Environment Information

Additional context

Suggested Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: JSONDecodeError in DRIFT Search due to malformed LLM response (Invalid control character) #2163

Description

Do you need to file an issue?

Describe the bug

Steps to reproduce

Expected Behavior

GraphRAG Config Used

Logs and screenshots

Additional Information

Environment Information

Additional context

Suggested Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions