feat: add MockChatGenerator#11708
Conversation
Add `MockChatGenerator`, a Chat Generator that returns predefined responses without calling any API. It is a deterministic, zero-cost drop-in replacement for real Chat Generators in tests, smoke tests, and quick prototypes, inspired by model-layer fakes in other frameworks (LangChain `FakeListChatModel`, LlamaIndex `MockLLM`, PydanticAI `FunctionModel`). It supports: - a fixed response (string or ChatMessage), - a list of responses cycled across calls (to drive Agent-like loops), - a `response_fn` callable for input-dependent replies, - an echo mode (the default) that returns the last user message. It implements the full Chat Generator interface: `run`, `run_async`, streaming callbacks, and serialization. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||
Reduce the number of test functions via parametrization (init validation, echo modes, response_fn return types, serialization roundtrips) without losing coverage, and cover the previously-missed defensive branches. mock.py is now at 100% statement coverage. Also fix the docstring usage example to import ToolCall. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address review feedback: avoid RST-style double backticks in code comments. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
|
||
| return replace(base, _meta=self._build_meta(messages, base)) | ||
|
|
||
| def _make_chunks(self, reply: ChatMessage) -> list[StreamingChunk]: |
There was a problem hiding this comment.
Small thing for a potential future PR would also to support streaming reasoning content.
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
Address review feedback: echo the last message that has text content instead of preferring the last user message and then falling back. This is behaviorally identical for the typical case (the last message is the user turn) and removes the now-unused ChatRole import. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mpangrazzi
left a comment
There was a problem hiding this comment.
Looks already good, and I would definitely use it when it comes to e.g. load/stress testing without actually calling LLMs. I left a few comments!
| if text is None: | ||
| return None | ||
| base = ChatMessage.from_assistant(text) | ||
|
|
There was a problem hiding this comment.
I was wondering about adding a check here that ensure that the last message is from assistant when building a reply (since here ChatMessage instances are appended unchanged, and one could add only user messages)
There was a problem hiding this comment.
Sebastian brought up not the same remark but something similar in our conversation. I am addressing this now in 015a703
ChatGenerator replies are always assistant messages now. If they aren't, we raise an error.
| chunks.append( | ||
| StreamingChunk(content="", component_info=component_info, index=0, meta={"model": self.model}) | ||
| ) | ||
|
|
There was a problem hiding this comment.
What about adding reasoning streaming (if present)?
There was a problem hiding this comment.
Great minds think alike: #11708 (comment)
We'll leave that for a later PR.
…enerator Reorder run()/run_async() to (messages, streaming_callback, generation_kwargs, *, tools, tools_strict), mirroring OpenAIChatGenerator so the mock is a true positional drop-in. Previously the order followed FallbackChatGenerator, which puts streaming_callback last and tools positionally. Add a regression test pinning the parameter order and verifying a callback passed as the 2nd positional arg is treated as streaming_callback. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A Chat Generator's replies are always assistant messages, so reject a non-assistant ChatMessage supplied via `responses` (at construction) or returned from `response_fn` (at run time) with a clear error, instead of emitting a user/system/tool message as a reply. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Should we add this PR to this list for docs updates? https://github.com/deepset-ai/haystack-private/issues/381 |
Yes, I will do so. Have it under "Notes for the reviewer" in the admittedly long PR description. 😉 |
…nses Note in the class/__init__ docstrings (and the ValueError list) that any ChatMessage passed via `responses` or returned from `response_fn` must have the assistant role, and reword `_coerce_to_message`'s docstring to reflect that it validates rather than coerces the role. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
sjrl
left a comment
There was a problem hiding this comment.
Looks good! Just a few last minor comments.
Related Issues
Proposed Changes:
Adds
MockChatGenerator, a Chat Generator that returns predefined responses without calling any API. It is a deterministic, zero-cost drop-in replacement for real Chat Generators (e.g.OpenAIChatGenerator) in unit tests, smoke tests of customer pipelines, and quick prototypes.Response selection modes:
strorChatMessage; the same reply is returned every call.str/ChatMessage; each call returns the next item, wrapping around. Useful to drive Agent-like loops (e.g. first call returns a tool call, second returns the final answer).response_fn=callable(messages) -> str | ChatMessage.Passing
ChatMessageobjects lets you return tool calls or reasoning content for exercising tool-calling pipelines without a real model.It implements the full Chat Generator interface so it slots in anywhere a real generator goes:
runandrun_asyncstreaming_callback(chunks reconstructed from the predefined reply, at init or run time)to_dict/from_dict(including serialization ofresponse_fnandstreaming_callbacknamed callables)warm_up(no-op)meta(model,finish_reason, approximateusage)Files:
haystack/components/generators/chat/mock.py– the componenttest/components/generators/chat/test_mock.py– 26 unit testshaystack/components/generators/chat/__init__.py(lazy import) andpydoc/generators_api.ymlHow did you test it?
Added unit tests covering all response modes, echo, cycling, tool-call replies, meta merging precedence, non-mutation of stored responses, sync/async runs, sync/async streaming, serialization roundtrips (including
response_fnand echo mode), and aPipelineintegration + serialization roundtrip.Notes for the reviewer
usagetoken counts are a deliberate approximation (whitespace word count, not real tokenization), documented as such; they exist to give downstream code realistic-looking metadata.Checklist
🤖 Generated with Claude Code