diff --git a/content/en/docs/eino/Cookbook.md b/content/en/docs/eino/Cookbook.md index 0485c508545..65d0db60987 100644 --- a/content/en/docs/eino/Cookbook.md +++ b/content/en/docs/eino/Cookbook.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: Cookbook @@ -63,6 +63,27 @@ This document serves as an example index for the eino-examples project, helping adk/multiagent/integration-excel-agentExcel Agent (ADK Integration)ADK integrated Excel Agent, including Planner, Executor, Replanner, Reporter +### Agent + + + + +
DirectoryNameDescription
adk/agent/ralph-loopRalph LoopAutonomous iteration mode: an outer
for
loop with
Runner.Run
implements single-turn iteration. The Agent perceives prior work through the file system; a verification gate checks for BUG markers before accepting completion claims
+ +### Cancel + + + + +
DirectoryNameDescription
adk/cancel/graceful-exitGraceful ExitDemonstrates Agent Cancel + Resume: captures terminal signals, then cancels nested Agents using
CancelAfterChatModel
+
WithRecursive
mode, waits for a safe point to save Checkpoint, then resumes execution
+ +### Middlewares + + + + +
DirectoryNameDescription
adk/middlewares/skillSkill MiddlewareLoads Agent skills from the file system (e.g., log_analyzer), demonstrating skill middleware usage
+ ### GraphTool @@ -209,6 +230,7 @@ This document serves as an example index for the eino-examples project, helping +
quickstart/chatChat QuickStartThe most basic LLM conversation example, including template, generation, streaming output
quickstart/eino_assistantEino AssistantComplete RAG application example, including knowledge indexing, Agent service, Web interface
quickstart/todoagentTodo AgentSimple Todo management Agent example
quickstart/chatwitheinoChat with Eino (Tutorial)9-chapter progressive tutorial, from ChatModel → Runner → Session → Tool → Middleware → Callback → Interrupt → GraphTool → Skill, building a complete Agent step by step
--- diff --git a/content/en/docs/eino/FAQ.md b/content/en/docs/eino/FAQ.md index d75225d43f8..a5c16078527 100644 --- a/content/en/docs/eino/FAQ.md +++ b/content/en/docs/eino/FAQ.md @@ -1,10 +1,10 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-19" lastmod: "" tags: [] title: FAQ -weight: 11 +weight: 10 --- # Q: cannot use openapi3.TypeObject (untyped string constant "object") as *openapi3.Types value in struct literal, cannot use types (variable of type string) as *openapi3.Types value in struct literal @@ -13,11 +13,7 @@ Check that the github.com/getkin/kin-openapi dependency version does not exceed # Q: Agent streaming calls do not enter the ToolsNode node. Or streaming effect is lost, behaving as non-streaming. -- First update Eino to the latest version - -Different models may output tool calls differently in streaming mode: some models (like OpenAI) output tool calls directly; some models (like Claude) output text first, then output tool calls. Therefore, different methods need to be used for judgment. This field is used to specify the function that determines whether the model's streaming output contains tool calls. - -The Config of ReAct Agent has a StreamToolCallChecker field. If not filled, the Agent will use "non-empty packet" to determine whether it contains tool calls: +- First update Eino to the latest version. Different models may output tool calls differently in streaming mode: some models (like OpenAI) output tool calls directly; some models (like Claude) output text first, then output tool calls. Therefore, different methods need to be used for judgment. This field is used to specify the function that determines whether the model's streaming output contains tool calls. The Config of ReAct Agent has a StreamToolCallChecker field. If not filled, the Agent will use "non-empty packet" to determine whether it contains tool calls: ```go func firstChunkStreamToolCallChecker(_ context.Context, sr *schema.StreamReader[*schema.Message]) (bool, error) { @@ -45,9 +41,7 @@ func firstChunkStreamToolCallChecker(_ context.Context, sr *schema.StreamReader[ } ``` -The above default implementation is suitable for: Tool Call Messages output by the model only contain Tool Calls. - -Cases where the default implementation is not applicable: there are non-empty content chunks before outputting Tool Calls. In this case, a custom tool call checker is needed: +The above default implementation is suitable for: Tool Call Messages output by the model only contain Tool Calls. Cases where the default implementation is not applicable: there are non-empty content chunks before outputting Tool Calls. In this case, a custom tool call checker is needed: ```go toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Message]) (bool, error) { @@ -74,9 +68,7 @@ toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Mes The above custom StreamToolCallChecker needs to check **all packets** for ToolCall when the model normally outputs an answer, which causes the "streaming judgment" effect to be lost. To preserve the "streaming judgment" effect as much as possible, the suggestion is: > 💡 -> Try adding prompts to constrain the model not to output extra text when calling tools, for example: "If you need to call a tool, output the tool directly without outputting text." -> -> Different models may be affected differently by prompts. In actual use, you need to adjust the prompts yourself and verify the effect. +> Try adding prompts to constrain the model not to output extra text when calling tools, for example: "If you need to call a tool, output the tool directly without outputting text." Different models may be affected differently by prompts. In actual use, you need to adjust the prompts yourself and verify the effect. # Q: [github.com/bytedance/sonic/loader](http://github.com/bytedance/sonic/loader): invalid reference to runtime.lastmoduledatap @@ -91,19 +83,15 @@ Currently, models generally do not produce illegal JSON output. First confirm wh Eino currently does not support batch processing. There are two optional methods: 1. Dynamically build the graph on demand for each request, with low additional cost. Note that Chain Parallel requires more than one parallel node. -2. Custom batch processing node, where the node handles batch processing tasks internally - -Code example: [https://github.com/cloudwego/eino-examples/tree/main/compose/batch](https://github.com/cloudwego/eino-examples/tree/main/compose/batch) +2. Custom batch processing node, where the node handles batch processing tasks internally. Code example: [https://github.com/cloudwego/eino-examples/tree/main/compose/batch](https://github.com/cloudwego/eino-examples/tree/main/compose/batch) # Q: Does Eino support structured model output? Two steps. First, require the model to output structured data, with three methods: 1. Some models support direct configuration (like OpenAI's response format). Check if there's such configuration in the model settings. -2. Obtain through tool call functionality -3. Write prompts requiring the model to output structured data - -After getting structured output from the model, you can use schema.NewMessageJSONParser to convert the message to the struct you need. +2. Obtain through tool call functionality. +3. Write prompts requiring the model to output structured data. After getting structured output from the model, you can use schema.NewMessageJSONParser to convert the message to the struct you need. # Q: How to get the Reasoning Content/reasoning/deep thinking content output by the model (chat model): @@ -115,14 +103,8 @@ Discussion by case: 1. context.canceled: When executing a graph or agent, the user passed in a cancelable context and initiated a cancellation. Check the context cancel operation in the application layer code. This error is unrelated to the Eino framework. 2. Context deadline exceeded: Could be two situations: - 1. When executing a graph or agent, the user passed in a context with timeout, triggering a timeout. - 2. Timeout or httpclient with timeout was configured for ChatModel or other external resources, triggering a timeout. - -Check `node path: [node name x]` in the thrown error. If the node name is not a node with external calls like ChatModel, it's most likely situation 2-a; otherwise, it's most likely situation 2-b. - -If you suspect it's situation 2-a, check which link in the upstream chain set the timeout on context. Common possibilities include FaaS platforms, etc. - -If you suspect it's situation 2-b, check whether the node has its own timeout configuration, such as Ark ChatModel configured with Timeout, or OpenAI ChatModel configured with HttpClient (with internal Timeout configuration). If neither is configured but still timing out, it may be the model SDK's default timeout. Known default timeouts: Ark SDK 10 minutes, Deepseek SDK 5 minutes. +3. When executing a graph or agent, the user passed in a context with timeout, triggering a timeout. +4. Timeout or httpclient with timeout was configured for ChatModel or other external resources, triggering a timeout. Check `node path: [node name x]` in the thrown error. If the node name is not a node with external calls like ChatModel, it's most likely situation 2-a; otherwise, it's most likely situation 2-b. If you suspect it's situation 2-a, check which link in the upstream chain set the timeout on context. Common possibilities include FaaS platforms, etc. If you suspect it's situation 2-b, check whether the node has its own timeout configuration, such as Ark ChatModel configured with Timeout, or OpenAI ChatModel configured with HttpClient (with internal Timeout configuration). If neither is configured but still timing out, it may be the model SDK's default timeout. Known default timeouts: Ark SDK 10 minutes, Deepseek SDK 5 minutes. # Q: How to get the parent graph's State in a subgraph @@ -138,37 +120,268 @@ The latest version of Eino introduces UserInputMultiContent and AssistantGenMult # Q: After upgrading to version 0.6.x, there are incompatibility issues -According to the previous community announcement plan [Migration from OpenAPI 3.0 Schema Object to JSONSchema in Eino · cloudwego/eino · Discussion #397](https://github.com/cloudwego/eino/discussions/397), Eino V0.6.1 has been released. Important update content includes removing the getkin/kin-openapi dependency and all OpenAPI 3.0 related code. +According to the previous community announcement plan [Migration from OpenAPI 3.0 Schema Object to JSONSchema in Eino · cloudwego/eino · Discussion #397](https://github.com/cloudwego/eino/discussions/397), Eino V0.6.1 has been released. Important update content includes removing the getkin/kin-openapi dependency and all OpenAPI 3.0 related code. For errors like undefined: schema.NewParamsOneOfByOpenAPIV3 in some eino-ext modules, upgrade the error-reporting eino-ext module to the latest version. If schema transformation is complex, you can use the [JSONSchema conversion methods](https://bytedance.larkoffice.com/wiki/ZMaawoQC4iIjNykzahwc6YOknXf) document's helper tool methods to assist with conversion. -For errors like undefined: schema.NewParamsOneOfByOpenAPIV3 in some eino-ext modules, upgrade the error-reporting eino-ext module to the latest version. - -If schema transformation is complex, you can use existing OpenAPI 3.0 → JSONSchema conversion tools to assist with conversion. +> 💡 -# Q: Which models provided by Eino-ext ChatModel support Response API format calls? +# Q: After creating a model, attempting model calls results in error: 400 Bad Request, message: code: missing_required_parameter; message: Missing required parameter: 'input. -- Currently in Eino-Ext, only ARK's Chat Model can create ResponsesAPI ChatModel through **NewResponsesAPIChatModel**. Other models currently do not support ResponsesAPI creation and usage. - - If you encounter this error, confirm whether the base URL you used to create the chat model is the Chat Completions URL or the Responses API URL. In most cases, an incorrect Responses API base URL was passed. +- If you encounter this error, confirm whether the base URL you used to create the chat model is the Chat Completions URL or the Responses API URL. In most cases, an incorrect Responses API base URL was passed. # Q: How to troubleshoot ChatModel call errors? For example, [NodeRunError] failed to create chat completion: error, status code: 400, status: 400 Bad Request. -This type of error is an error from the model API (such as GPT, Ark, Gemini, etc.). The general approach is to check whether the actual HTTP Request calling the model API has missing fields, incorrect field values, wrong BaseURL, etc. It's recommended to print out the actual HTTP Request through logs and verify/modify the HTTP Request through direct HTTP request methods (such as sending Curl from command line or using Postman for direct requests). After locating the problem, modify the corresponding issues in the Eino code accordingly. - -For how to print out the actual HTTP Request of the model API through logs, refer to this code example: [https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport](https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport) +This type of error is an error from the model API (such as GPT, Ark, Gemini, etc.). The general approach is to check whether the actual HTTP Request calling the model API has missing fields, incorrect field values, wrong BaseURL, etc. It's recommended to print out the actual HTTP Request through logs and verify/modify the HTTP Request through direct HTTP request methods (such as sending Curl from command line or using Postman for direct requests). After locating the problem, modify the corresponding issues in the Eino code accordingly. For how to print out the actual HTTP Request of the model API through logs, refer to this code example: [https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport](https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport) # Q: The gemini chat model created under the eino-ext repository doesn't support using Image URL to pass multimodal data? How to adapt? Currently, the gemini Chat model under the Eino-ext repository has already added support for passing URL types. Use go get github.com/cloudwego/eino-ext/components/model/gemini to update to [components/model/gemini/v0.1.22](https://github.com/cloudwego/eino-ext/releases/tag/components%2Fmodel%2Fgemini%2Fv0.1.22), the current latest version. Test passing Image URL to see if it meets business requirements. -# Q: Before calling tools (including MCP tool), getting JSON Unmarshal failure error, how to solve +# Q: Tool Calls generated by the model have issues (illegal JSON parameters, calling non-existent tools, parameter name changes, etc.), how to handle? + +Tool Calls generated by models (LLMs) may have various issues. Eino provides multi-layered defense mechanisms to address them. Below is an introduction by problem type: + +## 1. Tool Call arguments are not valid JSON (Unmarshal failure) + +**Typical error:** `failed to call mcp tool: failed to marshal request: json: error calling MarshalJSON for type json.RawMessage: unexpected end of JSON input` **Root cause:** The Argument field in Tool Calls generated by ChatModel is a string. Eino performs JSON Unmarshal before calling tools. If the JSON output by the model is invalid (extra prefix/suffix, special character escaping, missing braces, truncation due to length, etc.), an error will occur. **Solution A: ToolArgumentsHandler (recommended)** Configure `ToolArgumentsHandler` in `ToolsNodeConfig` (or ADK's `ToolsConfig`) to preprocess and fix arguments before tool execution: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: tools, + ToolArgumentsHandler: func(ctx context.Context, name, arguments string) (string, error) { + // Fix common JSON format issues here, such as missing braces, extra prefixes, etc. + return fixJSON(arguments), nil + }, + }, + }, +}) +``` + +A reference implementation for JSON fixing: [eino-examples/components/tool/middlewares/jsonfix](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix) **Execution order:** `ArgumentsAliases replacement → ToolArgumentsHandler → Tool execution` + +## 2. Model calls a non-existent tool (Tool Name hallucination) + +**Typical error:** `tool xxx not found in toolsNode indexes` **Root cause:** The model may "hallucinate" non-existent tool names. **Solution: UnknownToolsHandler** When configured, instead of throwing an error when the model calls a non-existent tool, the Handler returns a prompt text allowing the model to self-correct: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + UnknownToolsHandler: func(ctx context.Context, name, input string) (string, error) { + return fmt.Sprintf("Tool '%s' does not exist. Available tools: %s. Please retry.", name, availableToolNames), nil + }, +} +``` + +## 3. Tool name or parameter names have changed (compatibility issues from schema migration) + +**Scenario:** Tool renamed (e.g., `search` → `web_search`), or parameter field renamed (e.g., `q` → `query`), but the model may still use old names. This is especially common when using LLM Cache or when conversation history records old tool schemas. **Solution: ToolAliases** Configure name aliases and parameter aliases for tools, and the framework automatically resolves them during dispatch: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + ToolAliases: map[string]compose.ToolAliasConfig{ + "web_search": { + NameAliases: []string{"search", "web-search"}, // old tool name → current tool name + ArgumentsAliases: map[string][]string{ + "query": {"q", "search_term"}, // old parameter name → current parameter name + }, + }, + }, +} +``` + +> 💡 +> ToolAliases parameter alias replacement occurs before ToolArgumentsHandler. The complete execution order is: Name Alias resolution → Arguments Alias replacement → ToolArgumentsHandler → Tool execution. + +## 4. Let the model self-correct after tool execution failure (instead of interrupting the flow) + +**Scenario:** When a Tool execution errors (e.g., file not found, permission denied, API call failure), it defaults to interrupting the Agent flow. But usually a better approach is to return the error message as a normal Tool Result to the model, letting the model automatically correct and retry. **Solution A: ADK Middleware (WrapInvokableToolCall)** In ADK Agent, convert errors to string results through `ChatModelAgentMiddleware`'s `WrapInvokableToolCall` method: + +```go +func (m *safeToolMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err // Don't convert interrupt errors + } + return fmt.Sprintf("[tool error] %v", err), nil + } + return result, nil + }, nil +} +``` + +Reference: [quickstart/chatwitheino Ch05 Middleware](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go) **Solution B: compose layer ToolCallMiddlewares** Use `ToolCallMiddlewares` directly at the compose layer, suitable for scenarios using Graph/ToolsNode directly: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + ToolCallMiddlewares: []compose.ToolMiddleware{ + { + Invokable: func(next compose.InvokableToolEndpoint) compose.InvokableToolEndpoint { + return func(ctx context.Context, in *compose.ToolInput) (*compose.ToolOutput, error) { + output, err := next(ctx, in) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return nil, err + } + return &compose.ToolOutput{Result: fmt.Sprintf("[tool error] %v", err)}, nil + } + return output, nil + } + }, + }, + }, +} +``` + +Reference: [eino-examples/components/tool/middlewares/errorremover](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/errorremover) -The Argument field in Tool Call generated by ChatModel is a string. When the Eino framework calls tools based on this Argument string, it first does JSON Unmarshal. At this point, if the Argument string is not valid JSON, JSON Unmarshal will fail, throwing an error like: `failed to call mcp tool: failed to marshal request: json: error calling MarshalJSON for type json.RawMessage: unexpected end of JSON input` +> 💡 +> Note: When converting errors, you must first check `compose.IsInterruptRerunError`. InterruptRerun errors are control flow signals used by the framework for Human-in-the-loop and similar scenarios, and should not be swallowed. + +## Summary -The fundamental solution to this problem is to rely on the model to output valid Tool Call Arguments. Engineering-wise, we can try to fix some common JSON format issues, such as extra prefixes/suffixes, special character escaping issues, missing braces, etc., but cannot guarantee 100% correction. A similar fix implementation can be referenced in this code example: [https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix) + + + + + + +
ProblemMechanismConfiguration Location
Invalid argument JSON
ToolArgumentsHandler
ToolsNodeConfig
/
ToolsConfig
Calling non-existent tool
UnknownToolsHandler
ToolsNodeConfig
/
ToolsConfig
Tool name/parameter name changes
ToolAliases
ToolsNodeConfig
/
ToolsConfig
Tool execution error needs auto-correctionMiddleware error conversionADK
Handlers
or
ToolCallMiddlewares
# Q: How to visualize the topology structure of a graph/chain/workflow? Use the `GraphCompileCallback` mechanism to export the topology structure during `graph.Compile`. A code example for exporting as a mermaid diagram: [https://github.com/cloudwego/eino-examples/tree/main/devops/visualize](https://github.com/cloudwego/eino-examples/tree/main/devops/visualize) +# Q: How to get Tool Call Messages and Tool Result from tool calls in Eino's Flow/ReAct Agent scenarios? + - For obtaining intermediate structures in Flow/React Agent scenarios, refer to the document [Eino: ReAct Agent Manual](/docs/eino/core_modules/flow_integration_components/react_agent_manual) +- Additionally, you can replace Flow/React Agent with ADK's ChatModel Agent. For details, refer to [Eino ADK: Overview](/docs/eino/core_modules/eino_adk/agent_preview) + +# Q: When using Eino to develop an Agent, defined a tool (Tool) that doesn't require any parameters. Why do some large models encounter JSON Schema validation failures (such as `unknown msg type` or unsupported format) when calling? How to resolve this properly? + +**A: Root cause:** In the Function Calling / tool calling ecosystem, many large model vendors have strict format validation logic for the JSON Schema sent. If developers incorrectly pass empty parameter maps or empty structs when defining parameterless tools (e.g., causing the framework to generate `{"type": "object", "properties": {}}` which is syntactically valid but semantically meaningless), some model validation engines will reject the request as an unexpected format. **Framework mechanism and code behavior:** + +- In Eino framework's core definition (`eino/schema/tool.go`), the `schema.ToolInfo` struct specifically uses the `ParamsOneOf` field to describe parameters. +- The framework design explicitly allows: for tools that don't need parameters, `ParamsOneOf` should be `nil`. +- When `ParamsOneOf` is `nil`, Eino's underlying components will directly omit the tool's `parameters` field when building requests to various model providers, fundamentally avoiding triggering model strict validation rules. **Best practice:** When constructing parameterless tools in Eino, **do not use empty structs or empty Maps to initialize parameter descriptions**. Simply let `ParamsOneOf` remain in its default `nil` state. + +```go +tool := &schema.ToolInfo{ + Name: "fetch_current_time", + Desc: "Get the current system time, no parameters needed", + // Best practice: explicitly set to nil, or simply don't declare the field + ParamsOneOf: nil, +} +``` + +**(Note: If using **utils.InferTool** or similar reflection-based inference tools with an empty struct as input, ensure the Eino extension version being used correctly handles filtering of empty properties, or consider manually overriding the parameter definition as needed.)** + +# Q: How to get Session Values outside the Agent (e.g., deep agent's TODOs)? + +In ADK, `adk.GetSessionValues(ctx)` and `adk.AddSessionValue(ctx, key, value)` depend on the `runSession` injected into the context during Agent execution. This means they can **only be used within the Agent's execution context**—for example, in Middleware, Handler, or Tool callback functions. When the user gets an `AsyncIterator` through Runner's `Run` method and consumes `AgentEvent` externally, they are no longer in the Agent's execution context, so Session Values cannot be obtained through `adk.GetSessionValues`. If you need to get Session Values in real-time during Agent execution (e.g., while consuming streaming events), consider using Middleware/Callback Handler callbacks to pass the needed data through other channels (such as a channel). + +# Q: How to distinguish AgentEvents from multiple concurrent SubAgents with the same name? + +**Scenario:** When using DeepAgent, multiple SubAgents with the same name (e.g., `general-purpose`) may execute concurrently. When consuming `AsyncIterator[*AgentEvent]` through Runner, events from different instances are hard to distinguish. **Solution: Wrap Agent, inject identifier through CustomizedOutput** `AgentOutput` provides a `CustomizedOutput any` field that can carry custom data. By wrapping the Agent's `Run` method, inject a unique identifier on each emitted event: + +```go +type wrappedAgent struct { + adk.Agent + identifier int +} + +func (w *wrappedAgent) Run(ctx context.Context, input *adk.AgentInput, options ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + iter := w.Agent.Run(ctx, input, options...) + newIter, newGen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() + go func() { + defer newGen.Close() + for { + event, ok := iter.Next() + if !ok { + break + } + // Note: event.Output may be nil (e.g., error events, action-only events) + if event.Output == nil { + event.Output = &adk.AgentOutput{} + } + event.Output.CustomizedOutput = w.identifier + newGen.Send(event) + } + }() + return newIter +} +``` + +**Usage:** + +```go +agent1 := &wrappedAgent{Agent: generalAgent, identifier: 1} +agent2 := &wrappedAgent{Agent: generalAgent, identifier: 2} +// Pass agent1, agent2 as SubAgents to DeepAgent +``` + +**Consumer side distinction:** + +```go +for { + event, ok := iter.Next() + if !ok { + break + } + if event.Output != nil && event.Output.CustomizedOutput != nil { + id := event.Output.CustomizedOutput.(int) + fmt.Printf("Event from agent instance %d\n", id) + } +} +``` + +> 💡 +> Notes: +> +> 1. event.Output may be nil. You must do a nil check before setting CustomizedOutput. +> 2. This wrapper only covers the Run method. If the Agent implements the ResumableAgent interface (like Agents created by DeepAgent), the Resume method is called directly through the embedded Agent, and its events will not have the identifier injected. For complete coverage, you need to wrap the Resume method as well. +> 3. This solution is a workaround, suitable for quickly solving the distinction problem. CustomizedOutput will not be persisted to Checkpoint. + +# Q: How to load corresponding ToolInfo only when a Skill is triggered? / How to use Skill to force the model to call a specific tool? + +The root of both questions lies in confusing the concepts of Skill and Tool. **The essence of Skill is Prompt.** When triggered, the Skill middleware inserts a new UserMessage into the conversation, whose content is the Skill's Prompt text. You can write "please call xxx tool with parameters yyy" in the Skill Prompt, but this is still just a prompt—whether the model follows it depends on the quality of Prompt Engineering and the model's own randomness. **The essence of Tool (ToolInfo) is request parameters.** The ToolInfo list is sent as the `tools` parameter of the ChatModel request to the model, telling it "which tools you can call." Unless using ToolSearch for dynamic loading (supported by Claude, GPT 5.4+, etc.), ToolInfo must be passed along with the request. **About "dynamically loading ToolInfo when Skill triggers":** To achieve this effect means that when the Skill Prompt is inserted into the conversation, the corresponding tool definitions needed by that Skill are also appended to the `[]ToolInfo` of the current request. This is entirely user-side custom behavior—you need to: 1) identify whether a Skill was triggered in the current turn; 2) determine which Tools the Skill needs; 3) before constructing the ChatModel request, append the corresponding ToolInfo to `[]ToolInfo`. Note that `[]ToolInfo` is at the front of the Prompt Cache, and dynamically appending new tools will very likely break Prompt Cache, causing cache hit rate drops and increased latency. If you care about cache efficiency, pass all potentially needed tools at initialization. **About "using Skill to force model to call a specific tool":** Skill only sends a text prompt to the model. Whether the model strictly follows depends on the Prompt's clarity, the model's instruction-following ability, and context interference. This is essentially a Prompt Engineering problem with inherent uncertainty. If business requirements demand 100% certainty of calling a specific tool, specify ToolChoice in the LLM request to force the model to select that tool, or directly call the tool in application-layer code rather than relying on model decisions. + +> 💡 +> Recommended practices: Want the model to "likely" call a specific tool when Skill triggers → clearly write out the tool name, parameter format, and call instructions in the Skill Prompt; Need to dynamically control available tool set → use ToolSearch or dynamically modify `[]ToolInfo` in ChatModel middleware based on context; Must 100% call a specific tool → call directly in application-layer code, don't rely on model decisions; Worried about Prompt Cache invalidation → pass all potentially needed ToolInfo at initialization, avoid dynamic additions/deletions. + +# Q: Supervisor SubAgent transferring back to main Agent errors / transfer_to_agent forwarding causes SubAgent to receive changed user content + +These issues are all related to ADK's AgentTransfer mechanism. Supervisor is a multi-Agent collaboration mode based on AgentTransfer. The AgentTransfer mechanism has the following known limitations: + +- **Full context sharing**: Supervisor and SubAgents, and among SubAgents, are forced to share the complete context, leading to high token costs and latency. +- **Attention dilution**: Fully shared context is often redundant for SubAgents, diluting their focus on their actual tasks and reducing execution quality. +- **Context pollution**: "Successfully transferred to xxx" messages generated during transfer persist in context, potentially misleading subsequent Agent Tool Call decisions (forming incorrect few-shot examples). +- **Forced tool injection**: The mechanism requires injecting Transfer Tool (and possibly Exit Tool), increasing ToolInfo list complexity. + +> 💡 +> For the above reasons, the AgentTransfer / Supervisor mode in ADK is currently marked as "not recommended". + +**Recommended alternative:** Use DeepAgent or ChatModelAgent + AgentTool combination. In this mode: + +- Each AgentTool has independently encapsulated context, no mutual pollution, faster and cheaper, usually with better results. +- No "Successfully transferred to xxx" interference messages are generated, avoiding misleading model decisions. + +# Q: DeepSeek V4 model has issues with reason content return in tool call scenarios, how to solve? + +The DeepSeek V4 model has known issues with reason content return in tool call scenarios, reported by multiple users. + +**Solution:** Upgrade the corresponding eino-ext deepseek module to the latest version to fix this. + +```shell +go get github.com/cloudwego/eino-ext/components/model/deepseek@latest +``` -# Q: Gemini model error missing a `thought_signature` +After upgrading, run again to confirm whether reason content return has recovered to normal. diff --git a/content/en/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md b/content/en/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md index 5975d0e8ce5..b9ffb22039b 100644 --- a/content/en/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md +++ b/content/en/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md @@ -1,9 +1,9 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: Orchestration Design Principles' +title: Orchestration Design Principles weight: 2 --- @@ -321,7 +321,7 @@ func Init() { Eino's Graph type alignment check occurs at `err = graph.AddEdge("node1", "node2")` when checking whether the two nodes' types match. This allows discovering type mismatch errors during `graph building` or `Compile process`, applicable to rules ① ② ③ listed in [Eino: Orchestration Design Principles](/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles). -When the upstream node's output is `interface`, if the downstream node type implements that `interface`, upstream may be convertible to downstream type (type assertion), but can only be determined during `runtime`. At this point, if it's determined that upstream cannot be assigned to downstream, an error will be thrown. +When the upstream node's output is `interface`, if the downstream node type implements that `interface`, upstream may be convertible to downstream type (type assertion), but this can only be determined during `runtime`. The type check for this scenario has been moved to runtime. The structure is shown in the figure below: diff --git a/content/en/docs/eino/core_modules/components/document_transformer_guide.md b/content/en/docs/eino/core_modules/components/document_transformer_guide.md index 080ba8e7c81..4848ad327da 100644 --- a/content/en/docs/eino/core_modules/components/document_transformer_guide.md +++ b/content/en/docs/eino/core_modules/components/document_transformer_guide.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2025-07-21" +date: "2026-05-17" lastmod: "" tags: [] title: 'Eino: Document Transformer User Guide' @@ -160,9 +160,11 @@ for idx, doc := range outDocs { ## **Existing Implementations** -1. Markdown Header Splitter: Document splitting based on Markdown headers [Splitter - markdown](/docs/eino/ecosystem_integration/document/splitter_markdown) -2. Text Splitter: Document splitting based on text length or delimiters [Splitter - semantic](/docs/eino/ecosystem_integration/document/splitter_semantic) -3. Document Filter: Filter document content based on rules [Splitter - recursive](/docs/eino/ecosystem_integration/document/splitter_recursive) + + + + +
markdownREADME_zh.mdREADME.md
recursiveREADME_zh.mdREADME.md
semanticREADME_zh.mdREADME.md
## **Implementation Reference** diff --git a/content/en/docs/eino/core_modules/components/embedding_guide.md b/content/en/docs/eino/core_modules/components/embedding_guide.md index b0cf613360d..02c132bbaff 100644 --- a/content/en/docs/eino/core_modules/components/embedding_guide.md +++ b/content/en/docs/eino/core_modules/components/embedding_guide.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2025-07-21" +date: "2026-05-17" lastmod: "" tags: [] title: 'Eino: Embedding User Guide' diff --git a/content/en/docs/eino/core_modules/components/tools_node_guide/_index.md b/content/en/docs/eino/core_modules/components/tools_node_guide/_index.md index 87b36f8e210..007c3480b3d 100644 --- a/content/en/docs/eino/core_modules/components/tools_node_guide/_index.md +++ b/content/en/docs/eino/core_modules/components/tools_node_guide/_index.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-03" +date: "2026-05-17" lastmod: "" tags: [] title: 'Eino: ToolsNode & Tool Guide' @@ -282,7 +282,83 @@ type ToolInfo struct { The Tool component uses ToolOption to define optional parameters. ToolsNode has no abstracted common options. Each specific implementation can define its own specific Options, wrapped into the unified ToolOption type using the WrapToolImplSpecificOptFn function. -## Usage +## Tool Aliases 🏷️ alpha/09 + +The Tool Alias feature allows configuring **name aliases** and **argument aliases** for tools, enabling automatic resolution to the real tool and canonical parameters when an LLM calls a tool using an alias. + +### Configuration Structure + +```go +// ToolAliasConfig configures name and argument aliases for a single tool +type ToolAliasConfig struct { + // NameAliases is the list of alternative names for the tool + // If the model returns any of these names, it will be resolved to the canonical tool name + NameAliases []string + + // ArgumentsAliases maps canonical parameter keys to their alias lists + // key=canonical name, value=[]aliases + // e.g.: {"query": ["q", "search_term"], "limit": ["max_results", "count"]} + ArgumentsAliases map[string][]string +} +``` + +Configure via the `ToolAliases` field in `ToolsNodeConfig`: + +```go +config := &compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{searchTool, weatherTool}, + ToolAliases: map[string]ToolAliasConfig{ + "search": { + NameAliases: []string{"find", "query", "search_v1"}, + ArgumentsAliases: map[string][]string{ + "query": {"q", "search_term"}, + "limit": {"max_results", "count"}, + }, + }, + }, +} +toolsNode, err := compose.NewToolNode(ctx, config) +``` + +### Dynamic Override + +Use the `WithToolAliases()` call option to override global alias configuration at runtime: + +```go +// Override alias configuration (retain original tool list) +result, err := toolsNode.Invoke(ctx, input, + compose.WithToolAliases(map[string]compose.ToolAliasConfig{ + "search": { + NameAliases: []string{"new_alias"}, + }, + }), +) + +// Override both tool list and aliases +result, err := toolsNode.Invoke(ctx, input, + compose.WithToolList(newSearchTool), + compose.WithToolAliases(map[string]compose.ToolAliasConfig{...}), +) +``` + +### Execution Flow + +Processing order during tool calls: + +1. **Name Resolution**: The tool name returned by the LLM (which may be an alias) is resolved to the canonical tool name via index lookup +2. **Argument Remapping**: Alias keys in JSON arguments are automatically replaced with canonical keys +3. **ToolArgumentsHandler** (if configured): Receives the canonical tool name and already-remapped arguments +4. **Tool Execution**: The tool is called with canonical name and arguments + +### Notes + +- Name aliases **cannot** conflict with other tools' canonical names or registered aliases +- Argument aliases **cannot** conflict with existing property names in the tool's JSON Schema +- When both an alias key and the canonical key are **present simultaneously** in the argument JSON, the canonical key takes precedence and the alias key is kept as-is +- Configuring aliases for non-existent tool names will be **silently ignored** +- The alias feature supports both **standard tools** and **enhanced tools** + +## **Usage** ### Standard Tool Usage diff --git a/content/en/docs/eino/core_modules/devops/visual_debug_plugin_guide.md b/content/en/docs/eino/core_modules/devops/visual_debug_plugin_guide.md index 65ecbba27e5..6d5579ee3d3 100644 --- a/content/en/docs/eino/core_modules/devops/visual_debug_plugin_guide.md +++ b/content/en/docs/eino/core_modules/devops/visual_debug_plugin_guide.md @@ -1,44 +1,49 @@ --- Description: "" -date: "2025-11-20" +date: "2026-05-17" lastmod: "" tags: [] -title: Eino Dev Visual Debugging Guide +title: Eino Dev Visual Debug Plugin Guide weight: 3 --- -## Overview +## Introduction > 💡 -> Use this plugin to visually debug orchestration artifacts built with Eino (Graph, Chain): +> Use this plugin to visually debug orchestration artifacts (Graph, Chain) built with the Eino framework, including: > -> 1. Visual rendering of orchestration -> 2. Start from any operable node and debug with mock input +> 1. Visual rendering of orchestration artifacts; +> 2. Start debugging from any operable node with mock input. ## Quick Start ### Download eino-examples -Repo: [https://github.com/cloudwego/eino-examples](https://github.com/cloudwego/eino-examples) +> GitHub repository: _[https://github.com/cloudwego/eino-examples](https://github.com/cloudwego/eino-examples)_ ```bash -git clone https://github.com/cloudwego/eino-examples.git -# or +# HTTPS +git clone https://github.com/cloudwego/eino-examples.git + +# SSH git clone git@github.com:cloudwego/eino-examples.git ``` ### Install Dependencies -In the project directory, run the following in order: +Run the following commands sequentially in the project directory: ```bash +# 1. Pull latest devops repository go get github.com/cloudwego/eino-ext/devops@latest + +# 2. Cleans and updates go.mod and go.sum go mod tidy ``` ### Run the Demo -Open `eino-examples/devops/debug/main.go` and run `main.go`. The plugin launches a local HTTP service to connect to your process; allow network access if prompted. +Navigate to `eino-examples/devops/debug/main.go` and run `main.go`. Since the plugin also starts a local HTTP service to connect to the user's service process, a network access warning will pop up — click Allow. @@ -46,19 +51,19 @@ Open `eino-examples/devops/debug/main.go` and run `main.go`. The plugin launches
-1) Click the debug feature entry on the left or center to open configuration +1. Click the debug feature on the left side or in the center to enter debug configuration -2) Click “Configure Address” +2. Click to configure the debug address
-3) Enter 127.0.0.1:52538 +3. Enter 127.0.0.1:52538 -4) Confirm to enter the debug view, then select the Graph to debug +4. Click confirm to enter the debug interface and select the Graph to debug
@@ -66,43 +71,44 @@ Open `eino-examples/devops/debug/main.go` and run `main.go`. The plugin launches
-1) Click “Test Run” to start from START +1. Click "Test Run" to start execution from the start node -2) Enter "hello eino" and confirm +2. Enter "hello eino" and click confirm
- - - - - -
3) Inspect per-node inputs/outputs4) Switch Input/Output views
+ +3. The debug area displays the input and output of each node + + +4. Click Input and Output to switch between viewing node information + + ## Feature Overview ### Local or Remote Debugging -Configure `IP:Port` to connect to the target process, whether local or remote. +Whether the target orchestration artifact is running on your local machine or a remote server, you can connect to the target debug server by configuring the IP:Port. -### Orchestration Visualization +### Orchestration Topology Visualization -Supports Graph and Chain topology visualization. +Supports visualization of Graph and Chain orchestration topologies. -### Start from Any Node +### Debug from Any Node -### Inspect Node Results +### View Node Execution Results -Each node’s input, output, and execution time are shown in order. +Each node's execution results are displayed in execution order in the debug area, including: input, output, and execution time. @@ -110,7 +116,7 @@ Each node’s input, output, and execution time are shown in order. ### Orchestrate with Eino -The plugin supports debugging Graph and Chain artifacts. Example registration: +The plugin supports debugging Graph and Chain orchestration artifacts. Assume you already have orchestration code as follows: ```go func RegisterSimpleGraph(ctx context.Context) { @@ -140,24 +146,32 @@ func RegisterSimpleGraph(ctx context.Context) { ### Install Dependencies +Run the following commands sequentially in the project directory: + ```bash +# 1. Pull latest devops repository go get github.com/cloudwego/eino-ext/devops@latest + +# 2. Cleans and updates go.mod and go.sum go mod tidy ``` -### Initialize Debugging +### Call the Debug Initialization Function -Because debugging starts an HTTP service in your main process to interact with the local plugin, you must call `Init()` from `github.com/cloudwego/eino-ext/devops` to start the debug service. +Because debugging requires starting an HTTP service in the user's main process for interaction with the local debug plugin, you need to call `Init()` from _github.com/cloudwego/eino-ext/devops_ once to start the debug service. > 💡 > Notes > -> 1. Ensure the target orchestration has run `Compile()` at least once. -> 2. `devops.Init()` must run before calling `Compile()`. -> 3. Make sure the main process stays alive after `devops.Init()`. +> 1. Ensure the target orchestration artifact has executed `Compile()` at least once. +> 2. `devops.Init()` must be executed before calling `Compile()`. +> 3. You must ensure the main process does not exit after `devops.Init()` is executed. +> 4. Starting from v0.1.9, the default listening address for the debug service changed from `0.0.0.0` to `127.0.0.1` (local connections only). For remote debugging, explicitly specify the listening IP via `WithDevServerIP`, e.g.: `devops.Init(ctx, devops.WithDevServerIP("0.0.0.0"))`. + +For example, add debug service startup code in the `main()` function: ```go -// 1. Initialize debug service +// 1. Call the debug service initialization function err := devops.Init(ctx) if err != nil { logs.Errorf("[eino dev] init failed, err=%v", err) @@ -170,9 +184,9 @@ RegisterSimpleGraph(ctx) ### Run Your Process -Run your process locally or remotely, and ensure the main process does not exit. +Run your process on your local machine or in a remote environment, and ensure the main process does not exit. -In `github.com/cloudwego/eino-examples/devops/debug/main.go`, `main()` looks like: +In github.com/cloudwego/eino-examples/devops/debug/main.go, the `main()` code is as follows: ```go func main() { @@ -199,102 +213,132 @@ func main() { } ``` -### Configure Address +### Configure Debug Address + +- **IP**: The IP address of the server where the user process is running. + - If the user process is running on your local machine, enter `127.0.0.1`; + - If the user process is running on a remote server, enter the remote server's IP address, supporting both IPv4 and IPv6. +- **Port**: The port the debug service listens on, default is `52538`, configurable via the `WithDevServerPort` option method. -- IP: `127.0.0.1` for local; remote server IP for remote (IPv4/IPv6). -- Port: default `52538`, configurable via `WithDevServerPort`. +> 💡 +> Notes +> +> - Local debugging: The system may show a network access warning — allow access. +> - Remote server debugging: Ensure the port is accessible. Additionally, starting from v0.1.9, the default listening address is `127.0.0.1` only. For remote debugging, you must specify an IP accessible from the remote end (e.g., `0.0.0.0`) via `WithDevServerIP` when calling `devops.Init()`. -Allow network prompts locally; ensure remote ports are reachable. Once connected, the status indicator turns green. +After configuring the IP and Port, click confirm. The debug plugin will automatically connect to the target debug server. If the connection is successful, the connection status indicator will turn green. -### Select an Artifact +### Select Target Orchestration Artifact to Debug -Ensure your target orchestration has been compiled at least once. Multiple `Compile()` runs register multiple artifacts; you’ll see them in the selection list. +Ensure your target orchestration artifact has executed `Compile()` at least once. Since the debug system targets orchestration artifact instances, if `Compile()` is executed multiple times, multiple artifacts will be registered in the debug service, and you'll see multiple debug targets in the selection list. ### Start Debugging -- From START: click “Test Run”, enter mock input (complex types are inferred), and confirm. +Debugging supports starting from any node, including the START node and other intermediate nodes. + +- Debugging from the START node: Click "Test Run" directly, then enter mock input (if the input is a complex structure, the system will automatically infer the input structure), click confirm, and your graph will begin execution. Each node's results will be displayed below. -- From a specific node: click the run button on that node. +- Debugging from any operable node: For example, starting execution from the second node. -## Advanced + + +### View Execution Results + +When debugging from the START node, click Test Run and view the debug results below the plugin. + + + +When debugging from any operable node, view the debug results below the plugin. + + + +## Advanced Features ### Specify Implementation Type for Interface Fields -Interface-typed fields render as `{}` by default. Type a space inside `{}` to select an implementation type. The plugin uses a special JSON structure: +For interface-typed fields, they will be rendered as `{}` by default. Typing a space inside `{}` will bring up a list of interface implementation types. After selecting a type, the system will generate a special struct to express the interface information. The special struct is defined as follows: -```json +```go { - "_value": {}, // JSON value of the concrete type - "_eino_go_type": "*model.MyConcreteType" // Go type name + "_value": {} // JSON value generated according to the concrete type + "_eino_go_type": "*model.MyConcreteType" // Go type name } ``` > 💡 -> Common interface types like `string`, `schema.Message` are built-in. To register custom types, use `devops.AppendType` during `Init()`. +> The system has built-in common interface types such as `string`, `schema.Message`, etc., which can be directly selected. If you need to register custom interface implementation types, use the `AppendType` method provided by `devops`. -1) Suppose you have orchestration code where the graph input is `any`, and `node_1` takes `*NodeInfo`: +1. Assume you already have orchestration code as follows, where the graph input is defined as `any` and `node_1`'s input is defined as `*NodeInfo`: -```go -type NodeInfo struct { - Message string -} + ```go + type NodeInfo struct { + Message string + } -func RegisterGraphOfInterfaceType(ctx context.Context) { - // Define a graph that input parameter is any. - g := compose.NewGraph[any, string]() + func RegisterGraphOfInterfaceType(ctx context.Context) { + // Define a graph that input parameter is any. + g := compose.NewGraph[any, string]() - _ = g.AddLambdaNode("node_1", compose.InvokableLambda(func(ctx context.Context, input *NodeInfo) (output string, err error) { - if input == nil { - return "", nil - } - return input.Message + " process by node_1,", nil - })) + _ = g.AddLambdaNode("node_1", compose.InvokableLambda(func(ctx context.Context, input *NodeInfo) (output string, err error) { + if input == nil { + return "", nil + } + return input.Message + " process by node_1,", nil + })) - _ = g.AddLambdaNode("node_2", compose.InvokableLambda(func(ctx context.Context, input string) (output string, err error) { - return input + " process by node_2,", nil - })) + _ = g.AddLambdaNode("node_2", compose.InvokableLambda(func(ctx context.Context, input string) (output string, err error) { + return input + " process by node_2,", nil + })) - _ = g.AddLambdaNode("node_3", compose.InvokableLambda(func(ctx context.Context, input string) (output string, err error) { - return input + " process by node_3,", nil - })) + _ = g.AddLambdaNode("node_3", compose.InvokableLambda(func(ctx context.Context, input string) (output string, err error) { + return input + " process by node_3,", nil + })) - _ = g.AddEdge(compose._START_, "node_1") + _ = g.AddEdge(compose._START_, "node_1") - _ = g.AddEdge("node_1", "node_2") + _ = g.AddEdge("node_1", "node_2") - _ = g.AddEdge("node_2", "node_3") + _ = g.AddEdge("node_2", "node_3") - _ = g.AddEdge("node_3", compose._END_) + _ = g.AddEdge("node_3", compose._END_) - r, err := g.Compile(ctx) - if err != nil { - logs.Errorf("compile graph failed, err=%v", err) - return - } -} -``` + r, err := g.Compile(ctx) + if err != nil { + logs.Errorf("compile graph failed, err=%v", err) + return + } + } + ``` +2. Before debugging, register the custom `*NodeInfo` type via the `AppendType` method during `Init()`: -2) Before debugging, register the custom `*NodeInfo` type with `AppendType` at `Init()`: + ```go + err := devops.Init(ctx, devops.AppendType(&graph.NodeInfo{})) + ``` +3. During debugging, in the Test Run JSON input box, interface-typed fields will appear as `{}` by default. You can type a space inside `{}` to view all built-in and custom-registered data types, and select the concrete implementation type for that interface. -```go -err := devops.Init(ctx, devops.AppendType(&graph.NodeInfo{})) -``` + + +1. Fill in the debug node input in the `_value` field. + + -3) During Test Run, interface fields show `{}` by default. Type a space inside `{}` to view all built-in and custom types, select the concrete implementation, then fill `_value`. +1. Click confirm to view the debug results. -### Debugging `map[string]any` + -If a node input is `map[string]any`: +#### Debugging map[string]any + +Here we explain how to debug when the input type is map[string]any. If a node's input type is map[string]any, as shown below: ```go func RegisterAnyInputGraph(ctx context.Context) { @@ -341,17 +385,17 @@ func RegisterAnyInputGraph(ctx context.Context) { } ``` -During debugging, in the Test Run JSON input box, use the following format to specify concrete types for values: +During debugging, in the Test Run JSON input box, you need to enter content in the following format: ```json { - "name": { - "_value": "alice", - "_eino_go_type": "string" - }, - "score": { - "_value": "99", - "_eino_go_type": "int" - } + "name": { + "_value": "alice", + "_eino_go_type": "string" + }, + "score": { + "_value": "99", + "_eino_go_type": "int" + } } ``` diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md index 7feea7b33e5..ea8ad812459 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md @@ -1,28 +1,24 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: PatchToolCalls -weight: 7 +weight: 8 --- adk/middlewares/patchtoolcalls > 💡 -> The PatchToolCalls middleware is used to fix "dangling tool calls" issues in the message history. This middleware was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1). +> The PatchToolCalls middleware is used to fix "dangling tool calls" issues in message history. Introduced in v0.8.0. Supports both `*schema.Message` and `*schema.AgenticMessage` message types. ## Overview -In multi-turn conversation scenarios, there may be cases where an Assistant message contains ToolCalls, but the corresponding Tool message response is missing from the conversation history. Such "dangling tool calls" can cause some model APIs to throw errors or produce abnormal behavior. - -**Common scenarios:** +In multi-turn conversation scenarios, there may be cases where an Assistant message contains ToolCalls, but the corresponding Tool response is missing from the conversation history. Such "dangling tool calls" can cause some model APIs to throw errors or produce abnormal behavior. **Common scenarios:** - User sent a new message before tool execution completed, causing the tool call to be interrupted - Some tool call results were lost when restoring a session -- User canceled tool execution in a Human-in-the-loop scenario - -The PatchToolCalls middleware scans the message history before each model call and automatically inserts placeholder messages for tool calls that lack responses. +- User canceled tool execution in a Human-in-the-loop scenario The PatchToolCalls middleware scans the message history before each model call (`BeforeModelRewriteState` hook) and automatically inserts placeholder messages for tool calls that lack responses. ## Quick Start @@ -33,51 +29,67 @@ import ( "github.com/cloudwego/eino/adk/middlewares/patchtoolcalls" ) -// Create middleware with default configuration +// Use default configuration (cfg can be nil) mw, err := patchtoolcalls.New(ctx, nil) if err != nil { // Handle error } -// Use with ChatModelAgent agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ Model: yourChatModel, Middlewares: []adk.ChatModelAgentMiddleware{mw}, }) ``` -## Configuration Options +## API Reference + +### Config ```go type Config struct { - // PatchedContentGenerator custom function to generate placeholder message content - // Optional, uses default message if not set PatchedContentGenerator func(ctx context.Context, toolName, toolCallID string) (string, error) } ``` - +
FieldTypeRequiredDescription
PatchedContentGenerator
func(ctx, toolName, toolCallID string) (string, error)
NoCustom function to generate placeholder message content. Parameters include tool name and call ID, returns the content to fill in
PatchedContentGenerator
func(ctx context.Context, toolName, toolCallID string) (string, error)
NoCustom function to generate placeholder message content. Uses the built-in default message template when not set
-### Default Placeholder Message +### New + +```go +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) +``` + +Creates the PatchToolCalls middleware. `cfg` can be `nil`, in which case the default configuration is used. Internally calls `NewTyped[*schema.Message]`. + +### NewTyped + +```go +func NewTyped[M adk.MessageType](_ context.Context, cfg *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +Generic version constructor, supporting `*schema.Message` and `*schema.AgenticMessage`. `cfg` can be `nil`. + +- When `M = *schema.Message`, matches Tool messages via the `ToolCallID` field +- When `M = *schema.AgenticMessage`, matches via `ContentBlock.FunctionToolResult.CallID` -If `PatchedContentGenerator` is not set, the middleware uses a default placeholder message: +### Default Placeholder Message -**English (default):** +If `PatchedContentGenerator` is not set, the middleware uses a built-in template (formatted via `fmt.Sprintf`, where `%s` corresponds to toolName and toolCallID respectively): **English (default):** ``` -Tool call {toolName} with id {toolCallID} was cancelled - another message came in before it could be completed. +Tool call %s with id %s was canceled - another message came in before it could be completed. ``` **Chinese:** ``` -工具调用 {toolName}(ID 为 {toolCallID})已被取消——在其完成之前收到了另一条消息。 +工具调用 %s(ID 为 %s)已被取消——在其完成之前收到了另一条消息。 ``` -You can switch languages via `adk.SetLanguage()`. +Language can be switched via `adk.SetLanguage()`. ## Usage Examples @@ -91,10 +103,24 @@ mw, err := patchtoolcalls.New(ctx, &patchtoolcalls.Config{ }) ``` +### Generic Usage (AgenticMessage) + +```go +mw, err := patchtoolcalls.NewTyped[*schema.AgenticMessage](ctx, nil) +if err != nil { + // Handle error +} + +agent, err := adk.NewTypedChatModelAgent[*schema.AgenticMessage](ctx, &adk.TypedChatModelAgentConfig[*schema.AgenticMessage]{ + Model: yourChatModel, + Middlewares: []adk.TypedChatModelAgentMiddleware[*schema.AgenticMessage]{mw}, +}) +``` + ### Combined with Other Middlewares ```go -// PatchToolCalls should usually be placed at the front of the middleware chain +// PatchToolCalls should typically be placed at the front of the middleware chain // to ensure dangling tool calls are fixed before other middlewares process messages agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ Model: yourChatModel, @@ -108,40 +134,33 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ ## How It Works - - -**Processing Logic:** - -1. Executes in the `BeforeModelRewriteState` hook -2. Iterates through all messages to find Assistant messages containing `ToolCalls` -3. For each ToolCall, checks if a corresponding Tool message exists in subsequent messages (matched by `ToolCallID`) -4. If no corresponding Tool message is found, inserts a placeholder message -5. Returns the repaired message list +> 💡 +> For `*schema.Message`, matching is done via `msg.Role == schema.Tool && msg.ToolCallID`; for `*schema.AgenticMessage`, matching is done via `ContentBlock.FunctionToolResult.CallID`. -## Example Scenario +### Example Scenario -### Message History Before Repair +**Before repair:** ``` -[User] "Help me check the weather" -[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] -[Tool] "call_1: Sunny, 25°C" -[User] "No need to check the location, just tell me Beijing's weather" <- User interrupts +[User] "Help me check the weather" +[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] +[Tool] "call_1: Sunny, 25°C" +[User] "No need to check the location, just tell me Beijing's weather" <- User interrupts ``` -### Message History After Repair +**After repair:** ``` -[User] "Help me check the weather" -[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] -[Tool] "call_1: Sunny, 25°C" -[Tool] "call_2: Tool call get_location (ID: call_2) was cancelled..." <- Automatically inserted -[User] "No need to check the location, just tell me Beijing's weather" +[User] "Help me check the weather" +[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] +[Tool] "call_1: Sunny, 25°C" +[Tool] "call_2: Tool call get_location (ID: call_2) was canceled..." <- Automatically inserted +[User] "No need to check the location, just tell me Beijing's weather" ``` ## Multi-language Support -Placeholder messages support both Chinese and English, switch via `adk.SetLanguage()`: +Placeholder messages support Chinese and English, switch via `adk.SetLanguage()`: ```go import "github.com/cloudwego/eino/adk" @@ -153,7 +172,8 @@ adk.SetLanguage(adk.LanguageEnglish) // English (default) ## Notes > 💡 -> This middleware only modifies the history messages for the current run in the `BeforeModelRewriteState` hook, and does not affect the actual stored message history. The repair is temporary and only used for the current agent call. +> The state returned by `BeforeModelRewriteState` is persisted to the agent's internal state by the framework (see the `ProcessState` call in `wrappers.go`). Therefore, placeholder messages inserted by PatchToolCalls **will be retained in subsequent iterations** and do not need to be re-patched every round. -- It's recommended to place this middleware at the **front** of the middleware chain to ensure other middlewares process a complete message history -- If your scenario requires persisting the repaired messages, implement the corresponding logic in `PatchedContentGenerator` +- It is recommended to place this middleware at the **front** of the middleware chain to ensure other middlewares process a complete message history +- The `cfg` parameter can be `nil`, equivalent to `&Config{}` +- If the message list is empty (`len(state.Messages) == 0`), the middleware returns immediately without any processing diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md index 740b3775891..e045e2a97d7 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md @@ -1,33 +1,28 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: PlanTask -weight: 4 +weight: 6 --- -# PlanTask Middleware - -adk/middlewares/plantask - > 💡 -> This middleware was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1). +> This middleware was introduced in v0.8.0. Package path: `github.com/cloudwego/eino/adk/middlewares/plantask` ## Overview -`plantask` is a task management middleware that allows Agents to create and manage task lists. The middleware injects four tools through the `BeforeAgent` hook: - -- **TaskCreate**: Create a task -- **TaskGet**: View task details -- **TaskUpdate**: Update a task -- **TaskList**: List all tasks +`plantask` is a task management middleware that injects four tools into the Agent via the `BeforeAgent` hook, giving it structured task planning capabilities: -Main purposes: + + + + + + +
ToolFunction
TaskCreate
Create a task
TaskGet
Get details of a single task
TaskUpdate
Update task status/fields, set dependencies, delete tasks
TaskList
List summaries of all tasks
-- Track progress of complex tasks -- Break large tasks into smaller steps -- Manage dependencies between tasks +Core purpose: Break down complex requests into trackable subtasks, manage dependencies between tasks, and show execution progress to users. --- @@ -38,7 +33,7 @@ Main purposes: │ Agent │ │ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ -│ │ BeforeAgent: Inject task tools │ │ +│ │ BeforeAgent: Inject task tools (with sync.Mutex for concurrency) │ │ │ │ - TaskCreate │ │ │ │ - TaskGet │ │ │ │ - TaskUpdate │ │ @@ -53,7 +48,7 @@ Main purposes: │ │ │ Storage structure: │ │ baseDir/ │ -│ ├── .highwatermark # ID counter │ +│ ├── .highwatermark # Maximum allocated ID (plain numeric text) │ │ ├── 1.json # Task #1 │ │ ├── 2.json # Task #2 │ │ └── ... │ @@ -63,101 +58,122 @@ Main purposes: --- -## Configuration +## API + +### Constructors + +```go +// Generic version, supports *schema.Message and *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, config *Config) (adk.TypedChatModelAgentMiddleware[M], error) + +// Non-generic version, equivalent to NewTyped[*schema.Message] +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +``` + +### Config ```go type Config struct { Backend Backend // Storage backend, required - BaseDir string // Task file directory, required + BaseDir string // Task file storage directory, required } ``` -- Note that the Backend implementation should be isolated by session, with different sessions corresponding to different Backends (task lists) +> 💡 +> The Backend should be isolated at the session level — different sessions correspond to different Backend instances (i.e., different task lists). ---- +### Backend Interface -## Backend Interface +`Backend` is defined in the `plantask` package and is a streamlined subset of `filesystem.Backend`, retaining only the four methods needed for task storage: ```go type Backend interface { LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) - Read(ctx context.Context, req *ReadRequest) (string, error) + Read(ctx context.Context, req *ReadRequest) (*filesystem.FileContent, error) Write(ctx context.Context, req *WriteRequest) error Delete(ctx context.Context, req *DeleteRequest) error } ``` +Type alias relationships: + +```go +type FileInfo = filesystem.FileInfo // Path, IsDir, Size, ModifiedAt +type LsInfoRequest = filesystem.LsInfoRequest // Path string +type ReadRequest = filesystem.ReadRequest // FilePath, Offset, Limit +type WriteRequest = filesystem.WriteRequest // FilePath, Content string + +// DeleteRequest is custom to the plantask package (not in the filesystem package) +type DeleteRequest struct { + FilePath string +} +``` + +> 💡 +> Note that `Read` returns `*filesystem.FileContent` (containing the `Content string` field), not a raw string. Import path: `github.com/cloudwego/eino/adk/filesystem`. + --- ## Task Structure ```go type task struct { - ID string `json:"id"` // Task ID - Subject string `json:"subject"` // Title - Description string `json:"description"` // Description - Status string `json:"status"` // Status - Blocks []string `json:"blocks"` // Tasks blocked by this one - BlockedBy []string `json:"blockedBy"` // Tasks blocking this one - ActiveForm string `json:"activeForm"` // Active form text - Owner string `json:"owner"` // Responsible agent - Metadata map[string]any `json:"metadata"` // Custom data + ID string `json:"id"` + Subject string `json:"subject"` + Description string `json:"description"` + Status string `json:"status"` + Blocks []string `json:"blocks"` + BlockedBy []string `json:"blockedBy"` + ActiveForm string `json:"activeForm,omitempty"` + Owner string `json:"owner,omitempty"` + Metadata map[string]any `json:"metadata,omitempty"` } ``` ### Status - - + + - +
StatusDescription
pending
Pending (default)
Status ValueDescription
pending
Pending (default on creation)
in_progress
In progress
completed
Completed
deleted
Deleted (will delete the file)
deleted
Deleted (physically deletes the JSON file and removes from other tasks' dependency lists)
-Status transition: `pending` → `in_progress` → `completed`, any status can be directly `deleted`. +Status transitions: `pending` → `in_progress` → `completed`; any status can be directly set to `deleted`. --- -## Tools +## Tool Parameters ### TaskCreate -Create a task. +Tool name constant: `TaskCreateToolName = "TaskCreate"` - - - - + + + +
ParameterTypeRequiredDescription
subject
stringYesTitle
description
stringYesDescription
activeForm
stringNoActive form text, e.g., "Running tests"
metadata
objectNoCustom data
subject
stringYesTask title (imperative form)
description
stringYesDetailed task description, including context and acceptance criteria
activeForm
stringNoActive form text (e.g., "Running tests"), displayed to users when in_progress
metadata
objectNoCustom key-value pairs
-When to use: - -- The task is relatively complex with 3 or more steps -- The user has given a list of things to do -- You need to show progress to the user - -When not to use: - -- It's just a simple task -- Something that can be done quickly +After creation, the task ID auto-increments (based on the `.highwatermark` file), with initial status `pending`. ### TaskGet -View task details. +Tool name constant: `TaskGetToolName = "TaskGet"` - +
ParameterTypeRequiredDescription
taskId
stringYesTask ID
taskId
stringYesTask ID (numeric string only)
-Returns complete information about the task: title, description, status, dependencies, etc. +Returns complete task information: subject, description, status, blocks, blockedBy, owner. ### TaskUpdate -Update a task. +Tool name constant: `TaskUpdateToolName = "TaskUpdate"` @@ -165,24 +181,28 @@ Update a task. - - - - - + + + + +
ParameterTypeRequiredDescription
subject
stringNoNew title
description
stringNoNew description
activeForm
stringNoNew active form text
status
stringNoNew status
addBlocks
[]stringNoAdd blocked tasks
addBlockedBy
[]stringNoAdd tasks blocking this one
owner
stringNoResponsible agent
metadata
objectNoCustom data (set to null to delete)
status
stringNoNew status, enum:
pending
/
in_progress
/
completed
/
deleted
addBlocks
[]stringNoAdd task IDs that are blocked by the current task (bidirectional write)
addBlockedBy
[]stringNoAdd task IDs that block the current task (bidirectional write)
owner
stringNoName of the responsible agent
metadata
objectNoMerged into existing metadata; setting a key to null deletes that key
-Notes: +Key behaviors: -- `status: "deleted"` will directly delete the task file -- Circular dependencies are checked when adding dependencies -- Automatic cleanup occurs when all tasks are completed +- `status: "deleted"` physically deletes the task file and removes that ID from all other tasks' blocks/blockedBy +- Adding dependencies performs **cyclic dependency detection**; an error is returned if a cycle would be created +- When **all tasks are completed**, all task files are automatically deleted (cleanup mechanism) ### TaskList -List all tasks, no parameters required. +Tool name constant: `TaskListToolName = "TaskList"` -Returns a summary of each task: ID, status, title, responsible agent, dependencies. +No parameters. Returns a summary list of all tasks (sorted by ID), each in the format: + +``` +#ID [status] subject [owner: xxx] [blocked by #x, #y] +``` --- @@ -191,8 +211,7 @@ Returns a summary of each task: ID, status, title, responsible agent, dependenci ```go ctx := context.Background() -// The plantask middleware should normally be session-scoped -// Different sessions correspond to different task lists +// Backend should be isolated at the session level middleware, err := plantask.New(ctx, &plantask.Config{ Backend: myBackend, BaseDir: "/tasks", @@ -210,76 +229,76 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ ### Typical Flow ``` -1. Receive complex task +1. Receive a complex task │ ▼ -2. TaskCreate to create tasks +2. TaskCreate creates multiple subtasks - #1: Analyze requirements - - #2: Write code + - #2: Implement code + - #3: Write tests │ ▼ -3. TaskUpdate to set dependencies - - #2 depends on #1 - - #3 depends on #2 +3. TaskUpdate sets dependencies + - #2 addBlockedBy: ["1"] + - #3 addBlockedBy: ["2"] │ ▼ -4. TaskList to see what tasks exist +4. TaskList to view available tasks │ ▼ -5. TaskUpdate to start working - - Change #1 to in_progress +5. TaskUpdate #1 → in_progress │ ▼ -6. When done, TaskUpdate - - Change #1 to completed +6. After completion, TaskUpdate #1 → completed │ ▼ -7. Loop 4-6 until all completed +7. Repeat steps 4-6 until all completed │ ▼ -8. Automatic cleanup +8. All completed → automatically clean up all files ``` --- ## Dependency Management -- **blocks**: These tasks can start after I complete -- **blockedBy**: I can start after these tasks complete +- **blocks**: "Once I'm completed, these tasks can start" +- **blockedBy**: "Once these tasks are completed, I can start" + +Dependency writes are **bidirectional**: executing `addBlocks: ["2"]` on Task A will also write A's ID into Task #2's `blockedBy`. ``` Task #1 (blocks: ["2"]) ────► Task #2 (blockedBy: ["1"]) -#2 can only start after #1 completes +#1 must complete before #2 can start ``` -Circular dependencies will throw an error: +Cyclic dependency detection is implemented via DFS reachability: ``` #1 blocks #2 -#2 blocks #1 ← Not allowed, circular +#2 blocks #1 ← Error: would create a cyclic dependency ``` --- -## Automatic Cleanup - -When all tasks are `completed`, all task files will be automatically deleted. - ---- - -## Notes +## Implementation Details -- Task files are stored in JSON format in the `BaseDir` directory, with filenames as `{id}.json` -- The `.highwatermark` file is used to record the maximum assigned task ID, ensuring IDs don't repeat -- All tool operations are protected by mutex locks and are concurrency-safe -- The tool descriptions contain detailed usage guidelines that the Agent will follow + + + + + + + + +
MechanismDescription
ID Allocation
.highwatermark
file stores the current maximum ID, incremented by 1 on creation
Concurrency SafetyAll four tools share a single
sync.Mutex
, serializing execution within the same middleware instance
File FormatOne
{id}.json
file per task, JSON serialized using
sonic
Auto CleanupAfter TaskUpdate marks a task as completed, it checks — if all tasks are completed, batch deletes all files
ID ValidationNumeric-only regex
^\d+$
Delete CascadeWhen deleting a task, iterates all task files to remove references to that ID
--- ## Multi-language Support -Tool descriptions support Chinese and English switching via `adk.SetLanguage()`: +Tool descriptions support bilingual Chinese/English, switchable via global settings: ```go // Use Chinese descriptions @@ -289,4 +308,4 @@ adk.SetLanguage(adk.LanguageChinese) adk.SetLanguage(adk.LanguageEnglish) ``` -This setting is global and affects all ADK built-in prompts and tool descriptions. +This setting affects all ADK built-in prompts and tool descriptions. diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md index f9afbf5c065..c237be700e0 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md @@ -1,40 +1,42 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: Skill weight: 3 --- -Skill middleware adds Skill support to Eino ADK agents, enabling agents to dynamically discover and use predefined skills to complete tasks more accurately and efficiently. +The Skill Middleware provides Skill support for Eino ADK Agents, enabling the Agent to dynamically discover and use predefined skills to complete tasks. # What is a Skill -A Skill is a folder that contains instructions, scripts, and resources. Agents can discover and use these skills on demand to extend their capabilities. The core of a Skill is a `SKILL.md` file, which includes metadata (at least `name` and `description`) and guidance for the agent to execute a specific type of task. +A Skill is a folder containing instructions, scripts, and resources that an Agent can discover and use on demand to extend its capabilities. The core is the `SKILL.md` file, which contains metadata (at minimum name and description) and instructions to guide the Agent in executing tasks. ``` my-skill/ ├── SKILL.md # Required: instructions + metadata ├── scripts/ # Optional: executable code -├── references/ # Optional: reference docs -└── assets/ # Optional: templates/resources +├── references/ # Optional: reference documentation +└── assets/ # Optional: templates, resources ``` -Skills use **Progressive Disclosure** to manage context efficiently: +Skills use **Progressive Disclosure** to efficiently manage context: -1. **Discovery**: on startup, the agent only loads each skill’s name and description — enough to decide when the skill might be useful -2. **Activation**: when a task matches a skill’s description, the agent loads the full `SKILL.md` content into context -3. **Execution**: the agent follows the instructions and can load other files or execute bundled code as needed. This keeps the agent responsive while still allowing on-demand access to additional context. + + +1. **Discovery**: The Agent loads only the name and description of each available Skill, sufficient to determine when the Skill might be needed +2. **Activation**: When a task matches a Skill, the Agent loads the full `SKILL.md` content into context +3. **Execution**: The Agent follows the instructions to execute the task, loading additional files or running bundled code as needed > 💡 > Ref: [https://agentskills.io/home](https://agentskills.io/home) -# Interfaces +# API Reference ## FrontMatter -Skill metadata used for quick display during discovery, avoiding loading full content: +The metadata structure of a Skill, parsed from the YAML frontmatter of SKILL.md. Used for quick Skill information display during the discovery phase: ```go type FrontMatter struct { @@ -48,11 +50,11 @@ type FrontMatter struct { - - - - - + + + + +
FieldTypeDescription
Name
string
Unique identifier of a skill. The agent invokes the skill by name. Use short, meaningful names (e.g.
pdf-processing
,
web-research
). Corresponds to the
name
field in SKILL.md frontmatter.
Description
string
Description of what the skill does. This is the key basis for the agent to decide whether to use the skill, so it should clearly describe applicable scenarios and capabilities. Corresponds to the
description
field in SKILL.md frontmatter.
Context
ContextMode
Context mode. Supported values:
fork_with_context
(copy history messages to a new agent for execution),
fork
(create a new agent with isolated context for execution). Empty means inline mode (return skill content directly).
Agent
string
Agent name to use. Used with
Context
, resolved via
AgentHub
. Empty means using the default agent.
Model
string
Model name to use. Resolved via
ModelHub
. In context mode, passed to the agent factory; in inline mode, it switches the model used by subsequent ChatModel calls.
Name
string
Unique identifier of the Skill. Use short, meaningful names (e.g.,
pdf-processing
,
web-research
)
Description
string
Functional description of the Skill. Key basis for the Agent to decide whether to use this Skill; should clearly describe applicable scenarios and capabilities
Context
ContextMode
Context mode. Possible values:
fork
(isolated context),
fork_with_context
(copy history messages). Empty means inline mode
Agent
string
Specifies the Agent name to use, paired with
Context
, obtains the corresponding Agent via
AgentHub
. Empty uses the default Agent
Model
string
Specifies the model name to use, obtains the corresponding model instance via
ModelHub
### ContextMode @@ -66,14 +68,14 @@ const ( - - - + + +
ModeDescription
Inline (default)Skill content is returned as the tool result and the current agent continues processing
ForkWithContextCreate a new agent, copy current conversation history, execute the skill independently, and return the result
ForkCreate a new agent with isolated context (only skill content), execute independently, and return the result
Inline (default)Skill content is returned directly as a tool result, processed by the current Agent
fork_with_context
Creates a new Agent, copies the current conversation history, independently executes the Skill task, then returns the result
fork
Creates a new Agent with isolated context (only including Skill content), independently executes, then returns the result
## Skill -Complete skill structure (metadata + instruction content): +The complete Skill structure, containing metadata and instruction content: ```go type Skill struct { @@ -85,18 +87,14 @@ type Skill struct { - - - + + +
FieldTypeDescription
FrontMatter
FrontMatter
Embedded metadata:
Name
,
Description
,
Context
,
Agent
,
Model
Content
string
The body of SKILL.md after frontmatter. Contains detailed instructions, workflows, examples, etc. The agent reads it after skill activation.
BaseDirectory
string
Absolute path of the skill directory. The agent can use this path to access other resources in the skill directory (scripts, templates, references, etc.).
FrontMatter
FrontMatter
Embedded metadata structure
Content
string
The body content after the frontmatter in SKILL.md, including detailed instructions, workflows, examples, etc.
BaseDirectory
string
Absolute path of the Skill directory, which the Agent can use to access other resource files in the directory
## Backend -Skill backend interface defines how skills are retrieved. It decouples skill storage from usage: - -- **Flexible storage**: store skills in local filesystem, databases, remote services, cloud storage, etc. -- **Extensible**: implement custom backends (e.g. load from Git repos, config centers) -- **Test-friendly**: easy to build mock backends for unit tests +The Skill backend interface, decoupling skill storage from usage: ```go type Backend interface { @@ -107,13 +105,13 @@ type Backend interface { - - + +
MethodDescription
List
List metadata of all available skills. Called when the agent starts to build the skill tool description, so the agent knows what skills exist.
Get
Get full skill content by name. Called when the agent decides to use a skill, returning the full Skill structure including detailed instructions.
List
Lists metadata of all available skills. Called when the Agent starts, used to build skill tool descriptions
Get
Gets the complete skill content by name. Called when the Agent decides to use a specific skill
### NewBackendFromFilesystem -A filesystem-backed backend implementation that reads skills from a directory via `filesystem.Backend`: +A backend implementation based on the `filesystem.Backend` interface that scans first-level subdirectories under the specified directory to read skills: ```go type BackendFromFilesystemConfig struct { @@ -126,314 +124,154 @@ func NewBackendFromFilesystem(ctx context.Context, config *BackendFromFilesystem - - + +
FieldTypeRequiredDescription
Backend
filesystem.Backend
YesFilesystem backend implementation used for file operations
BaseDir
string
YesRoot directory for skills. It scans all first-level subdirectories and treats the ones containing
SKILL.md
as skills.
Backend
filesystem.Backend
YesFilesystem backend implementation for file operations
BaseDir
string
YesSkills root directory path. Scans first-level subdirectories under this directory, looking for directories containing
SKILL.md
files
How it works: -- scan first-level subdirectories under `BaseDir` -- look for `SKILL.md` in each subdirectory -- parse YAML frontmatter to get metadata -- deeply nested `SKILL.md` files are ignored - -### filesystem.Backend Implementations +- Scans first-level subdirectories under `BaseDir` +- Looks for `SKILL.md` files in each subdirectory +- Parses YAML frontmatter to get metadata +- Deeply nested `SKILL.md` files are ignored -There are two `filesystem.Backend` implementations to choose from. See [Middleware: FileSystem](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem). +`filesystem.Backend` interface has two available implementations; see the FileSystem Backend documentation for details. ## AgentHub and ModelHub -When Skills use context mode (fork/isolate), you need to configure AgentHub and ModelHub: +When a Skill uses Context mode (fork / fork\_with\_context), AgentHub and ModelHub are needed to provide Agent instances and model instances. + +> 💡 +> The following shows non-generic alias types (i.e., `*schema.Message` specialization). Generic versions `TypedAgentHub[M]`, `TypedModelHub[M]` can be used for `*schema.AgenticMessage` scenarios with identical interface signatures, only differing in message type parameter. ```go -// AgentHubOptions contains options passed to AgentHub.Get when creating an agent for skill execution. -type AgentHubOptions struct { - // Model is the resolved model instance when a skill specifies a "model" field in frontmatter. - // nil means the skill did not specify a model override; implementations should use their default. - Model model.ToolCallingChatModel +// AgentHubOptions passed to AgentHub.Get +type AgentHubOptions = TypedAgentHubOptions[*schema.Message] + +type TypedAgentHubOptions[M adk.MessageType] struct { + // Model is the model instance specified in the skill's frontmatter (resolved via ModelHub). + // nil means the skill did not specify a model override; the implementation should use the default model. + Model model.BaseModel[M] } -// AgentHub provides agent instances for context mode (fork/fork_with_context) execution. -type AgentHub interface { - // Get returns an Agent by name. When name is empty, implementations should return a default agent. - // The opts parameter carries skill-level overrides (e.g., model) resolved by the framework. - Get(ctx context.Context, name string, opts *AgentHubOptions) (adk.Agent, error) +// AgentHub provides Agent instances for Context mode +type AgentHub = TypedAgentHub[*schema.Message] + +type TypedAgentHub[M adk.MessageType] interface { + // Get returns an Agent by name. Should return the default Agent when name is empty. + Get(ctx context.Context, name string, opts *TypedAgentHubOptions[M]) (adk.TypedAgent[M], error) } -// ModelHub provides model instances. -type ModelHub interface { - Get(ctx context.Context, name string) (model.ToolCallingChatModel, error) +// ModelHub resolves model instances by name +type ModelHub = TypedModelHub[*schema.Message] + +type TypedModelHub[M adk.MessageType] interface { + Get(ctx context.Context, name string) (model.BaseModel[M], error) } ``` -## +> 💡 +> Note: The return type of `AgentHubOptions.Model` and `ModelHub.Get` is `model.BaseModel[M]`, not the `model.ToolCallingChatModel` from older documentation. -## Initialization +## SubAgentInput and SubAgentOutput -Create the Skill middleware (recommended: `NewMiddleware`): +These two structs are used when customizing fork mode behavior: ```go -func NewMiddleware(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +type SubAgentInput = TypedSubAgentInput[*schema.Message] + +type TypedSubAgentInput[M adk.MessageType] struct { + Skill Skill + Mode ContextMode + RawArguments string // Raw JSON arguments + SkillContent string // Built skill content + History []M // Conversation history (fork_with_context mode only) + ToolCallID string // Tool call ID (fork_with_context mode only) +} + +type SubAgentOutput = TypedSubAgentOutput[*schema.Message] + +type TypedSubAgentOutput[M adk.MessageType] struct { + Skill Skill + Mode ContextMode + RawArguments string + Messages []M // All messages produced by the sub-Agent + Results []string // Extracted assistant message text content +} ``` -Config: +# Initialization + +## Config ```go -type Config struct { - // Backend is required - Backend Backend - - // SkillToolName defaults to "skill" - SkillToolName *string - - // AgentHub provides agent factories for context mode - // Required when skill uses "context: fork" or "context: isolate" - AgentHub AgentHub - - // ModelHub provides model instances for skill-specified models - ModelHub ModelHub - - // CustomSystemPrompt customizes system prompt - CustomSystemPrompt SystemPromptFunc - - // CustomToolDescription customizes tool description +type Config = TypedConfig[*schema.Message] + +type TypedConfig[M adk.MessageType] struct { + Backend Backend + SkillToolName *string + AgentHub TypedAgentHub[M] + ModelHub TypedModelHub[M] + + CustomSystemPrompt SystemPromptFunc CustomToolDescription ToolDescriptionFunc + CustomToolParams func(ctx context.Context, defaults map[string]*schema.ParameterInfo) (map[string]*schema.ParameterInfo, error) + BuildContent func(ctx context.Context, skill Skill, rawArgs string) (string, error) + BuildForkMessages func(ctx context.Context, in TypedSubAgentInput[M]) ([]M, error) + FormatForkResult func(ctx context.Context, in TypedSubAgentOutput[M]) (string, error) } ``` - - - - - - + + + + + + + + + +
FieldTypeRequiredDefaultDescription
Backend
Backend
Yes
  • Skill backend implementation responsible for storage and retrieval. You can use the built-in
    LocalBackend
    or provide your own.
    SkillToolName
    *string
    No
    "skill"
    Name of the skill tool. Agents invoke skills via this tool name. If your agent already has a tool with the same name, set this to avoid conflicts.
    AgentHub
    AgentHub
    No
  • Provides agent factories. Required when a skill uses
    context: fork
    or
    context: isolate
    .
    ModelHub
    ModelHub
    No
  • Provides model instances. Used when a skill specifies the
    model
    field.
    CustomSystemPrompt
    SystemPromptFunc
    NoBuilt-in promptCustom system prompt function
    CustomToolDescription
    ToolDescriptionFunc
    NoBuilt-in descriptionCustom tool description function
    Backend
    Backend
    Yes-Skill backend implementation, responsible for skill storage and retrieval
    SkillToolName
    *string
    No
    "skill"
    Skill tool name. Can be customized to avoid conflicts if a tool with the same name already exists
    AgentHub
    TypedAgentHub[M]
    No-Provides Agent instances. Required when using
    context: fork
    or
    fork_with_context
    ModelHub
    TypedModelHub[M]
    No-Provides model instances. Passed to AgentHub in Context mode; in inline mode, switches the model for subsequent ChatModel calls via WrapModel
    CustomSystemPrompt
    SystemPromptFunc
    NoBuilt-in promptCustom system prompt. Signature:
    func(ctx, toolName) string
    CustomToolDescription
    ToolDescriptionFunc
    NoBuilt-in descriptionCustom tool description. Signature:
    func(ctx, skills []FrontMatter) string
    CustomToolParams
    func
    NoOnly
    skill
    parameter
    Custom tool parameter schema. Receives default parameters, returns custom parameters; always keeps
    skill
    as required
    BuildContent
    func
    NoDefault formattingCustom Skill content generation, can inject additional context into the content
    BuildForkMessages
    func
    NoSee belowCustom initial messages passed to the sub-Agent in fork mode. Default:
    fork
    [UserMessage(content)]
    ,
    fork_with_context
    [history..., ToolMessage(content, callID)]
    FormatForkResult
    func
    NoConcatenate contentCustom sub-Agent result formatting. Default concatenates assistant message content and returns
    -# Quick Start - -Example: loading a pdf skill locally. Full code: [https://github.com/cloudwego/eino-examples/tree/main/adk/middlewares/skill](https://github.com/cloudwego/eino-examples/tree/main/adk/middlewares/skill). - -- Create a skills directory under your working directory: +## NewMiddleware ```go -workdir/ -├── skills/ -│ └── pdf/ -│ ├── scripts -│ │ └── analyze.py -│ └── SKILL.md -└── other files +func NewMiddleware(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) ``` -- Create a local filesystem backend and build the Skill middleware: +Creates the Skill Middleware, returns `adk.ChatModelAgentMiddleware`, to be passed into `ChatModelAgentConfig.Handlers`. -```go -import ( - "github.com/cloudwego/eino/adk/middlewares/skill" - "github.com/cloudwego/eino-ext/adk/backend/local" -) +> 💡 +> The generic version `NewTyped[M](ctx, config)` returns `adk.TypedChatModelAgentMiddleware[M]`, which can be used for `*schema.AgenticMessage` typed Agents. -ctx := context.Background() +## Usage Example -be, err := local.NewBackend(ctx, &local.Config{}) +```go +// 1. Create Backend +backend, err := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesystemConfig{ + Backend: fsBackend, + BaseDir: "/path/to/skills", +}) if err != nil { - log.Fatal(err) + return err } -skillBackend, err := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesystemConfig{ - Backend: be, - BaseDir: skillsDir, +// 2. Create Middleware +handler, err := skill.NewMiddleware(ctx, &skill.Config{ + Backend: backend, + AgentHub: myAgentHub, // Optional, only needed for fork mode + ModelHub: myModelHub, // Optional, only needed when using the model field }) if err != nil { - log.Fatalf("Failed to create skill backend: %v", err) + return err } -sm, err := skill.NewMiddleware(ctx, &skill.Config{ - Backend: skillBackend, -}) -``` - -- Create a local FileSystem middleware so the agent can read other skill files and execute scripts: - -```go -import ( - "github.com/cloudwego/eino/adk/middlewares/filesystem" -) - -fsm, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: be, - StreamingShell: be, -}) -``` - -- Create an agent and configure middlewares: - -```go +// 3. Pass to Agent's Handlers agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "LogAnalysisAgent", - Description: "An agent that can analyze logs", - Instruction: "You are a helpful assistant.", - Model: cm, - Handlers: []adk.ChatModelAgentMiddleware{fsm, sm}, + // ... other configuration + Handlers: []adk.ChatModelAgentMiddleware{handler}, }) ``` - -- Run the agent and observe output: - -```go -runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: agent, -}) - -input := fmt.Sprintf("Analyze the %s file", filepath.Join(workDir, "test.log")) -log.Println("User: ", input) - -iterator := runner.Query(ctx, input) -for { - event, ok := iterator.Next() - if !ok { - break - } - if event.Err != nil { - log.Printf("Error: %v\n", event.Err) - break - } - - prints.Event(event) -} -``` - -Agent output: - -```yaml -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool name: skill -arguments: {"skill":"log_analyzer"} - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool response: Launching skill: log_analyzer -Base directory for this skill: /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/skills/log_analyzer -# SKILL.md content - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool name: execute -arguments: {"command": "python3 /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/skills/log_analyzer/scripts/analyze.py /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/test.log"} - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool response: Analysis Result for /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/test.log: -Total Errors: 2 -Total Warnings: 1 - -Error Details: -Line 3: [2024-05-20 10:02:15] ERROR: Database connection failed. -Line 5: [2024-05-20 10:03:05] ERROR: Connection timed out. - -Warning Details: -Line 2: [2024-05-20 10:01:23] WARNING: High memory usage detected. - - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -answer: Here's the analysis result of the log file: - -### Summary -- **Total Errors**: 2 -- **Total Warnings**: 1 - -### Detailed Entries -#### Errors: -1. Line 3: [2024-05-20 10:02:15] ERROR: Database connection failed. -2. Line5: [2024-05-2010:03:05] ERROR: Connection timed out. - -#### Warnings: -1. Line2: [2024-05-2010:01:23] WARNING: High memory usage detected. - -The log file contains critical issues related to database connectivity and a warning about memory usage. Let me know if you need further analysis! -``` - -# How It Works - -The Skill middleware adds a system prompt and a skill tool to the agent. The system prompt is below, where `{tool_name}` is the tool name of the skill tool: - -```python -# Skills System - -**How to Use Skills (Progressive Disclosure):** - -Skills follow a **progressive disclosure** pattern - you see their name and description above, but only read full instructions when needed: - -1. **Recognize when a skill applies**: Check if the user's task matches a skill's description -2. **Read the skill's full instructions**: Use the '{tool_name}' tool to load skill -3. **Follow the skill's instructions**: tool result contains step-by-step workflows, best practices, and examples -4. **Access supporting files**: Skills may include helper scripts, configs, or reference docs - use absolute paths - -**When to Use Skills:** -- User's request matches a skill's domain (e.g., "research X" -> web-research skill) -- You need specialized knowledge or structured workflows -- A skill provides proven patterns for complex tasks - -**Executing Skill Scripts:** -Skills may contain Python scripts or other executable files. Always use absolute paths. - -**Example Workflow:** - -User: "Can you research the latest developments in quantum computing?" - -1. Check available skills -> See "web-research" skill -2. Call '{tool_name}' tool to read the full skill instructions -3. Follow the skill's research workflow (search -> organize -> synthesize) -4. Use any helper scripts with absolute paths - -Remember: Skills make you more capable and consistent. When in doubt, check if a skill exists for the task! -``` - -The skill tool takes a skill name to load and returns the full content of the corresponding SKILL.md. Its tool description lists all available skills with their names and descriptions: - -```sql -Execute a skill within the main conversation - - -When users ask you to perform tasks, check if any of the available skills below can help complete the task more effectively. Skills provide specialized capabilities and domain knowledge. - -How to invoke: -- Use this tool with the skill name only (no arguments) -- Examples: - - `skill: pdf` - invoke the pdf skill - - `skill: xlsx` - invoke the xlsx skill - - `skill: ms-office-suite:pdf` - invoke using fully qualified name - -Important: -- When a skill is relevant, you must invoke this tool IMMEDIATELY as your first action -- NEVER just announce or mention a skill in your text response without actually calling this tool -- This is a BLOCKING REQUIREMENT: invoke the relevant Skill tool BEFORE generating any other response about the task -- Only use skills listed in below -- Do not invoke a skill that is already running -- Do not use this tool for built-in CLI commands (like /help, /clear, etc.) - - - -{{- range .Matters }} - - -{{ .Name }} - - -{{ .Description }} - - -{{- end }} - -``` - -Example: - - - -> 💡 -> Skill middleware only provides the ability to load SKILL.md as shown above. If a skill requires the agent to read files, execute scripts, etc., users need to configure those capabilities for the agent separately. diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md index 8895dc6f705..63fc73c77e3 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md @@ -1,210 +1,343 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: Summarization -weight: 3 +weight: 4 --- +> 💡 +> This middleware was introduced in v0.8.0. Package path: `github.com/cloudwego/eino/adk/middlewares/summarization` + ## Overview -The Summarization middleware automatically compresses conversation history when the token count exceeds a configured threshold. This helps maintain context continuity in long conversations while staying within the model's token limits. +The Summarization middleware automatically calls a summarization model to compress conversation history when the conversation token count exceeds a threshold, keeping long conversations coherent within the model's context window. The middleware hooks into `BeforeModelRewriteState`, checking trigger conditions before each model call. When triggered, it executes: count → summary generation (with retry/failover) → post-processing → state replacement. -> 💡 -> This middleware was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1). +## Generic System -## Quick Start +All core types and functions in this package provide both **Typed generic versions** (`M adk.MessageType`) and **non-generic aliases** (fixed to `*schema.Message`). -```go -import ( - "context" - "github.com/cloudwego/eino/adk/middlewares/summarization" -) + + + + + + + + + + + + + + + +
    Generic VersionNon-generic Alias (= Typed\[*schema.Message\])
    TypedConfig[M]
    Config
    NewTyped[M](ctx, *TypedConfig[M])
    New(ctx, *Config)
    TypedTokenCounterFunc[M]
    TokenCounterFunc
    TypedGenModelInputFunc[M]
    GenModelInputFunc
    TypedGetFailoverModelFunc[M]
    GetFailoverModelFunc
    TypedFinalizeFunc[M]
    FinalizeFunc
    TypedCallbackFunc[M]
    CallbackFunc
    TypedUserMessageFilterFunc[M]
    UserMessageFilterFunc
    TypedPreserveUserMessages[M]
    PreserveUserMessages
    TypedRetryConfig[M]
    RetryConfig
    TypedFailoverConfig[M]
    FailoverConfig
    TypedFailoverContext[M]
    FailoverContext
    TypedFinalizerBuilder[M]
    FinalizerBuilder
    -// Create middleware with minimal configuration -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, // Required: model used for generating summaries -}) -if err != nil { - // Handle error -} +Unless otherwise noted, type signatures in this document use the generic form `M`. When using non-generic aliases, `M` = `*schema.Message`. -// Use with ChatModelAgent -agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Model: yourChatModel, - Middlewares: []adk.ChatModelAgentMiddleware{mw}, -}) +### Constructors + +```go +// Generic version — supports *schema.Message and *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error) + +// Non-generic version — equivalent to NewTyped[*schema.Message] +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) ``` -## Configuration Options +## TypedConfig[M] Configuration - - - - - - - - - - - + + + + + + + + + + + + +
    FieldTypeRequiredDefaultDescription
    Modelmodel.BaseChatModelYes
  • Chat model used for generating summaries
    ModelOptions[]model.OptionNo
  • Options passed to the model when generating summaries
    TokenCounterTokenCounterFuncNo~4 chars/tokenCustom token counting function
    Trigger*TriggerConditionNo190,000 tokensCondition to trigger summarization
    InstructionstringNoBuilt-in promptCustom summarization instruction
    TranscriptFilePathstringNo
  • Full conversation transcript file path
    PreparePrepareFuncNo
  • Custom preprocessing function before summary generation
    FinalizeFinalizeFuncNo
  • Custom post-processing function for final messages
    CallbackCallbackFuncNo
  • Called after Finalize to observe state changes (read-only)
    EmitInternalEventsboolNofalseWhether to emit internal events
    PreserveUserMessages*PreserveUserMessagesNoEnabled: trueWhether to preserve original user messages in summary
    Model
    model.BaseModel[M]
    YesModel used to generate summaries
    ModelOptions
    []model.Option
    NoOptions passed to the summarization model
    TokenCounter
    TypedTokenCounterFunc[M]
    NoUses the most recent assistant message's total\_tokens as baseline, with incremental messages estimated at ~4 chars/tokenCustom token counting function
    Trigger
    *TriggerCondition
    NoContextTokens=160,000Conditions to trigger summarization
    UserInstruction
    string
    NoBuilt-in promptCustom user-level summarization instruction, overrides the default instruction
    TranscriptFilePath
    string
    NoFull conversation transcript file path, appended to the summary to remind the model of the original context location. Only takes effect when Finalize is not set
    GenModelInput
    TypedGenModelInputFunc[M]
    NosysInstruction → contextMsgs → userInstructionFull control over building the summarization model input
    Finalize
    TypedFinalizeFunc[M]
    NoBuilt-in post-processingCustom summary post-processing. When set, the middleware no longer performs any default post-processing
    Callback
    TypedCallbackFunc[M]
    NoCalled after Finalize, with parameters
    before, after adk.TypedChatModelAgentState[M]
    (value types), read-only
    EmitInternalEvents
    bool
    NofalseWhether to emit internal events at key points
    PreserveUserMessages
    *TypedPreserveUserMessages[M]
    NoEnabled: truePreserve original user messages in the summary. Only takes effect when Finalize is not set
    Retry
    *TypedRetryConfig[M]
    Nonil (no retry)Retry strategy for primary model summary generation
    Failover
    *TypedFailoverConfig[M]
    NonilFailover strategy after primary model failure
    -### TriggerCondition Structure +> 💡 +> **Finalize Override Semantics**: Once a custom `Finalize` is set, the middleware will **skip all default post-processing** — both `PreserveUserMessages` and `TranscriptFilePath` will no longer take effect. To reuse default post-processing logic in a custom Finalize, use the `DefaultFinalizer` function. + +## Sub-configuration Structs + +### TriggerCondition + +Summarization is triggered when **any** condition is met. ```go type TriggerCondition struct { - // ContextTokens triggers summarization when total token count exceeds this threshold - ContextTokens int + ContextTokens int // Trigger when token count exceeds this threshold + ContextMessages int // Trigger when message count exceeds this threshold } ``` -### PreserveUserMessages Structure +### TypedPreserveUserMessages\[M\] + +When enabled, replaces the `...` section in the summary with the most recent original user messages. ```go -type PreserveUserMessages struct { - // Enabled whether to enable user message preservation - Enabled bool - - // MaxTokens maximum tokens for preserved user messages - // Only preserves the most recent user messages until this limit is reached - // Defaults to 1/3 of TriggerCondition.ContextTokens - MaxTokens int +type TypedPreserveUserMessages[M adk.MessageType] struct { + Enabled bool + MaxTokens int // Maximum tokens for preserved user messages; defaults to TriggerCondition.ContextTokens / 3 + Filter TypedUserMessageFilterFunc[M] // Filter function, return false to exclude a message } ``` -### Configuration Examples +### TypedRetryConfig[M] -**Custom Token Threshold** +```go +type TypedRetryConfig[M adk.MessageType] struct { + MaxRetries *int // Default 3 + ShouldRetry func(ctx context.Context, resp M, err error) bool // Default: retry when err != nil + BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration // Default: exponential backoff + jitter +} +``` + +### TypedFailoverConfig[M] ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Trigger: &summarization.TriggerCondition{ - ContextTokens: 100000, // Trigger at 100k tokens - }, -}) +type TypedFailoverConfig[M adk.MessageType] struct { + MaxRetries *int // Default 3 + ShouldFailover func(ctx context.Context, resp M, err error) bool // Default: failover when err != nil + BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration + GetFailoverModel TypedGetFailoverModelFunc[M] // Returns (failoverModel model.BaseModel[M], failoverModelInputMsgs []M, failoverErr error) +} ``` -**Custom Token Counter** +### TypedFailoverContext[M] + +Context passed to the `GetFailoverModel` callback. ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - TokenCounter: func(ctx context.Context, input *summarization.TokenCounterInput) (int, error) { - // Use your tokenizer - return yourTokenizer.Count(input.Messages) - }, -}) +type TypedFailoverContext[M adk.MessageType] struct { + Attempt int // Current failover attempt count, starting from 1 + SystemInstruction M // System instruction (set internally by the middleware, not configurable) + UserInstruction M // User instruction + OriginalMessages []M // Original full conversation + LastModelResponse M // Model response from the last attempt + LastErr error +} ``` -**Set Transcript File Path** +### TypedTokenCounterInput[M] ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, +type TypedTokenCounterInput[M adk.MessageType] struct { + Messages []M + Tools []*schema.ToolInfo +} +``` + +## Function Type Signatures Quick Reference + +```go +type TypedTokenCounterFunc[M] func(ctx context.Context, input *TypedTokenCounterInput[M]) (int, error) +type TypedGenModelInputFunc[M] func(ctx context.Context, sysInstruction, userInstruction M, originalMsgs []M) ([]M, error) +type TypedGetFailoverModelFunc[M] func(ctx context.Context, failoverCtx *TypedFailoverContext[M]) (model.BaseModel[M], []M, error) +type TypedFinalizeFunc[M] func(ctx context.Context, originalMessages []M, summary M) ([]M, error) +type TypedCallbackFunc[M] func(ctx context.Context, before, after adk.TypedChatModelAgentState[M]) error +type TypedUserMessageFilterFunc[M] func(ctx context.Context, msg M) (bool, error) +``` + +## DefaultFinalizer + +`DefaultFinalizer` is a standalone factory function that returns a `TypedFinalizeFunc[M]` consistent with the middleware's default post-processing logic. Use it when you need to reuse the default logic (preserve user messages, append transcript path, etc.) in a custom `Finalize`. + +```go +func DefaultFinalizer[M adk.MessageType](cfg *DefaultFinalizerConfig[M]) (TypedFinalizeFunc[M], error) +``` + +### DefaultFinalizerConfig[M] + +```go +type DefaultFinalizerConfig[M adk.MessageType] struct { + PreserveUserMessages *TypedPreserveUserMessages[M] // Default Enabled=true, MaxTokens=30000 + TranscriptFilePath string +} +``` + +**Example**: Execute default post-processing in a custom Finalize, then add a system message: + +```go +defaultFinalize, err := summarization.DefaultFinalizer[*schema.Message](&summarization.DefaultFinalizerConfig[*schema.Message]{ TranscriptFilePath: "/path/to/transcript.txt", }) +if err != nil { + // handle error +} + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: func(ctx context.Context, originalMessages []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + msgs, err := defaultFinalize(ctx, originalMessages, summary) + if err != nil { + return nil, err + } + // Add a system message before the summary + return append([]*schema.Message{schema.SystemMessage("your system prompt")}, msgs...), nil + }, +} ``` -**Custom Finalize Function** +## FinalizerBuilder + +`TypedFinalizerBuilder[M]` provides a chained API for building `TypedFinalizeFunc[M]`, supporting linking multiple handlers (Handler) and an optional custom finalizer (Custom). ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Finalize: func(ctx context.Context, originalMessages []adk.Message, summary adk.Message) ([]adk.Message, error) { - // Custom logic to build final messages - return []adk.Message{ - schema.SystemMessage("Your system prompt"), - summary, - }, nil - }, -}) +func NewTypedFinalizer[M adk.MessageType]() *TypedFinalizerBuilder[M] +func NewFinalizer() *FinalizerBuilder // = NewTypedFinalizer[*schema.Message] + +func (b *TypedFinalizerBuilder[M]) PreserveSkills(config *PreserveSkillsConfig) *TypedFinalizerBuilder[M] +func (b *TypedFinalizerBuilder[M]) Custom(fn TypedFinalizeFunc[M]) *TypedFinalizerBuilder[M] +func (b *TypedFinalizerBuilder[M]) Build() (TypedFinalizeFunc[M], error) ``` -**Using Callback to Observe State Changes/Store** +Execution order: Handlers transform the summary sequentially in registration order → Custom determines the final output message list. If Custom is not set, returns `[]M{summary}`. + +### PreserveSkills + +Preserves skill content that was loaded by the Skill middleware after summary compression, ensuring the agent retains skill knowledge after context window compression. ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Callback: func(ctx context.Context, before, after adk.ChatModelAgentState) error { - log.Printf("Summarization completed: %d messages -> %d messages", - len(before.Messages), len(after.Messages)) - return nil - }, -}) +type PreserveSkillsConfig struct { + SkillToolName string // Skill tool name, must match the Skill middleware. Default "skill" + MaxSkills *int // Maximum skills to preserve. Default 5; 0 means disabled + MaxTokensPerSkill *int // Maximum tokens per skill, truncated if exceeded. Default 5000 + SkillsTokenBudget *int // Total token budget for all skills. Default 25000 +} ``` -**Control User Message Preservation** +**Example**: ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - PreserveUserMessages: &summarization.PreserveUserMessages{ - Enabled: true, - MaxTokens: 50000, // Preserve up to 50k tokens of user messages - }, -}) +finalizer, err := summarization.NewFinalizer(). + PreserveSkills(&summarization.PreserveSkillsConfig{}). + Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + return []*schema.Message{schema.SystemMessage("system prompt"), summary}, nil + }). + Build() + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: finalizer, +} ``` -## How It Works +## Summarize Method -```mermaid -flowchart TD - A[BeforeModelRewriteState] --> B{Token count exceeds threshold?} - B -->|No| C[Return original state] - B -->|Yes| D[Emit BeforeSummary event] - D --> E{Has custom Prepare?} - E -->|Yes| F[Call Prepare] - E -->|No| G[Call model to generate summary] - F --> G - G --> H{Has custom Finalize?} - H -->|Yes| I[Call Finalize] - H -->|No| L{Has custom Callback?} - I --> L - L -->|Yes| M[Call Callback] - L -->|No| J[Emit AfterSummary event] - M --> J - J --> K[Return new state] - - style A fill:#e3f2fd - style G fill:#fff3e0 - style D fill:#e8f5e9 - style J fill:#e8f5e9 - style K fill:#c8e6c9 - style C fill:#f5f5f5 - style M fill:#fce4ec - style F fill:#fff3e0 - style I fill:#fff3e0 +`TypedMiddleware[M]` exposes a `Summarize` method for manually executing a summarization outside of the middleware's automatic trigger: + +```go +func (m *TypedMiddleware[M]) Summarize(ctx context.Context, state *adk.TypedChatModelAgentState[M]) ([]M, error) ``` +This method executes the complete summarization flow (generation → post-processing → Callback → events), but **does not check trigger conditions**. Returns the replaced message list. + +## How It Works + + + +**Trigger Condition Check**: First checks `ContextMessages` (message count), then calculates token count via `TokenCounter` and compares against `ContextTokens`. Triggered if either condition is met. + +**Default Post-processing** (when Finalize is not set): + +1. Replace `...` in the summary with the most recent original user messages (controlled by `PreserveUserMessages`) +2. Append `TranscriptFilePath` hint +3. Add summary preamble and continuation instructions + ## Internal Events -When EmitInternalEvents is set to true, the middleware emits events at key points: +When `EmitInternalEvents = true`, the middleware emits events via `adk.TypedSendEvent`: - - + + +
    Event TypeTrigger TimingCarried Data
    ActionTypeBeforeSummaryBefore generating summaryOriginal message list
    ActionTypeAfterSummaryAfter completing summaryFinal message list
    ActionTypeBeforeSummarize
    After trigger condition is met, before calling the model
    TypedBeforeSummarizeAction[M]{Messages}
    : original message list
    ActionTypeGenerateSummary
    After each model generation attempt (including retries/failover)
    TypedGenerateSummaryAction[M]{Attempt, Phase, ModelResponse, GetError()}
    ActionTypeAfterSummarize
    After summarization completes, after Finalize
    TypedAfterSummarizeAction[M]{Messages}
    : final message list
    -**Usage Example** +Events are wrapped in `TypedCustomizedAction[M]` and placed in the `adk.AgentAction.CustomizedAction` field. `GenerateSummaryPhase` has two values: `GenerateSummaryPhasePrimary` (primary model/retry) and `GenerateSummaryPhaseFailover` (failover). + +## Usage Examples + +### Minimal Configuration ```go mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - EmitInternalEvents: true, + Model: yourChatModel, }) -// Listen for events in your event handler +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: yourChatModel, + Middlewares: []adk.ChatModelAgentMiddleware{mw}, +}) +``` + +### Custom Trigger + Retry + Failover + +```go +mw, err := summarization.New(ctx, &summarization.Config{ + Model: yourChatModel, + Trigger: &summarization.TriggerCondition{ + ContextTokens: 100000, + ContextMessages: 80, + }, + TranscriptFilePath: "/path/to/transcript.txt", + Retry: &summarization.RetryConfig{ + MaxRetries: ptrOf(2), + }, + Failover: &summarization.FailoverConfig{ + MaxRetries: ptrOf(3), + GetFailoverModel: func(ctx context.Context, fctx *summarization.FailoverContext) (model.BaseModel[*schema.Message], []*schema.Message, error) { + return backupModel, nil, nil // Returning nil input will reuse the default input + }, + }, +}) +``` + +### FinalizerBuilder + PreserveSkills + DefaultFinalizer + +```go +defaultFinalize, _ := summarization.DefaultFinalizer[*schema.Message]( + &summarization.DefaultFinalizerConfig[*schema.Message]{ + TranscriptFilePath: "/path/to/transcript.txt", + }, +) + +finalizer, err := summarization.NewFinalizer(). + PreserveSkills(&summarization.PreserveSkillsConfig{ + MaxSkills: ptrOf(3), + }). + Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + msgs, err := defaultFinalize(ctx, origMsgs, summary) + if err != nil { + return nil, err + } + return append([]*schema.Message{schema.SystemMessage("system prompt")}, msgs...), nil + }). + Build() + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: finalizer, +} ``` -## Best Practices +## Notes -1. **Set TranscriptFilePath**: It's recommended to always provide a conversation transcript file path so the model can reference the original conversation when needed. -2. **Adjust Token Threshold**: Adjust `Trigger.MaxTokens` based on the model's context window size. Generally recommended to set it to 80-90% of the model's limit. -3. **Custom Token Counter**: In production environments, it's recommended to implement a custom `TokenCounter` that matches the model's tokenizer for accurate counting. +1. **Set TranscriptFilePath**: It is strongly recommended to provide a conversation transcript file path so the model can trace back to details from the original transcript after summarization. +2. **Adjust Trigger Thresholds**: `Trigger.ContextTokens` should be set to 80-90% of the model's context window. The default value of 160,000 is suitable for models with a 200k window. +3. **Custom TokenCounter**: For production environments, it is recommended to implement a counter that precisely matches the model's tokenizer. The default estimator uses the most recent assistant message's `ResponseMeta.Usage.TotalTokens` as a baseline, with incremental messages estimated at ~4 chars/token. +4. **Finalize Override**: After setting `Finalize`, `PreserveUserMessages` and `TranscriptFilePath` no longer take effect automatically. To reuse them, use `DefaultFinalizer` or `FinalizerBuilder`. +5. **GetFailoverModel Constraints**: The callback must return a non-nil model and a non-empty input message list. diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md index 3fe9bf47048..24c0d54b441 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md @@ -1,25 +1,23 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-17" lastmod: "" tags: [] -title: ToolReduction -weight: 6 +title: Reduction +weight: 5 --- -# ToolReduction Middleware - -adk/middlewares/reduction +`adk/middlewares/reduction` > 💡 -> This middleware was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1). +> This middleware was introduced in v0.8.0. ## Overview -The `reduction` middleware is used to control the token count occupied by tool results, providing two strategies: +The `reduction` middleware manages the token count occupied by tool outputs in Agent conversations, divided into two phases: -1. **Truncation**: Immediately truncate overly long outputs when a tool returns, saving the complete content to Backend -2. **Clear**: When total tokens exceed the threshold, store old tool results to the file system +1. **Truncation**: Triggered immediately when a tool call returns. When a single output exceeds `MaxLengthForTrunc`, the full content is stored in the Backend and the message is replaced with a truncated summary. +2. **Clear**: Triggered before model calls (`BeforeModelRewriteState`). When total tokens exceed `MaxTokensForClear`, it iterates through history messages and offloads old tool arguments and results to the Backend. --- @@ -30,12 +28,13 @@ Tool call returns result │ ▼ ┌─────────────────────────────────────────────────────────────┐ -│ WrapInvokableToolCall / WrapStreamableToolCall │ +│ WrapInvokableToolCall / WrapStreamableToolCall │ +│ WrapEnhancedInvokableToolCall / WrapEnhancedStreamable │ │ │ -│ Truncation strategy (can be skipped) │ +│ Truncation (can be skipped via SkipTruncation) │ │ Result length > MaxLengthForTrunc? │ -│ Yes → Truncate content, save full content to Backend │ -│ No → Return as-is │ +│ Yes → Truncate content, store full content in Backend │ +│ No → Return as-is │ └─────────────────────────────────────────────────────────────┘ │ ▼ @@ -45,11 +44,14 @@ Tool call returns result ┌─────────────────────────────────────────────────────────────┐ │ BeforeModelRewriteState │ │ │ -│ Clear strategy (can be skipped) │ +│ Clear (can be skipped via SkipClear) │ │ Total tokens > MaxTokensForClear? │ -│ Yes → Store old tool results to Backend, replace with │ -│ file paths │ -│ No → Do nothing │ +│ Yes → ClearMessageRewriter preprocessing │ +│ → Old tool results stored in Backend, │ +│ replaced with file paths │ +│ → ClearAtLeastTokens minimum release check │ +│ → ClearPostProcess callback │ +│ No → No action │ └─────────────────────────────────────────────────────────────┘ │ ▼ @@ -58,150 +60,115 @@ Tool call returns result --- -## Configuration +## Generic System -### Config Main Configuration +This middleware uses the ADK standard generic pattern, supporting both `*schema.Message` and `*schema.AgenticMessage`: ```go -type Config struct { - // Backend storage backend for saving truncated/cleared content - // Required when SkipTruncation is false - Backend Backend - - // SkipTruncation skip the truncation phase - SkipTruncation bool - - // SkipClear skip the clear phase - SkipClear bool - - // ReadFileToolName name of the tool for reading files - // After content is offloaded to a file, the agent needs this tool to read it - // Default "read_file" - ReadFileToolName string +// Generic configuration, M constrained to adk.MessageType +type TypedConfig[M adk.MessageType] struct { ... } - // RootDir root directory for saving content - // Default "/tmp" - // Truncated content saved to {RootDir}/trunc/{tool_call_id} - // Cleared content saved to {RootDir}/clear/{tool_call_id} - RootDir string - - // MaxLengthForTrunc maximum length to trigger truncation - // Default 50000 - MaxLengthForTrunc int - - // TokenCounter token counter - // Used to determine if clearing needs to be triggered - // Default uses character_count/4 estimation - TokenCounter func(ctx context.Context, msg []adk.Message, tools []*schema.ToolInfo) (int64, error) +// Backward-compatible alias +type Config = TypedConfig[*schema.Message] +``` - // MaxTokensForClear token threshold to trigger clearing - // Default 30000 - MaxTokensForClear int64 +Constructors also provide both generic and non-generic versions: - // ClearRetentionSuffixLimit how many recent conversation rounds to keep without clearing - // Default 1 - ClearRetentionSuffixLimit int +```go +func NewTyped[M adk.MessageType](ctx context.Context, config *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error) +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +``` - // ClearPostProcess callback after clearing completes - // Can be used to save or notify current state - ClearPostProcess func(ctx context.Context, state *adk.ChatModelAgentState) context.Context +--- - // ToolConfig configuration for specific tools - // Takes precedence over global configuration - ToolConfig map[string]*ToolReductionConfig -} -``` +## Configuration -### ToolReductionConfig Tool-level Configuration +### TypedConfig[M] Main Configuration + + + + + + + + + + + + + + + + + + + + +
    FieldTypeDescription
    Backend
    Backend
    Storage backend. Required when
    SkipTruncation
    is false; can be nil when only doing Clear without offload.
    SkipTruncation
    bool
    Skip the truncation phase.
    SkipClear
    bool
    Skip the clear phase.
    ReadFileToolName
    string
    Tool name used to read offloaded content. Default
    "read_file"
    .
    RootDir
    string
    Root directory for saving content. Default
    "/tmp"
    . Truncated content is stored at
    {RootDir}/trunc/{tool_call_id}
    , cleared content at
    {RootDir}/clear/{tool_call_id}
    .
    GenTruncOffloadFilePath
    func(ctx, *ToolDetail) (string, error)
    Custom truncation file path generation. When set, RootDir does not apply to truncation. Useful when tool_call_id is not unique.
    GenClearOffloadFilePath
    func(ctx, *ToolDetail) (string, error)
    Custom clear file path generation. When set, RootDir does not apply to clear.
    MaxLengthForTrunc
    int
    Maximum character length to trigger truncation. Default
    50000
    .
    TruncExcludeTools
    []string
    List of tool names excluded from truncation.
    TokenCounter
    func(ctx, []M, []*schema.ToolInfo) (int64, error)
    Token counting function. Default uses character count / 4 estimation. Recommended to replace with tiktoken-go/tokenizer.
    MaxTokensForClear
    int64
    Token threshold to trigger clear. Default
    160000
    .
    ClearRetentionSuffixLimit
    int
    Keep the most recent N rounds of assistant messages from being cleared. Default
    1
    .
    ClearAtLeastTokens
    int64
    Minimum tokens that must be freed by clear. If not met, clear is not executed (to avoid needlessly breaking prompt cache). Default
    0
    .
    ClearExcludeTools
    []string
    List of tool names excluded from clear.
    ClearMessageRewriter
    func(ctx, M, []M) ([]M, error)
    Message rewrite callback before clear. Parameters are toolCallMsg and the corresponding toolResponseMsgs. Can be used to rewrite write_file/edit_file calls into system-reminders. Returning nil removes that group of messages.
    ClearPostProcess
    func(ctx, *adk.TypedChatModelAgentState[M]) context.Context
    Callback after clear completes, can save state or send notifications. Returns a potentially updated context.
    ToolConfig
    map[string]*ToolReductionConfig
    Per-tool configuration, takes priority over global settings.
    + +### ToolReductionConfig Per-tool Configuration ```go type ToolReductionConfig struct { - // Backend storage backend for this tool - Backend Backend - - // SkipTruncation skip truncation for this tool + Backend Backend SkipTruncation bool - - // TruncHandler custom truncation handler - // Uses default handler if not set - TruncHandler func(ctx context.Context, detail *ToolDetail) (*TruncResult, error) - - // SkipClear skip clearing for this tool - SkipClear bool - - // ClearHandler custom clear handler - // Uses default handler if not set - ClearHandler func(ctx context.Context, detail *ToolDetail) (*ClearResult, error) + TruncHandler func(ctx context.Context, detail *ToolDetail) (*TruncResult, error) + SkipClear bool + ClearHandler func(ctx context.Context, detail *ToolDetail) (*ClearResult, error) } ``` -### ToolDetail Tool Details +- `TruncHandler` / `ClearHandler` being nil and not skipped uses the global default handler. +- `Backend` is an independent storage backend for that tool, overriding the global Backend. + +### ToolDetail ```go type ToolDetail struct { - // ToolContext tool metadata (tool name, call ID) - ToolContext *adk.ToolContext - - // ToolArgument input parameters - ToolArgument *schema.ToolArgument - - // ToolResult output result - ToolResult *schema.ToolResult + ToolContext *adk.ToolContext + ToolArgument *schema.ToolArgument + ToolResult *schema.ToolResult // Non-streaming + StreamToolResult *schema.StreamReader[*schema.ToolResult] // Streaming } ``` -### TruncResult Truncation Result +### TruncResult ```go type TruncResult struct { - // NeedTrunc whether truncation is needed - NeedTrunc bool - - // ToolResult tool result after truncation - // Required when NeedTrunc is true - ToolResult *schema.ToolResult - - // NeedOffload whether offloading to storage is needed - NeedOffload bool - - // OffloadFilePath offload file path - // Required when NeedOffload is true - OffloadFilePath string - - // OffloadContent offload content - // Required when NeedOffload is true - OffloadContent string + NeedTrunc bool + ToolResult *schema.ToolResult // Required when NeedTrunc && non-streaming + StreamToolResult *schema.StreamReader[*schema.ToolResult] // Required when NeedTrunc && streaming + NeedOffload bool + OffloadFilePath string // Required when NeedOffload + OffloadContent string // Required when NeedOffload } ``` -### ClearResult Clear Result +### ClearResult ```go type ClearResult struct { - // NeedClear whether clearing is needed - NeedClear bool - - // ToolArgument tool argument after clearing - // Required when NeedClear is true - ToolArgument *schema.ToolArgument - - // ToolResult tool result after clearing - // Required when NeedClear is true - ToolResult *schema.ToolResult - - // NeedOffload whether offloading to storage is needed - NeedOffload bool + NeedClear bool + ToolArgument *schema.ToolArgument // Required when NeedClear + ToolResult *schema.ToolResult // Required when NeedClear + NeedOffload bool + OffloadFilePath string // Required when NeedOffload + OffloadContent string // Required when NeedOffload +} +``` - // OffloadFilePath offload file path - // Required when NeedOffload is true - OffloadFilePath string +### Backend Interface - // OffloadContent offload content - // Required when NeedOffload is true - OffloadContent string +```go +// Defined in reduction/internal, exported via type alias +type Backend interface { + Write(context.Context, *filesystem.WriteRequest) error } ``` +`filesystem.WriteRequest` contains two fields: `FilePath string` and `Content string`. + --- ## Creating the Middleware @@ -209,67 +176,75 @@ type ClearResult struct { ### Basic Usage ```go -import ( - "context" - "github.com/cloudwego/eino/adk/middlewares/reduction" -) +import "github.com/cloudwego/eino/adk/middlewares/reduction" -// Use default configuration middleware, err := reduction.New(ctx, &reduction.Config{ - Backend: myBackend, // Required: storage backend + Backend: myBackend, }) -// Use with ChatModelAgent agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Model: yourChatModel, + Model: chatModel, Middlewares: []adk.ChatModelAgentMiddleware{middleware}, }) ``` +### Generic Usage (AgenticMessage) + +```go +middleware, err := reduction.NewTyped[*schema.AgenticMessage](ctx, &reduction.TypedConfig[*schema.AgenticMessage]{ + Backend: myBackend, + TokenCounter: myAgenticTokenCounter, +}) + +agent, err := adk.NewTypedChatModelAgent(ctx, &adk.TypedChatModelAgentConfig[*schema.AgenticMessage]{ + Model: chatModel, + Middlewares: []adk.TypedChatModelAgentMiddleware[*schema.AgenticMessage]{middleware}, +}) +``` + ### Custom Configuration ```go -config := &reduction.Config{ +middleware, err := reduction.New(ctx, &reduction.Config{ Backend: myBackend, RootDir: "/data/agent", MaxLengthForTrunc: 30000, MaxTokensForClear: 100000, ClearRetentionSuffixLimit: 2, - TokenCounter: myTokenCounter, + ClearAtLeastTokens: 10000, + TruncExcludeTools: []string{"search_tool"}, + ClearExcludeTools: []string{"read_file"}, + ClearMessageRewriter: func(ctx context.Context, toolCallMsg *schema.Message, toolResponseMsgs []*schema.Message) ([]*schema.Message, error) { + // Rewrite write_file calls into a system-reminder + return []*schema.Message{schema.UserMessage("file written")}, nil + }, ClearPostProcess: func(ctx context.Context, state *adk.ChatModelAgentState) context.Context { log.Printf("Clear completed, messages: %d", len(state.Messages)) return ctx }, ToolConfig: map[string]*reduction.ToolReductionConfig{ - "grep": { - Backend: grepBackend, - SkipTruncation: false, - }, - "read_file": { - Backend: readFileBackend, - SkipClear: true, // Read file tool doesn't need clearing - }, + "grep": {Backend: grepBackend}, + "read_file": {SkipClear: true}, }, -} - -middleware, err := reduction.New(ctx, config) +}) ``` -### Using Truncation Strategy Only +### Truncation Only ```go middleware, err := reduction.New(ctx, &reduction.Config{ Backend: myBackend, - SkipClear: true, // Skip clear phase + SkipClear: true, }) ``` -### Using Clear Strategy Only +### Clear Only ```go middleware, err := reduction.New(ctx, &reduction.Config{ - Backend: myBackend, - SkipTruncation: true, // Skip truncation phase + SkipTruncation: true, + MaxTokensForClear: 100000, + // When Backend is nil, clear still replaces content with placeholders but does not perform offload }) ``` @@ -279,29 +254,37 @@ middleware, err := reduction.New(ctx, &reduction.Config{ ### Truncation -Handled in `WrapInvokableToolCall` / `WrapStreamableToolCall`: +Handled in `WrapInvokableToolCall` / `WrapStreamableToolCall` / `WrapEnhancedInvokableToolCall` / `WrapEnhancedStreamableToolCall`: 1. Tool returns result -2. Call TruncHandler to determine if truncation is needed -3. If truncation needed, save full content to Backend -4. Return truncated content with hint text telling the agent where to find the full content +2. Check `TruncExcludeTools`, skip if matched +3. Look up ToolConfig → global defaultConfig to get TruncHandler +4. TruncHandler evaluation: read the full output, check if total length of all text parts exceeds `MaxLengthForTrunc` +5. If exceeded: keep the first and last `MaxLengthForTrunc/(textParts*2)` characters as preview, store full content in Backend +6. Return truncation notice informing the agent of the full content file path + +> 💡 +> For streaming tools, the default TruncHandler waits for the complete stream to be read before deciding whether to truncate. If strict incremental streaming behavior is needed, provide a custom TruncHandler for that tool. ### Clear Handled in `BeforeModelRewriteState`: -1. Use TokenCounter to calculate total tokens -2. Only process if exceeds MaxTokensForClear -3. Iterate from old messages, skipping already processed ones and the most recent ClearRetentionSuffixLimit rounds -4. For each tool call in range, call ClearHandler -5. If clearing needed, write to Backend and replace message result with file path -6. Call ClearPostProcess callback +1. Calculate total tokens using `TokenCounter` +2. Skip if not exceeding `MaxTokensForClear` +3. Determine clear range: from the first unprocessed assistant message to `len(messages) - ClearRetentionSuffixLimit` rounds +4. If `ClearMessageRewriter` is configured, execute rewrite preprocessing on messages within range first +5. Iterate tool call messages within range, skip `ClearExcludeTools` +6. Call ClearHandler for each tool call, replacing arguments and results +7. If `ClearAtLeastTokens` is set: operate on a copy first, compare token difference before and after clear; abandon this clear if threshold not met +8. If threshold met, execute actual offload writes and update state.Messages +9. Call `ClearPostProcess` --- ## Multi-language Support -Truncation and clear hint text supports Chinese and English, switch via `adk.SetLanguage()`: +Truncation and clear prompt text supports automatic Chinese/English switching: ```go adk.SetLanguage(adk.LanguageChinese) // Chinese @@ -312,7 +295,11 @@ adk.SetLanguage(adk.LanguageEnglish) // English (default) ## Notes -- When `SkipTruncation` is false, `Backend` must be set -- The default TokenCounter uses `character_count / 4` estimation, which is not accurate for Chinese; consider using `github.com/tiktoken-go/tokenizer` as a replacement -- Already processed messages are marked and won't be processed again -- Configuration in `ToolConfig` takes precedence over global configuration +- `Backend` **must** be set when `SkipTruncation` is false +- The default TokenCounter uses character count / 4 estimation; it is recommended to replace with `github.com/tiktoken-go/tokenizer` +- Already-processed messages are marked via the Extra field `_reduction_mw_processed` and will not be processed again +- `ToolConfig` per-tool configuration takes priority over global; if only `SkipTruncation: false` is set in ToolConfig without providing a `TruncHandler`, it falls back to the default handler +- `GenTruncOffloadFilePath` / `GenClearOffloadFilePath` are useful when tool_call_id is not unique (e.g., retry), preventing file overwrites +- `ClearMessageRewriter` executes after the clear range is determined but before per-tool clearing, suitable for compressing write/edit calls into brief hints +- Setting `ClearAtLeastTokens` to 0 means clear executes whenever the threshold is exceeded; values greater than 0 can avoid minimal clearing that would break prompt cache +- Legacy APIs (`NewClearToolResult`, `NewToolResultMiddleware`) are deprecated; migration to `New` / `NewTyped` is recommended diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md index f3d60b57c33..4884c44512f 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md @@ -1,26 +1,28 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: ToolSearch -weight: 5 +weight: 7 --- -# ToolSearch Middleware - -adk/middlewares/dynamictool/toolsearch - -> 💡 -> This middleware was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1). - ## Overview The `toolsearch` middleware implements dynamic tool selection. When the tool library is large, passing all tools to the model would overflow the context. This middleware's approach is: -1. Add a `tool_search` meta-tool that accepts regex patterns to search tool names +1. Add a `tool_search` meta-tool that accepts keyword queries or direct selection to search for tools 2. Initially hide all dynamic tools -3. After the model calls `tool_search`, matched tools become available in subsequent calls +3. After the model calls `tool_search`, matched tools appear in subsequent calls + +Three operating modes are supported (two configuration values, but `UseModelToolSearch=true` has two end-to-end behaviors): + +- **Default mode** (`UseModelToolSearch=false`): The middleware manages tool visibility itself. Before each Model call via `BeforeModelRewriteState`, it filters `state.ToolInfos` based on `tool_search` call results, progressively adding selected dynamic tools back to the model's visible list +- **Model native mode — pure server-side retrieval** (`UseModelToolSearch=true`, model self-retrieves DeferredTools): The middleware moves dynamic tools into `state.DeferredToolInfos`, passing them to the model via `model.WithDeferredTools`. If the model natively supports server-side tool retrieval (e.g., Claude's tool search), the model searches and selects directly from DeferredTools, **without calling the tool_search tool** +- **Model native mode — client-side proxy retrieval** (`UseModelToolSearch=true`, model discovers tools by calling `tool_search`): Same middleware configuration as above, but the model doesn't have autonomous DeferredTools retrieval capability. Instead, it calls the `tool_search` tool (registered via `model.WithToolSearchTool`), the client-side `modelToolSearchTool` performs the search and returns a structured `ToolSearchResult` (containing full ToolInfo of matched tools), and the model selects tools accordingly + +> 💡 +> Package path: github.com/cloudwego/eino/adk/middlewares/dynamictool/toolsearch --- @@ -31,18 +33,35 @@ Agent initialization │ ▼ ┌───────────────────────────────────────────┐ -│ BeforeAgent │ -│ - Inject tool_search tool │ -│ - Add DynamicTools to Tools list │ +│ BeforeAgent │ +│ - Inject tool_search tool │ +│ - Add DynamicTools to Tools list │ +│ - In model native mode, set │ +│ runCtx.ToolSearchTool │ └───────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────┐ -│ WrapModel │ -│ Before each Model call: │ -│ 1. Scan message history to find all tool_search return results │ -│ 2. Full Tools minus unselected DynamicTools = tools for this Model │ -│ call │ +│ BeforeModelRewriteState │ +│ (executed before each Model call) │ +│ │ +│ 1. Insert │ +│ User message listing all searchable │ +│ tool names │ +│ │ +│ First call (initialization): │ +│ Default mode: │ +│ Remove DynamicTools from ToolInfos │ +│ Model native mode: │ +│ DynamicTools → DeferredToolInfos │ +│ Remove DynamicTools from ToolInfos │ +│ and tool_search │ +│ │ +│ Subsequent calls (default mode - │ +│ forward selection): │ +│ Scan message history, collect │ +│ tool_search returned matches, add │ +│ matched DynamicTools back to ToolInfos │ └────────────────────────────────────────────┘ │ ▼ @@ -57,33 +76,80 @@ Agent initialization type Config struct { // Tools that can be dynamically searched and loaded DynamicTools []tool.BaseTool + + // Whether to use the model's native tool search capability + // + // When true, the middleware delegates tool search to the model's native capability. + // + // When false (default), the middleware manages tool visibility by + // filtering the tool list before each Model call based on tool_search results. + // Note: This approach may invalidate the model's KV-cache + // (since the tool list changes between calls). + UseModelToolSearch bool } ``` --- -## tool_search Tool +## Constructors -The tool injected by the middleware. +```go +// Standard constructor, uses *schema.Message +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) -**Parameters:** +// Generic constructor, supports *schema.Message and *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, config *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +## `New` internally calls `NewTyped[*schema.Message]`. If you use `TypedChatModelAgent` (e.g., Agentic mode), use `NewTyped` directly. + +## tool_search Tool + +The meta-tool injected by the middleware. **Parameters:** - + +
    ParameterTypeRequiredDescription
    regex_pattern
    stringYesRegex pattern to match tool names
    query
    stringYesQuery string to find tools. Supports three modes: keyword search,
    select:
    direct selection,
    +keyword
    must-match
    max_results
    integerNoMaximum number of results to return (default: 5). Only applies to keyword search mode; direct selection mode is not limited
    -**Returns:** +**Query modes:** + + + + + + +
    ModeSyntaxDescription
    Keyword search
    "weather forecast"
    Matches keywords in tool names and descriptions, sorted by relevance score. Supports camelCase and
    _
    /
    __
    (MCP) separator splitting
    Direct selection
    "select:tool_a,tool_b"
    Select one or more tools by exact name, comma-separated. Not limited by
    max_results
    Must-match
    "+slack send message"
    Keywords prefixed with
    +
    are must-match items; tools without that keyword are filtered out. Other keywords are used for sorting
    + +**Return value (default mode):** ```json -{ - "selectedTools": ["tool_a", "tool_b"] -} +{"matches": ["tool_a", "tool_b"]} ``` ---- +**Return value (model native mode):** Returns a structured `schema.ToolResult` containing full `ToolInfo` of matched tools for native model processing. + +## Keyword Search Scoring Mechanism + +Keyword search uses a multi-layer scoring system, calculating the highest score for each keyword separately then summing: + + + + + + + +
    Matching RuleScore
    Tool name split part exactly matches keyword10
    Tool name split part contains keyword (substring)5
    Full tool name contains keyword3
    Tool description contains keyword2
    -## Usage Example +> 💡 +> Each keyword takes the highest score (intMax) across rules for a single tool, without stacking scores from multiple parts within the same tool. Scores from multiple keywords are summed for the total. Tools with equal scores are sorted alphabetically by name. + +Tool names are split by `_` (underscore), `__` (MCP server-tool separator), and camelCase boundaries into multiple parts for matching. For example, `mcp__slack__send_message` splits into `["mcp", "slack", "send", "message"]`, and `NotebookEdit` splits into `["Notebook", "Edit"]`. Matching is case-insensitive. + +## Usage Examples + +### Default Mode (Middleware Manages Tool Visibility) ```go middleware, err := toolsearch.New(ctx, &toolsearch.Config{ @@ -104,35 +170,70 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ }) ``` +### Model Native Mode + +```go +middleware, err := toolsearch.New(ctx, &toolsearch.Config{ + DynamicTools: []tool.BaseTool{ + weatherTool, + stockTool, + currencyTool, + }, + UseModelToolSearch: true, +}) +if err != nil { + return err +} + +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: myModel, // Requires model to support native tool search + Handlers: []adk.ChatModelAgentMiddleware{middleware}, +}) +``` + +The configuration is identical, but end-to-end behavior depends on the model adapter implementation: + +- If the model natively supports server-side retrieval (e.g., Claude): the model searches and selects tools directly from `DeferredToolInfos`, and the `tool_search` tool is not called +- If the model uses client-side proxy retrieval: the model issues a `tool_search` call → client-side `modelToolSearchTool` performs the search → returns structured `ToolSearchResult` (with full ToolInfo) → model selects tools accordingly + --- ## How It Works ### BeforeAgent -1. Get all DynamicTools -2. Create `tool_search` tool using DynamicTools -3. Add `tool_search` and all DynamicTools to `runCtx.Tools`, at this point Agent has full Tools +1. Get ToolInfo for all DynamicTools, validate no duplicate tool names +2. Create the corresponding type of `tool_search` tool based on `UseModelToolSearch` +3. Add `tool_search` and all DynamicTools to `runCtx.Tools` (at this point the Agent has the full set of tools) +4. In model native mode, set `runCtx.ToolSearchTool`, which the framework passes to the model via `model.WithToolSearchTool` + +### BeforeModelRewriteState (Before Each Model Call) + +**Common logic:** + +- Ensure the message list contains an `` reminder (inserted as a User message, listing all searchable tool names) **First call — initialization (both modes):** -### WrapModel + +
    +Default mode Removes all DynamicTools from
    state.ToolInfos
    , so the model initially can only see static tools and
    tool_search
    +Model native mode 1. Extract DynamicTools from
    state.ToolInfos
    into
    state.DeferredToolInfos
    2. Remove
    tool_search
    from
    state.ToolInfos
    (handled natively by the model)
    -Before each Model call: +**Subsequent calls — forward selection (default mode only):** -1. Iterate through message history to find all `tool_search` return results +1. Iterate message history, find all `tool_search` return results with JSON `matches` field 2. Collect selected tool names -3. Filter out unselected DynamicTools from full tools -4. Call Model with filtered tool list +3. Add matched DynamicTools back to `state.ToolInfos` (accumulative, previously added tools are not removed) -### Tool Selection Flow +### Tool Selection Flow (Default Mode) ``` Round 1: - Model can only see tool_search - Model calls tool_search(regex_pattern="weather.*") - Returns {"selectedTools": ["weather_forecast", "weather_history"]} + Model can only see tool_search + static tools + Model calls tool_search(query="weather forecast") + Returns {"matches": ["weather_forecast", "weather_history"]} Round 2: - Model can see tool_search + weather_forecast + weather_history + Model can see tool_search + static tools + weather_forecast + weather_history Model calls weather_forecast(...) ``` @@ -140,7 +241,10 @@ Round 2: ## Notes -- DynamicTools cannot be empty -- Regex matches tool names, not descriptions -- Selected tools remain available unless the tool_search call result is deleted or modified -- tool_search can be called multiple times, results accumulate +- `DynamicTools` cannot be empty, and tool names cannot be duplicated +- Keyword search matches tool names and descriptions, case-insensitive +- In default mode, selected tools remain available (accumulated based on `tool_search` results in message history) +- `tool_search` can be called multiple times; results accumulate +- In default mode, the tool list may change before each Model call, which may invalidate the model's KV-cache +- Model native mode requires the ChatModel to support `model.WithToolSearchTool` and/or `model.WithDeferredTools` options. Which path is taken (pure server-side retrieval vs. client-side proxy retrieval) depends on the model adapter implementation +- The `` reminder is inserted as a **User message** (not a System message) into the message list, positioned before the first non-System message diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md index 9a86ea95204..37090076ee8 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md @@ -1,298 +1,257 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-19" lastmod: "" tags: [] -title: 'Eino ADK: ChatModelAgentMiddleware' +title: ChatModelAgentMiddleware weight: 8 --- -## Overview +`ChatModelAgentMiddleware` is the core interface for customizing the behavior of `ChatModelAgent` (and `DeepAgent` built on top of it). Introduced in v0.8.0, it has continued to evolve in subsequent releases. -## ChatModelAgentMiddleware Interface +## Type Conventions -`ChatModelAgentMiddleware` defines the interface for customizing `ChatModelAgent` behavior. +This document uses the default `M = *schema.Message` aliases. The generic original types are prefixed with `Typed`: -**Important:** This interface is designed specifically for `ChatModelAgent` and Agents built on top of it (such as `DeepAgent`). - -> 💡 -> The ChatModelAgentMiddleware interface was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1) +```go +type ChatModelAgentMiddleware = TypedChatModelAgentMiddleware[*schema.Message] +type BaseChatModelAgentMiddleware = TypedBaseChatModelAgentMiddleware[*schema.Message] +type ChatModelAgentState = TypedChatModelAgentState[*schema.Message] +type ModelContext = TypedModelContext[*schema.Message] +``` -### Why Use ChatModelAgentMiddleware Instead of AgentMiddleware? +When you need to use `*schema.AgenticMessage`, use the `Typed` generic versions directly. - - - - - -
    FeatureAgentMiddleware (struct)ChatModelAgentMiddleware (interface)
    ExtensibilityClosed, users cannot add new methodsOpen, users can implement custom handlers
    Context PropagationCallbacks only return errorAll methods return
    (context.Context, ..., error)
    Configuration ManagementScattered in closuresCentralized in struct fields
    +--- -### Interface Definition +## Interface Definition ```go type ChatModelAgentMiddleware interface { - // BeforeAgent is called before each agent run, allows modifying instruction and tools configuration + // ── Lifecycle Hooks ── + + // BeforeAgent: called once before the agent runs; can modify instruction and tools configuration BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) - // BeforeModelRewriteState is called before each model call - // The returned state will be persisted to the agent's internal state and passed to the model - // The returned context will be propagated to the model call and subsequent handlers + // AfterAgent: called after the agent terminates successfully (final answer or return-directly tool result) + // Not called on error termination (max iterations exceeded, context cancelled, model error) + AfterAgent(ctx context.Context, state *ChatModelAgentState) (context.Context, error) + + // BeforeModelRewriteState: called before each model invocation + // The returned state is persisted; Messages, ToolInfos, and DeferredToolInfos can be modified BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - // AfterModelRewriteState is called after each model call + // AfterModelRewriteState: called after each model invocation // The input state contains the model response as the last message AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - // WrapInvokableToolCall wraps the synchronous execution of a tool with custom behavior - // If no wrapping is needed, return the original endpoint and nil error - // Only called for tools that implement InvokableTool - WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) + // ── Wrappers ── - // WrapStreamableToolCall wraps the streaming execution of a tool with custom behavior - // If no wrapping is needed, return the original endpoint and nil error - // Only called for tools that implement StreamableTool + WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) - - // WrapEnhancedInvokableToolCall wraps the synchronous execution of an enhanced tool with custom behavior WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) - - // WrapEnhancedStreamableToolCall wraps the streaming execution of an enhanced tool with custom behavior WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) - // WrapModel wraps the chat model with custom behavior - // If no wrapping is needed, return the original model and nil error - // Called at request time, executed before each model call - WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) -} -``` - -### Using BaseChatModelAgentMiddleware - -Embed `*BaseChatModelAgentMiddleware` to get default no-op implementations: - -```go -type MyHandler struct { - *adk.BaseChatModelAgentMiddleware -} - -func (h *MyHandler) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - return ctx, state, nil + // WrapModel: wraps the ChatModel; parameter type is model.BaseModel[M] (not ToolCallingChatModel) + // The framework handles WithTools binding separately and does not pass through user wrappers + WrapModel(ctx context.Context, m model.BaseModel[M], mc *ModelContext) (model.BaseModel[M], error) } ``` ---- - -## Tool Call Endpoint Types - -Tool wrapping uses function types instead of interfaces, more clearly expressing the wrapping intent: +> 💡 +> Embed `*BaseChatModelAgentMiddleware` to get no-op default implementations for all methods; only override the ones you care about. -```go -// InvokableToolCallEndpoint is the function signature for synchronous tool calls -type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) +### AgentMiddleware Is Deprecated -// StreamableToolCallEndpoint is the function signature for streaming tool calls -type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) +> 💡 +> The `AgentMiddleware` struct and the `ChatModelAgentConfig.Middlewares` field have been marked as Deprecated and will be removed in a future release. All new code should use `ChatModelAgentMiddleware` (interface-based Handlers). -// EnhancedInvokableToolCallEndpoint is the function signature for enhanced synchronous tool calls -type EnhancedInvokableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.ToolResult, error) +`AgentMiddleware` is a struct with inherent limitations — users cannot extend its methods, and callbacks only return error without context propagation. `ChatModelAgentMiddleware` is an interface: -// EnhancedStreamableToolCallEndpoint is the function signature for enhanced streaming tool calls -type EnhancedStreamableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.StreamReader[*schema.ToolResult], error) -``` +- Hook methods return `(context.Context, ..., error)`, supporting context propagation +- Wrapper methods propagate modified context through the endpoint chain +- Custom handlers can carry arbitrary internal state -### Why Use Separate Endpoint Types? +Migration mapping: -The previous `ToolCall` interface contained both `InvokableRun` and `StreamableRun`, but most tools only implement one of them. -Separate endpoint types enable: + + + + + + + +
    AgentMiddleware fieldChatModelAgentMiddleware replacement
    AdditionalInstruction
    Modify
    runCtx.Instruction
    in
    BeforeAgent
    AdditionalTools
    Modify
    runCtx.Tools
    in
    BeforeAgent
    BeforeChatModel
    BeforeModelRewriteState
    AfterChatModel
    AfterModelRewriteState
    WrapToolCall
    WrapInvokableToolCall
    /
    WrapStreamableToolCall
    etc.
    -- Corresponding wrap methods are only called when the tool implements the respective interface -- Clearer contract for wrapper authors -- No ambiguity about which method to implement +In the current version, both can coexist (Handlers execute after Middlewares), but you should migrate as soon as possible. --- -## ChatModelAgentContext +## Context Types + +### ChatModelAgentContext -`ChatModelAgentContext` contains runtime information passed to handlers before each `ChatModelAgent` run. +Input to `BeforeAgent`, called once before each Run: ```go type ChatModelAgentContext struct { - // Instruction is the instruction for the current Agent execution - // Includes agent-configured instructions, framework and AgentMiddleware appended extra instructions, - // and modifications applied by previous BeforeAgent handlers + // Current instruction (includes agent config + framework appended + prior handler modifications) Instruction string - // Tools are the original tools (without any wrappers or tool middleware) currently configured for Agent execution - // Includes tools passed in AgentConfig, tools implicitly added by the framework (like transfer/exit tools), - // and other tools added by middleware + // Original tool list (includes framework-implicit tools such as transfer/exit) Tools []tool.BaseTool - // ReturnDirectly is the set of tool names currently configured to make the Agent return directly + // Set of tool names configured to "return directly" ReturnDirectly map[string]bool + + // ToolInfo for the model's native tool search capability + // Once set by a handler, the framework passes it to the model via model.WithToolSearchTool + ToolSearchTool *schema.ToolInfo } ``` ---- - -## ChatModelAgentState +### ChatModelAgentState -`ChatModelAgentState` represents the state of the chat model agent during conversation. This is the primary state type for `ChatModelAgentMiddleware` and `AgentMiddleware` callbacks. +**Persistent state** passed before and after each model call (maintained across iterations): ```go type ChatModelAgentState struct { - // Messages contains all messages in the current conversation session - Messages []Message + // All messages in the current session + Messages []*schema.Message + + // Tool definitions passed to the model (via model.WithTools); can be modified in BeforeModelRewriteState + ToolInfos []*schema.ToolInfo + + // Deferred tool definitions (via model.WithDeferredTools), used for the model's native search capability + // nil when not in use + DeferredToolInfos []*schema.ToolInfo } ``` ---- +> 💡 +> The recommended place to modify `ToolInfos` / `DeferredToolInfos` is `BeforeModelRewriteState` — this is the source of truth for tool configuration. Do not modify the tool list in `WrapModel`. -## ToolContext +### ModelContext -`ToolContext` provides metadata about the tool being wrapped. Created at request time, contains information about the current tool call. +Context for `WrapModel` and `Before/AfterModelRewriteState`: ```go -type ToolContext struct { - // Name is the tool name - Name string +type ModelContext struct { + // Deprecated: use ChatModelAgentState.ToolInfos instead + Tools []*schema.ToolInfo + + // Model retry configuration + ModelRetryConfig *ModelRetryConfig - // CallID is the unique identifier for this specific tool call - CallID string + // Model failover configuration + ModelFailoverConfig *ModelFailoverConfig[*schema.Message] } ``` -### Usage Example: Tool Call Wrapping +### ToolContext + +Metadata for tool wrapping: ```go -func (h *MyHandler) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - log.Printf("Tool %s (call %s) starting with args: %s", tCtx.Name, tCtx.CallID, argumentsInJSON) - - result, err := endpoint(ctx, argumentsInJSON, opts...) - - if err != nil { - log.Printf("Tool %s failed: %v", tCtx.Name, err) - return "", err - } - - log.Printf("Tool %s completed with result: %s", tCtx.Name, result) - return result, nil - }, nil +type ToolContext struct { + Name string // Tool name + CallID string // Unique identifier for this call } ``` --- -## ModelContext +## Tool Call Endpoint Types -`ModelContext` contains context information passed to `WrapModel`. Created at request time, contains tool configuration for the current model call. +Tool wrapping uses function types rather than interfaces. The framework calls the corresponding Wrap method based on which interface the tool implements: ```go -type ModelContext struct { - // Tools is the list of tools currently configured for the agent - // Populated at request time, contains the tools that will be sent to the model - Tools []*schema.ToolInfo +// Standard tools +type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) +type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) - // ModelRetryConfig contains the retry configuration for the model - // Populated at request time from the agent's ModelRetryConfig - // Used by EventSenderModelWrapper to appropriately wrap stream errors - ModelRetryConfig *ModelRetryConfig -} +// Enhanced tools (using ToolArgument/ToolResult) +type EnhancedInvokableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.ToolResult, error) +type EnhancedStreamableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.StreamReader[*schema.ToolResult], error) ``` -### Usage Example: Model Wrapping - -```go -func (h *MyHandler) WrapModel(ctx context.Context, m model.BaseChatModel, mc *adk.ModelContext) (model.BaseChatModel, error) { - return &myModelWrapper{ - inner: m, - tools: mc.Tools, - }, nil -} - -type myModelWrapper struct { - inner model.BaseChatModel - tools []*schema.ToolInfo -} +> 💡 +> Each Wrap method is **only called when the tool implements the corresponding interface**. For example, if a tool only implements `InvokableTool`, only `WrapInvokableToolCall` is called, not `WrapStreamableToolCall`. -func (w *myModelWrapper) Generate(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.Message, error) { - log.Printf("Model called with %d tools", len(w.tools)) - return w.inner.Generate(ctx, msgs, opts...) -} +--- -func (w *myModelWrapper) Stream(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.StreamReader[*schema.Message], error) { - return w.inner.Stream(ctx, msgs, opts...) -} -``` +## Execution Order ---- +### Model Call Lifecycle (outer to inner) + +1. ~~AgentMiddleware.BeforeChatModel~~ (**Deprecated**, will be removed) +2. **ChatModelAgentMiddleware.BeforeModelRewriteState** +3. `failoverModelWrapper` (internal — model failover, if configured) +4. `retryModelWrapper` (internal — failure retry) +5. `eventSenderModelWrapper` preprocessing (internal — prepares event sending) +6. **ChatModelAgentMiddleware.WrapModel** preprocessing (first registered → executes first) +7. `callbackInjectionModelWrapper` (internal) +8. **Model.Generate / Stream** +9. `callbackInjectionModelWrapper` postprocessing +10. **ChatModelAgentMiddleware.WrapModel** postprocessing (first registered → executes last) +11. `eventSenderModelWrapper` postprocessing +12. `retryModelWrapper` postprocessing +13. **ChatModelAgentMiddleware.AfterModelRewriteState** +14. ~~AgentMiddleware.AfterChatModel~~ (**Deprecated**, will be removed) + +### Tool Call Lifecycle (outer to inner) + +1. `eventSenderToolHandler` (internal — sends tool result event) +2. `ToolsConfig.ToolCallMiddlewares` +3. ~~AgentMiddleware.WrapToolCall~~ (**Deprecated**, will be removed) +4. **ChatModelAgentMiddleware.WrapXxxToolCall** (first registered → outermost) +5. `cancelMonitoredToolHandler` (internal — cancel monitoring) +6. **Tool.InvokableRun / StreamableRun** ## Run-Local Storage API -`SetRunLocalValue`, `GetRunLocalValue`, and `DeleteRunLocalValue` provide the ability to store, retrieve, and delete values during the current agent Run() call. +Store and retrieve key-value pairs during the current agent `Run()`. Values are compatible with interrupt/resume — they are serialized and persisted with checkpoints. ```go -// SetRunLocalValue sets a key-value pair that persists during the current agent Run() call -// The value is scoped to this specific execution and is not shared between different Run() calls or agent instances -// -// Values stored here are compatible with interrupt/resume cycles - they are serialized and restored when the agent resumes -// For custom types, they must be registered in init() using schema.RegisterName[T]() to ensure proper serialization -// -// This function can only be called from within a ChatModelAgentMiddleware during agent execution -// Returns an error if called outside of agent execution context func SetRunLocalValue(ctx context.Context, key string, value any) error - -// GetRunLocalValue retrieves a value set during the current agent Run() call -// The value is scoped to this specific execution and is not shared between different Run() calls or agent instances -// -// Values stored via SetRunLocalValue are compatible with interrupt/resume cycles - they are serialized and restored when the agent resumes -// For custom types, they must be registered in init() using schema.RegisterName[T]() to ensure proper serialization -// -// This function can only be called from within a ChatModelAgentMiddleware during agent execution -// Returns (value, true, nil) if found, (nil, false, nil) if not found, -// returns error if called outside of agent execution context func GetRunLocalValue(ctx context.Context, key string) (any, bool, error) - -// DeleteRunLocalValue deletes a value set during the current agent Run() call -// -// This function can only be called from within a ChatModelAgentMiddleware during agent execution -// Returns an error if called outside of agent execution context func DeleteRunLocalValue(ctx context.Context, key string) error ``` -### Usage Example: Sharing Data Across Handler Points +> 💡 +> Custom types must be registered in `init()` via `schema.RegisterName[T]()` to ensure correct gob serialization. These functions can only be called within `ChatModelAgentMiddleware` callbacks. + +### Example: Sharing State Across Callbacks ```go func init() { - schema.RegisterName[*MyCustomData]("my_package.MyCustomData") + schema.RegisterName[*ToolStats]("mypackage.ToolStats") } -type MyCustomData struct { +type ToolStats struct { Count int Name string } -type MyHandler struct { +type MyMiddleware struct { *adk.BaseChatModelAgentMiddleware } -func (h *MyHandler) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - result, err := endpoint(ctx, argumentsInJSON, opts...) - - data := &MyCustomData{Count: 1, Name: tCtx.Name} - if err := adk.SetRunLocalValue(ctx, "my_handler.last_tool", data); err != nil { - log.Printf("Failed to set run local value: %v", err) - } - +// Record statistics after tool calls +func (m *MyMiddleware) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + + _ = adk.SetRunLocalValue(ctx, "last_tool", &ToolStats{Count: 1, Name: tCtx.Name}) return result, err }, nil } -func (h *MyHandler) AfterModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - if val, found, err := adk.GetRunLocalValue(ctx, "my_handler.last_tool"); err == nil && found { - if data, ok := val.(*MyCustomData); ok { - log.Printf("Last tool was: %s (count: %d)", data.Name, data.Count) +// Read statistics after model calls +func (m *MyMiddleware) AfterModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { + if val, found, _ := adk.GetRunLocalValue(ctx, "last_tool"); found { + if stats, ok := val.(*ToolStats); ok { + log.Printf("Last tool: %s (count=%d)", stats.Name, stats.Count) } } return ctx, state, nil @@ -303,226 +262,79 @@ func (h *MyHandler) AfterModelRewriteState(ctx context.Context, state *adk.ChatM ## SendEvent API -`SendEvent` allows sending custom `AgentEvent` to the event stream during agent execution. +Send custom `AgentEvent` to the event stream during agent execution; callers iterating over the event stream will receive them: ```go -// SendEvent sends a custom AgentEvent to the event stream during agent execution -// Allows ChatModelAgentMiddleware implementations to emit custom events, -// which will be received by callers iterating over the agent's event stream -// -// This function can only be called from within a ChatModelAgentMiddleware during agent execution -// Returns an error if called outside of agent execution context func SendEvent(ctx context.Context, event *AgentEvent) error ``` ---- - -## State Type (To Be Deprecated) - -`State` holds agent runtime state, including messages and user-extensible storage. - -**⚠️ Deprecation Warning:** This type will be made unexported in v1.0.0. Please use `ChatModelAgentState` in `ChatModelAgentMiddleware` and `AgentMiddleware` callbacks. Direct use of `compose.ProcessState[*State]` is not recommended and will stop working in v1.0.0; please use the handler API instead. - -```go -type State struct { - Messages []Message - extra map[string]any // unexported, access via SetRunLocalValue/GetRunLocalValue - - // The following are internal fields - do not access directly - // Kept exported for backward compatibility with existing checkpoints - ReturnDirectlyToolCallID string - ToolGenActions map[string]*AgentAction - AgentName string - RemainingIterations int - - internals map[string]any -} -``` - ---- - -## Architecture Diagram - -The following diagram shows how `ChatModelAgentMiddleware` works during `ChatModelAgent` execution: - -``` -Agent.Run(input) - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ BeforeAgent(ctx, *ChatModelAgentContext) │ -│ Input: Current Instruction, Tools and other Agent runtime env │ -│ Output: Modified Agent runtime env │ -│ Purpose: Called once at Run start, modifies config for entire Run │ -│ lifecycle │ -└─────────────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ ReAct Loop │ -│ ┌───────────────────────────────────────────────────────────────────┐ │ -│ │ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ BeforeModelRewriteState(ctx, *ChatModelAgentState, *MC) │ │ │ -│ │ │ Input: Persistent state like message history, plus Model │ │ │ -│ │ │ runtime env │ │ │ -│ │ │ Output: Modified persistent state, returns new ctx │ │ │ -│ │ │ Purpose: Modify persistent state across iterations │ │ │ -│ │ │ (mainly message list) │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ WrapModel(ctx, BaseChatModel, *ModelContext) │ │ │ -│ │ │ Input: ChatModel being wrapped, plus Model runtime env │ │ │ -│ │ │ Output: Wrapped Model (onion model) │ │ │ -│ │ │ Purpose: Modify input, output and config for single │ │ │ -│ │ │ Model request │ │ │ -│ │ │ │ │ │ │ -│ │ │ ▼ │ │ │ -│ │ │ ┌───────────────┐ │ │ │ -│ │ │ │ Model │ │ │ │ -│ │ │ │ Generate/Stream│ │ │ │ -│ │ │ └───────────────┘ │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ AfterModelRewriteState(ctx, *ChatModelAgentState, *MC) │ │ │ -│ │ │ Input: Persistent state like message history (with Model │ │ │ -│ │ │ response), plus Model runtime env │ │ │ -│ │ │ Output: Modified persistent state │ │ │ -│ │ │ Purpose: Modify persistent state across iterations │ │ │ -│ │ │ (mainly message list) │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌──────────────────┐ │ │ -│ │ │ Model return? │ │ │ -│ │ └──────────────────┘ │ │ -│ │ │ │ │ │ -│ │ Final response│ │ ToolCalls │ │ -│ │ │ ▼ │ │ -│ │ │ ┌─────────────────────────────────────┐ │ │ -│ │ │ │ WrapInvokableToolCall / WrapStream │ │ │ -│ │ │ │ ableToolCall(ctx, endpoint, *TC) │ │ │ -│ │ │ │ Input: Tool being wrapped plus │ │ │ -│ │ │ │ Tool runtime env │ │ │ -│ │ │ │ Output: Wrapped endpoint │ │ │ -│ │ │ │ (onion model) │ │ │ -│ │ │ │ Purpose: Modify input, output │ │ │ -│ │ │ │ and config for single │ │ │ -│ │ │ │ Tool request │ │ │ -│ │ │ │ │ │ │ │ -│ │ │ │ ▼ │ │ │ -│ │ │ │ ┌─────────────┐ │ │ │ -│ │ │ │ │ Tool.Run() │ │ │ │ -│ │ │ │ └─────────────┘ │ │ │ -│ │ │ └─────────────────────────────────────┘ │ │ -│ │ │ │ │ │ -│ │ │ │ (Result added to Messages) │ │ -│ │ │ │ │ │ -│ │ │ ┌─────────┘ │ │ -│ │ │ │ │ │ -│ │ │ └──────────► Continue loop │ │ -│ │ │ │ │ -│ └─────────────────────┼─────────────────────────────────────────────┘ │ -│ │ │ -│ ▼ │ -│ Loop until complete or maxIterations reached │ -└─────────────────────────────────────────────────────────────────────────┘ - │ - ▼ - Agent.Run() ends -``` - -### Handler Method Description - - - - - - - - - -
    MethodInputOutputScope
    BeforeAgent
    Agent runtime env (
    *ChatModelAgentContext
    )
    Modified Agent runtime envEntire Run lifecycle, called only once
    BeforeModelRewriteState
    Persistent state + Model runtime envModified persistent statePersistent state across iterations (message list)
    WrapModel
    ChatModel being wrapped + Model runtime envWrapped ModelSingle Model request input, output and config
    AfterModelRewriteState
    Persistent state (with response) + Model runtime envModified persistent statePersistent state across iterations (message list)
    WrapInvokableToolCall
    Tool being wrapped + Tool runtime envWrapped endpointSingle Tool request input, output and config
    WrapStreamableToolCall
    Tool being wrapped + Tool runtime envWrapped endpointSingle Tool request input, output and config
    +Can only be called within `ChatModelAgentMiddleware` callbacks. --- -## Execution Order +## State Type -### Model Call Lifecycle (wrapper chain from outer to inner) - -1. `AgentMiddleware.BeforeChatModel` (hook, runs before model call) -2. `ChatModelAgentMiddleware.BeforeModelRewriteState` (hook, can modify state before model call) -3. `retryModelWrapper` (internal - retries on failure, if configured) -4. `eventSenderModelWrapper` preprocessing (internal - prepares event sending) -5. `ChatModelAgentMiddleware.WrapModel` preprocessing (wrapper, wrapped at request time, first registered runs first) -6. `callbackInjectionModelWrapper` (internal - injects callbacks if not enabled) -7. `Model.Generate/Stream` -8. `callbackInjectionModelWrapper` postprocessing -9. `ChatModelAgentMiddleware.WrapModel` postprocessing (wrapper, first registered runs last) -10. `eventSenderModelWrapper` postprocessing (internal - sends model response event) -11. `retryModelWrapper` postprocessing (internal - handles retry logic) -12. `ChatModelAgentMiddleware.AfterModelRewriteState` (hook, can modify state after model call) -13. `AgentMiddleware.AfterChatModel` (hook, runs after model call) - -### Tool Call Lifecycle (from outer to inner) - -1. `eventSenderToolHandler` (internal ToolMiddleware - sends tool result event after all processing) -2. `ToolsConfig.ToolCallMiddlewares` (ToolMiddleware) -3. `AgentMiddleware.WrapToolCall` (ToolMiddleware) -4. `ChatModelAgentMiddleware.WrapInvokableToolCall/WrapStreamableToolCall` (wrapped at request time, first registered is outermost) -5. `Tool.InvokableRun/StreamableRun` +> 💡 +> `State` remains exported only for backward compatibility with checkpoints. **Do not use it directly** — use `ChatModelAgentState` in `ChatModelAgentMiddleware` callbacks, and replace the former `State.Extra` with `SetRunLocalValue/GetRunLocalValue`. The `compose.ProcessState[*State]` usage will stop working in v1.0.0. --- ## Migration Guide -### Migrating from AgentMiddleware to ChatModelAgentMiddleware +### Migrating from compose.ProcessState[*State] -**Before (AgentMiddleware):** +**Before:** ```go -middleware := adk.AgentMiddleware{ - BeforeChatModel: func(ctx context.Context, state *adk.ChatModelAgentState) error { - return nil - }, -} +compose.ProcessState(ctx, func(_ context.Context, st *adk.State) error { + st.Extra["myKey"] = myValue + return nil +}) ``` -**After (ChatModelAgentMiddleware):** +**After:** ```go -type MyHandler struct { - *adk.BaseChatModelAgentMiddleware +// Write +if err := adk.SetRunLocalValue(ctx, "myKey", myValue); err != nil { + return ctx, state, err } -func (h *MyHandler) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - newCtx := context.WithValue(ctx, myKey, myValue) - return newCtx, state, nil +// Read +if val, found, err := adk.GetRunLocalValue(ctx, "myKey"); err == nil && found { + // use val } ``` -### Migrating from compose.ProcessState[*State] +### Adapting to AfterAgent (new in v0.9) -**Before:** +`AfterAgent` is called after the agent **terminates successfully** (final answer or return-directly tool result) and can be used for post-processing: ```go -compose.ProcessState(ctx, func(_ context.Context, st *adk.State) error { - st.Extra["myKey"] = myValue - return nil -}) +func (m *MyMiddleware) AfterAgent(ctx context.Context, state *adk.ChatModelAgentState) (context.Context, error) { + log.Printf("Agent completed, %d messages total", len(state.Messages)) + // Auditing, statistics, cleanup, etc. + return ctx, nil +} ``` -**After (using SetRunLocalValue/GetRunLocalValue):** +> 💡 +> `AfterAgent` is called in registration order (same as `BeforeAgent`). If any handler returns an error, subsequent handlers are not called (fail-fast), and the error is sent to the event stream. -```go -if err := adk.SetRunLocalValue(ctx, "myKey", myValue); err != nil { - return ctx, state, err -} +### Adapting to ToolInfos / DeferredToolInfos (new in v0.9) -if val, found, err := adk.GetRunLocalValue(ctx, "myKey"); err == nil && found { +`ChatModelAgentState` now includes `ToolInfos` and `DeferredToolInfos` fields, replacing `ModelContext.Tools` as the source of truth for tool configuration: + +```go +func (m *MyMiddleware) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { + // Dynamically filter tools + filtered := make([]*schema.ToolInfo, 0, len(state.ToolInfos)) + for _, t := range state.ToolInfos { + if shouldInclude(t.Name) { + filtered = append(filtered, t) + } + } + state.ToolInfos = filtered + return ctx, state, nil } ``` diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md index 5eb63ed655d..24715bb0863 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md @@ -1,125 +1,162 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: FileSystem Backend weight: 1 --- -> 💡 -> Package: [github.com/cloudwego/eino/adk/filesystem](https://github.com/cloudwego/eino/tree/main/adk/filesystem) +> 💡Package: [github.com/cloudwego/eino/adk/filesystem](https://github.com/cloudwego/eino/tree/main/adk/filesystem) -## Background and Goals +## Background and Purpose -In AI Agent scenarios, the agent often needs to interact with a filesystem: reading file content, searching code, editing configs, executing commands, and so on. However, different runtime environments access the filesystem very differently: +AI Agents need to interact with the filesystem (reading, searching, editing, executing commands), but access methods vary significantly across runtime environments: local disk, remote sandbox, in-memory simulation, object storage, etc. If file operation logic is implemented separately for each environment, Middleware/Agent code becomes tightly coupled to the underlying storage. -- **Local development**: operate on the local filesystem directly, works out of the box -- **Cloud sandbox**: operate on an isolated sandbox filesystem via remote APIs, requires authentication and networking -- **Testing**: use an in-memory simulated filesystem without real disk I/O -- **Custom storage**: integrate with OSS, databases, or other non-traditional “filesystems” +The `filesystem.Backend` interface solves this problem — serving as a **unified filesystem operation protocol**: -If each environment implements its own set of file operations, middleware and agent code become tightly coupled to the underlying storage implementation, making reuse and testing difficult. - -To address this, Eino ADK defines the `filesystem.Backend` interface as a **unified filesystem operation protocol**. Its design goals are: - -1. **Decouple storage from business logic**: middleware depends only on the Backend interface and does not care whether the underlying implementation is local disk, a remote sandbox, or an in-memory mock -2. **Pluggable replacement**: by switching Backend implementations, the same agent can run in different environments without changing any business code -3. **Testability**: a built-in `InMemoryBackend` makes it easy to simulate filesystem behavior in unit tests -4. **Extensibility**: all methods use struct parameters, so adding new fields in the future won’t break compatibility for existing implementations +1. **Decouple storage from business logic** — Middleware depends only on the interface, regardless of the underlying implementation +2. **Pluggable replacement** — Switching Backend allows running in different environments without modifying business code +3. **Easy to test** — Built-in `InMemoryBackend` requires no real disk I/O +4. **Forward compatible** — All methods use struct parameters; adding new fields won't break existing implementations ## Backend Interface ```go type Backend interface { - // List files and directories under the given path LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) - // Read file content, supports line-based pagination (offset + limit) Read(ctx context.Context, req *ReadRequest) (*FileContent, error) - // Search for matches of pattern under the given path and return the match list GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error) - // Find matching files by glob pattern and base path GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error) - // Write or create a file Write(ctx context.Context, req *WriteRequest) error - // Replace string content in a file Edit(ctx context.Context, req *EditRequest) error } ``` -### Extension Interfaces + + + + + + + + +
    MethodFunctionReturns
    LsInfo
    List files and directories under the specified path
    []FileInfo
    Read
    Read file content, supports line-based pagination (offset + limit)
    *FileContent
    GrepRaw
    Search for content matching a pattern within files
    []GrepMatch
    GlobInfo
    Find matching files by glob pattern
    []FileInfo
    Write
    Write or create a file
    error
    Edit
    Replace string content within a file
    error
    -Besides the core file operations, a Backend can optionally implement shell command execution: +## Extension Interfaces + +### Shell / StreamingShell + +A Backend can optionally implement command execution capabilities. When a Backend also implements `Shell` or `StreamingShell`, the Filesystem Middleware will additionally register the `execute` tool. The two are **mutually exclusive** and cannot be configured simultaneously. ```go -// Shell provides synchronous command execution type Shell interface { Execute(ctx context.Context, input *ExecuteRequest) (result *ExecuteResponse, err error) } -// StreamingShell provides streaming command execution for long-running commands type StreamingShell interface { ExecuteStreaming(ctx context.Context, input *ExecuteRequest) (result *schema.StreamReader[*ExecuteResponse], err error) } ``` -When a Backend implements `Shell` or `StreamingShell`, the FileSystem middleware additionally registers the `execute` tool so the agent can run shell commands. +### MultiModalReader + +An optional extension interface that supports multi-modal file reading (images, PDFs, etc.), returning structured `MultiFileContent`. + +```go +type MultiModalReader interface { + MultiModalRead(ctx context.Context, req *MultiModalReadRequest) (*MultiFileContent, error) +} +``` + +When the Backend implements this interface and the Middleware is configured with `UseMultiModalRead = true`, the `read_file` tool will use multi-modal reading. + +## Core Data Types -### Core Data Types +### Request Types - - - - - - - - - - + + + + + + + + +
    TypeDescription
    FileInfo
    File/directory info: path, isDir, size, modified time
    FileContent
    File content with line number information
    GrepMatch
    Search match: content, path, line number
    ReadRequest
    Read request: path, offset (1-based line), limit (line count)
    GrepRequest
    Search request: pattern (regex), path, glob filter, file type filters, etc.
    WriteRequest
    Write request: path, content
    EditRequest
    Edit request: path, old string, new string, replace all
    ExecuteRequest
    Command request: command string, background flag
    ExecuteResponse
    Command result: stdout/stderr, exit code, truncated flag
    TypeFieldsDescription
    LsInfoRequest
    Path string
    Directory path to list
    ReadRequest
    FilePath string
    Offset int
    Limit int
    File path; starting line number (1-based, <1 treated as 1); maximum number of lines to read (0=all)
    MultiModalReadRequest
    Embeds
    ReadRequest
    Pages string
    Inherits all ReadRequest fields; Pages specifies PDF page range (e.g. "1-5", "3")
    GrepRequest
    Pattern string
    Path string
    Glob string
    FileType string
    CaseInsensitive bool
    EnableMultiline bool
    AfterLines int
    BeforeLines int
    Regex search pattern (ripgrep syntax); search directory; glob file filter; file type filter (e.g. "go", "py"); case insensitive; enable multiline matching; show N lines after match; show N lines before match
    GlobInfoRequest
    Pattern string
    Path string
    Glob expression (supports
    *
    ,
    **
    ,
    ?
    ,
    [abc]
    ); starting directory for search
    WriteRequest
    FilePath string
    Content string
    Target file path; content to write
    EditRequest
    FilePath string
    OldString string
    NewString string
    ReplaceAll bool
    File path; exact string to replace (non-empty); replacement string; when false, requires OldString to appear exactly once in the file
    ExecuteRequest
    Command string
    RunInBackendGround bool
    Command string to execute; whether to run in background
    +### Response Types + + + + + + + + + +
    TypeFieldsDescription
    FileInfo
    Path string
    IsDir bool
    Size int64
    ModifiedAt string
    File/directory path; whether it is a directory; file size (bytes); last modified time (ISO 8601 format)
    FileContent
    Content string
    Plain text content of the file
    MultiFileContent
    *FileContent
    Parts []FileContentPart
    Embeds FileContent; multi-modal output parts. Parts and FileContent are mutually exclusive: when Parts is non-empty, FileContent is ignored
    FileContentPart
    Type FileContentPartType
    MIMEType string
    Data []byte
    Content type (
    "image"
    or
    "pdf"
    ); MIME type (e.g. "image/png"); raw binary data
    GrepMatch
    Content string
    Path string
    Line int
    Matched line content; file path; 1-based line number
    ExecuteResponse
    Output string
    ExitCode *int
    Truncated bool
    Command output; exit code (pointer, may be nil); whether output was truncated
    + +### Constants + +```go +type FileContentPartType string + +const ( + FileContentPartTypeImage FileContentPartType = "image" + FileContentPartTypePDF FileContentPartType = "pdf" +) +``` + ## Built-in Implementation: InMemoryBackend -`InMemoryBackend` is a built-in Backend implementation that stores files in an in-memory map, mainly used for: +`InMemoryBackend` stores files in an in-memory map, primarily used for: -- **Unit tests**: test agent and middleware file operations without a real filesystem -- **Lightweight scenarios**: temporary file operations without persistence -- **Large tool result offloading**: the FileSystem middleware’s large tool result offloading feature uses InMemoryBackend by default +- **Unit testing** — Test Agent/Middleware file operation logic without a real filesystem +- **Lightweight scenarios** — Temporary file operations without persistence +- **Tool result offloading** — The Filesystem Middleware's large tool result offloading feature uses InMemoryBackend by default + +### Constructor ```go -import "github.com/cloudwego/eino/adk/filesystem" +func NewInMemoryBackend() *InMemoryBackend +``` -ctx := context.Background() +Zero-parameter constructor that returns an empty in-memory filesystem. + +### Usage Example + +```go backend := filesystem.NewInMemoryBackend() +ctx := context.Background() -// Write file -err := backend.Write(ctx, &filesystem.WriteRequest{ +// Write +_ = backend.Write(ctx, &filesystem.WriteRequest{ FilePath: "/example/test.txt", Content: "Hello, World!\nLine 2\nLine 3", }) -// Read file (paginated) -content, err := backend.Read(ctx, &filesystem.ReadRequest{ +// Read (paginated) +content, _ := backend.Read(ctx, &filesystem.ReadRequest{ FilePath: "/example/test.txt", Offset: 1, Limit: 10, }) // List directory -files, err := backend.LsInfo(ctx, &filesystem.LsInfoRequest{ - Path: "/example", -}) +files, _ := backend.LsInfo(ctx, &filesystem.LsInfoRequest{Path: "/example"}) -// Search content (regex supported) -matches, err := backend.GrepRaw(ctx, &filesystem.GrepRequest{ - Pattern: "Hello", - Path: "/example", +// Search (regex) +matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ + Pattern: "Hello", + Path: "/example", + CaseInsensitive: true, }) -// Edit file -err = backend.Edit(ctx, &filesystem.EditRequest{ +// Edit +_ = backend.Edit(ctx, &filesystem.EditRequest{ FilePath: "/example/test.txt", OldString: "Hello", NewString: "Hi", @@ -127,39 +164,42 @@ err = backend.Edit(ctx, &filesystem.EditRequest{ }) ``` -Features: +### Implementation Features -- Thread-safe (based on `sync.RWMutex`) -- GrepRaw supports regex, case-insensitive, context lines, and other advanced options -- GrepRaw uses parallel processing internally (up to 10 workers) +- **Thread-safe** — Based on `sync.RWMutex`; read operations use read locks, write operations use write locks +- **GrepRaw parallel processing** — Launches up to 10 workers for parallel matching when searching multiple files +- **Regex support** — Full regex, case-insensitive (`(?i)` prefix), multiline mode +- **Context lines** — GrepRaw supports BeforeLines/AfterLines to show context around matches +- **Glob matching** — Uses the `doublestar` library to support `**` recursive matching +- **FileType mapping** — Built-in mapping table of 70+ file types to extensions (go, py, ts, rust, etc.) +- **No Shell implementation** — InMemoryBackend does not implement the Shell/StreamingShell interface ## External Implementations -The following Backend implementations live in the [eino-ext](https://github.com/cloudwego/eino-ext) repository: +The following Backend implementations are located in the [eino-ext](https://github.com/cloudwego/eino-ext) repository: -- **Local Backend** — a local filesystem implementation that operates on the host disk with zero configuration -- **Ark Agentkit Sandbox Backend** — a Volcengine Agentkit remote sandbox implementation that executes file operations in an isolated cloud environment +- **Local Backend** (`github.com/cloudwego/eino-ext/adk/backend/local`) — Local filesystem implementation that operates directly on the host disk +- **Ark Agentkit Sandbox** (`github.com/cloudwego/eino-ext/adk/backend/agentkit`) — Volcengine Agentkit remote sandbox implementation ### Implementation Comparison - + - - + + +
    FeatureInMemoryLocalAgentkit Sandbox
    Execution modelIn-memoryLocal directRemote sandbox
    Network dependencyNoNoYes
    Network dependencyNoneNoneRequired
    Configuration complexityZero configZero configCredentials required
    PersistenceNoYesYes
    Shell supportNoYes (including streaming)Yes
    Use casesTests/temporaryDevelopment/localMulti-tenant/production
    Shell supportNoShell + StreamingShellShell
    MultiModalReaderNoImplementation dependentImplementation dependent
    Use casesTesting / temporary storageDevelopment / local environmentMulti-tenant / production
    -## Custom Implementations +## Custom Implementation -To integrate custom storage (e.g. OSS, databases), you only need to implement the `Backend` interface: +Implement the `Backend` interface to integrate with custom storage. For command execution, additionally implement `Shell` or `StreamingShell`; for multi-modal reading, implement `MultiModalReader`. ```go -type MyBackend struct { - // ... -} +type MyBackend struct { /* ... */ } func (b *MyBackend) LsInfo(ctx context.Context, req *filesystem.LsInfoRequest) ([]filesystem.FileInfo, error) { // Custom implementation @@ -169,7 +209,29 @@ func (b *MyBackend) Read(ctx context.Context, req *filesystem.ReadRequest) (*fil // Custom implementation } -// ... implement the remaining methods -``` +func (b *MyBackend) GrepRaw(ctx context.Context, req *filesystem.GrepRequest) ([]filesystem.GrepMatch, error) { + // Custom implementation +} + +func (b *MyBackend) GlobInfo(ctx context.Context, req *filesystem.GlobInfoRequest) ([]filesystem.FileInfo, error) { + // Custom implementation +} -If you also need command execution, implement `Shell` or `StreamingShell` as well. +func (b *MyBackend) Write(ctx context.Context, req *filesystem.WriteRequest) error { + // Custom implementation +} + +func (b *MyBackend) Edit(ctx context.Context, req *filesystem.EditRequest) error { + // Custom implementation +} + +// Optional: implement Shell +func (b *MyBackend) Execute(ctx context.Context, input *filesystem.ExecuteRequest) (*filesystem.ExecuteResponse, error) { + // Custom implementation +} + +// Optional: implement MultiModalReader +func (b *MyBackend) MultiModalRead(ctx context.Context, req *filesystem.MultiModalReadRequest) (*filesystem.MultiFileContent, error) { + // Custom implementation +} +``` diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md index 7f10f023cdd..77cd1377039 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: Ark Agentkit Sandbox @@ -15,7 +15,7 @@ Note: If your eino version is v0.8.0 or above, you need to use ark agentkit back ### Overview -Agentkit Sandbox Backend is a remote sandbox implementation of EINO ADK FileSystem that executes file system operations in an isolated cloud environment through Volcengine Agentkit service. +Agentkit Sandbox Backend is a remote sandbox implementation of EINO ADK FileSystem that executes filesystem operations in an isolated cloud environment through the Volcengine Agentkit service. #### Core Features @@ -119,7 +119,7 @@ middleware, _ := fsMiddleware.New(ctx, &fsMiddleware.Config{ // Create Agent agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ Name: "SandboxAgent", - Description: "AI Agent with secure file system access capabilities", + Description: "AI Agent with secure filesystem access capabilities", Model: chatModel, Handlers: []adk.ChatModelAgentMiddleware{middleware}, }) @@ -147,7 +147,7 @@ files, _ := backend.LsInfo(ctx, &filesystem.LsInfoRequest{ }) // Read file (paginated) -content, _ := backend.Read(ctx, &filesystem.ReadRequest{ +fcontent, _ := backend.Read(ctx, &filesystem.ReadRequest{ FilePath: "/home/gem/file.txt", Offset: 0, Limit: 100, @@ -184,18 +184,18 @@ result, _ := backend.Execute(ctx, &filesystem.ExecuteRequest{ - - - - - + + + + +
    FeatureAgentkitLocal
    Execution ModelRemote SandboxLocal Direct
    Network DependencyRequiredNot Required
    Configuration ComplexityRequires CredentialsZero Config
    Security ModelIsolated SandboxOS Permissions
    Use CasesMulti-tenant/ProductionDevelopment/Local
    Execution modelRemote sandboxLocal direct
    Network dependencyRequiredNot required
    Configuration complexityCredentials requiredZero config
    Security modelIsolated sandboxOS permissions
    Use casesMulti-tenant / productionDevelopment / local environment
    ### FAQ **Q: Authentication failed** -Check environment variables, verify AK/SK match, and ensure account has Ark Sandbox permissions. +Check environment variables, verify AK/SK match, and ensure the account has Ark Sandbox permissions. **Q: Request timeout** diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md new file mode 100644 index 00000000000..10a5a76214d --- /dev/null +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md @@ -0,0 +1,201 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: Local Filesystem +weight: 2 +--- + +## Local Backend + +**Package**: `github.com/cloudwego/eino-ext/adk/backend/local` + +> 💡 +> eino v0.8.0+ requires local backend v0.2.1 or above. + +Local Backend is the local implementation of Eino ADK FileSystem, directly operating on the local file system. It implements both the `filesystem.Backend` (file operations) and `filesystem.StreamingShell` (streaming command execution) interfaces. + +**Core Features**: Zero configuration, native performance, enforced absolute paths, streaming command execution, optional command validation. + +--- + +## Installation + +```bash +go get github.com/cloudwego/eino-ext/adk/backend/local +``` + +## Configuration + +```go +type Config struct { + // Optional: command validation function for ExecuteStreaming security control. + // Rejects execution when returning non-nil error. + ValidateCommand func(string) error +} +``` + +## Quick Start + +```go +backend, err := local.NewBackend(ctx, &local.Config{}) + +// Write file (must be absolute path; overwrites if file exists) +err = backend.Write(ctx, &filesystem.WriteRequest{ + FilePath: "/tmp/hello.txt", + Content: "Hello, Local Backend!", +}) + +// Read file (supports line-level pagination) +fc, err := backend.Read(ctx, &filesystem.ReadRequest{ + FilePath: "/tmp/hello.txt", + Offset: 1, // Starting line number (1-based) + Limit: 50, // Maximum lines, 0 means all +}) +``` + +### Integration with Agent + +```go +import ( + "github.com/cloudwego/eino/adk" + fsMiddleware "github.com/cloudwego/eino/adk/middlewares/filesystem" + "github.com/cloudwego/eino-ext/adk/backend/local" +) + +backend, _ := local.NewBackend(ctx, &local.Config{}) + +middleware, _ := fsMiddleware.New(ctx, &fsMiddleware.Config{ + Backend: backend, // Required: registers ls/read/write/edit/glob/grep tools + StreamingShell: backend, // Optional: registers streaming execute tool +}) + +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: chatModel, + Handlers: []adk.ChatModelAgentMiddleware{middleware}, +}) +``` + +> 💡 +> `Shell` and `StreamingShell` in the middleware Config are mutually exclusive. Local Backend only implements `StreamingShell` (streaming command execution), not non-streaming `Shell`. + +--- + +## Implemented Interfaces and Methods + +### filesystem.Backend + + + + + + + + + +
    MethodSignatureDescription
    LsInfo
    (ctx, *LsInfoRequest) ([]FileInfo, error)
    List directory contents
    Read
    (ctx, *ReadRequest) (*FileContent, error)
    Read file, supports line-level pagination (Offset 1-based, Limit 0=all)
    Write
    (ctx, *WriteRequest) error
    Write file; auto-creates parent directories; overwrites if file exists
    Edit
    (ctx, *EditRequest) error
    String replacement; supports
    ReplaceAll
    ; errors if
    OldString
    is not unique (non-ReplaceAll mode)
    GrepRaw
    (ctx, *GrepRequest) ([]GrepMatch, error)
    ripgrep-based search, supports full regex syntax; supports case-insensitive, multiline matching, context lines
    GlobInfo
    (ctx, *GlobInfoRequest) ([]FileInfo, error)
    Glob pattern file matching, supports
    *
    /
    **
    /
    ?
    /
    [abc]
    + +### filesystem.StreamingShell + + + + +
    MethodSignatureDescription
    ExecuteStreaming
    (ctx, *ExecuteRequest) (*StreamReader[*ExecuteResponse], error)
    Streaming shell command execution with real-time output; supports background execution (
    RunInBackendGround
    )
    + +--- + +## Usage Examples + +### Search Content (Regex) + +```go +matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ + Path: "/home/user/project", + Pattern: "TODO|FIXME", // ripgrep regex syntax + Glob: "*.go", + CaseInsensitive: true, +}) +``` + +### Edit File + +```go +backend.Edit(ctx, &filesystem.EditRequest{ + FilePath: "/tmp/file.txt", + OldString: "old text", + NewString: "new text", + ReplaceAll: true, +}) +``` + +### Streaming Command Execution + +```go +reader, _ := backend.ExecuteStreaming(ctx, &filesystem.ExecuteRequest{ + Command: "tail -f /var/log/app.log", +}) +for { + resp, err := reader.Recv() + if err == io.EOF { + break + } + fmt.Print(resp.Output) +} +``` + +### With Command Validation + +```go +backend, _ := local.NewBackend(ctx, &local.Config{ + ValidateCommand: func(cmd string) error { + allowed := map[string]bool{"ls": true, "cat": true, "grep": true} + parts := strings.Fields(cmd) + if len(parts) == 0 || !allowed[parts[0]] { + return fmt.Errorf("command not allowed: %s", parts[0]) + } + return nil + }, +}) +``` + +--- + +## Path Requirements + +All file paths must be absolute paths (starting with `/`). Relative paths can be converted via `filepath.Abs()`. + +--- + +## Comparison with Agentkit Backend + + + + + + + + + + +
    FeatureLocalAgentkit
    Execution modelLocal directRemote sandbox
    Network dependencyNoneRequired
    Configuration complexityZero configurationRequires credentials
    Security modelOS permissions + ValidateCommandIsolated sandbox
    Streaming outputSupported (StreamingShell)Not supported
    Platform supportUnix/Linux/macOSAny
    Use caseDevelopment/local environmentsMulti-tenant/production environments
    + +--- + +## FAQ + +**Q: Does GrepRaw support regex?** + +A: Yes. It uses ripgrep (`rg`) underneath and supports full regex syntax. The system must have ripgrep installed, otherwise it reports `ripgrep (rg) is not installed or not in PATH`. See [https://github.com/BurntSushi/ripgrep#installation](https://github.com/BurntSushi/ripgrep#installation) for installation instructions. + +**Q: Does Write create or overwrite?** + +A: Overwrite. `Write` uses `O_CREATE|O_TRUNC` flags — if the file exists, its content is overwritten; if it doesn't exist, it's created (with automatic parent directory creation). + +**Q: Is Windows supported?** + +A: No. `ExecuteStreaming` depends on `/bin/sh`. File operations themselves can run on any platform, but command execution is Unix-only. + +**Q: Does Local Backend support non-streaming Execute?** + +A: No. Local only implements `StreamingShell` (`ExecuteStreaming`), not `Shell` (`Execute`). `Shell` and `StreamingShell` in the middleware Config are mutually exclusive; choose one. diff --git "a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" "b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" deleted file mode 100644 index 9334bf19a63..00000000000 --- "a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" +++ /dev/null @@ -1,231 +0,0 @@ ---- -Description: "" -date: "2026-03-24" -lastmod: "" -tags: [] -title: Local File System -weight: 2 ---- - -## Local Backend - -Package: `github.com/cloudwego/eino-ext/adk/backend/local` - -Note: If your eino version is v0.8.0 or above, you need to use local backend [adk/backend/local/v0.2.1](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Flocal%2Fv0.2.1). - -### Overview - -Local Backend is the local file system implementation of EINO ADK FileSystem, directly operating on the local file system, providing native performance and zero-configuration experience. - -#### Core Features - -- Zero Configuration - Works out of the box -- Native Performance - Direct file system access, no network overhead -- Path Safety - Enforces absolute paths -- Streaming Execution - Supports real-time command output streaming -- Command Validation - Optional security validation hooks - -### Installation - -```bash -go get github.com/cloudwego/eino-ext/adk/backend/local -``` - -### Configuration - -```go -type Config struct { - // Optional: Command validation function for Execute() security control - ValidateCommand func(string) error -} -``` - -### Quick Start - -#### Basic Usage - -```go -import ( - "context" - - "github.com/cloudwego/eino-ext/adk/backend/local" - "github.com/cloudwego/eino/adk/filesystem" -) - -func main() { - ctx := context.Background() - - backend, err := local.NewBackend(ctx, &local.Config{}) - if err != nil { - panic(err) - } - - // Write file (must be absolute path) - err = backend.Write(ctx, &filesystem.WriteRequest{ - FilePath: "/tmp/hello.txt", - Content: "Hello, Local Backend!", - }) - - // Read file - fcontent, err := backend.Read(ctx, &filesystem.ReadRequest{ - FilePath: "/tmp/hello.txt", - }) - fmt.Println(fcontent.Content) -} -``` - -#### With Command Validation - -```go -func validateCommand(cmd string) error { - allowed := map[string]bool{"ls": true, "cat": true, "grep": true} - parts := strings.Fields(cmd) - if len(parts) == 0 || !allowed[parts[0]] { - return fmt.Errorf("command not allowed: %s", parts[0]) - } - return nil -} - -backend, _ := local.NewBackend(ctx, &local.Config{ - ValidateCommand: validateCommand, -}) -``` - -#### Integration with Agent - -```go -import ( - "github.com/cloudwego/eino/adk" - fsMiddleware "github.com/cloudwego/eino/adk/middlewares/filesystem" -) - -// Create Backend -backend, _ := local.NewBackend(ctx, &local.Config{}) - -// Create Middleware -middleware, _ := fsMiddleware.New(ctx, &fsMiddleware.Config{ - Backend: backend, - StreamingShell: backend, -}) - -// Create Agent -agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "LocalFileAgent", - Description: "AI Agent with local file system access capabilities", - Model: chatModel, - Handlers: []adk.ChatModelAgentMiddleware{middleware}, -}) -``` - -### API Reference - - - - - - - - - - - -
    MethodDescription
    LsInfoList directory contents
    ReadRead file content (supports pagination, default 200 lines)
    WriteCreate new file (error if exists)
    EditReplace file content
    GrepRawSearch file content (literal match)
    GlobInfoFind files by pattern
    ExecuteExecute shell commands
    ExecuteStreamingExecute commands with streaming output
    - -#### Examples - -```go -// List directory -files, _ := backend.LsInfo(ctx, &filesystem.LsInfoRequest{ - Path: "/home/user", -}) - -// Read file (paginated) -content, _ := backend.Read(ctx, &filesystem.ReadRequest{ - FilePath: "/path/to/file.txt", - Offset: 0, - Limit: 50, -}) - -// Search content (literal match, not regex) -matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ - Path: "/home/user/project", - Pattern: "TODO", - Glob: "*.go", -}) - -// Find files -files, _ := backend.GlobInfo(ctx, &filesystem.GlobInfoRequest{ - Path: "/home/user", - Pattern: "**/*.go", -}) - -// Edit file -backend.Edit(ctx, &filesystem.EditRequest{ - FilePath: "/tmp/file.txt", - OldString: "old", - NewString: "new", - ReplaceAll: true, -}) - -// Execute command -result, _ := backend.Execute(ctx, &filesystem.ExecuteRequest{ - Command: "ls -la /tmp", -}) - -// Streaming execution -reader, _ := backend.ExecuteStreaming(ctx, &filesystem.ExecuteRequest{ - Command: "tail -f /var/log/app.log", -}) -for { - resp, err := reader.Recv() - if err == io.EOF { - break - } - fmt.Print(resp.Stdout) -} -``` - -### Path Requirements - -All paths must be absolute paths (starting with `/`): - -```go -// Correct -backend.Read(ctx, &filesystem.ReadRequest{FilePath: "/home/user/file.txt"}) - -// Incorrect -backend.Read(ctx, &filesystem.ReadRequest{FilePath: "./file.txt"}) -``` - -Convert relative paths: - -```go -absPath, _ := filepath.Abs("./relative/path") -``` - -### Comparison with Agentkit Backend - - - - - - - - - - -
    FeatureLocalAgentkit
    Execution ModelLocal DirectRemote Sandbox
    Network DependencyNoneRequired
    Configuration ComplexityZero ConfigRequires Credentials
    Security ModelOS PermissionsIsolated Sandbox
    Streaming OutputSupportedNot Supported
    Platform SupportUnix/Linux/macOSAny
    Use CasesDevelopment/LocalMulti-tenant/Production
    - -### FAQ - -**Q: Why does running grep fail with `ripgrep (rg) is not installed or not in PATH. Please install it:` [https://github.com/BurntSushi/ripgrep#installation](https://github.com/BurntSushi/ripgrep#installation)?** - -The local Grep command relies on `ripgrep` by default. If your system does not have `ripgrep` installed, install it following the official guide. - -**Q: Does GrepRaw support regex?** - -Yes. GrepRaw uses `ripgrep` under the hood for grep operations, so regex patterns are supported. - -**Q: Windows support?** - -Not supported, depends on `/bin/sh`. diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md index 93301c65f2f..9672a1a1cef 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: AgentsMD @@ -9,212 +9,187 @@ weight: 9 ## Overview -`agentsmd` is an Eino ADK middleware that **automatically injects the content of Agents.md into the model input messages on every model call**. The injection is ephemeral: it is added dynamically for each model call and is not persisted into the session state, so it **won’t be processed by summarization/compression middlewares**. - -**Core value**: define system-level behavior instructions and context for an agent via an Agents.md file (similar to Claude Code’s CLAUDE.md), without manually composing system prompts. - -**Package**: `github.com/cloudwego/eino/adk/middlewares/agentsmd` - ---- +`agentsmd` is an Eino ADK middleware that **automatically injects Agents.md file content into the message sequence on every model call**. The injected message is persisted by the framework into the agent's internal state, but through **idempotency checks** (`Extra["__agentsmd_content__"]` marker) ensures no duplicate injection occurs. Since injected content is fixed upon first appearance, **it will not change with subsequent summarization/compression**. **Core value**: Define system-level behavioral instructions and context for an Agent via Agents.md files (similar to Claude Code's CLAUDE.md), without manually managing system prompt concatenation. **Package path**: `github.com/cloudwego/eino/adk/middlewares/agentsmd` ## Quick Start -### Minimal Example - ```go -package main - -import ( - "context" - "fmt" - - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/adk/middlewares/agentsmd" -) - -func main() { - ctx := context.Background() - - // 1. Prepare Backend (file reading backend) - backend := NewLocalFileBackend("/path/to/project") - - // 2. Create agentsmd middleware - mw, err := agentsmd.New(ctx, &agentsmd.Config{ - Backend: backend, - AgentsMDFiles: []string{"/home/user/project/agents.md"}, - }) - if err != nil { - panic(err) - } - - // 3. Attach the middleware to the agent - // agent := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // Middlewares: []adk.ChatModelAgentMiddleware{mw}, - // }) - _ = mw - fmt.Println("agentsmd middleware created successfully") +ctx := context.Background() + +// 1. Create agentsmd middleware +mw, err := agentsmd.New(ctx, &agentsmd.Config{ + Backend: myBackend, // Implements agentsmd.Backend interface + AgentsMDFiles: []string{"/project/agents.md"}, +}) +if err != nil { + panic(err) } + +// 2. Configure with Agent +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: chatModel, + Handlers: []adk.ChatModelAgentMiddleware{mw}, +}) ``` --- -## Configuration +## Configuration Details -### Config +### Config Struct ```go type Config struct { - // Backend provides file access to load Agents.md files. - // It can be a local filesystem, remote storage, or any other backend. - // Required. - Backend Backend - - // AgentsMDFiles is an ordered list of Agents.md file paths to load. - // Files are loaded and injected in the given order. - // Files support recursive @import (max depth 5). - AgentsMDFiles []string - - // AllAgentsMDMaxBytes limits the total bytes of all loaded Agents.md content. - // Files are loaded in order; once the cumulative size exceeds this limit, - // the remaining files will be skipped. - // Each individual file is always loaded in full. - // 0 means unlimited. + Backend Backend + AgentsMDFiles []string AllAgentsMDMaxBytes int - - // OnLoadWarning is an optional callback invoked on non-fatal errors during loading - // (e.g. file not found, cyclic @import, depth limit exceeded). - // If nil, warnings are printed via log.Printf. - // - // Note: Backend.Read errors other than os.ErrNotExist (e.g. permission denied, I/O errors) - // are not treated as warnings and will abort the loading process. - OnLoadWarning func(filePath string, err error) + OnLoadWarning func(filePath string, err error) } ``` -### Parameters +### Parameter Description - - - - + + + +
    ParameterTypeRequiredDefaultDescription
    Backend
    Backend
    Yes-File reading backend that performs the actual I/O
    AgentsMDFiles
    []string
    Yes-List of Agents.md file paths to load (at least one)
    AllAgentsMDMaxBytes
    int
    No
    0
    (unlimited)
    Total byte limit for all files
    OnLoadWarning
    func(string, error)
    No
    log.Printf
    Callback for non-fatal errors
    Backend
    Backend
    YesFile reading backend responsible for actual file I/O
    AgentsMDFiles
    []string
    YesList of Agents.md file paths to load (at least one), loaded and injected in order
    AllAgentsMDMaxBytes
    int
    No
    0
    (unlimited)
    Total byte limit for all files; subsequent files are skipped when exceeded, but each file is always loaded in full
    OnLoadWarning
    func(string, error)
    No
    log.Printf
    Callback for non-fatal errors (missing files, circular @import, depth exceeded, etc.)
    +### Validation Rules + +`New` / `NewTyped` validates the Config at creation time: + +- `Config` cannot be nil +- `Backend` cannot be nil +- `AgentsMDFiles` must contain at least one path +- `AllAgentsMDMaxBytes` cannot be negative + --- +## Constructors + +### New — Standard Constructor + +```go +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) +``` + +Returns `ChatModelAgentMiddleware` (i.e., `TypedChatModelAgentMiddleware[*schema.Message]`), suitable for standard `ChatModelAgent`. + +### NewTyped — Generic Constructor + +```go +func NewTyped[M adk.MessageType](_ context.Context, cfg *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +Generic version supporting both `*schema.Message` and `*schema.AgenticMessage` message types. `New` internally calls `NewTyped[*schema.Message]`. + ## Backend Interface -### Definition +### Interface Definition ```go type Backend interface { - // Read reads file content. - // If the file does not exist, implementations should return an error that wraps os.ErrNotExist - // (so errors.Is(err, os.ErrNotExist) returns true). - // This lets the loader skip missing files silently and notify via OnLoadWarning. - // Other errors (permission denied, I/O errors) abort the loading process. Read(ctx context.Context, req *ReadRequest) (*FileContent, error) } ``` -### Types +### Type Definitions -```go -// ReadRequest defines request parameters for reading a file -type ReadRequest struct { - FilePath string // file path - Offset int // starting line number (1-based) -} +`ReadRequest` and `FileContent` are aliases of the same-named types from the `github.com/cloudwego/eino/adk/filesystem` package: -// FileContent defines the return structure of file content -type FileContent struct { - Content string // file text content -} +```go +type ReadRequest = filesystem.ReadRequest +type FileContent = filesystem.FileContent ``` +> 💡 +> **Backend Implementation Requirements** +> +> - When a file does not exist, implementations **must** return an error wrapping `os.ErrNotExist` (making `errors.Is(err, os.ErrNotExist)` return `true`); the loader uses this to distinguish "missing file" from "real I/O error" +> - Other errors (permission denied, I/O errors) will **abort the entire loading process** and are not treated as warnings +> - The `Read` method should be concurrency-safe + --- ## @import Syntax -Agents.md supports `@import` to recursively include other files. - -### Syntax +Agents.md files support `@path` syntax to recursively include other files. -In Agents.md, use `@path/to/file` to reference another file: +### Syntax Format ```markdown # Project instructions You are a coding assistant. -Please follow these rules: +Please refer to the following conventions: @rules/code-style.md @rules/api-conventions.md ``` -### Rules +### Matching Rules + +The loader scans file content using the regex `@([a-zA-Z0-9_.~/][a-zA-Z0-9_.~/\-]*)` with the following filtering logic: + +- **Paths containing /**: Treated directly as @import (e.g., `@rules/style.md`) +- **Paths without /**: Treated as @import only when the extension is in the allowed list; otherwise ignored. **Allowed extensions**: `.md`, `.txt`, `.mdx`, `.yaml`, `.yml`, `.json`, `.toml`. This design avoids mistaking `@someone` or `@example.com` as import targets. + +### Resolution Behavior -1. **Path resolution**: relative paths are resolved from the current file’s directory; absolute paths are used as-is -2. **Max recursion depth**: 5 (beyond that the import is skipped and `OnLoadWarning` is triggered) -3. **Cycle detection**: cyclic imports are detected and skipped (`OnLoadWarning` is triggered) -4. **Global de-duplication**: the same file is not loaded twice -5. **Supported extensions** (when the path contains no `/`): `.md`, `.txt`, `.mdx`, `.yaml`, `.yml`, `.json`, `.toml` -6. **False-positive filtering**: `@ref` without `/` whose extension is not allowed will be ignored (to avoid treating `@someone` or `@example.com` as an import) + + + + + + + + +
    RuleDescription
    Path resolutionRelative paths are resolved from the current file's directory; absolute paths are used directly
    Maximum recursion depth5 levels (exceeding triggers
    OnLoadWarning
    and skips)
    Circular reference detectionPaths already in the current ancestor chain are skipped (triggers
    OnLoadWarning
    )
    Global deduplicationEach file path is only read and injected once during the entire load
    Original text preserved@import referenced files are appended as separate paragraphs; the
    @path
    text in the original is not removed
    Byte budgetAfter cumulative bytes exceed
    AllAgentsMDMaxBytes
    , subsequent imports are skipped
    -### Example Directory Layout +### Directory Structure Example ``` project/ -├── Agents.md # entry file +├── Agents.md # Main entry file ├── rules/ -│ ├── code-style.md # code style rules -│ ├── api-conventions.md # API conventions -│ └── testing.md # testing rules +│ ├── code-style.md # @rules/code-style.md +│ ├── api-conventions.md # @rules/api-conventions.md +│ └── testing.md └── context/ - └── architecture.md # architecture notes + └── architecture.md ``` --- ## How It Works +### Implementation Hook + +The middleware implements the `TypedChatModelAgentMiddleware` interface's `BeforeModelRewriteState` method (**not** WrapModel). This hook is triggered before each model call when rewriting state. + ### Injection Flow +### Message Sequence After Injection + ``` -User message + history - │ - ▼ -┌─────────────────────┐ -│ agentsmd middleware │ -│ (WrapModel) │ -│ │ -│ 1. Load Agents.md │ -│ 2. Cache in RunLocal│ -│ 3. Build injected msg│ -└─────────────────────┘ - │ - ▼ -┌─────────────────────────────────────┐ -│ Injected message sequence │ -│ │ -│ [System] system prompt │ -│ [User] ← Agents.md injection │ ← inserted before the first User message -│ [User] previous user message 1 │ -│ [Assistant] assistant reply 1 │ -│ [User] current user message │ -└─────────────────────────────────────┘ - │ - ▼ -Model call (Generate / Stream) +[System] System prompt +[User] ← Agents.md content (with Extra marker) +[User] User history message 1 +[Assistant] Assistant reply 1 +[User] Current user message ``` -### Key Mechanics +### Key Mechanisms + +**1. Persistent injection + idempotency guarantee** The framework persists the state returned by `BeforeModelRewriteState` into the agent's internal state (`st.Messages = state.Messages`). Injected messages are marked with `Extra["__agentsmd_content__"]`; each time the hook is entered, it first scans for the marker — if it already exists, the original state is returned directly, avoiding duplicate injection. Therefore, the effect is: content is injected and persisted on the first model call, and subsequent iterations do not re-insert it. **2. Run-level caching** Within a single `Run()`, content loaded the first time is cached to RunLocal storage via `adk.SetRunLocalValue`. Subsequent model calls (e.g., during multi-turn tool calls) directly reuse the cache via `adk.GetRunLocalValue`. Each new `Run()` reloads the content, so file modifications take effect on the next Run. **4. Insertion position** Content is inserted as a `User` role message **before the first User message** in the sequence. If there are no User messages in the sequence, it is appended to the end. **5. Content formatting** Loaded file content is formatted: -1. **Ephemeral injection**: Agents.md content is inserted only for model calls and not written into `ChatModelAgentState`, so it won’t be summarized/compressed -2. **Run-level caching**: within a single agent `Run()`, the loaded Agents.md content is cached in `RunLocalValue`; subsequent model calls reuse it to avoid repeated reads -3. **Insertion position**: injected as a `User` role message before the first user message; if there is no user message, it is appended to the end -4. **I18n**: formatted output adapts to Chinese/English automatically (based on the system language environment) +- Wrapped in `` tags +- Includes an i18n header (instructing the model to follow directives) and footer (noting the context may not be relevant) +- Each file is displayed independently with a `File content: {path} (instructions):` prefix +- Language (Chinese/English) is controlled globally via `adk.SetLanguage` --- @@ -222,15 +197,13 @@ Model call (Generate / Stream) ### Middleware Ordering -**It is recommended to place the `agentsmd` middleware after summarization/compression middlewares.** This ensures Agents.md content: - -- won’t be compressed away by summarization -- is fully available on every model call +> 💡 +> **It is recommended to place the agentsmd middleware after summarization/compression middlewares.** This way Agents.md content won't be summarized or compressed, and each model call receives the complete instructions. ```go -Middlewares: []adk.ChatModelAgentMiddleware{ - summarizationMiddleware, // summarize first - agentsMDMiddleware, // then inject Agents.md +Handlers: []adk.ChatModelAgentMiddleware{ + summarizationMiddleware, // Summarize first + agentsMDMiddleware, // Then inject Agents.md } ``` @@ -238,44 +211,51 @@ Middlewares: []adk.ChatModelAgentMiddleware{ - - - - - - + + + + + +
    ScenarioBehavior
    File not found (
    os.ErrNotExist
    )
    Skip the file and trigger
    OnLoadWarning
    Cyclic
    @import
    Skip the cyclic file and trigger
    OnLoadWarning
    @import
    depth > 5
    Skip and trigger
    OnLoadWarning
    Total size exceeds
    AllAgentsMDMaxBytes
    Skip remaining files and trigger
    OnLoadWarning
    (the first file is always loaded fully)
    Permission denied / I/O errorAbort loading and return error
    All file contents emptyDo not inject; pass through original messages
    File not found (
    os.ErrNotExist
    )
    Skip the file, trigger
    OnLoadWarning
    Circular @importSkip the circular file, trigger
    OnLoadWarning
    @import depth exceeds 5 levelsSkip, trigger
    OnLoadWarning
    Cumulative size exceeds
    AllAgentsMDMaxBytes
    Skip subsequent files, trigger
    OnLoadWarning
    (first file is always loaded in full)
    Permission denied / I/O errorAbort loading, return error
    All file contents emptyNo injection; pass through messages as-is
    -### Backend Requirements - -- When a file does not exist, implementations **must** return an error that wraps `os.ErrNotExist` (e.g. `fmt.Errorf(\"... : %w\", os.ErrNotExist)`), otherwise the loader cannot distinguish “missing file” vs “real I/O error” -- `Read` should be concurrency-safe - ### Performance Considerations -- Set `AllAgentsMDMaxBytes` reasonably to avoid injecting too much content and consuming the model context window -- Agents.md is loaded once per `Run()` (run-level caching), but **every new `Run()` reloads it**, so file edits take effect on the next run -- Avoid importing too many files; the recursion depth limit is 5 +- Set `AllAgentsMDMaxBytes` reasonably to avoid injecting too much content that consumes the context window +- Agents.md content is loaded only once per `Run()` (run-level caching), but **each new Run() reloads** +- Avoid importing too many files; recursion depth limit is 5 levels -### Writing Agents.md +### Agents.md Writing Tips -- Keep it concise and include only instructions that truly affect model behavior -- Use `@import` to split concerns (code style, API conventions, architecture notes, etc.) -- Avoid large code examples or datasets in Agents.md to prevent wasting context window -- The content is wrapped in `` tags when passed to the model, so the model treats it as system-level instructions +- Keep content concise; only include instructions that truly affect model behavior +- Use @import to split by concern (code conventions, API conventions, architecture notes, etc.) +- Avoid including large code examples or data to prevent wasting the context window +- File content is wrapped in `` tags when passed to the model --- ## FAQ -**Q: Will Agents.md content be saved into the conversation history?** -A: No. The content is injected dynamically during model calls and is not written into `ChatModelAgentState`, so it won’t appear in history. +**Q: Will the Agents.md content be saved to conversation history?** + +A: Yes. The state returned by `BeforeModelRewriteState` is persisted by the framework. However, due to the idempotency check (`Extra["__agentsmd_content__"]` marker), content is only injected once on the first model call, and subsequent iterations skip it directly. It is recommended to place agentsmd after summarization to prevent injected content from being summarized. **Q: What happens if an Agents.md file does not exist?** -A: The file is skipped and `OnLoadWarning` is triggered (defaults to `log.Printf`). It does not fail the whole load. -**Q: What is the base directory for @import paths?** -A: The directory of the current file. For example, `@rules/style.md` in `/project/Agents.md` resolves to `/project/rules/style.md`. +A: The file is skipped, triggering the `OnLoadWarning` callback (defaults to `log.Printf`), without affecting the loading of other files. + +**Q: What directory are @import paths relative to?** + +A: Relative to the directory of the current file. For example, `@rules/style.md` in `/project/Agents.md` resolves to `/project/rules/style.md`. + +**Q: Will the same file imported by multiple files be loaded multiple times?** + +A: No. The loader maintains a global deduplication map (`seen`); the same path is only read and injected once. + +**Q: Will the @path references in the original text be replaced?** + +A: No. @import referenced files are appended as separate paragraphs after the original text; the original content remains unchanged. + +**Q: What is the difference between New and NewTyped?** -**Q: If multiple files import the same file, will it be loaded multiple times?** -A: No. The loader maintains a global de-duplication map; the same file path is read and injected only once. +A: `New` returns `ChatModelAgentMiddleware` (i.e., `TypedChatModelAgentMiddleware[*schema.Message]`), suitable for standard Agents. `NewTyped` is the generic version that additionally supports `*schema.AgenticMessage` type for Agentic Model scenarios. diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md index e7fba84f352..34f00b00020 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md @@ -1,187 +1,221 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: FileSystem weight: 2 --- -> 💡 Package: [github.com/cloudwego/eino/adk/middlewares/filesystem](https://github.com/cloudwego/eino/tree/main/adk/middlewares/filesystem) +The FileSystem middleware injects a set of filesystem operation tools (ls, read\_file, write\_file, edit\_file, glob, grep) and an optional command execution tool (execute) into the Agent, enabling it to interact with local or remote filesystems. -## Overview - -The FileSystem middleware provides filesystem access for agents. It operates the filesystem through the [FileSystem Backend](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/filesystem_backend) interface and automatically injects a set of file operation tools and the corresponding system prompt, enabling the agent to read/write/search/edit files directly. - -Core capabilities: - -- **Filesystem tool injection** — automatically registers tools such as ls, read_file, write_file, edit_file, glob, grep -- **Shell command execution** — optionally injects the execute tool, supports both sync and streaming execution -- **Per-tool configuration** — each tool can be configured independently (name/description/custom implementation/disable) -- **Multilingual prompts** — tool descriptions and system prompts support Chinese/English switching +``` +import "github.com/cloudwego/eino/adk/middlewares/filesystem" +``` -## Create the Middleware +--- -It is recommended to use `New` to create the middleware (returns `ChatModelAgentMiddleware`): +## Quick Start ```go -import "github.com/cloudwego/eino/adk/middlewares/filesystem" +import ( + "context" + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/adk/middlewares/filesystem" +) +// 1. Create middleware middleware, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: myBackend, - // To enable shell command execution, set Shell or StreamingShell - Shell: myShell, + Backend: myBackend, // Implements filesystem.Backend interface }) -if err != nil { - // handle error -} +// 2. Inject into Agent agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ // ... Middlewares: []adk.ChatModelAgentMiddleware{middleware}, }) ``` +--- + +## Constructors + + + + + +
    Function signatureDescription
    New(ctx, *MiddlewareConfig) (ChatModelAgentMiddleware, error)
    Recommended. Returns
    ChatModelAgentMiddleware
    , supports dynamically modifying Instruction and Tools via the
    BeforeAgent
    hook.
    NewTyped[M MessageType](ctx, *MiddlewareConfig) (TypedChatModelAgentMiddleware[M], error)
    Generic version; type parameter
    M
    supports
    *schema.Message
    and
    *schema.AgenticMessage
    .
    New
    is equivalent to
    NewTyped[*schema.Message]
    .
    + > 💡 -> `New` returns `ChatModelAgentMiddleware` with better context propagation (it can modify the agent’s instruction and tools at runtime via the `BeforeAgent` hook). +> **Deprecated**: `NewMiddleware(ctx, *Config) (AgentMiddleware, error)` is the legacy constructor; new code should use `New`. `NewMiddleware` returns the struct `AgentMiddleware`, lacking the flexibility of the `BeforeAgent` hook; additionally it enables "large result offloading" by default (see below), which has been removed from the `New` path. + +--- ## MiddlewareConfig -```go -type MiddlewareConfig struct { - // Backend provides filesystem operations - // Required - Backend filesystem.Backend - - // Shell provides synchronous shell command execution - // If set, the execute tool will be registered - // Optional, mutually exclusive with StreamingShell - Shell filesystem.Shell - - // StreamingShell provides streaming shell command execution - // If set, the streaming execute tool will be registered (real-time output) - // Optional, mutually exclusive with Shell - StreamingShell filesystem.StreamingShell - - // Per-tool configuration (all optional) - LsToolConfig *ToolConfig // ls tool config - ReadFileToolConfig *ToolConfig // read_file tool config - WriteFileToolConfig *ToolConfig // write_file tool config - EditFileToolConfig *ToolConfig // edit_file tool config - GlobToolConfig *ToolConfig // glob tool config - GrepToolConfig *ToolConfig // grep tool config - - // CustomSystemPrompt overrides the default system prompt - // Optional, defaults to ToolsSystemPrompt - CustomSystemPrompt *string - - // Deprecated fields, use the corresponding *ToolConfig.Desc instead - // CustomLsToolDesc, CustomReadFileToolDesc, CustomGrepToolDesc, - // CustomGlobToolDesc, CustomWriteFileToolDesc, CustomEditToolDesc -} -``` +`MiddlewareConfig` is the configuration struct used by `New` / `NewTyped`. -### ToolConfig +### Core Fields -Each tool can be configured independently via `ToolConfig`: + + + + + + + +
    FieldTypeDescription
    Backend
    filesystem.Backend
    Required. Provides filesystem operation capabilities, powering the ls, read\_file, write\_file, edit\_file, glob, grep tools (6 total). Interface defined in the
    github.com/cloudwego/eino/adk/filesystem
    package.
    Shell
    filesystem.Shell
    Optional. Provides command execution capability; when set, registers the
    execute
    tool. Mutually exclusive with
    StreamingShell
    .
    StreamingShell
    filesystem.StreamingShell
    Optional. Provides streaming command execution capability; when set, registers the streaming
    execute
    tool. Mutually exclusive with
    Shell
    .
    UseMultiModalRead
    bool
    Optional, defaults to
    false
    . When enabled, the
    read_file
    tool becomes an
    EnhancedInvokableTool
    supporting multi-modal content such as images/PDFs. Requires the Backend to also implement the filesystem.MultiModalReader interface.
    CustomSystemPrompt
    *string
    Optional. Overrides the system prompt appended to the Agent Instruction. If
    nil
    , no system prompt is appended.
    + +### Tool Configuration Fields + +Each tool has a corresponding `*ToolConfig` field for customizing tool name, description, replacing the implementation, or disabling it: + + + + + + + + + +
    FieldCorresponding tool
    LsToolConfig
    ls
    ReadFileToolConfig
    read\_file
    WriteFileToolConfig
    write\_file
    EditFileToolConfig
    edit\_file
    GlobToolConfig
    glob
    GrepToolConfig
    grep
    + +> The `execute` tool currently does not support customization via `ToolConfig`; its registration is controlled solely by whether `Shell` / `StreamingShell` is set. + +--- + +## ToolConfig ```go type ToolConfig struct { - // Name overrides the tool name - // Optional. Defaults to the built-in name (e.g. "ls", "read_file") - Name string - - // Desc overrides the tool description - // Optional. Defaults to the built-in description - Desc *string - - // CustomTool provides a custom tool implementation - // If set, it replaces the default implementation built on Backend - // Optional - CustomTool tool.BaseTool - - // Disable disables this tool - // When true, the tool will not be registered - // Optional, defaults to false - Disable bool + Name string // Override tool name; empty string uses default + Desc *string // Override tool description; nil uses default + CustomTool tool.BaseTool // Custom tool implementation; when set, replaces the Backend default + Disable bool // Set to true to not register this tool } ``` -Example — rename a tool and disable write: +**Priority**: `Disable=true` > `CustomTool` > Backend default implementation. + +--- + +## Tool Name Constants ```go -middleware, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: myBackend, - ReadFileToolConfig: &filesystem.ToolConfig{ - Name: "cat_file", // custom name - }, - WriteFileToolConfig: &filesystem.ToolConfig{ - Disable: true, // disable write tool - }, -}) +const ( + ToolNameLs = "ls" + ToolNameReadFile = "read_file" + ToolNameWriteFile = "write_file" + ToolNameEditFile = "edit_file" + ToolNameGlob = "glob" + ToolNameGrep = "grep" + ToolNameExecute = "execute" +) ``` +--- + ## Injected Tools - - - - - - - - + + + + + + + +
    ToolDefault nameDescriptionCondition
    List directory
    ls
    List files and directories under the given pathInjected when Backend is not nil
    Read file
    read_file
    Read file content, supports line-based pagination (offset + limit)Injected when Backend is not nil
    Write file
    write_file
    Create or overwrite a fileInjected when Backend is not nil
    Edit file
    edit_file
    Replace strings in a fileInjected when Backend is not nil
    Glob
    glob
    Find files by glob patternInjected when Backend is not nil
    Search content
    grep
    Search file content by pattern, supports multiple output modesInjected when Backend is not nil
    Execute command
    execute
    Execute shell commandsRequires Shell or StreamingShell
    ToolDefault nameRegistration conditionDescription
    ls
    ls
    Backend ≠ nilList files and subdirectories under a directory
    read\_file
    read_file
    Backend ≠ nilRead file content, supports offset/limit pagination. When
    UseMultiModalRead
    is enabled, can read images and PDFs
    write\_file
    write_file
    Backend ≠ nilCreate or overwrite a file
    edit\_file
    edit_file
    Backend ≠ nilExact string replacement editing, supports
    replace_all
    glob
    glob
    Backend ≠ nilMatch file paths by glob pattern
    grep
    grep
    Backend ≠ nilRegex search of file content, supports multiple output modes and pagination
    execute
    execute
    Shell ≠ nil or StreamingShell ≠ nilExecute shell commands
    -Each tool can be disabled via its corresponding `*ToolConfig` (`Disable: true`) or replaced with a custom implementation (`CustomTool`). +--- -## Multilingual Support +## Backend Interface -Tool descriptions and built-in prompts default to English. To switch to Chinese, use `adk.SetLanguage()`: +`Backend` is defined in the `github.com/cloudwego/eino/adk/filesystem` package. The middleware package re-exports request/response types via type aliases (such as `ReadRequest`, `FileContent`, etc.), but **the Backend interface itself must be referenced from the adk/filesystem package**. ```go -import "github.com/cloudwego/eino/adk" - -adk.SetLanguage(adk.LanguageChinese) // switch to Chinese -adk.SetLanguage(adk.LanguageEnglish) // switch to English (default) +type Backend interface { + LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) + Read(ctx context.Context, req *ReadRequest) (*FileContent, error) + GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error) + GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error) + Write(ctx context.Context, req *WriteRequest) error + Edit(ctx context.Context, req *EditRequest) error +} ``` -You can also customize each tool’s text via `ToolConfig.Desc` or override the system prompt via `CustomSystemPrompt`. +### Shell and StreamingShell -## [deprecated] Large Tool Result Offloading +```go +type Shell interface { + Execute(ctx context.Context, input *ExecuteRequest) (*ExecuteResponse, error) +} -> 💡 -> This feature will be deprecated in 0.8.0. Please migrate to Middleware: ToolReduction. +type StreamingShell interface { + ExecuteStreaming(ctx context.Context, input *ExecuteRequest) (*schema.StreamReader[*ExecuteResponse], error) +} +``` -> Note: Large tool result offloading is only available in the legacy `Config` + `NewMiddleware` API. The recommended `MiddlewareConfig` + `New` does not include it. If you need it, use the ToolReduction middleware. +The two are mutually exclusive; only one can be set. `StreamingShell` supports streaming output, suitable for long-running commands. -When tool call results are too large (e.g. reading large files, grep matching too many lines), keeping the full result in the conversation context can cause: +--- -- token usage to spike -- agent history context pollution -- worse reasoning efficiency +## MultiModalReader Extension Interface -So the legacy middleware (`NewMiddleware`) provides an automatic offloading mechanism: +When `UseMultiModalRead = true`, the Backend needs to additionally implement the `MultiModalReader` interface: -- when the result exceeds a threshold (default 20,000 tokens), it does not return the full content to the LLM -- the actual result is saved to the filesystem (Backend) -- the context contains only a summary and a file path (the agent can call `read_file` again to fetch on demand) +```go +type MultiModalReader interface { + MultiModalRead(ctx context.Context, req *MultiModalReadRequest) (*MultiFileContent, error) +} +``` -This feature is enabled by default and can be configured via `Config` (not `MiddlewareConfig`): +**Behavior**: -```go -type Config struct { - // ... Backend, Shell, StreamingShell, ToolConfig fields are the same as MiddlewareConfig +- The `read_file` tool is upgraded from `InvokableTool` to `EnhancedInvokableTool`, returning multi-modal results via `schema.ToolResult.Parts` +- The default implementation supports reading image files (PNG, JPG, etc.) and PDF files (supports the `pages` parameter to specify page ranges, max 20 pages per request) +- The tool description automatically appends a multi-modal capability suffix; if the description is customized via `ReadFileToolConfig.Desc`, no suffix is appended - // Disable automatic offloading - WithoutLargeToolResultOffloading bool +> 💡 +> When using `ChatModelAgentMiddleware`, the `WrapEnhancedInvokableToolCall` method must be implemented for the multi-modal read\_file tool to work. + +```go +// MultiModalReadRequest extends ReadRequest +type MultiModalReadRequest struct { + ReadRequest + Pages string // PDF page range, e.g. "1-5", "3", "10-20" +} - // Custom threshold (default 20000 tokens) - LargeToolResultOffloadingTokenLimit int +// MultiFileContent return result +type MultiFileContent struct { + *FileContent // Plain text result + Parts []FileContentPart // Multi-modal result (mutually exclusive with FileContent; when Parts is non-empty, FileContent is ignored) +} - // Custom offloading path generator - // Default path format: /large_tool_result/{ToolCallID} - LargeToolResultOffloadingPathGen func(ctx context.Context, input *compose.ToolInput) (string, error) +type FileContentPart struct { + Type FileContentPartType // "image" or "pdf" + MIMEType string // e.g. "image/png", "application/pdf" + Data []byte // Raw binary data } ``` + +--- + +## Deprecated: Legacy Config and Large Result Offloading + +> 💡 +> The following content only applies to the `NewMiddleware` + `Config` legacy path. The `New` / `NewTyped` path **does not include** large result offloading functionality. + +The legacy `Config` provides a "Large Tool Result Offloading" mechanism in addition to the `MiddlewareConfig` fields: + + + + + + +
    FieldDescription
    WithoutLargeToolResultOffloading bool
    Set to
    true
    to disable offloading; defaults to
    false
    (enabled)
    LargeToolResultOffloadingTokenLimit int
    Token threshold, defaults to
    20000
    LargeToolResultOffloadingPathGen func(ctx, *compose.ToolInput) (string, error)
    Offloading path generator function; defaults to
    /large_tool_result/{ToolCallID}
    + +**Trigger condition**: Offloading is triggered when the character count of a tool's return result exceeds `tokenLimit × 4`. + +**Offloading behavior**: The full result is written to a file via `Backend.Write`, and the original return is replaced with a summary (first 10 lines + file path hint). The Agent can then read the full result in pages via `read_file`. diff --git a/content/en/docs/eino/core_modules/eino_adk/adk_agent_callback.md b/content/en/docs/eino/core_modules/eino_adk/adk_agent_callback.md index c4bf6d5e2a7..30831c36287 100644 --- a/content/en/docs/eino/core_modules/eino_adk/adk_agent_callback.md +++ b/content/en/docs/eino/core_modules/eino_adk/adk_agent_callback.md @@ -7,12 +7,12 @@ title: Agent Callback weight: 9 --- -This feature adds Callback support to ADK agents, similar to the callback mechanism in the compose package. With callbacks, users can observe the agent execution lifecycle and implement logging, tracing, monitoring, and more. +This feature adds Callback support to ADK Agents, similar to the callback mechanism in the compose package. With callbacks, users can observe the Agent execution lifecycle and implement logging, tracing, monitoring, and more. > 💡 > **Tip**: The cozeloop ADK trace version is available at [https://github.com/cloudwego/eino-ext/releases/tag/callbacks%2Fcozeloop%2Fv0.2.0](https://github.com/cloudwego/eino-ext/releases/tag/callbacks%2Fcozeloop%2Fv0.2.0) > -> Make sure to use a trace callback handler implementation that supports v0.8, otherwise agent tracing won’t work properly. +> Make sure to use a trace callback handler implementation that supports v0.8 for agent tracing to work properly. ## Overview @@ -20,30 +20,30 @@ The ADK Agent Callback mechanism shares the same infrastructure as the callback - Uses the same `callbacks.Handler` interface - Uses the same `callbacks.RunInfo` structure -- Can be combined with callbacks of other components (e.g. ChatModel, Tool) +- Can be combined with callbacks from other components (such as ChatModel, Tool, etc.) > 💡 -> With Agent Callback, you can hook into key points of agent execution to implement observability such as tracing, logging, and metrics. This capability was introduced in v0.8.0. +> With Agent Callback, you can hook into key points of Agent execution to implement observability capabilities such as tracing, logging, and metrics. This capability was introduced in v0.8.0. ## Core Types ### ComponentOfAgent -Component type identifier used to recognize agent-related events in callbacks: +Component type identifier used to recognize Agent-related events in callbacks: ```go const ComponentOfAgent components.Component = "Agent" ``` -Used in `callbacks.RunInfo.Component` to filter callback events related to agents only. +Used in `callbacks.RunInfo.Component` to filter callback events related only to Agents. ### AgentCallbackInput -Input type for agent callbacks, passed to `OnStart`: +Input type for Agent callbacks, passed in the `OnStart` callback: ```go type AgentCallbackInput struct { - // Input contains the agent input for a new run. It is nil when resuming. + // Input contains the Agent input for a new run. It is nil when resuming. Input *AgentInput // ResumeInfo contains information for resuming from an interrupt. It is nil for a new run. ResumeInfo *ResumeInfo @@ -51,38 +51,38 @@ type AgentCallbackInput struct { ``` - +
    CallField values
    Call methodField values
    Agent.Run()
    Input
    is set,
    ResumeInfo
    is nil
    Agent.Resume()
    ResumeInfo
    is set,
    Input
    is nil
    ### AgentCallbackOutput -Output type for agent callbacks, passed to `OnEnd`: +Output type for Agent callbacks, passed in the `OnEnd` callback: ```go type AgentCallbackOutput struct { - // Events provides the agent event stream. Each handler receives its own copy. + // Events provides the Agent event stream. Each handler receives an independent copy. Events *AsyncIterator[*AgentEvent] } ``` > 💡 -> **Important**: consume `Events` **asynchronously** to avoid blocking agent execution. Each callback handler gets an independent copy of the event stream, so they do not interfere with each other. +> **Important**: The `Events` iterator should be consumed **asynchronously** to avoid blocking Agent execution. Each callback handler receives an independent copy of the event stream, so they do not interfere with each other. ## API Usage ### WithCallbacks -Run option that adds callback handlers to receive agent lifecycle events: +Run option that adds callback handlers to receive Agent lifecycle events: ```go func WithCallbacks(handlers ...callbacks.Handler) AgentRunOption ``` -### Type Conversion Helpers +### Type Conversion Functions -Convert generic callback types to agent-specific types: +Convert generic callback types to Agent-specific types: ```go // Convert input type @@ -92,11 +92,11 @@ func ConvAgentCallbackInput(input callbacks.CallbackInput) *AgentCallbackInput func ConvAgentCallbackOutput(output callbacks.CallbackOutput) *AgentCallbackOutput ``` -If the type does not match, these functions return nil. +If the type does not match, the functions return nil. -## Examples +## Usage Examples -### Option 1: Use HandlerBuilder +### Method 1: Using HandlerBuilder Build a generic callback handler via `callbacks.NewHandlerBuilder()`: @@ -147,11 +147,11 @@ iter := runner.Run(ctx, input.Messages, adk.WithCallbacks(handler)) ``` > 💡 -> **Important**: this is the correct usage. Callbacks only work when running the agent through Runner. If you call `agent.Run()` directly, callbacks will not be triggered. +> **Important**: The example above shows the correct usage. Callbacks only work when running the Agent through Runner. If you call `agent.Run()` directly, callbacks will not be triggered. -### Option 2: Use HandlerHelper (Recommended) +### Method 2: Using HandlerHelper (Recommended) -`template.HandlerHelper` makes type conversion easier: +`template.HandlerHelper` provides more convenient type conversion: ```go import ( @@ -196,17 +196,17 @@ iter := runner.Run(ctx, input.Messages, adk.WithCallbacks(helper)) ``` > 💡 -> **Important**: callbacks only work when running the agent through Runner. If you call `agent.Run()` directly, callbacks will not be triggered. -> +> **Important**: Callbacks only work when running the Agent through Runner. If you call `agent.Run()` directly, callbacks will not be triggered. + > 💡 -> `HandlerHelper` performs type conversion automatically and keeps the code concise. It also supports composing callbacks for multiple components. +> `HandlerHelper` performs type conversion automatically, making code more concise. It also supports composing callback handlers for multiple components. ## Tracing Use Case > 💡 -> **Important**: AgentCallback only works when executed via Runner. If you call Agent.Run() directly, callbacks will not be triggered because the callback mechanism is implemented at the flowAgent layer. Create a Runner via `adk.NewRunner()` and execute the agent via `Runner.Run()` or `Runner.Query()`. +> **Important**: AgentCallback only works when executed via Runner. If you call Agent.Run() directly, callbacks will not be triggered because the callback mechanism is implemented at the flowAgent layer. Create a Runner via `adk.NewRunner()` and execute the Agent via `Runner.Run()` or `Runner.Query()`. -The most common use case is distributed tracing. Below is an example using OpenTelemetry: +The most common application of Agent Callback is implementing distributed tracing. Below is an example using OpenTelemetry: ```go import ( @@ -219,7 +219,7 @@ import ( "github.com/cloudwego/eino/callbacks" ) -// Create an Agent (ChatModelAgent as example) +// Create Agent (ChatModelAgent as example) agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ Name: "my_agent", Description: "A helpful assistant", @@ -269,7 +269,7 @@ handler := callbacks.NewHandlerBuilder(). }). Build() -// Execute via Runner and pass the callback handler +// Execute Agent via Runner and pass the callback handler iter := runner.Query(ctx, "Hello, agent!", adk.WithCallbacks(handler)) // Consume event stream @@ -287,14 +287,14 @@ for { ``` > 💡 -> **Reminder**: callbacks only work when running the agent via Runner. If you call `agent.Run()` directly, even if you pass `adk.WithCallbacks(handler)`, agent-level callbacks will not be triggered. -> +> **Reminder**: Callbacks only work when running the Agent via Runner. If you call `agent.Run()` directly, even if you pass `adk.WithCallbacks(handler)`, Agent-level callbacks will not be triggered. + > 💡 > **Tip**: The cozeloop ADK trace version is available at [https://github.com/cloudwego/eino-ext/releases/tag/callbacks%2Fcozeloop%2Fv0.2.0](https://github.com/cloudwego/eino-ext/releases/tag/callbacks%2Fcozeloop%2Fv0.2.0) ## Agent Type Identifiers -Built-in agents implement `components.Typer` and return their type identifier, which is filled into `callbacks.RunInfo.Type`: +Built-in Agent implementations implement the `components.Typer` interface, returning their type identifier which populates the `callbacks.RunInfo.Type` field: @@ -305,24 +305,24 @@ Built-in agents implement `components.Typer` and return their type identifier, w
    Agent typeGetType() return value
    DeterministicTransfer Agent
    "DeterministicTransfer"
    -## Callback Semantics +## Callback Behavior ### Callback Timing
    -Run1. Initialize callback context2. Handle input3. Call
    OnStart
    4. Execute agent logic5. Register
    OnEnd
    (when iterator is created)
    -Resume1. Build ResumeInfo2. Initialize callback context3. Call
    OnStart
    4. Resume agent execution5. Register
    OnEnd
    (when iterator is created)
    +Run method1. Initialize callback context2. Handle input3. Call
    OnStart
    4. Execute Agent logic5. Register
    OnEnd
    (when event stream is created) +Resume method1. Build ResumeInfo2. Initialize callback context3. Call
    OnStart
    4. Resume Agent execution5. Register
    OnEnd
    (when event stream is created) ### OnEnd Timing -`OnEnd` is registered **when the iterator is created**, not when the generator is closed. This enables handlers to consume events while the stream is being produced. +The `OnEnd` callback is registered **when the iterator is created**, not when the generator closes. This allows handlers to consume events while the stream is being produced. ## Notes -### 1. Consume Events Asynchronously +### 1. Consume Event Stream Asynchronously -In callback handlers, `AgentCallbackOutput.Events` **must** be consumed asynchronously, otherwise it will block agent execution: +In callback handlers, `AgentCallbackOutput.Events` **must** be consumed asynchronously, otherwise it will block Agent execution: ```go // ✅ Correct @@ -339,7 +339,7 @@ OnEnd: func(ctx context.Context, info *callbacks.RunInfo, output *adk.AgentCallb return ctx } -// ❌ Wrong - will deadlock +// ❌ Wrong - will cause deadlock OnEnd: func(ctx context.Context, info *callbacks.RunInfo, output *adk.AgentCallbackOutput) context.Context { for { event, ok := output.Events.Next() @@ -354,8 +354,8 @@ OnEnd: func(ctx context.Context, info *callbacks.RunInfo, output *adk.AgentCallb ### 2. No OnError Callback -Because `Agent.Run()` and `Agent.Resume()` do not return error, agent callbacks **do not support** `OnError`. Errors are carried via `AgentEvent.Err` in the event stream. +Since the `Agent.Run()` and `Agent.Resume()` method signatures do not return error, Agent callbacks **do not support** `OnError`. Error information is delivered via the `AgentEvent.Err` field in the event stream. -### 3. Event Stream Copying +### 3. Event Stream Copying Mechanism -When multiple callback handlers are registered, each handler receives an independent copy of the event stream. The last handler receives the original stream to reduce allocations. +When multiple callback handlers are registered, each handler receives an independent copy of the event stream, so they do not interfere with each other. The last handler receives the original event stream to reduce memory allocations. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_collaboration.md b/content/en/docs/eino/core_modules/eino_adk/agent_collaboration.md index 5c012b3651f..a4e29808e81 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_collaboration.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_collaboration.md @@ -1,521 +1,116 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: Agent Collaboration' +title: Agent Collaboration weight: 4 --- -# Agent Collaboration +# Multi-Agent Collaboration -The overview document has provided basic explanations of Agent collaboration. Below we will introduce the design and implementation of collaboration and composition primitives in combination with code: +Eino ADK provides two primary Agent collaboration methods: -## Collaboration Primitives +## AgentAsTool (Recommended) -### Inter-Agent Collaboration Methods +Wraps a sub-Agent as a Tool, allowing the parent Agent to autonomously decide when to invoke it via ToolCall. The sub-Agent executes independently, and results are returned to the parent Agent's context. - - - - -
    Collaboration MethodDescription
    TransferDirectly transfers the task to another Agent. The current Agent exits after execution completes, without concern for the task execution status of the transferred Agent
    ToolCall (AgentAsTool)Invokes the Agent as a ToolCall, waits for the Agent's response, and can obtain the output result of the called Agent for the next round of processing
    - -### AgentInput Context Strategies - - - - - -
    Context StrategyDescription
    Upstream Agent Full ConversationGets the complete conversation history of the current Agent's upstream Agent
    Fresh Task DescriptionIgnores the complete conversation history of the upstream Agent and provides a completely new task summary as the AgentInput for the sub-Agent
    - -### Decision Autonomy - - - - - -
    Decision AutonomyDescription
    Autonomous DecisionWithin the Agent, based on its available downstream Agents, when assistance is needed, it autonomously selects a downstream Agent for assistance. Generally, the Agent makes decisions based on LLM internally, but even if selection is based on preset logic, it is still considered autonomous decision from the Agent's external perspective
    Preset DecisionThe next Agent after an Agent executes a task is predetermined. The execution order of Agents is predetermined and predictable
    - -### Composition Primitives - - - - - - - - -
    TypeDescriptionRunning ModeCollaboration MethodContext StrategyDecision Autonomy
    SubAgentsCombines the user-provided agent as the parent Agent and the user-provided subAgents list as child Agents to form an Agent capable of autonomous decision-making. The Name and Description serve as the Agent's name identifier and description.
  • Currently, an Agent can only have one parent Agent
  • Use the SetSubAgents function to build a "multi-tree" form of Multi-Agent
  • Within this "multi-tree", AgentName must remain unique
  • TransferUpstream Agent Full ConversationAutonomous Decision
    SequentialCombines the user-provided SubAgents list into a Sequential Agent that executes in order. The Name and Description serve as the Sequential Agent's name identifier and description. When the Sequential Agent executes, it runs the SubAgents list in order until all Agents have been executed.TransferUpstream Agent Full ConversationPreset Decision
    ParallelCombines the user-provided SubAgents list into a Parallel Agent that executes concurrently based on the same context. The Name and Description serve as the Parallel Agent's name identifier and description. When the Parallel Agent executes, it runs the SubAgents list concurrently and finishes when all Agents have completed.TransferUpstream Agent Full ConversationPreset Decision
    LoopExecutes the user-provided SubAgents list in array order sequentially and repeatedly, forming a Loop Agent. The Name and Description serve as the Loop Agent's name identifier and description. When the Loop Agent executes, it runs the SubAgents list in order and finishes when all Agents have completed.TransferUpstream Agent Full ConversationPreset Decision
    AgentAsToolConverts an Agent into a Tool to be used by other Agents as a regular Tool. Whether an Agent can call other Agents as Tools depends on its own implementation. The ChatModelAgent provided in ADK supports the AgentAsTool functionalityToolCallFresh Task DescriptionAutonomous Decision
    - -## Context Passing - -When building multi-Agent systems, efficient and accurate sharing of information between different Agents is crucial. Eino ADK provides two core context passing mechanisms to meet different collaboration needs: History and SessionValues. - -### History - -#### Concept - -History corresponds to the [Upstream Agent Full Conversation context strategy]. Every AgentEvent produced by each Agent in a multi-Agent system is saved to History. When calling a new Agent (Workflow/Transfer), the AgentEvents in History are converted and concatenated into the AgentInput. - -By default, Assistant or Tool Messages from other Agents are converted to User Messages. This is equivalent to telling the current LLM: "Just now, Agent_A called some_tool and returned some_result. Now it's your turn to make a decision." - -Through this approach, other Agents' behaviors are treated as "external information" or "factual statements" provided to the current Agent, rather than its own behaviors, thus avoiding LLM context confusion. - - - -In Eino ADK, when building AgentInput for an Agent, the History it can see is "all AgentEvents produced before me". - -Worth mentioning is ParallelWorkflowAgent: two parallel sub-Agents (A, B) cannot see each other's AgentEvents during parallel execution because neither A nor B precedes the other. - -#### RunPath - -Each AgentEvent in History is "produced by a specific Agent in a specific execution sequence", meaning AgentEvent has its own RunPath. The purpose of RunPath is to convey this information; it doesn't carry other functions in the eino framework. - -The table below shows the specific RunPath when Agents execute under various orchestration modes: - - - - - - - -
    ExampleRunPath
  • Agent: [Agent]
  • SubAgent: [Agent, SubAgent]
  • Agent: [Agent]
  • Agent (after function call): [Agent]
  • Agent1: [SequentialAgent, LoopAgent, Agent1]
  • Agent2: [SequentialAgent, LoopAgent, Agent1, Agent2]
  • Agent1: [SequentialAgent, LoopAgent, Agent1, Agent2, Agent1]
  • Agent2: [SequentialAgent, LoopAgent, Agent1, Agent2, Agent1, Agent2]
  • Agent3: [SequentialAgent, LoopAgent, Agent3]
  • Agent4: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent4]
  • Agent5: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent5]
  • Agent6: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent6]
  • Agent: [Agent]
  • SubAgent: [Agent, SubAgent]
  • Agent: [Agent, SubAgent, Agent]
  • - -#### Customization - -In some cases, the History content needs to be adjusted before the Agent runs. At this point, you can customize how the Agent generates AgentInput from History using AgentWithOptions: - -```go -// github.com/cloudwego/eino/adk/flow.go - -type HistoryRewriter func(ctx context.Context, entries []*HistoryEntry) ([]Message, error) - -func WithHistoryRewriter(h HistoryRewriter) AgentOption -``` - -### SessionValues - -#### Concept +This is the most flexible and composable collaboration pattern: -SessionValues is a global temporary KV store that persists throughout a single run, used to support cross-Agent state management and data sharing. Any Agent in a single run can read and write SessionValues at any time. - -Eino ADK provides multiple methods for Agents to read and write Session Values in a concurrency-safe manner at runtime: - -```go -// github.com/cloudwego/eino/adk/runctx.go - -// Get all SessionValues -func GetSessionValues(ctx context.Context) map[string]any -// Batch set SessionValues -func AddSessionValues(ctx context.Context, kvs map[string]any) -// Get a value from SessionValues by specified key. Returns false as the second value if key doesn't exist, otherwise true -func GetSessionValue(ctx context.Context, key string) (any, bool) -// Set a single SessionValue -func AddSessionValue(ctx context.Context, key string, value any) -``` - -Note that since the SessionValues mechanism is implemented based on Context, and Runner reinitializes the Context when running, injecting SessionValues via `AddSessionValues` or `AddSessionValue` outside of the Run method will not take effect. - -If you need to inject data into SessionValues before the Agent runs, you need to use a dedicated Option to assist with this. Usage is as follows: - -```go -// github.com/cloudwego/eino/adk/call_option.go -// WithSessionValues injects SessionValues before Agent runs -func WithSessionValues(v map[string]any) AgentRunOption - -// Usage: -runner := adk.NewRunner(ctx, adk.RunnerConfig{Agent: agent}) -iterator := runner.Run(ctx, []adk.Message{schema.UserMessage("xxx")}, - adk.WithSessionValues(map[string]any{ - PlanSessionKey: 123, - UserInputSessionKey: []adk.Message{schema.UserMessage("yyy")}, - }), -) -``` - -## Transfer SubAgents - -### Concept - -Transfer corresponds to the [Transfer collaboration method]. When an Agent produces an AgentEvent containing TransferAction during runtime, Eino ADK calls the Agent specified by the Action. The called Agent is called a SubAgent. - -TransferAction can be quickly created using `NewTransferToAgentAction`: - -```go -import "github.com/cloudwego/eino/adk" - -event := adk.NewTransferToAgentAction("dest agent name") -``` - -For Eino ADK to find and run the SubAgent instance upon receiving TransferAction, you need to first call `SetSubAgents` to register possible SubAgents with Eino ADK before running: - -```go -// github.com/cloudwego/eino/adk/flow.go -func SetSubAgents(ctx context.Context, agent Agent, subAgents []Agent) (Agent, error) -``` - -> 💡 -> The meaning of Transfer is to **hand over** the task to the SubAgent, not delegate or assign. Therefore: -> -> 1. Unlike ToolCall, when calling a SubAgent through Transfer, after the SubAgent finishes running, the parent Agent will not be called again to summarize content or perform the next operation. -> 2. When calling a SubAgent, the SubAgent's input is still the original input, and the parent Agent's output serves as context for the SubAgent's reference. - -When triggering SetSubAgents, both parent and child Agents need to process to complete initialization. Eino ADK defines the `OnSubAgents` interface to support this functionality: - -```go -// github.com/cloudwego/eino/adk/interface.go -type OnSubAgents interface { - OnSetSubAgents(ctx context.Context, subAgents []Agent) error - OnSetAsSubAgent(ctx context.Context, parent Agent) error - OnDisallowTransferToParent(ctx context.Context) error -} -``` - -If an Agent implements the `OnSubAgents` interface, `SetSubAgents` will call the corresponding methods to register with the Agent. For example, `ChatModelAgent`'s implementation. - -### Example - -Below we demonstrate the Transfer capability with a multi-functional conversation Agent. The goal is to build an Agent that can query weather or chat with users. The Agent structure is as follows: - - - -All three Agents are implemented using ChatModelAgent: +- The parent Agent retains control and can continue reasoning based on sub-Agent results +- The sub-Agent receives an independent task description and does not inherit the parent Agent's full conversation history +- Multiple sub-Agents can be invoked in parallel ```go import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" "github.com/cloudwego/eino/compose" + "github.com/cloudwego/eino/components/tool" ) -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -type GetWeatherInput struct { - City string `json:"city"` -} - -func NewWeatherAgent() adk.Agent { - weatherTool, err := utils.InferTool( - "get_weather", - "Gets the current weather for a specific city.", - func(ctx context.Context, input *GetWeatherInput) (string, error) { - return fmt.Sprintf(`the temperature in %s is 25°C`, input.City), nil - }, - ) - if err != nil { - log.Fatal(err) - } - - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "WeatherAgent", - Description: "This agent can get the current weather for a given city.", - Instruction: "Your sole purpose is to get the current weather for a given city by using the 'get_weather' tool. After calling the tool, report the result directly to the user.", - Model: newChatModel(), - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{weatherTool}, - }, - }, - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func NewChatAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ChatAgent", - Description: "A general-purpose agent for handling conversational chat.", - Instruction: "You are a friendly conversational assistant. Your role is to handle general chit-chat and answer questions that are not related to any specific tool-based tasks.", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func NewRouterAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "RouterAgent", - Description: "A manual router that transfers tasks to other expert agents.", - Instruction: `You are an intelligent task router. Your responsibility is to analyze the user's request and delegate it to the most appropriate expert agent.If no Agent can handle the task, simply inform the user it cannot be processed.`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} -``` - -Then use Eino ADK's Transfer capability to build a Multi-Agent and run it. ChatModelAgent implements the OnSubAgent interface. In the adk.SetSubAgents method, this interface is used to register parent/child Agents with ChatModelAgent, without requiring users to handle TransferAction generation: - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino/adk" -) - -func main() { - weatherAgent := NewWeatherAgent() - chatAgent := NewChatAgent() - routerAgent := NewRouterAgent() - - ctx := context.Background() - a, err := adk.SetSubAgents(ctx, routerAgent, []adk.Agent{chatAgent, weatherAgent}) - if err != nil { - log.Fatal(err) - } - - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: a, - }) - - // query weather - println("\n\n>>>>>>>>>query weather<<<<<<<<<") - iter := runner.Query(ctx, "What's the weather in Beijing?") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil { - fmt.Printf("\nAgent[%s]: transfer to %+v\n\n======\n", event.AgentName, event.Action.TransferToAgent.DestAgentName) - } else { - fmt.Printf("\nAgent[%s]:\n%+v\n\n======\n", event.AgentName, event.Output.MessageOutput.Message) - } - } - - // failed to route - println("\n\n>>>>>>>>>failed to route<<<<<<<<<") - iter = runner.Query(ctx, "Book me a flight from New York to London tomorrow.") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil { - fmt.Printf("\nAgent[%s]: transfer to %+v\n\n======\n", event.AgentName, event.Action.TransferToAgent.DestAgentName) - } else { - fmt.Printf("\nAgent[%s]:\n%+v\n\n======\n", event.AgentName, event.Output.MessageOutput.Message) - } - } -} -``` - -Running result: - -```yaml ->>>>>>>>>query weather<<<<<<<<< -Agent[RouterAgent]: -assistant: -tool_calls: -{Index: ID:call_SKNsPwKCTdp1oHxSlAFt8sO6 Type:function Function:{Name:transfer_to_agent Arguments:{"agent_name":"WeatherAgent"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{201 17 218} -====== -Agent[RouterAgent]: transfer to WeatherAgent -====== -Agent[WeatherAgent]: -assistant: -tool_calls: -{Index: ID:call_QMBdUwKj84hKDAwMMX1gOiES Type:function Function:{Name:get_weather Arguments:{"city":"Beijing"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{255 15 270} -====== -Agent[WeatherAgent]: -tool: the temperature in Beijing is 25°C -tool_call_id: call_QMBdUwKj84hKDAwMMX1gOiES -tool_call_name: get_weather -====== -Agent[WeatherAgent]: -assistant: The current temperature in Beijing is 25°C. -finish_reason: stop -usage: &{286 11 297} -====== - ->>>>>>>>>failed to route<<<<<<<<< -Agent[RouterAgent]: -assistant: I'm unable to assist with booking flights. Please use a relevant travel service or booking platform to make your reservation. -finish_reason: stop -usage: &{206 23 229} -====== -``` - -The other two methods of OnSubAgents are called when an Agent acts as a SubAgent in SetSubAgents: - -- OnSetAsSubAgent is used to register parent Agent information with the Agent -- OnDisallowTransferToParent is called when the Agent sets the WithDisallowTransferToParent option, to inform the Agent not to produce TransferAction to the parent Agent. - -```go -adk.SetSubAgents( - ctx, - Agent1, - []adk.Agent{ - adk.AgentWithOptions(ctx, Agent2, adk.WithDisallowTransferToParent()), +// Create sub-Agent +subAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "researcher", + Description: "Search and summarize relevant information", + Instruction: "You are a research assistant...", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{searchTool}, + }, }, -) +}) + +// Wrap as Tool +agentTool := adk.NewAgentTool(ctx, subAgent) + +// Parent Agent registers sub-Agent Tool +parentAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "coordinator", + Description: "Main Agent for coordinating tasks", + Instruction: "You are a task coordinator...", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{agentTool}, + }, + }, +}) ``` -### Static Transfer Configuration - -AgentWithDeterministicTransferTo is an Agent Wrapper that generates a preset TransferAction after the original Agent executes, enabling static configuration of Agent jumping: - -```go -// github.com/cloudwego/eino/adk/flow.go +### AgentTool Options -type DeterministicTransferConfig struct { - Agent Agent - ToAgentNames []string -} - -func AgentWithDeterministicTransferTo(_ context.Context, config *DeterministicTransferConfig) Agent -``` - -In Supervisor mode, after a SubAgent finishes execution, it always returns to the Supervisor, which generates the next task objective. AgentWithDeterministicTransferTo can be used here: + + + + +
    OptionDescription
    WithFullChatHistoryAsInput()
    Pass the parent Agent's full conversation history as the sub-Agent's input (by default only the model-generated request parameters are passed)
    WithAgentInputSchema(schema)
    Custom input schema for the sub-Agent
    - +### Event Stream Pass-through -```go -// github.com/cloudwego/eino/adk/prebuilt/supervisor.go +When `ToolsConfig.EmitInternalEvents = true`, sub-Agent events are streamed through in real-time to the parent Agent's event stream, allowing end users to see the sub-Agent's intermediate process. -type SupervisorConfig struct { - Supervisor adk.Agent - SubAgents []adk.Agent -} +> 💡 +> Pass-through events do not affect the parent Agent's state or checkpoint; they are only for user display. The sole exception is InterruptAction, which propagates across boundaries via CompositeInterrupt to support interrupt/resume. -func NewSupervisor(ctx context.Context, conf *SupervisorConfig) (adk.Agent, error) { - subAgents := make([]adk.Agent, 0, len(conf.SubAgents)) - supervisorName := conf.Supervisor.Name(ctx) - for _, subAgent := range conf.SubAgents { - subAgents = append(subAgents, adk.AgentWithDeterministicTransferTo(ctx, &adk.DeterministicTransferConfig{ - Agent: subAgent, - ToAgentNames: []string{supervisorName}, - })) - } +### Pre-built Example: DeepAgents - return adk.SetSubAgents(ctx, conf.Supervisor, subAgents) -} -``` +[DeepAgents](/docs/eino/core_modules/eino_adk/agent_implementation/deepagents) is a best-practice example of the AgentAsTool pattern: the main Agent delegates subtasks to sub-Agents via **TaskTool**, combined with **WriteTodos** for task planning and progress tracking. ## Workflow Agents -WorkflowAgent supports running Agents according to workflows preset in code. Eino ADK provides three basic Workflow Agents: Sequential, Parallel, and Loop. They can be nested within each other to complete more complex tasks. - -By default, the input for each Agent in a Workflow is generated using the method described in the History section. You can customize the AgentInput generation method using WithHistoryRewriter. - -When an Agent produces an ExitAction Event, the Workflow Agent will immediately exit, regardless of whether there are other Agents that need to run afterward. - -For detailed explanations and use case references, see: [Eino ADK: Workflow Agents](/docs/eino/core_modules/eino_adk/agent_implementation/workflow) - -### SequentialAgent - -SequentialAgent executes a series of Agents in the order you provide: - - - -```go -type SequentialAgentConfig struct { - Name string - Description string - SubAgents []Agent -} - -func NewSequentialAgent(ctx context.Context, config *SequentialAgentConfig) (Agent, error) -``` - -### LoopAgent - -LoopAgent is implemented based on SequentialAgent. After SequentialAgent completes, it runs from the beginning again: - - - -```go -type LoopAgentConfig struct { - Name string - Description string - SubAgents []Agent - - MaxIterations int // Maximum number of loop iterations -} - -func NewLoopAgent(ctx context.Context, config *LoopAgentConfig) (Agent, error) -``` - -### ParallelAgent - -ParallelAgent runs multiple Agents concurrently: +Deterministic orchestration for multi-step tasks with fixed flows: - + + + + + +
    TypeDescriptionConstructor
    SequentialExecutes sub-Agents sequentially in array order
    adk.NewSequentialAgent
    ParallelExecutes all sub-Agents concurrently; completes when all finish
    adk.NewParallelAgent
    LoopLoops through the sub-Agent sequence until BreakLoop or MaxIterations is exceeded
    adk.NewLoopAgent
    -```go -type ParallelAgentConfig struct { - Name string - Description string - SubAgents []Agent -} +Workflow Agents pass context between each other via Transfer: the output of an upstream Agent is automatically appended to the downstream Agent's input Messages. -func NewParallelAgent(ctx context.Context, config *ParallelAgentConfig) (Agent, error) -``` +# Context Passing -## AgentAsTool +## SessionValues -When running an Agent requires only clear and explicit instructions rather than a complete running context (History), the Agent can be converted to a Tool for invocation: +A global KV store across Agents; any Agent within a single run can read and write concurrency-safely: ```go -func NewAgentTool(_ context.Context, agent Agent, options ...AgentToolOption) tool.BaseTool +// Read/Write API +adk.AddSessionValue(ctx, "key", value) +val, ok := adk.GetSessionValue(ctx, "key") +adk.AddSessionValues(ctx, map[string]any{"k1": v1, "k2": v2}) +all := adk.GetSessionValues(ctx) ``` -After converting to a Tool, the Agent can be called by ChatModels that support function calling, and can also be called by all LLM-driven Agents. The calling method depends on the Agent implementation. - -Message history isolation: An Agent as a Tool does not inherit the message history (History) of the parent Agent. - -SessionValues sharing: However, it shares the SessionValues of the parent Agent, i.e., reads and writes the same KV map. - -Internal event exposure: An Agent as a Tool is still an Agent and produces AgentEvents. By default, these internal AgentEvents are not exposed through the `AsyncIterator` returned by `Runner`. In some business scenarios, if you need to expose the internal AgentTool's AgentEvents to users, you need to add configuration in the parent `ChatModelAgent`'s `ToolsConfig` to enable internal event exposure: +> 💡 +> SessionValues are implemented based on Context, and Runner reinitializes the Context when running. If you need to inject data before a run, use the `WithSessionValues` Option: ```go -// from adk/chatmodel.go - -type ToolsConfig struct { - // other configurations... - - // EmitInternalEvents indicates whether internal events from agentTool should be emitted - // to the parent generator via a tool option injection at run-time. - EmitInternalEvents bool -} +iter := runner.Run(ctx, messages, + adk.WithSessionValues(map[string]any{ + "user_id": "123", + }), +) ``` - -These internal events will not enter the parent agent's context (except for the last message which would enter anyway), and various AgentActions will not take effect (except InterruptAction). diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_extension.md b/content/en/docs/eino/core_modules/eino_adk/agent_extension.md index 2532cbd00ec..d7511af4e7b 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_extension.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_extension.md @@ -1,118 +1,133 @@ --- Description: "" -date: "2025-11-20" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: Agent Runner and Extension' +title: Agent Runner and Extension weight: 6 --- -# Agent Runner +# Runner -## Definition +Runner is the execution entry point for Agents, responsible for managing Agent lifecycle, context initialization, Checkpoint persistence, and interrupt recovery. **Any Agent should be run through Runner.** -Runner is the core engine in Eino ADK responsible for executing Agents. Its main purpose is to manage and control the entire lifecycle of Agents, such as handling multi-Agent collaboration, saving and passing context, etc. Cross-cutting capabilities like interrupt, callback, etc. all rely on Runner for implementation. Any Agent should be run through Runner. +## Basic Usage -## Interrupt & Resume - -Agent Runner provides runtime interrupt and resume functionality. This allows a running Agent to proactively interrupt its execution and save the current state, supporting resumption from the interrupt point. This functionality is commonly used in scenarios where the Agent processing flow requires external input, long waits, or pausable operations. - -Below we introduce three key points in an interrupt-to-resume process: +```go +import "github.com/cloudwego/eino/adk" + +// Create Runner +runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + CheckPointStore: store, // Optional, required for interrupt recovery +}) + +// Method 1: Query — directly send a user question +iter := runner.Query(ctx, "Help me search today's news") + +// Method 2: Run — pass in complete Messages +iter := runner.Run(ctx, []*schema.Message{ + schema.UserMessage("Hello"), +}, adk.WithSessionValues(map[string]any{"user": "alice"})) + +// Consume the event stream +for { + event, ok := iter.Next() + if !ok { + break + } + // Process event +} +``` -1. Interrupted Action: Thrown by the Agent as an interrupt event, intercepted by Agent Runner -2. Checkpoint: Agent Runner intercepts the event and saves the current running state -3. Resume: After running conditions are ready again, Agent Runner resumes running from the checkpoint +## Generics Support -### Interrupted Action +```go +type TypedRunner[M MessageType] struct { ... } +type Runner = TypedRunner[*schema.Message] -During the Agent's execution, you can proactively interrupt the Runner's operation by producing an AgentEvent containing an Interrupted Action. +func NewTypedRunner[M MessageType](conf TypedRunnerConfig[M]) *TypedRunner[M] +``` -When the Event's Interrupted is not empty, the Agent Runner considers an interrupt to have occurred: +The `*schema.AgenticMessage` path uses `NewTypedRunner` for construction. -```go -// github.com/cloudwego/eino/adk/interface.go -type AgentAction struct { - // other actions - Interrupted *InterruptInfo - // other actions -} +## Interrupt & Resume -// github.com/cloudwego/eino/adk/interrupt.go -type InterruptInfo struct { - Data any -} -``` +An Agent can proactively interrupt during execution. Runner automatically saves state (requires `CheckPointStore` configured), and can later resume from the breakpoint. -When an interrupt occurs, you can attach custom interrupt information through the InterruptInfo structure. This information: +### Interrupt -1. Will be passed to the caller, which can be used to explain the reason for the interrupt, etc. -2. If the Agent run needs to be resumed later, the InterruptInfo will be re-passed to the interrupted Agent upon resumption, and the Agent can use this information to resume running +An Agent produces an event containing `Interrupted` to trigger an interrupt: ```go -// For example, when ChatModelAgent interrupts, it sends the following AgentEvent: -h.Send(&AgentEvent{AgentName: h.agentName, Action: &AgentAction{ - Interrupted: &InterruptInfo{ - Data: &ChatModelAgentInterruptInfo{Data: data, Info: info}, +gen.Send(&adk.AgentEvent{ + Action: &adk.AgentAction{ + Interrupted: &adk.InterruptInfo{Data: myData}, }, -}}) +}) ``` -### State Persistence (Checkpoint) - -When Runner captures this Event with Interrupted Action, it immediately terminates the current execution flow. If: +### State Persistence -1. CheckPointStore is set in Runner +After Runner captures an interrupt, it stores the running state (input, conversation history, InterruptInfo) in the `CheckPointStore` using the CheckPointID as key: ```go -// github.com/cloudwego/eino/adk/runner.go -type RunnerConfig struct { - // other fields - CheckPointStore CheckPointStore -} - -// github.com/cloudwego/eino/adk/interrupt.go type CheckPointStore interface { Set(ctx context.Context, key string, value []byte) error Get(ctx context.Context, key string) ([]byte, bool, error) } ``` -1. CheckPointID is passed via AgentRunOption WithCheckPointID when calling Runner +Pass the CheckPointID via Option when calling: ```go -// github.com/cloudwego/eino/adk/interrupt.go -func WithCheckPointID(id string) AgentRunOption +iter := runner.Run(ctx, messages, adk.WithCheckPointID("cp-123")) ``` -After terminating running, Runner persists the current running state (original input, conversation history, etc.) and the InterruptInfo thrown by the Agent to CheckPointStore using CheckPointID as the key. - > 💡 -> To preserve the original types of data in interfaces, Eino ADK uses gob ([https://pkg.go.dev/encoding/gob](https://pkg.go.dev/encoding/gob)) to serialize running state. Therefore, when using custom types, you need to register the types in advance using gob.Register or gob.RegisterName (the latter is more recommended; the former uses path plus type name as the default name, so both the type's location and name cannot change). Eino automatically registers types built into the framework. +> ADK uses gob to serialize running state. Custom types need to be registered in advance with gob.RegisterName. Framework built-in types are automatically registered. ### Resume -When running is interrupted, calling Runner's Resume interface with the CheckPointID from the interrupt can resume running: - ```go -// github.com/cloudwego/eino/adk/runner.go -func (r *Runner) Resume(ctx context.Context, checkPointID string, opts ...AgentRunOption) (*AsyncIterator[*AgentEvent], error) +// Simple resume: implicitly resumes all interrupt points +iter, err := runner.Resume(ctx, "cp-123") + +// Precise resume: specify targets and data +iter, err := runner.ResumeWithParams(ctx, "cp-123", &adk.ResumeParams{ + Targets: map[string]any{ + "agent-address": resumeData, + }, +}) ``` -Resuming Agent running requires the interrupted Agent to implement the ResumableAgent interface. Runner reads the running state from CheckPointerStore and resumes running, where the InterruptInfo and the EnableStreaming configured in the previous run are provided as input to the Agent: +Resumption requires the interrupted Agent to implement the `ResumableAgent` interface: ```go -// github.com/cloudwego/eino/adk/interface.go -type ResumableAgent interface { - Agent - - Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] -} - -// github.com/cloudwego/eino/adk/interrupt.go -type ResumeInfo struct { - EnableStreaming bool - *InterruptInfo +type TypedResumableAgent[M MessageType] interface { + TypedAgent[M] + Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*TypedAgentEvent[M]] } ``` -To pass new information to the Agent during Resume, you can define an AgentRunOption and pass it when calling Runner.Resume. +# Multi-Turn Runtime: TurnLoop + +For scenarios requiring multi-turn interaction (chat applications, continuous conversations), ADK provides the `TurnLoop` runtime: + +- **Push-based event loop**: Push new messages to trigger Agent execution +- **Preempt**: When user sends a new message while Agent is running, current run can be cancelled +- **Stop**: Stop the event loop +- **Declarative Checkpoint/Resume**: TurnLoop automatically manages input bookkeeping; the application layer only needs to declare the recovery strategy + +See: [Agent Cancel and TurnLoop Quickstart](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) + +# Agent Cancel + +Runtime cancellation capability added in v0.9, supporting: + +- **CancelMode bitmask combination**: `CancelModelStream | CancelToolCalls` +- **CancelHandle.Wait()**: Wait for cancellation to complete +- **Integration with TurnLoop**: Automatically triggers Cancel on Preempt + +See: [Agent Cancel and TurnLoop Quickstart](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md deleted file mode 100644 index bfd09a76eb8..00000000000 --- a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md +++ /dev/null @@ -1,897 +0,0 @@ ---- -Description: "" -date: "2026-03-16" -lastmod: "" -tags: [] -title: 'Eino ADK: ChatModelAgent' -weight: 1 ---- - -# ChatModelAgent Overview - -## Import Path - -`import "github.com/cloudwego/eino/adk"` - -## What is ChatModelAgent - -`ChatModelAgent` is a core prebuilt Agent in Eino ADK that encapsulates the complex logic of interacting with Large Language Models (LLMs) and supports using tools to complete tasks. - -## ChatModelAgent ReAct Pattern - -`ChatModelAgent` uses the [ReAct](https://react-lm.github.io/) pattern internally, which is designed to solve complex problems by having the ChatModel perform explicit, step-by-step "thinking". After configuring tools for `ChatModelAgent`, its internal execution flow follows the ReAct pattern: - -- Call ChatModel (Reason) -- LLM returns tool call request (Action) -- ChatModelAgent executes tool (Act) -- It returns the tool result to ChatModel (Observation), then starts a new cycle until ChatModel determines no Tool call is needed and ends. - -When no tools are configured, `ChatModelAgent` degrades to a single ChatModel call. - - - -You can configure Tools for ChatModelAgent through ToolsConfig: - -```go -// github.com/cloudwego/eino/adk/chatmodel.go - -type ToolsConfig struct { - compose.ToolsNodeConfig - - // Names of the tools that will make agent return directly when the tool is called. - // When multiple tools are called and more than one tool is in the return directly list, only the first one will be returned. - ReturnDirectly map[string]bool - - // EmitInternalEvents indicates whether internal events from agentTool should be emitted - // to the parent generator via a tool option injection at run-time. - EmitInternalEvents bool -} -``` - -ToolsConfig reuses Eino Graph ToolsNodeConfig, see [Eino: ToolsNode & Tool Usage Guide](/docs/eino/core_modules/components/tools_node_guide) for details. Additionally, it provides the ReturnDirectly configuration. ChatModelAgent will exit directly after calling a Tool configured in ReturnDirectly. - -## ChatModelAgent Configuration Fields - -> 💡 -> Note: GenModelInput by default renders the Instruction in F-String format using adk.GetSessionValues(). To disable this behavior, customize the GenModelInput method. - -```go -type ChatModelAgentConfig struct { - // Name of the agent. Better be unique across all agents. - Name string - // Description of the agent's capabilities. - // Helps other agents determine whether to transfer tasks to this agent. - Description string - // Instruction used as the system prompt for this agent. - // Optional. If empty, no system prompt will be used. - // Supports f-string placeholders for session values in default GenModelInput, for example: - // "You are a helpful assistant. The current time is {Time}. The current user is {User}." - // These placeholders will be replaced with session values for "Time" and "User". - Instruction string - - Model model.ToolCallingChatModel - - ToolsConfig ToolsConfig - - // GenModelInput transforms instructions and input messages into the model's input format. - // Optional. Defaults to defaultGenModelInput which combines instruction and messages. - GenModelInput GenModelInput - - // Exit defines the tool used to terminate the agent process. - // Optional. If nil, no Exit Action will be generated. - // You can use the provided 'ExitTool' implementation directly. - Exit tool.BaseTool - - // OutputKey stores the agent's response in the session. - // Optional. When set, stores output via AddSessionValue(ctx, outputKey, msg.Content). - OutputKey string - - // MaxIterations defines the upper limit of ChatModel generation cycles. - // The agent will terminate with an error if this limit is exceeded. - // Optional. Defaults to 20. - MaxIterations int - - // ModelRetryConfig configures retry behavior for the ChatModel. - // When set, the agent will automatically retry failed ChatModel calls - // based on the configured policy. - // Optional. If nil, no retry will be performed. - ModelRetryConfig *ModelRetryConfig -} - -type ToolsConfig struct { - compose.ToolsNodeConfig - - // Names of the tools that will make agent return directly when the tool is called. - // When multiple tools are called and more than one tool is in the return directly list, only the first one will be returned. - ReturnDirectly map[string]bool - - // EmitInternalEvents indicates whether internal events from agentTool should be emitted - // to the parent generator via a tool option injection at run-time. - EmitInternalEvents bool -} - -type GenModelInput func(ctx context.Context, instruction string, input *AgentInput) ([]Message, error) -``` - -- `Name`: Agent name -- `Description`: Agent description -- `Instruction`: System Prompt when calling ChatModel, supports f-string rendering -- `Model`: ChatModel used for running, must support tool calling -- `ToolsConfig`: Tool configuration - - ToolsConfig reuses Eino Graph ToolsNodeConfig, see [Eino: ToolsNode & Tool Usage Guide](/docs/eino/core_modules/components/tools_node_guide) for details. - - ReturnDirectly: When ChatModelAgent calls a Tool configured in ReturnDirectly, it will immediately exit with the result, without returning to ChatModel per the react pattern. If multiple Tools are hit, only the first Tool is returned. Map key is the Tool name. - - EmitInternalEvents: When using adk.AgentTool() to treat an Agent as a SubAgent through ToolCall, by default, this SubAgent will not send AgentEvents, only returning the final result as ToolResult. -- `GenModelInput`: When the Agent is called, it uses this method to convert `Instruction` and `AgentInput` into Messages for calling ChatModel. The Agent provides a default GenModelInput method: - 1. Add `Instruction` as `System Message` before `AgentInput.Messages` - 2. Render `SessionValues` as variables into the message list from step 1 - -> 💡 -> The default `GenModelInput` uses pyfmt rendering. Text in the message list is treated as a pyfmt template, meaning '{' and '}' in the text are treated as keywords. If you want to input these two characters directly, they need to be escaped as '{{' and '}}'. - -- `OutputKey`: When configured, the last Message produced by ChatModelAgent running will be set in `SessionValues` with `OutputKey` as the key -- `MaxIterations`: Maximum number of ChatModel generations in react mode. Agent will exit with error when exceeded. Default value is 20 -- `Exit`: Exit is a special Tool. When the model calls this tool and executes it, ChatModelAgent will exit directly, with an effect similar to `ToolsConfig.ReturnDirectly`. ADK provides a default ExitTool implementation for users: - -```go -type ExitTool struct{} - -func (et ExitTool) Info(_ context.Context) (*schema.ToolInfo, error) { - return ToolInfoExit, nil -} - -func (et ExitTool) InvokableRun(ctx context.Context, argumentsInJSON string, _ ...tool.Option) (string, error) { - type exitParams struct { - FinalResult string `json:"final_result"` - } - - params := &exitParams{} - err := sonic.UnmarshalString(argumentsInJSON, params) - if err != nil { - return "", err - } - - err = SendToolGenAction(ctx, "exit", NewExitAction()) - if err != nil { - return "", err - } - - return params.FinalResult, nil -} -``` - -- `ModelRetryConfig`: When configured, various errors during ChatModel request (including direct errors and errors during streaming response) will be retried according to the configured policy. If an error occurs during streaming response, the streaming response will still be returned through AgentEvent immediately. If the error during streaming response will be retried according to the configured policy, consuming the message stream in AgentEvent will get `WillRetryError`. Users can handle this error for corresponding display processing. Example: - -```go -iterator := agent.Run(ctx, input) -for { - event, ok := iterator.Next() - if !ok { - break - } - - if event.Err != nil { - handleFinalError(event.Err) - break - } - - // Process streaming output - if event.Output != nil && event.Output.MessageOutput.IsStreaming { - stream := event.Output.MessageOutput.MessageStream - for { - msg, err := stream.Recv() - if err == io.EOF { - break // Stream completed successfully - } - if err != nil { - // Check if this error will be retried (more streams coming) - var willRetry *adk.WillRetryError - if errors.As(err, &willRetry) { - log.Printf("Attempt %d failed, retrying...", willRetry.RetryAttempt) - break // Wait for next event with new stream - } - // Original error - won't retry, agent will stop and the next AgentEvent probably will be an error - log.Printf("Final error (no retry): %v", err) - break - } - // Display chunk to user - displayChunk(msg) - } - } -} -``` - -## ChatModelAgent Transfer - -`ChatModelAgent` supports converting other Agents' meta information into its own Tools, achieving dynamic Transfer through ChatModel judgment: - -- `ChatModelAgent` implements the `OnSubAgents` interface. After using `SetSubAgents` to set sub Agents for `ChatModelAgent`, `ChatModelAgent` will add a `Transfer Tool` and instruct ChatModel in the prompt to call this Tool when transfer is needed, using the transfer target AgentName as Tool input. - -```go -const ( - TransferToAgentInstruction = `Available other agents: %s - -Decision rule: -- If you're best suited for the question according to your description: ANSWER -- If another agent is better according its description: CALL '%s' function with their agent name - -When transferring: OUTPUT ONLY THE FUNCTION CALL` -) - -func genTransferToAgentInstruction(ctx context.Context, agents []Agent) string { - var sb strings.Builder - for _, agent := range agents { - sb.WriteString(fmt.Sprintf("\n- Agent name: %s\n Agent description: %s", - agent.Name(ctx), agent.Description(ctx))) - } - - return fmt.Sprintf(TransferToAgentInstruction, sb.String(), TransferToAgentToolName) -} -``` - -- `Transfer Tool` running sets a Transfer Event, specifying the jump to the target Agent, and ChatModelAgent exits after completion. -- Agent Runner receives the Transfer Event and jumps to the target Agent for execution, completing the Transfer operation - -## ChatModelAgent AgentAsTool - -When the Agent being called doesn't need a complete running context but only clear and explicit input parameters to run correctly, the Agent can be converted to a Tool for `ChatModelAgent` to judge and call: - -- ADK provides utility methods to conveniently convert Eino ADK Agents to Tools for ChatModelAgent to call: - -```go -// github.com/cloudwego/eino/adk/agent_tool.go - -func NewAgentTool(_ context.Context, agent Agent, options ...AgentToolOption) tool.BaseTool -``` - -- Agents converted to Tools can be registered directly in ChatModelAgent through `ToolsConfig` - -```go -bookRecommender := NewBookRecommendAgent() -bookRecommendeTool := NewAgentTool(ctx, bookRecommender) - -a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // ... - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{bookRecommendeTool}, - }, - }, -}) -``` - -## ChatModelAgent Middleware - -`ChatModelAgentMiddleware` is an extension mechanism for `ChatModelAgent` that allows developers to inject custom logic at various stages of Agent execution: - - - -`ChatModelAgentMiddleware` is defined as an interface. Developers can implement this interface and configure it in `ChatModelAgentConfig` to make it effective in `ChatModelAgent`: - -```go -type ChatModelAgentMiddleware interface { - // ... -} - -a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // ... - Handlers: []adk.ChatModelAgentMiddleware{ - &MyMiddleware{}, - }, -}) -``` - -**Using BaseChatModelAgentMiddleware** - -`BaseChatModelAgentMiddleware` provides default empty implementations for all methods. By embedding it, you can override only the methods you need: - -```go -type MyMiddleware struct { - *adk.BaseChatModelAgentMiddleware - // Custom fields - logger *log.Logger -} - -// Only override the methods you need -func (m *MyMiddleware) BeforeModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - m.logger.Printf("Messages count: %d", len(state.Messages)) - return ctx, state, nil -} -``` - -### BeforeAgent - -Called before each Agent run, can be used to modify instructions and tool configuration. ChatModelAgentContext defines the content that can be read and written in BeforeAgent: - -```go -type ChatModelAgentContext struct { - // Instruction is the current Agent's instruction - Instruction string - // Tools is the current configured original tool list - Tools []tool.BaseTool - // ReturnDirectly configures tool name sets that return directly after being called - ReturnDirectly map[string]bool -} - -type ChatModelAgentMiddleware interface { - // ... - BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) - // ... -} -``` - -Example: - -```go -func (m *MyMiddleware) BeforeAgent( - ctx context.Context, - runCtx *adk.ChatModelAgentContext, -) (context.Context, *adk.ChatModelAgentContext, error) { - // Copy runCtx to avoid modifying input - nRunCtx := *runCtx - - // Modify instruction - nRunCtx.Instruction += "\n\nPlease always reply in Chinese." - - // Add tool - nRunCtx.Tools = append(runCtx.Tools, myCustomTool) - - // Set tool to return directly - nRunCtx.ReturnDirectly["my_tool"] = true - - return ctx, &nRunCtx, nil -} -``` - -### BeforeModelRewriteState / AfterModelRewriteState - -Called before/after each model call, can be used to inspect and modify message history. ModelContext defines read-only content, ChatModelAgentState defines read-write content: - -```go -type ModelContext struct { - // Tools contains the list of tools currently configured for the Agent - // Populated at request time, contains tool info that will be sent to the model - Tools []*schema.ToolInfo - - // ModelRetryConfig contains the retry configuration for the model - // Populated from Agent's ModelRetryConfig - ModelRetryConfig *ModelRetryConfig -} - -type ChatModelAgentState struct { - // Messages contains all messages in the current session - Messages []Message -} - -type ChatModelAgentMiddleware interface { - BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) -} -``` - -Example: - -```go -func (m *MyMiddleware) BeforeModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - // Copy state to avoid modifying input - nState := *state - - // Check message history - if len(state.Messages) > 50 { - // Truncate old messages - nState.Messages = state.Messages[len(state.Messages)-50:] - } - return ctx, &nState, nil -} - -func (m *MyMiddleware) AfterModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - // Model response is the last message - lastMsg := state.Messages[len(state.Messages)-1] - m.logger.Printf("Model response: %s", lastMsg.Content) - return ctx, state, nil -} -``` - -### WrapModel - -Wraps model calls, can be used to intercept and modify model input and output: - -```go -type ChatModelAgentMiddleware interface { - WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) -} -``` - -Example: - -```go -func (m *MyMiddleware) WrapModel( - ctx context.Context, - chatModel model.BaseChatModel, - mc *adk.ModelContext, -) (model.BaseChatModel, error) { - return &loggingModel{ - inner: chatModel, - logger: m.logger, - }, nil -} - -type loggingModel struct { - inner model.BaseChatModel - logger *log.Logger -} - -func (m *loggingModel) Generate(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.Message, error) { - m.logger.Printf("Input messages: %d", len(msgs)) - resp, err := m.inner.Generate(ctx, msgs, opts...) - m.logger.Printf("Output: %v, error: %v", resp != nil, err) - return resp, err -} - -func (m *loggingModel) Stream(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.StreamReader[*schema.Message], error) { - return m.inner.Stream(ctx, msgs, opts...) -} -``` - -### WrapInvokableToolCall / WrapStreamableToolCall - -Wraps tool calls, can be used to intercept and modify tool input and output: - -```go -// InvokableToolCallEndpoint is the function signature for tool calls. -// Middleware developers add custom logic around this Endpoint. -type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) - -// StreamableToolCallEndpoint is the function signature for streaming tool calls. -// Middleware developers add custom logic around this Endpoint. -type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) - -type ToolContext struct { - // Name indicates the name of the tool being called - Name string - // CallID indicates the ToolCallID of this tool call - CallID string -} - -type ChatModelAgentMiddleware interface { - WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) - WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) -} -``` - -Example: - -```go -func (m *MyMiddleware) WrapInvokableToolCall( - ctx context.Context, - endpoint adk.InvokableToolCallEndpoint, - tCtx *adk.ToolContext, -) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - m.logger.Printf("Calling tool: %s (ID: %s)", tCtx.Name, tCtx.CallID) - start := time.Now() - - result, err := endpoint(ctx, argumentsInJSON, opts...) - - m.logger.Printf("Tool %s completed in %v", tCtx.Name, time.Since(start)) - return result, err - }, nil -} -``` - -# ChatModelAgent Usage Example - -## Scenario Description - -Create a book recommendation Agent that can recommend relevant books based on user input. - -## Code Implementation - -### Step 1: Define Tools - -The book recommendation Agent needs a `book_search` tool that can search for books based on user requirements (genre, rating, etc.). - -Using utility methods provided by Eino makes it easy to create (see [How to create a tool?](/docs/eino/core_modules/components/tools_node_guide/how_to_create_a_tool)): - -```go -import ( - "context" - "log" - - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" -) - -type BookSearchInput struct { - Genre string `json:"genre" jsonschema:"description=Preferred book genre,enum=fiction,enum=sci-fi,enum=mystery,enum=biography,enum=business"` - MaxPages int `json:"max_pages" jsonschema:"description=Maximum page length (0 for no limit)"` - MinRating int `json:"min_rating" jsonschema:"description=Minimum user rating (0-5 scale)"` -} - -type BookSearchOutput struct { - Books []string -} - -func NewBookRecommender() tool.InvokableTool { - bookSearchTool, err := utils.InferTool("search_book", "Search books based on user preferences", func(ctx context.Context, input *BookSearchInput) (output *BookSearchOutput, err error) { - // search code - // ... - return &BookSearchOutput{Books: []string{"God's blessing on this wonderful world!"}}, nil - }) - if err != nil { - log.Fatalf("failed to create search book tool: %v", err) - } - return bookSearchTool -} -``` - -### Step 2: Create ChatModel - -Eino provides various ChatModel wrappers (such as openai, gemini, doubao, etc., see [Eino: ChatModel Usage Guide](/docs/eino/core_modules/components/chat_model_guide) for details). Here we use openai ChatModel as an example: - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/components/model" -) - -func NewChatModel() model.ToolCallingChatModel { - ctx := context.Background() - apiKey := os.Getenv("OPENAI_API_KEY") - openaiModel := os.Getenv("OPENAI_MODEL") - - cm, err := openai.NewChatModel(ctx, &openai.ChatModelConfig{ - APIKey: apiKey, - Model: openaiModel, - }) - if err != nil { - log.Fatal(fmt.Errorf("failed to create chatmodel: %w", err)) - } - return cm -} -``` - -### Step 3: Create ChatModelAgent - -In addition to configuring ChatModel and tools, you need to configure Name and Description describing the Agent's function and purpose, as well as the Instruction that instructs the ChatModel. The Instruction will ultimately be passed to ChatModel as a system message. - -```go -import ( - "context" - "fmt" - "log" - - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/compose" -) - -func NewBookRecommendAgent() adk.Agent { - ctx := context.Background() - - a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "BookRecommender", - Description: "An agent that can recommend books", - Instruction: `You are an expert book recommender. Based on the user's request, use the "search_book" tool to find relevant books. Finally, present the results to the user.`, - Model: NewChatModel(), - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{NewBookRecommender()}, - }, - }, - }) - if err != nil { - log.Fatal(fmt.Errorf("failed to create chatmodel: %w", err)) - } - - return a -} -``` - -### - -### Step 4: Run via Runner - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino/adk" - - "github.com/cloudwego/eino-examples/adk/intro/chatmodel/internal" -) - -func main() { - ctx := context.Background() - a := internal.NewBookRecommendAgent() - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: a, - }) - iter := runner.Query(ctx, "recommend a fiction book to me") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - msg, err := event.Output.MessageOutput.GetMessage() - if err != nil { - log.Fatal(err) - } - fmt.Printf("\nmessage:\n%v\n======", msg) - } -} -``` - -## Running Result - -```yaml -message: -assistant: -tool_calls: -{Index: ID:call_o2It087hoqj8L7atzr70EnfG Type:function Function:{Name:search_book Arguments:{"genre":"fiction","max_pages":0,"min_rating":0}} Extra:map[]} - -finish_reason: tool_calls -usage: &{140 24 164} -====== - - -message: -tool: {"Books":["God's blessing on this wonderful world!"]} -tool_call_id: call_o2It087hoqj8L7atzr70EnfG -tool_call_name: search_book -====== - - -message: -assistant: I recommend the fiction book "God's blessing on this wonderful world!". It's a great choice for readers looking for an exciting story. Enjoy your reading! -finish_reason: stop -usage: &{185 31 216} -====== -``` - -# ChatModelAgent Interrupt and Resume - -## Introduction - -`ChatModelAgent` is implemented using Eino Graph, so it can reuse Eino Graph's Interrupt&Resume capability in the agent. - -- On Interrupt, return a special error in the tool to make the Graph trigger an interrupt and throw custom information. On resume, the Graph will re-run this tool: - -```go -// github.com/cloudwego/eino/adk/interrupt.go - -func NewInterruptAndRerunErr(extra any) error -``` - -- On Resume, custom ToolOptions are supported for passing additional information to the Tool during resume: - -```go -import ( - "github.com/cloudwego/eino/components/tool" -) - -type askForClarificationOptions struct { - NewInput *string -} - -func WithNewInput(input string) tool.Option { - return tool.WrapImplSpecificOptFn(func(t *askForClarificationOptions) { - t.NewInput = &input - }) -} -``` - -## Example - -Below we will build on the code from the [ChatModelAgent Usage Example] section above to add a tool `ask_for_clarification` to `BookRecommendAgent`. When the user provides insufficient information for recommendations, the Agent will call this tool to ask the user for more information. `ask_for_clarification` uses the Interrupt&Resume capability to implement "asking" the user. - -### Step 1: Add Tool Supporting Interrupt - -```go -import ( - "context" - "log" - - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" - "github.com/cloudwego/eino/compose" -) - -type askForClarificationOptions struct { - NewInput *string -} - -func WithNewInput(input string) tool.Option { - return tool.WrapImplSpecificOptFn(func(t *askForClarificationOptions) { - t.NewInput = &input - }) -} - -type AskForClarificationInput struct { - Question string `json:"question" jsonschema:"description=The specific question you want to ask the user to get the missing information"` -} - -func NewAskForClarificationTool() tool.InvokableTool { - t, err := utils.InferOptionableTool( - "ask_for_clarification", - "Call this tool when the user's request is ambiguous or lacks the necessary information to proceed. Use it to ask a follow-up question to get the details you need, such as the book's genre, before you can use other tools effectively.", - func(ctx context.Context, input *AskForClarificationInput, opts ...tool.Option) (output string, err error) { - o := tool.GetImplSpecificOptions[askForClarificationOptions](nil, opts...) - if o.NewInput == nil { - return "", compose.NewInterruptAndRerunErr(input.Question) - } - return *o.NewInput, nil - }) - if err != nil { - log.Fatal(err) - } - return t -} -``` - -### Step 2: Add Tool to Agent - -```go -func NewBookRecommendAgent() adk.Agent { - // xxx - a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // xxx - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{NewBookRecommender(), NewAskForClarificationTool()}, - }, - // Whether to output AgentEvents from SubAgent when Tool internally calls SubAgent via AgentTool() - EmitInternalEvents: true, - }, - }) - // xxx -} -``` - -### Step 3: Configure CheckPointStore in Agent Runner - -Configure `CheckPointStore` in Runner (the example uses the simplest InMemoryStore), and pass in `CheckPointID` when calling the Agent for use during resume. Also, on interrupt, Graph places `InterruptInfo` in `Interrupted.Data`: - -```go -func newInMemoryStore() compose.CheckPointStore { - return &inMemoryStore{ - mem: map[string][]byte{}, - } -} - -func main() { - ctx := context.Background() - a := subagents.NewBookRecommendAgent() - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - EnableStreaming: true, // you can disable streaming here - Agent: a, - CheckPointStore: newInMemoryStore(), - }) - iter := runner.Query(ctx, "recommend a book to me", adk.WithCheckPointID("1")) - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil && event.Action.Interrupted != nil { - fmt.Printf("\ninterrupt happened, info: %+v\n", event.Action.Interrupted.Data.(*adk.ChatModelAgentInterruptInfo).RerunNodesExtra["ToolNode"]) - continue - } - msg, err := event.Output.MessageOutput.GetMessage() - if err != nil { - log.Fatal(err) - } - fmt.Printf("\nmessage:\n%v\n======\n\n", msg) - } - - scanner := bufio.NewScanner(os.Stdin) - fmt.Print("\nyour input here: ") - scanner.Scan() - fmt.Println() - nInput := scanner.Text() - - iter, err := runner.Resume(ctx, "1", adk.WithToolOptions([]tool.Option{subagents.WithNewInput(nInput)})) - if err != nil { - log.Fatal(err) - } - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - prints.Event(event) - } -} -``` - -### Running Result - -An interrupt will occur after running - -``` -message: -assistant: -tool_calls: -{Index: ID:call_3HAobzkJvW3JsTmSHSBRftaG Type:function Function:{Name:ask_for_clarification Arguments:{"question":"Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{219 37 256} -====== - - -interrupt happened, info: &{ToolCalls:[{Index: ID:call_3HAobzkJvW3JsTmSHSBRftaG Type:function Function:{Name:ask_for_clarification Arguments:{"question":"Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?"}} Extra:map[]}] ExecutedTools:map[] RerunTools:[call_3HAobzkJvW3JsTmSHSBRftaG] RerunExtraMap:map[call_3HAobzkJvW3JsTmSHSBRftaG:Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?]} -your input here: -``` - -After stdin input, retrieve the previous interrupt state from CheckPointStore and continue running with the completed input - -``` -new input is: -recommend me a fiction book - -message: -tool: recommend me a fiction book -tool_call_id: call_3HAobzkJvW3JsTmSHSBRftaG -tool_call_name: ask_for_clarification -====== - - -message: -assistant: -tool_calls: -{Index: ID:call_3fC5OqPZLls11epXMv7sZGAF Type:function Function:{Name:search_book Arguments:{"genre":"fiction","max_pages":0,"min_rating":0}} Extra:map[]} - -finish_reason: tool_calls -usage: &{272 24 296} -====== - - -message: -tool: {"Books":["God's blessing on this wonderful world!"]} -tool_call_id: call_3fC5OqPZLls11epXMv7sZGAF -tool_call_name: search_book -====== - - -message: -assistant: I recommend the fiction book "God's Blessing on This Wonderful World!" Enjoy your reading! -finish_reason: stop -usage: &{317 20 337} -====== -``` - -# Summary - -`ChatModelAgent` is the core Agent implementation in ADK, serving as the "thinking" part of applications. It leverages the powerful capabilities of LLMs for reasoning, understanding natural language, making decisions, generating responses, and interacting with tools. - -`ChatModelAgent`'s behavior is non-deterministic, dynamically deciding which tools to use or transferring control to other Agents through LLM. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md new file mode 100644 index 00000000000..735e6b87e2f --- /dev/null +++ b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md @@ -0,0 +1,306 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: ChatModelAgent +weight: 1 +--- + +# ChatModelAgent Overview + +`import "github.com/cloudwego/eino/adk"` + +## What is ChatModelAgent + +`ChatModelAgent` is the core Agent implementation of Eino ADK — it uses a ChatModel as the decision maker, Tools as the action space, and autonomously drives problem-solving through a ReAct Loop. + +For a complete introduction to ChatModelAgent concepts, ReAct Loop, and Middleware system, see: [ChatModelAgent Introduction](/docs/eino/overview/eino_adk_quickstart) + +## ReAct Loop + +When Tools are configured, ChatModelAgent executes in a ReAct loop: + +1. **Reason**: Calls the ChatModel, which decides the next action +2. **Action**: The model returns a ToolCall request +3. **Act**: Executes the corresponding Tool +4. **Observation**: Injects the Tool result into the context and starts a new loop iteration + +The loop continues until the model determines no further Tool calls are needed. Without Tools configured, it degrades to a single ChatModel call. + +# Configuration + +## TypedChatModelAgentConfig + +```go +type TypedChatModelAgentConfig[M MessageType] struct { + Name string + Description string + Instruction string + + Model model.BaseModel[M] // Required. Must support model.WithTools when using Tools + + ToolsConfig ToolsConfig + GenModelInput TypedGenModelInput[M] + + Exit tool.BaseTool // NOT RECOMMENDED + OutputKey string // NOT RECOMMENDED + MaxIterations int // Default 20 + + Handlers []TypedChatModelAgentMiddleware[M] + Middlewares []AgentMiddleware // Legacy compatibility + + ModelRetryConfig *TypedModelRetryConfig[M] + ModelFailoverConfig *ModelFailoverConfig[M] +} + +// Default alias +type ChatModelAgentConfig = TypedChatModelAgentConfig[*schema.Message] +``` + +### Field Descriptions + + + + + + + + + + + + + + +
    FieldDescription
    Name
    Agent name. Required when used as an AgentTool
    Description
    Agent capability description. Required when used as an AgentTool
    Instruction
    System Prompt. Supports
    {Key}
    placeholders; default
    GenModelInput
    renders with SessionValues
    Model
    Required.
    model.BaseModel[M]
    type; must support
    model.WithTools
    when using Tools
    ToolsConfig
    Tool configuration, see below for details
    GenModelInput
    Custom input transformation. Default uses Instruction as System Message + f-string rendering
    MaxIterations
    Maximum ReAct loop iterations; exits with error when exceeded. Default 20
    Handlers
    Interface-based Middleware (
    TypedChatModelAgentMiddleware[M]
    ), recommended
    Middlewares
    Struct-based Middleware (
    AgentMiddleware
    ), legacy compatibility
    ModelRetryConfig
    Retry strategy for failed model calls
    ModelFailoverConfig
    Switches to a backup model on model call failure. Requires configuring
    GetFailoverModel
    and
    ShouldFailover
    + +> 💡 +> The default GenModelInput uses pyfmt rendering. `{` and `}` in Messages are treated as placeholders. To output these characters literally, escape them with `{{` and `}}`. + +### ToolsConfig + +```go +type ToolsConfig struct { + compose.ToolsNodeConfig + + ReturnDirectly map[string]bool // Tool names that return directly after execution + EmitInternalEvents bool // Forward AgentTool internal events +} +``` + +- **ReturnDirectly**: When a matching Tool is executed, the Agent exits immediately without calling the model again. If multiple match, the first is used +- **EmitInternalEvents**: When a sub-Agent is called via AgentTool, its events are forwarded in real-time to the parent Agent's event stream + +### Constructor Functions + +```go +func NewChatModelAgent(ctx context.Context, config *ChatModelAgentConfig) (*ChatModelAgent, error) +func NewTypedChatModelAgent[M MessageType](ctx context.Context, config *TypedChatModelAgentConfig[M]) (*TypedChatModelAgent[M], error) +``` + +# Middleware (ChatModelAgentMiddleware) + +## Interface Definition + +```go +type TypedChatModelAgentMiddleware[M MessageType] interface { + BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) + AfterAgent(ctx context.Context, state *TypedChatModelAgentState[M]) (context.Context, error) + + BeforeModelRewriteState(ctx context.Context, state *TypedChatModelAgentState[M], mc *TypedModelContext[M]) (context.Context, *TypedChatModelAgentState[M], error) + AfterModelRewriteState(ctx context.Context, state *TypedChatModelAgentState[M], mc *TypedModelContext[M]) (context.Context, *TypedChatModelAgentState[M], error) + + WrapModel(ctx context.Context, m model.BaseModel[M], mc *TypedModelContext[M]) (model.BaseModel[M], error) + + WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) + WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) + WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) + WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) +} + +type ChatModelAgentMiddleware = TypedChatModelAgentMiddleware[*schema.Message] +``` + +Use `*BaseChatModelAgentMiddleware` embedding to only override the methods you need: + +```go +type MyMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *MyMiddleware) BeforeModelRewriteState( + ctx context.Context, + state *adk.ChatModelAgentState, + mc *adk.ModelContext, +) (context.Context, *adk.ChatModelAgentState, error) { + // Custom logic + return ctx, state, nil +} +``` + +## Hook Points + + + + + + + + + +
    HookTimingModifiable Content
    BeforeAgent
    Before Agent runs (once only)Instruction, Tools, ReturnDirectly, ToolSearchTool
    AfterAgent
    After Agent succeedsRead final state (no modification)
    BeforeModelRewriteState
    Before each model callMessages, ToolInfos, DeferredToolInfos (persisted to state)
    AfterModelRewriteState
    After each model callMessages (including model response), ToolInfos (persisted to state)
    WrapModel
    Wraps model callRetry, failover, event sending (do not modify Messages)
    WrapToolCall
    Wraps tool callPermission checks, logging, output rewriting
    + +> 💡 +> The state returned by `BeforeModelRewriteState` is persisted to the agent's internal state by the framework. Therefore, modifications in this hook (e.g., compressing Messages, filtering ToolInfos) affect all subsequent iterations. + +## Core Types + +### ChatModelAgentContext (BeforeAgent parameter) + +```go +type ChatModelAgentContext struct { + Instruction string + Tools []tool.BaseTool + ReturnDirectly map[string]bool + ToolSearchTool *schema.ToolInfo // Model's native ToolSearch capability +} +``` + +### ChatModelAgentState (BeforeModel/AfterModel parameter) + +```go +type TypedChatModelAgentState[M MessageType] struct { + Messages []M + ToolInfos []*schema.ToolInfo // Tool list passed to the model + DeferredToolInfos []*schema.ToolInfo // Server-side deferred tool list +} + +type ChatModelAgentState = TypedChatModelAgentState[*schema.Message] +``` + +### ModelContext (WrapModel parameter) + +```go +type TypedModelContext[M MessageType] struct { + Tools []*schema.ToolInfo // Deprecated: use state.ToolInfos + ModelRetryConfig *TypedModelRetryConfig[M] + ModelFailoverConfig *ModelFailoverConfig[M] +} + +type ModelContext = TypedModelContext[*schema.Message] +``` + +## Execution Order + +**Model call chain** (outer to inner): + +1. `AgentMiddleware.BeforeChatModel` +2. **BeforeModelRewriteState** +3. failover wrapper (built-in) +4. retry wrapper (built-in) +5. event sender wrapper (built-in) +6. **WrapModel** (first registered = outermost) +7. callback injection (built-in) +8. Actual model call +9. **AfterModelRewriteState** +10. `AgentMiddleware.AfterChatModel` + +**Tool call chain** (outer to inner): + +1. event sender (built-in) +2. `ToolsConfig.ToolCallMiddlewares` +3. `AgentMiddleware.WrapToolCall` +4. **WrapToolCall** (first registered = outermost) +5. callback injection (built-in) +6. Actual tool call + +# AgentAsTool + +Wraps a sub-Agent as a Tool so the parent Agent can invoke it autonomously via ToolCall: + +```go +subAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "researcher", + Description: "Searches and summarizes information", + Model: chatModel, + // ... +}) + +agentTool := adk.NewAgentTool(ctx, subAgent) + +parentAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + // ... + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{agentTool}, + }, + }, +}) +``` + +Generic version: `adk.NewTypedAgentTool[M](ctx, agent, options...)` + +Options: `WithFullChatHistoryAsInput()` (passes full conversation history), `WithAgentInputSchema(schema)` (custom input schema) + +# ModelRetry + +When configured, automatically retries failed ChatModel calls. When an error occurs during streaming, the current stream is still returned via AgentEvent, and consuming the MessageStream yields a `WillRetryError`: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + // ... + ModelRetryConfig: &adk.ModelRetryConfig{ + // Retry strategy configuration + }, +}) + +// Handle WillRetryError when consuming the event stream +stream := event.Output.MessageOutput.MessageStream +for { + msg, err := stream.Recv() + if err == io.EOF { + break + } + if err != nil { + var willRetry *adk.WillRetryError + if errors.As(err, &willRetry) { + log.Printf("Attempt %d failed, retrying...", willRetry.RetryAttempt) + break // Wait for the next event + } + break + } + displayChunk(msg) +} +``` + +# ModelFailover + +When configured, switches to a backup model on model call failure: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: primaryModel, + ModelFailoverConfig: &adk.ModelFailoverConfig{ + GetFailoverModel: func(ctx context.Context, err error) (model.BaseModel[*schema.Message], error) { + return backupModel, nil + }, + ShouldFailover: func(err error) bool { + return true // Decide whether to failover based on error type + }, + }, +}) +``` + +# Cancel + +Runtime cancellation capability introduced in v0.9. See [Agent Cancel and TurnLoop](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) for details. + +```go +cancelOpt, cancelFn := adk.WithCancel() +iter := runner.Run(ctx, messages, cancelOpt) + +// Cancel later (CancelMode supports bitmask combinations) +handle := cancelFn(adk.CancelAfterChatModel | adk.CancelAfterToolCalls) +handle.Wait() // Wait for cancellation to complete +``` diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md new file mode 100644 index 00000000000..29125a76c21 --- /dev/null +++ b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md @@ -0,0 +1,176 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: ChatModel Failover Documentation +weight: 1 +--- + +> 💡 +> This feature is currently in alpha/09 gradual rollout + +## Overview + +`ChatModelAgent` has built-in model failover capability: when the primary model call fails, it automatically switches to a backup model, supporting both Generate (synchronous) and Stream (streaming). Configured via `ModelFailoverConfig[M]`, it is orthogonally composable with `TypedModelRetryConfig[M]` (same-model retry). + +> This document uses the default `*schema.Message` type as an example. For generic usage, replace the APIs with their `Typed` prefix versions and parameterize the message type as `M MessageType`. + +## Core Data Structures + +### ModelFailoverConfig[M] + +```go +type ModelFailoverConfig[M MessageType] struct { + // Maximum number of failover attempts. 0 means no failover; + // 1 means GetFailoverModel is called at most once. + // When lastSuccessModel exists, it is tried first before calling GetFailoverModel. + MaxRetries uint + + // Determines whether to trigger failover. Always stops when ctx.Err() != nil regardless of return value. + // When combined with ModelRetryConfig, outputErr is *RetryExhaustedError; + // the original error is available via RetryExhaustedError.LastErr. + // In streaming scenarios, outputMessage may carry partially received messages. + // Required when configuring ModelFailoverConfig. + ShouldFailover func(ctx context.Context, outputMessage M, outputErr error) bool + + // Selects the next model and optionally transforms input messages. + // failoverCtx.FailoverAttempt starts from 1. + // Returning nil failoverModelInputMessages means using the original input. + // Returning non-nil failoverErr terminates failover immediately. + // Required when configuring ModelFailoverConfig. + GetFailoverModel func(ctx context.Context, failoverCtx *FailoverContext[M]) ( + failoverModel model.BaseModel[M], + failoverModelInputMessages []M, + failoverErr error, + ) +} +``` + +### FailoverContext[M] + +```go +type FailoverContext[M MessageType] struct { + FailoverAttempt uint // Current attempt number, starting from 1 + InputMessages []M // Original input before transformation + LastOutputMessage M // Output from last failure (partial message in streaming) + // When combined with ModelRetryConfig, this is *RetryExhaustedError + LastErr error // Error from last failure +} +``` + +## Quick Start + +### Basic Usage: Dual-Model Failover + +```go +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "my-agent", + Instruction: "You are a helpful assistant.", + Model: primaryModel, // model.BaseModel[*schema.Message], required + + ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, // At most 1 failover (2 total calls) + + ShouldFailover: func(ctx context.Context, msg *schema.Message, err error) bool { + return !errors.Is(err, context.Canceled) && + !errors.Is(err, context.DeadlineExceeded) + }, + + GetFailoverModel: func(ctx context.Context, fc *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + return fallbackModel, nil, nil // nil messages → use original input + }, + }, +}) +``` + +> 💡 +> `model.BaseChatModel` is a type alias for `model.BaseModel[*schema.Message]`; the two can be used interchangeably. + +### Transforming Input During Failover + +When the backup model doesn't support certain features (e.g., image input): + +```go +ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, + ShouldFailover: func(_ context.Context, _ *schema.Message, _ error) bool { + return true + }, + GetFailoverModel: func(_ context.Context, fc *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + // Filter out image content, degrade to text-only model + return textModel, filterTextOnly(fc.InputMessages), nil + }, +}, +``` + +### Combining with Retry + +Failover and Retry compose orthogonally. Semantics: **each model retries according to the Retry strategy first; after retries are exhausted, Failover switches models**. + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: primaryModel, + // ... + + ModelRetryConfig: &adk.ModelRetryConfig{ + MaxRetries: 2, + IsRetryAble: func(_ context.Context, err error) bool { + return isTransientError(err) + }, + }, + + ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, + ShouldFailover: func(_ context.Context, _ *schema.Message, err error) bool { + // err is *RetryExhaustedError at this point + return true + }, + GetFailoverModel: func(_ context.Context, _ *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + return fallbackModel, nil, nil + }, + }, +}) +``` + +## Streaming Failover Behavior + + + + + + +
    ScenarioBehavior
    Stream()
    initialization failure
    Same as Generate, directly triggers failover evaluation
    Mid-stream errorReceived chunks are concatenated into
    LastOutputMessage
    and passed to
    ShouldFailover
    ; after deciding to failover, closes the current stream and restarts with the new model
    Client impactEvents already sent during the failed attempt are not retracted. Clients should reset partial results or deduplicate by metadata when receiving a new stream round
    + +> 💡 +> `ErrStreamCanceled` (caller actively abandoning the stream) does not trigger failover; it returns directly. + +## Model Call Chain Execution Order + +Failover's position in the wrapper chain (outer to inner): + +``` +1. AgentMiddleware.BeforeChatModel + 2. ChatModelAgentMiddleware.BeforeModelRewriteState + 3. failoverModelWrapper ← failover at this layer + 4. retryModelWrapper ← internal retry within each failover model + 5. eventSenderModelWrapper + 6. ChatModelAgentMiddleware.WrapModel (first registered = outermost) + 7. callbackInjectionModelWrapper (handled internally by failoverProxyModel when failover is enabled) + 8. failoverProxyModel / Model.Generate|Stream + 9. ChatModelAgentMiddleware.AfterModelRewriteState +10. AgentMiddleware.AfterChatModel +``` + +## Important Notes + +- **Required validation**: Both `ShouldFailover` and `GetFailoverModel` are required when configuring `ModelFailoverConfig`; missing either returns an error from `NewChatModelAgent`. The `Model` field is always required. +- **Attempt numbering**: `FailoverAttempt` starts from 1. A single Model call executes at most `1 + MaxRetries` times (1 initial + up to MaxRetries failovers). +- **Input messages**: When `GetFailoverModel` returns `nil` messages, the original input is used; when it returns non-`nil`, it replaces the original input. +- **Error type when combined with Retry**: `ShouldFailover` and `FailoverContext.LastErr` receive `*RetryExhaustedError`; the original error is available via `RetryExhaustedError.LastErr`. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md index 9f7f876db3d..aea728afd7a 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md @@ -1,192 +1,208 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: DeepAgents' -weight: 5 +title: DeepAgents +weight: 3 --- -## DeepAgents Overview +> 💡 +> This feature requires eino >= v0.5.14. -DeepAgents is an out-of-the-box agent solution built on top of ChatModelAgent (see: [Eino ADK: ChatModelAgent](/docs/eino/core_modules/eino_adk/agent_implementation/chat_model)). You don't need to assemble prompts, tools, or context management yourself - you can immediately get a runnable agent while still using ChatModelAgent's extension capabilities to add business features, such as custom tools and middleware. +## Overview -**Included Features:** +DeepAgents is an out-of-the-box solution based on ChatModelAgent. Without manually assembling prompts, tools, or context management, you get an Agent with planning, file system, shell execution, and sub-Agent delegation capabilities, while retaining all the extensibility of ChatModelAgent (custom tools, middleware, handlers). -- **Planning Capability** — Task decomposition and progress tracking through `write_todos` -- **File System** — Provides `read_file`, `write_file`, `edit_file`, `ls`, `glob`, `grep` for reading and writing context -- **Shell Access** — Use `execute` to run commands -- **Sub-Agents** — Delegate work to sub-agents with independent context windows via `task` -- **Smart Default Configuration** — Built-in prompts that teach the model how to efficiently use these tools -- **Context Management** — Automatic summarization for long conversation history, automatic file saving for large outputs - - SummarizationMiddleware, ReductionMiddleware are under development +**Built-in Capabilities**: -### ImportPath +- **Planning** — `write_todos` tool for task decomposition and progress tracking +- **File System** — `ls`, `read_file`, `write_file`, `edit_file`, `glob`, `grep` +- **Shell** — `execute` (supports streaming) +- **Sub-Agent** — `task` tool delegates tasks to context-isolated sub-agents +- **Smart Defaults** — Built-in prompts teach the model to use tools efficiently +- **Context Management** — Large outputs are automatically saved to files -Eino version must be >= v0.5.14 +### Import ```go -import github.com/cloudwego/eino/adk/prebuilt/deep +import "github.com/cloudwego/eino/adk/prebuilt/deep" -agent, err := deep.New(ctx, &deep.Config{}) +agent, err := deep.New(ctx, &deep.Config{ + ChatModel: myModel, +}) ``` -### DeepAgents Structure - -The core concept of DeepAgents is to use a main agent (MainAgent) to coordinate, plan, delegate, or autonomously execute tasks. The main agent uses its built-in ChatModel and a series of tools to interact with the external world or decompose complex tasks to specialized sub-agents (SubAgents). - - +--- -The diagram above shows the core components of DeepAgents and their relationships: +## Complete Config Definition -- Main Agent: The entry point and commander of the system, receives initial tasks, calls tools in ReAct mode to complete tasks and is responsible for presenting the final results. -- ChatModel (ToolCallingChatModel): Usually a large language model with tool-calling capabilities, responsible for understanding tasks, reasoning, selecting and calling tools. -- Tools: A collection of capabilities available to MainAgent, including: - - WriteTodos: Built-in planning tool for decomposing complex tasks into structured todo lists. - - TaskTool: A special tool that serves as the unified entry point for calling sub-agents. - - BuiltinTools, CustomTools: General tools built into DeepAgents and various tools customized by users according to business needs. -- SubAgents: Responsible for executing specific, independent subtasks, with context isolated from MainAgent. - - GeneralPurpose: A general-purpose sub-agent with the same tools as MainAgent (except TaskTool), used to execute subtasks in a "clean" context. - - CustomSubAgents: Various sub-agents customized by users according to business needs. +```go +type Config = TypedConfig[*schema.Message] -### Built-in Capabilities +type TypedConfig[M adk.MessageType] struct { + Name string // Agent identifier name + Description string // Purpose description + ChatModel model.BaseModel[M] // Required; must support model.WithTools + Instruction string // System prompt; uses built-in default prompt when empty -#### Filesystem + // Sub-Agents (bound to TaskTool) + SubAgents []adk.TypedAgent[M] -> 💡 -> Currently in alpha state + // Custom tools + ToolsConfig adk.ToolsConfig + MaxIteration int // Maximum reasoning iteration count -When creating DeepAgents, configure the relevant Backend, and DeepAgents will automatically load the corresponding tools: + // File system (choose one or combine) + Backend filesystem.Backend // Registers ls/read_file/write_file/edit_file/glob/grep + Shell filesystem.Shell // Registers execute (mutually exclusive with StreamingShell) + StreamingShell filesystem.StreamingShell // Registers execute (streaming, mutually exclusive with Shell) -``` -type Config struct { - // ... - Backend filesystem.Backend - Shell filesystem.Shell - StreamingShell filesystem.StreamingShell - // ... -} -``` + // Built-in feature toggles + WithoutWriteTodos bool // true disables write_todos tool + WithoutGeneralSubAgent bool // true disables the default general-purpose sub-Agent - - - - - -
    ConfigurationFunctionAdded Tools
    BackendProvides file system access capability, optionalread_file, write_file, edit_file, glob, grep
    ShellProvides Shell capability, optional, mutually exclusive with StreamShellexecute
    StreamingShellProvides Shell capability with streaming results, optional, mutually exclusive with Shellexecute(streaming)
    + // TaskTool description generator (customize task tool's description) + TaskToolDescriptionGenerator func(ctx context.Context, agents []adk.TypedAgent[M]) (string, error) -DeepAgents implements built-in filesystem by referencing filesystem middleware. For more detailed capability description of this middleware, see: [Middleware: FileSystem](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem) + // Extensions + Middlewares []adk.AgentMiddleware // struct-based middleware + Handlers []adk.TypedChatModelAgentMiddleware[M] // interface-based handlers -### Task Decomposition and Planning + // Model fault tolerance + ModelRetryConfig *adk.TypedModelRetryConfig[M] + ModelFailoverConfig *adk.ModelFailoverConfig[M] -The Description of WriteTodos describes the principles of task decomposition and planning. The main agent adds a subtask list to the context by calling the WriteTodos tool to inspire subsequent reasoning and execution processes: + // Output storage (written to session via AddSessionValue) + OutputKey string +} +``` - +### Constructors -1. The model receives user input. -2. The model calls the WriteTodos tool with a task list generated according to the WriteTodos Description. This tool call is added to the context for future reference. -3. The model calls TaskTool according to the todos in the context to complete the first todo. -4. Calls WriteTodos again to update the Todos execution progress. +```go +// Standard version (M = *schema.Message) +func New(ctx context.Context, cfg *Config) (adk.ResumableAgent, error) -> 💡 -> For simple tasks, calling WriteTodos every time may have a negative effect. The WriteTodos Description includes some common positive and negative examples to avoid not calling or over-calling WriteTodos. When using DeepAgents, you can add more prompts according to actual business scenarios to make WriteTodos called at appropriate times. +// Generic version (supports *schema.AgenticMessage) +func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedResumableAgent[M], error) +``` > 💡 -> WriteTodos will be added to the Agent by default. Configure `WithoutWriteTodos=true` to disable WriteTodos. +> Returns ResumableAgent (includes Resume method), which can be used with Runner's checkpoint/resume mechanism. -### Task Delegation and SubAgents Invocation - -**TaskTool** +--- -All sub-agents are bound to TaskTool. When the main agent assigns subtasks to sub-agents for processing, it calls TaskTool and specifies which sub-agent is needed and the task to execute. TaskTool then routes the task to the specified sub-agent and returns the result to the main agent after execution. The default Description of TaskTool explains the general rules for calling sub-agents and concatenates the Description of each sub-agent. Developers can customize the Description of TaskTool by configuring `TaskToolDescriptionGenerator`. +## Architecture -> When users configure Config.SubAgents, these Agents will be bound to TaskTool based on ChatModelAgent's AgentAsTool capability + -**Context Isolation** +- **Main Agent**: System entry point, completes tasks by calling tools in a ReAct manner +- **ChatModel** (`model.BaseModel[M]`): Responsible for reasoning and tool selection +- **Tools**: + - `write_todos`: Built-in planning tool that decomposes tasks into structured TODO lists + - `task`: Sub-Agent invocation entry (routing parameters: `subagent_type`, `description`) + - Built-in tools (file system/shell) + user-defined tools (`ToolsConfig`) +- **SubAgents**: Context-isolated, independently execute sub-tasks + - `general-purpose`: Default sub-Agent with the same tools (except task) and configuration as the main Agent + - Custom sub-Agents (`Config.SubAgents`) -Context isolation between Agents: +--- -- Information Transfer: The main agent and sub-agents do not share context. Sub-agents only receive the subtask goals assigned by the main agent, not the entire task processing; the main agent only receives the processing results from sub-agents, not the processing of sub-agents. -- Avoid Pollution: This isolation ensures that the execution process of sub-agents (such as numerous tool calls and intermediate steps) does not "pollute" the main agent's context. The main agent only receives concise, clear final answers. +## Built-in File System -**general-purpose** + + + + + +
    Config FieldRegistered ToolsDescription
    Backend
    ls, read_file, write_file, edit_file, glob, grepFile system operations
    Shell
    executeNon-streaming command execution, mutually exclusive with StreamingShell
    StreamingShell
    execute (streaming)Streaming command execution, mutually exclusive with Shell
    -DeepAgents adds a sub-agent by default: general-purpose. general-purpose has the same system prompt and tools as the main agent (except TaskTool). When there is no specialized sub-agent to handle a task, the main agent can call general-purpose to isolate context. Developers can remove this agent by configuring `WithoutGeneralSubAgent=true`. +Internally implemented using FileSystem Middleware. -### Comparison with Other Agents +--- -- Compared to ReAct Agent +## Task Planning: write_todos - - Advantages: DeepAgents strengthens task decomposition and planning through built-in WriteTodos; it also isolates multi-agent contexts, usually performing better in large-scale, multi-step tasks. - - Disadvantages: Making plans and calling sub-agents bring additional model requests, increasing latency and token costs; if task decomposition is unreasonable, it may have a negative effect. -- Compared to Plan-and-Execute + - - Advantages: DeepAgents provides Plan/RePlan as tools for the main agent to freely call, allowing unnecessary planning to be skipped during tasks, overall reducing model calls and lowering latency and costs. - - Disadvantages: Task planning and delegation are completed in one model call, requiring higher model capabilities, and prompt tuning is relatively more difficult. +The `write_todos` tool writes a structured TODO list to the session (key: `deep_agent_session_key_todos`) for reference in subsequent reasoning. -## DeepAgents Usage Example +**TODO Structure**: -### Scenario Description +```go +type TODO struct { + Content string `json:"content"` + ActiveForm string `json:"activeForm"` + Status string `json:"status"` // "pending" | "in_progress" | "completed" +} +``` -Excel Agent is an "intelligent assistant that understands Excel". It first breaks down the problem into steps, then executes and verifies results step by step. It can understand user questions and uploaded file content, propose feasible solutions, and select appropriate tools (system commands, generate and run Python code, web queries, etc.) to complete tasks. +**Workflow**: -In real business, you can think of Excel Agent as an "Excel expert + automation engineer". When you provide a raw spreadsheet and target description, it will propose a solution and complete the execution: +1. Model receives user input +2. Calls `write_todos` to decompose tasks and write to context +3. Executes TODOs one by one (calling task or tools directly) +4. Calls `write_todos` again to update progress -- **Data Cleaning and Formatting**: Complete deduplication, null value handling, and date format standardization from an Excel file containing large amounts of data. -- **Data Analysis and Report Generation**: Extract monthly sales totals from sales data, aggregate statistics, pivot, and finally generate and export chart reports. -- **Automated Budget Calculation**: Automatically calculate total budget based on budget applications from different departments and generate department budget allocation tables. -- **Data Matching and Merging**: Match and merge customer information tables from multiple different sources to generate a complete customer information database. +> 💡 +> For simple tasks, calling write_todos every time may be counterproductive. The built-in prompt already includes positive and negative examples guiding when to use it. You can further tune this through a custom Instruction. Setting WithoutWriteTodos=true completely disables it. -The structure of Excel Agent built with DeepAgents is as follows: +--- - +## Sub-Agent Delegation: task Tool -1. Add ReadFile tool to the main agent, allowing the main agent to view file content and assist in subtask formulation -2. Add Code and WebSearch sub-agents: Code can write python code to operate excel spreadsheets; WebSearch can search for information and summarize. +**TaskTool** is the unified invocation entry for all sub-Agents: -### Code Implementation +- Parameters: `subagent_type` (target sub-Agent name), `description` (task description) +- Internally wraps each sub-Agent as a tool via `adk.NewTypedAgentTool` +- Default description includes names and descriptions of all available sub-Agents; customizable via `TaskToolDescriptionGenerator` -[https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) +**Context Isolation**: -### Execution Results +- Sub-Agents only receive the task description assigned by the main Agent, not shared conversation history +- The main Agent only receives the sub-Agent's final result; intermediate steps are not passed back +- Prevents large volumes of tool calls and intermediate reasoning from "polluting" the main Agent context -```yaml -name: ExcelAgent -path: [{ExcelAgent}] -tool name: task -arguments: {"subagent_type":"CodeAgent","description":"Please help me with a CSV file operation task. Specific requirements: 1. Read the questions.csv file in the current directory; 2. Extract the first column data from the CSV file; 3. Save the extracted first column data to a new CSV file named first_column.csv. Please use the pandas library to complete these operations, ensuring the code can execute correctly and handle possible file path issues."} +**general-purpose Sub-Agent**: -name: ExcelAgent -path: [{ExcelAgent}] -tool name: python_runner -arguments: {"code":"```python\nimport pandas as pd\nimport os\n\n# Get current working directory and build file path\nbase_dir = os.getcwd()\ninput_path = os.path.join(base_dir, 'questions.csv')\noutput_path = os.path.join(base_dir, 'first_column.csv')\n\ntry:\n # Read CSV file\n df = pd.read_csv(input_path)\n # Extract first column data\n first_column = df.iloc[:, 0]\n # Save as new CSV file (without index)\n first_column.to_csv(output_path, index=False)\n print(f\"Successfully saved first column data to: {output_path}\")\nexcept FileNotFoundError:\n print(f\"Error: File not found {input_path}\")\nexcept Exception as e:\n print(f\"Error during processing: {str(e)}\")\n```"} +- Created by default, has the same tools (except task), Instruction, and ModelFailoverConfig as the main Agent +- Used to execute general tasks in an isolated context when no specialized sub-Agent exists +- Setting `WithoutGeneralSubAgent=true` disables it -name: ExcelAgent -path: [{ExcelAgent}] -tool response: Successfully saved first column data to: /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv +--- +## Comparison with Other Approaches -name: ExcelAgent -path: [{ExcelAgent}] -answer: Task completed. Successfully read the `questions.csv` file in the current directory, extracted the first column data, and saved the result to `first_column.csv`. The specific output path is: + + + + +
    DimensionDeepAgents vs ReActDeepAgents vs Plan-and-Execute
    AdvantagesBuilt-in planning + sub-Agent context isolation, better for multi-step tasksPlan/RePlan called as tools on demand, reducing unnecessary planning overhead
    DisadvantagesPlanning + sub-Agent calls increase model requests, latency, and token costsPlanning and delegation completed in a single call, higher requirements on model capability
    -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv` +--- -The code handles path concatenation and exception catching (such as file not found or format errors) to ensure execution stability. +## Usage Example -name: ExcelAgent -path: [{ExcelAgent}] -tool response: Task completed. Successfully read the `questions.csv` file in the current directory, extracted the first column data, and saved the result to `first_column.csv`. The specific output path is: +### Excel Agent Scenario -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv` + -The code handles path concatenation and exception catching (such as file not found or format errors) to ensure execution stability. +- Main Agent configured with ReadFile tool to assist task formulation +- Added Code (Python for Excel operations) and WebSearch as two sub-Agents -name: ExcelAgent -path: [{ExcelAgent}] -answer: Successfully extracted the first column data from the `questions.csv` spreadsheet to a new file `first_column.csv`, saved at: +### Code -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv` +Complete example: [https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) -The process handled path concatenation and exception catching (such as file not found, format errors, etc.) to ensure data extraction completeness and file generation stability. If you need to adjust the file path or have further requirements for data format, please let me know. +```go +agent, err := deep.New(ctx, &deep.Config{ + Name: "ExcelAgent", + ChatModel: myModel, + Backend: localBackend, + SubAgents: []adk.Agent{codeAgent, webSearchAgent}, + ToolsConfig: adk.ToolsConfig{ + InvokableTools: []tool.InvokableTool{readFileTool}, + }, +}) ``` diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md index 76988341bc5..f04d755f941 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md @@ -1,17 +1,17 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: Plan-Execute Agent' -weight: 4 +title: Plan-Execute Agent +weight: 2 --- ## Plan-Execute Agent Overview ### Import Path -`import github.com/cloudwego/eino/adk/prebuilt/planexecute` +`import ``github.com/cloudwego/eino/adk/prebuilt/planexecute` ### What is Plan-Execute Agent? @@ -275,7 +275,7 @@ func newPlanExecuteAgent(ctx context.Context) adk.Agent { replanner := newReplanner(ctx, model) // Combine into PlanExecuteAgent (fixed execute-replan max iterations 10) - planExecuteAgent, err := planexecute.NewPlanExecuteAgent(ctx, &planexecute.PlanExecuteConfig{ + planExecuteAgent, err := planexecute.New(ctx, &planexecute.PlanExecuteConfig{ Planner: planner, Executor: executor, Replanner: replanner, @@ -346,7 +346,7 @@ func main() { {"steps":["Identify the most recent and credible sources for AI developments in healthcare in 2024, such as scientific journals, industry reports, news articles, and expert analyses.","Extract and compile the key technologies emerging or advancing in AI for healthcare in 2024, including machine learning models, diagnostic tools, robotic surgery, personalized medicine, and data management solutions.","Analyze the main applications of AI in healthcare during 2024, focusing on areas such as diagnostics, patient care, drug discovery, medical imaging, and healthcare administration.","Investigate current industry trends related to AI in healthcare for 2024, including adoption rates, regulatory changes, ethical considerations, funding landscape, and market forecasts.","Synthesize the gathered information into a comprehensive summary covering the latest developments in AI for healthcare in 2024, highlighting key technologies, applications, and industry trends with examples and implications."]} 2025/09/08 11:47:47 === Agent:Executor Output === -{"message":"Found 10 results successfully.","results":[{"title":"Artificial Intelligence in Healthcare: 2024 Year in Review","url":"https://www.researchgate.net/publication/389402322_Artificial_Intelligence_in_Healthcare_2024_Year_in_Review","summary":"The adoption of LLMs and text data types amongst various healthcare specialties, especially for education and administrative tasks, is unlocking new potential for AI applications in..."},{"title":"AI in Healthcare - Nature","url":"https://www.nature.com/collections/hacjaaeafj","summary":"\"AI in Healthcare\" encompasses the use of AI technologies to enhance various aspects of healthcare delivery, from diagnostics to treatment personalization, ultimately aiming to improve..."},...]} +{"message":"Found 10 results successfully.","results":[{"title":"Artificial Intelligence in Healthcare: 2024 Year in Review","url":"https://www.researchgate.net/publication/389402322_Artificial_Intelligence_in_Healthcare_2024_Year_in_Review","summary":"The adoption of LLMs and text data types amongst various healthcare specialties, especially for education and administrative tasks, is unlocking new potential for AI applications in..."},{"title":"AI in Healthcare - Nature","url":"https://www.nature.com/collections/hacjaaeafj","summary":"\"AI in Healthcare\" encompasses the use of AI technologies to enhance various aspects of healthcare delivery, from diagnostics to treatment personalization, ultimately aiming to improve..."},{"title":"Evolution of artificial intelligence in healthcare: a 30-year ...","url":"https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2024.1505692/full","summary":"Conclusion: This study reveals a sustained explosive growth trend in AI technologies within the healthcare sector in recent years, with increasingly profound applications in medicine. Additionally, medical artificial intelligence research is dynamically evolving with the advent of new technologies."},{"title":"The Impact of Artificial Intelligence on Healthcare: A Comprehensive ...","url":"https://onlinelibrary.wiley.com/doi/full/10.1002/hsr2.70312","summary":"This review analyzes the impact of AI on healthcare using data from the Web of Science (2014-2024), focusing on keywords like AI, ML, and healthcare applications."},{"title":"Artificial intelligence in healthcare (Review) - PubMed","url":"https://pubmed.ncbi.nlm.nih.gov/39583770/","summary":"Furthermore, the barriers and constraints that may impede the use of AI in healthcare are outlined, and the potential future directions of AI-augmented healthcare systems are discussed."},{"title":"Full article: Towards new frontiers of healthcare systems research ...","url":"https://www.tandfonline.com/doi/full/10.1080/2047..."},...]} ... ``` diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md deleted file mode 100644 index bca088eb1cf..00000000000 --- a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md +++ /dev/null @@ -1,359 +0,0 @@ ---- -Description: "" -date: "2026-03-02" -lastmod: "" -tags: [] -title: 'Eino ADK: Supervisor Agent' -weight: 3 ---- - -## Supervisor Agent Overview - -### Import Path - -`import github.com/cloudwego/eino/adk/prebuilt/supervisor` - -### What is Supervisor Agent? - -Supervisor Agent is a centralized multi-agent collaboration pattern consisting of one Supervisor Agent and multiple SubAgents. The Supervisor is responsible for task allocation, monitoring the execution process of sub-agents, and summarizing results and making decisions after sub-agents complete; sub-agents focus on executing specific tasks and automatically transfer task control back to the Supervisor via WithDeterministicTransferTo after completion. - - - -This pattern is suitable for scenarios that require dynamic coordination of multiple specialized agents to complete complex tasks, such as: - -- Research project management (Supervisor assigns research, experiment, report writing tasks to different sub-agents). -- Customer service processes (Supervisor assigns tasks to technical support, after-sales, sales sub-agents based on user question types). - -### Supervisor Agent Structure - -The core structure of Supervisor pattern is as follows: - -- **Supervisor Agent**: As the collaboration core, has task allocation logic (such as rule-based or LLM decision), can include sub-agents under management via `SetSubAgents`. -- **SubAgents**: Each sub-agent is enhanced with WithDeterministicTransferTo, with `ToAgentNames` preset to the Supervisor name, ensuring automatic transfer back to Supervisor after task completion. - -### Supervisor Agent Features - -1. **Deterministic Callback**: After sub-agent execution completes (not interrupted), WithDeterministicTransferTo automatically triggers Transfer event, transferring task control back to Supervisor, avoiding collaboration flow interruption. -2. **Centralized Control**: Supervisor uniformly manages sub-agents, can dynamically adjust task allocation based on sub-agent execution results (such as assigning to other sub-agents or directly generating final results). -3. **Loosely Coupled Extension**: Sub-agents can be independently developed, tested, and replaced; just ensure they implement the Agent interface and bind to Supervisor to join the collaboration flow. -4. **Support for Interrupt and Resume**: If sub-agent or Supervisor supports `ResumableAgent` interface, collaboration flow can resume after interruption, maintaining task context continuity. - -### Supervisor Agent Execution Flow - -The typical collaboration flow of Supervisor pattern is as follows: - -1. **Task Start**: Runner triggers Supervisor to run, inputs initial task (e.g., "Complete a report on LLM development history"). -2. **Task Allocation**: Supervisor transfers task to designated sub-agent (e.g., "Research Agent") via Transfer event based on task requirements. -3. **Sub-Agent Execution**: Sub-agent executes specific task (e.g., researches LLM key milestones) and generates execution result events. -4. **Automatic Callback**: After sub-agent completes, WithDeterministicTransferTo triggers Transfer event, transferring task back to Supervisor. -5. **Result Processing**: Supervisor receives sub-agent results, decides next step (e.g., assign to "Report Writing Agent" to continue processing, or directly output final result). - -## Supervisor Agent Usage Example - -### Scenario Description - -Create a research report generation system: - -- **Supervisor**: Based on user input research topic, assigns tasks to "Research Agent" and "Writer Agent", and summarizes final report. -- **Research Agent**: Responsible for generating research plan (e.g., key stages of LLM development). -- **Writer Agent**: Responsible for writing complete report based on research plan. - -### Code Implementation - -#### Step 1: Implement Sub-Agents - -First create two sub-agents, responsible for research and writing tasks respectively: - -```go -// Research Agent: Generates research plan -func NewResearchAgent(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ResearchAgent", - Description: "Generates a detailed research plan for a given topic.", - Instruction: ` -You are a research planner. Given a topic, output a step-by-step research plan with key stages and milestones. -Output ONLY the plan, no extra text.`, - Model: model, - }) - return agent -} - -// Writer Agent: Writes report based on research plan -func NewWriterAgent(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "WriterAgent", - Description: "Writes a report based on a research plan.", - Instruction: ` -You are an academic writer. Given a research plan, expand it into a structured report with details and analysis. -Output ONLY the report, no extra text.`, - Model: model, - }) - return agent -} -``` - -#### Step 2: Implement Supervisor Agent - -Create Supervisor Agent, define task allocation logic (simplified here as rule-based: first assign to Research Agent, then assign to Writer Agent): - -```go -// Supervisor Agent: Coordinates research and writing tasks -func NewReportSupervisor(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ReportSupervisor", - Description: "Coordinates research and writing to generate a report.", - Instruction: ` -You are a project supervisor. Your task is to coordinate two sub-agents: -- ResearchAgent: generates a research plan. -- WriterAgent: writes a report based on the plan. - -Workflow: -1. When receiving a topic, first transfer the task to ResearchAgent. -2. After ResearchAgent finishes, transfer the task to WriterAgent with the plan as input. -3. After WriterAgent finishes, output the final report.`, - Model: model, - }) - return agent -} -``` - -#### Step 3: Combine Supervisor and Sub-Agents - -Use `NewSupervisor` to combine Supervisor and sub-agents: - -```go -import ( - "context" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/adk/prebuilt/supervisor" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -func main() { - ctx := context.Background() - - // 1. Create LLM model (e.g., GPT-4o) - model, _ := openai.NewChatModel(ctx, &openai.ChatModelConfig{ - APIKey: "YOUR_API_KEY", - Model: "gpt-4o", - }) - - // 2. Create sub-agents and Supervisor - researchAgent := NewResearchAgent(model) - writerAgent := NewWriterAgent(model) - reportSupervisor := NewReportSupervisor(model) - - // 3. Combine Supervisor and sub-agents - supervisorAgent, _ := supervisor.New(ctx, &supervisor.Config{ - Supervisor: reportSupervisor, - SubAgents: []adk.Agent{researchAgent, writerAgent}, - }) - - // 4. Run Supervisor pattern - iter := supervisorAgent.Run(ctx, &adk.AgentInput{ - Messages: []adk.Message{ - schema.UserMessage("Write a report on the history of Large Language Models."), - }, - EnableStreaming: true, - }) - - // 5. Consume event stream (print results) - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Output != nil && event.Output.MessageOutput != nil { - msg, _ := event.Output.MessageOutput.GetMessage() - println("Agent[" + event.AgentName + "]:\n" + msg.Content + "\n===========") - } - } -} -``` - -### Execution Results - -```markdown -Agent[ReportSupervisor]: - -=========== -Agent[ReportSupervisor]: -successfully transferred to agent [ResearchAgent] -=========== -Agent[ResearchAgent]: -1. **Scope Definition & Background Research** - - Task: Define "Large Language Model" (LLM) for the report (e.g., size thresholds, key characteristics: transformer-based, large-scale pretraining, general-purpose). - - Task: Identify foundational NLP/AI concepts pre-LLMs (statistical models, early neural networks, word embeddings) to contextualize origins. - - Milestone: 3-day literature review of academic definitions, industry reports, and AI historiographies to finalize scope. - -2. **Chronological Periodization** - - Task: Divide LLM history into distinct eras (e.g., Pre-2017: Pre-transformer foundations; 2017-2020: Transformer revolution & early LLMs; 2020-Present: Scaling & mainstream adoption). - ... - -Agent[ResearchAgent]: -successfully transferred to agent [ReportSupervisor] -=========== -Agent[ReportSupervisor]: -successfully transferred to agent [WriterAgent] -=========== -Agent[WriterAgent]: -# The History of Large Language Models: From Foundations to Mainstream Revolution - -## Abstract -Large Language Models (LLMs) represent one of the most transformative technological innovations of the 21st century... - -## 1. Introduction: Defining Large Language Models -A **Large Language Model (LLM)** is a type of machine learning model designed to process and generate human language... - -... - -## 7. Conclusion: A Revolution in Five Years -The history of LLMs is a story of exponential progress: from the transformer's 2017 invention to ChatGPT's 2022 viral explosion... - -## References -- Devlin, J., et al. (2018). *BERT: Pre-training of deep bidirectional transformers for language understanding*. NAACL. -... -=========== -Agent[WriterAgent]: -successfully transferred to agent [ReportSupervisor] -=========== -``` - -## WithDeterministicTransferTo - -### What is WithDeterministicTransferTo? - -`WithDeterministicTransferTo` is an Agent enhancement tool provided by Eino ADK, used to inject task transfer capability into Agents. It allows developers to preset fixed task transfer paths for target Agents. When the Agent completes its task (not interrupted), it automatically generates a Transfer event to transfer the task flow to the preset target Agent. - -This capability is the foundation for building the Supervisor Agent collaboration pattern, ensuring sub-agents can reliably transfer task control back to the Supervisor after execution, forming a "allocate-execute-feedback" closed-loop collaboration flow. - -### WithDeterministicTransferTo Core Implementation - -#### Configuration Structure - -Define core task transfer parameters through `DeterministicTransferConfig`: - -```go -// Wrapper method -func AgentWithDeterministicTransferTo(_ context.Context, config *DeterministicTransferConfig) Agent - -// Configuration details -type DeterministicTransferConfig struct { - Agent Agent // Target Agent to be enhanced - ToAgentNames []string // List of target Agent names to transfer to after task completion -} -``` - -- `Agent`: The original Agent that needs transfer capability added. -- `ToAgentNames`: When `Agent` completes task and is not interrupted, automatically transfers task to target Agent name list (transfers in order). - -#### Agent Wrapping - -WithDeterministicTransferTo wraps the original Agent. Based on whether it implements the `ResumableAgent` interface (supports interrupt and resume), it returns `agentWithDeterministicTransferTo` or `resumableAgentWithDeterministicTransferTo` instance respectively, ensuring enhanced capability is compatible with Agent's original functions (such as `Resume` method). - -The wrapped Agent overrides the `Run` method (for `ResumableAgent`, also overrides `Resume` method), appending Transfer events to the original Agent's event stream: - -```go -// Wrapper for regular Agent -type agentWithDeterministicTransferTo struct { - agent Agent // Original Agent - toAgentNames []string // Target Agent name list -} - -// Run method: Executes original Agent task, appends Transfer event after task completion -func (a *agentWithDeterministicTransferTo) Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent] { - aIter := a.agent.Run(ctx, input, options...) - - iterator, generator := NewAsyncIteratorPair[*AgentEvent]() - - // Asynchronously process original event stream and append Transfer event - go appendTransferAction(ctx, aIter, generator, a.toAgentNames) - - return iterator -} -``` - -For `ResumableAgent`, additionally implements `Resume` method, ensuring deterministic transfer still triggers after resume execution: - -```go -type resumableAgentWithDeterministicTransferTo struct { - agent ResumableAgent // Original Agent supporting resume - toAgentNames []string // Target Agent name list -} - -// Resume method: Resumes execution of original Agent task, appends Transfer event after completion -func (a *resumableAgentWithDeterministicTransferTo) Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] { - aIter := a.agent.Resume(ctx, info, opts...) - iterator, generator := NewAsyncIteratorPair[*AgentEvent]() - go appendTransferAction(ctx, aIter, generator, a.toAgentNames) - return iterator -} -``` - -#### Event Stream Append Transfer Event - -`appendTransferAction` is the core logic implementing deterministic transfer. It consumes the original Agent's event stream and automatically generates and sends Transfer events to target Agents after the Agent task ends normally (not interrupted): - -```go -func appendTransferAction(ctx context.Context, aIter *AsyncIterator[*AgentEvent], generator *AsyncGenerator[*AgentEvent], toAgentNames []string) { - defer func() { - // Exception handling: Capture panic and pass error via event - if panicErr := recover(); panicErr != nil { - generator.Send(&AgentEvent{Err: safe.NewPanicErr(panicErr, debug.Stack())}) - } - generator.Close() // Event stream ends, close generator - }() - - interrupted := false - - // 1. Forward all events from original Agent - for { - event, ok := aIter.Next() - if !ok { // Original event stream ended - break - } - generator.Send(event) // Forward event to caller - - // Check if interruption occurred (e.g., InterruptAction) - if event.Action != nil && event.Action.Interrupted != nil { - interrupted = true - } else { - interrupted = false - } - } - - // 2. If not interrupted and target Agent exists, generate Transfer event - if !interrupted && len(toAgentNames) > 0 { - for _, toAgentName := range toAgentNames { - // Generate transfer message (system prompt + Transfer action) - aMsg, tMsg := GenTransferMessages(ctx, toAgentName) - // Send system prompt event (notify user of task transfer) - aEvent := EventFromMessage(aMsg, nil, schema.Assistant, "") - generator.Send(aEvent) - // Send Transfer action event (trigger task transfer) - tEvent := EventFromMessage(tMsg, nil, schema.Tool, tMsg.ToolName) - tEvent.Action = &AgentAction{ - TransferToAgent: &TransferToAgentAction{ - DestAgentName: toAgentName, // Target Agent name - }, - } - generator.Send(tEvent) - } - } -} -``` - -**Key Logic**: - -- **Event Forwarding**: All events generated by the original Agent (such as thinking, tool calls, output results) are fully forwarded, ensuring business logic is unaffected. -- **Interruption Check**: If Agent is interrupted during execution (e.g., `InterruptAction`), Transfer is not triggered (interruption is considered task not completed normally). -- **Transfer Event Generation**: After task ends normally, two events are generated for each `ToAgentNames`: - 1. System prompt event (`schema.Assistant` role): Notifies user that task will be transferred to target Agent. - 2. Transfer action event (`schema.Tool` role): Carries `TransferToAgentAction`, triggers ADK runtime to transfer task to the Agent corresponding to `DestAgentName`. - -## Summary - -WithDeterministicTransferTo provides reliable task transfer capability for Agents, which is the core foundation for building Supervisor pattern; Supervisor pattern achieves efficient collaboration between multiple Agents through centralized coordination and deterministic callbacks, significantly reducing development and maintenance costs for complex tasks. By combining both, developers can quickly build flexible, scalable multi-Agent systems. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md deleted file mode 100644 index 8581e2d329c..00000000000 --- a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md +++ /dev/null @@ -1,1265 +0,0 @@ ---- -Description: "" -date: "2026-01-20" -lastmod: "" -tags: [] -title: 'Eino ADK: Workflow Agents' -weight: 2 ---- - -# Workflow Agents Overview - -## Import Path - -`import "github.com/cloudwego/eino/adk"` - -## What Are Workflow Agents - -Workflow Agents are a special type of Agent in Eino ADK that allows developers to organize and execute multiple sub-agents in a predefined flow. - -Unlike the Transfer pattern based on LLM autonomous decision-making, Workflow Agents use **predefined decisions**, running sub-agents according to the execution flow defined in code, providing a more predictable and controllable multi-agent collaboration approach. - -Eino ADK provides three basic Workflow Agent types: - -- **SequentialAgent**: Executes sub-agents sequentially in order -- **LoopAgent**: Loops through a sequence of sub-agents -- **ParallelAgent**: Executes multiple sub-agents concurrently - -These Workflow Agents can be nested with each other to build more complex execution flows, meeting various business scenario requirements. - -# SequentialAgent - -## Features - -SequentialAgent is the most basic Workflow Agent. It executes a series of sub-agents sequentially according to the order provided in the configuration. After each sub-agent completes execution, its output is passed to the next sub-agent through the History mechanism, forming a linear execution chain. - - - -```go -type SequentialAgentConfig struct { - Name string // Agent name - Description string // Agent description - SubAgents []Agent // List of sub-agents, arranged in execution order -} - -func NewSequentialAgent(ctx context.Context, config *SequentialAgentConfig) (Agent, error) -``` - -SequentialAgent execution follows these rules: - -1. **Linear execution**: Strictly follows the order of the SubAgents array -2. **History passing**: Each agent's execution result is added to History, allowing subsequent agents to access the execution history of previous agents -3. **Early exit**: If any sub-agent produces an ExitAction / Interrupt, the entire Sequential flow terminates immediately - -SequentialAgent is suitable for the following scenarios: - -- **Multi-step processing flows**: Such as data preprocessing -> analysis -> report generation -- **Pipeline processing**: Each step's output serves as the next step's input -- **Task sequences with dependencies**: Subsequent tasks depend on results from previous tasks - -## Example - -This example demonstrates how to use SequentialAgent to create a three-step document processing pipeline: - -1. **DocumentAnalyzer**: Analyzes document content -2. **ContentSummarizer**: Summarizes analysis results -3. **ReportGenerator**: Generates the final report - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -// Create ChatModel instance -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// Document analysis Agent -func NewDocumentAnalyzerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "DocumentAnalyzer", - Description: "Analyzes document content and extracts key information", - Instruction: "You are a document analysis expert. Please carefully analyze the document content provided by the user, extracting key information, main points, and important data.", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Content summarization Agent -func NewContentSummarizerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ContentSummarizer", - Description: "Summarizes analysis results", - Instruction: "Based on the previous document analysis results, generate a concise and clear summary highlighting the most important findings and conclusions.", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Report generation Agent -func NewReportGeneratorAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ReportGenerator", - Description: "Generates the final analysis report", - Instruction: "Based on the previous analysis and summary, generate a structured analysis report including an executive summary, detailed analysis, and recommendations.", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // Create three processing step Agents - analyzer := NewDocumentAnalyzerAgent() - summarizer := NewContentSummarizerAgent() - generator := NewReportGeneratorAgent() - - // Create SequentialAgent - sequentialAgent, err := adk.NewSequentialAgent(ctx, &adk.SequentialAgentConfig{ - Name: "DocumentProcessingPipeline", - Description: "Document processing pipeline: Analysis → Summary → Report Generation", - SubAgents: []adk.Agent{analyzer, summarizer, generator}, - }) - if err != nil { - log.Fatal(err) - } - - // Create Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: sequentialAgent, - }) - - // Execute document processing flow - input := "Please analyze the following market report: In Q3 2024, company revenue grew 15%, mainly due to the successful launch of new product lines. However, operating costs also increased by 8%, requiring efficiency optimization." - - fmt.Println("Starting document processing pipeline...") - iter := runner.Query(ctx, input) - - stepCount := 1 - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - if event.Output != nil && event.Output.MessageOutput != nil { - fmt.Printf("\n=== Step %d: %s ===\n", stepCount, event.AgentName) - fmt.Printf("%s\n", event.Output.MessageOutput.Message.Content) - stepCount++ - } - } - - fmt.Println("\nDocument processing pipeline completed!") -} -``` - -Run result: - -```markdown -Starting document processing pipeline... - -=== Step 1: DocumentAnalyzer === -Market Report Key Information Analysis: - -1. Revenue Growth: - - In Q3 2024, company revenue grew 15% year-over-year. - - The main driver of revenue growth was the successful launch of new product lines. - -2. Cost Situation: - - Operating costs increased by 8%. - - The cost increase reminds the company of the need for efficiency optimization. - -Key Points Summary: -- The new product line launch significantly drove revenue growth, showing good results in product innovation. -- Although revenue increased, the rise in operating costs somewhat affected profitability, highlighting the importance of improving operational efficiency. - -Important Data: -- Revenue growth rate: 15% -- Operating cost growth rate: 8% - -=== Step 2: ContentSummarizer === -Summary: In Q3 2024, the company achieved 15% revenue growth, mainly attributed to the successful launch of new product lines, demonstrating significant improvement in product innovation capability. However, operating costs also increased by 8%, putting some pressure on profitability and emphasizing the urgent need for operational efficiency optimization. Overall, the company needs to seek a better balance between growth and cost control to ensure sustainable healthy development. - -=== Step 3: ReportGenerator === -Analysis Report - -I. Executive Summary -In Q3 2024, the company achieved 15% year-over-year revenue growth, mainly driven by the successful launch of new product lines, demonstrating strong product innovation capability. However, operating costs also increased 8% year-over-year, putting some pressure on profit margins. To ensure continued profitable growth, focus should be on optimizing operational efficiency and promoting balanced development of cost control and revenue growth. - -II. Detailed Analysis -1. Revenue Growth Analysis -- The company's 15% revenue growth reflects good market acceptance of new product lines, effectively expanding revenue sources. -- The launch of new product lines demonstrates improved R&D and market responsiveness, laying a foundation for future sustained growth. - -2. Operating Cost Situation -- The 8% increase in operating costs may come from various aspects including raw material price increases, decreased production efficiency, or increased sales and promotion expenses. -- This cost increase somewhat offsets the profit gains from revenue growth, affecting overall profitability. - -3. Profitability and Efficiency Considerations -- The mismatch between revenue and cost growth indicates room for improvement in current operational efficiency. -- Optimizing supply chain management, improving production automation, and strengthening cost control will become key measures. - -III. Recommendations -1. Strengthen follow-up support for new product lines, including marketing and customer feedback mechanisms, to continue driving revenue growth. -2. Conduct in-depth analysis of operating cost composition, identify main cost drivers, and develop targeted cost reduction strategies. -3. Promote internal process optimization and technology upgrades to improve production and operational efficiency and alleviate cost pressure. -4. Establish a dynamic financial monitoring system to achieve real-time tracking and adjustment of revenue and costs, ensuring company financial health. - -IV. Conclusion -The company demonstrated good growth momentum in Q3 2024 but also faces challenges from rising costs. Through continuous product innovation combined with effective cost management, there is potential to achieve dual improvement in profitability and market competitiveness, driving steady company development. - -Document processing pipeline completed! -``` - -# LoopAgent - -## Features - -LoopAgent is built on SequentialAgent. It repeatedly executes the configured sub-agent sequence until the maximum iteration count is reached or a sub-agent produces an ExitAction. LoopAgent is particularly suitable for scenarios requiring iterative optimization, repeated processing, or continuous monitoring. - - - -```go -type LoopAgentConfig struct { - Name string // Agent name - Description string // Agent description - SubAgents []Agent // List of sub-agents - MaxIterations int // Maximum iteration count, 0 means infinite loop -} - -func NewLoopAgent(ctx context.Context, config *LoopAgentConfig) (Agent, error) -``` - -LoopAgent execution follows these rules: - -1. **Loop execution**: Repeatedly executes the SubAgents sequence, with each loop being a complete Sequential execution process -2. **History accumulation**: Results from each iteration accumulate in History, allowing subsequent iterations to access all historical information -3. **Conditional exit**: Supports terminating the loop via ExitAction or reaching maximum iteration count; setting `MaxIterations=0` means infinite loop - -LoopAgent is suitable for the following scenarios: - -- **Iterative optimization**: Tasks requiring repeated improvement such as code optimization, parameter tuning -- **Continuous monitoring**: Periodically checking status and executing corresponding operations -- **Repeated processing**: Tasks that need multiple rounds of processing to achieve satisfactory results -- **Self-improvement**: Agent continuously improves its output based on previous execution results - -## Example - -This example demonstrates how to use LoopAgent to create a code optimization loop: - -1. **CodeAnalyzer**: Analyzes code issues -2. **CodeOptimizer**: Optimizes code based on analysis results -3. **ExitController**: Determines whether to exit the loop - -The loop continues until code quality meets standards or maximum iteration count is reached. - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// Code analysis Agent -func NewCodeAnalyzerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "CodeAnalyzer", - Description: "Analyzes code quality and performance issues", - Instruction: `You are a code analysis expert. Please analyze the provided code and identify the following issues: -1. Performance bottlenecks -2. Code duplication -3. Readability issues -4. Potential bugs -5. Non-compliance with best practices - -If the code is already excellent, output "EXIT: Code quality has met standards" to end the optimization process.`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Code optimization Agent -func NewCodeOptimizerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "CodeOptimizer", - Description: "Optimizes code based on analysis results", - Instruction: `Based on the previous code analysis results, optimize and improve the code: -1. Fix identified performance issues -2. Eliminate code duplication -3. Improve code readability -4. Fix potential bugs -5. Apply best practices - -Please provide the complete optimized code.`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Create a special Agent to handle exit logic -func NewExitControllerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ExitController", - Description: "Controls the exit of the optimization loop", - Instruction: `Check the previous analysis results. If the code analyst believes the code quality has met standards (contains "EXIT" keyword), -output "TERMINATE" and generate an exit action to end the loop. Otherwise continue to the next optimization round.`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // Create optimization flow Agents - analyzer := NewCodeAnalyzerAgent() - optimizer := NewCodeOptimizerAgent() - controller := NewExitControllerAgent() - - // Create LoopAgent, execute up to 5 optimization rounds - loopAgent, err := adk.NewLoopAgent(ctx, &adk.LoopAgentConfig{ - Name: "CodeOptimizationLoop", - Description: "Code optimization loop: Analysis → Optimization → Check exit condition", - SubAgents: []adk.Agent{analyzer, optimizer, controller}, - MaxIterations: 5, // Maximum 5 optimization rounds - }) - if err != nil { - log.Fatal(err) - } - - // Create Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: loopAgent, - }) - - // Code example to optimize - codeToOptimize := ` -func processData(data []int) []int { - result := []int{} - for i := 0; i < len(data); i++ { - for j := 0; j < len(data); j++ { - if data[i] > data[j] { - result = append(result, data[i]) - break - } - } - } - return result -} -` - - fmt.Println("Starting code optimization loop...") - iter := runner.Query(ctx, "Please optimize the following Go code:\n"+codeToOptimize) - - iteration := 1 - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - if event.Output != nil && event.Output.MessageOutput != nil { - fmt.Printf("\n=== Round %d - %s ===\n", iteration, event.AgentName) - fmt.Printf("%s\n", event.Output.MessageOutput.Message.Content) - - // Check if exit is needed - if event.AgentName == "ExitController" { - if event.Action != nil && event.Action.Exit { - fmt.Println("\nOptimization loop ended early!") - break - } - iteration++ - } - } - } - - fmt.Println("\nCode optimization loop completed!") -} -``` - -Run result: - -```java -Starting code optimization loop... - -=== Round 1 - CodeAnalyzer === -Analyzing the provided code: - -```go -func processData(data []int) []int { - result := []int{} - for i := 0; i < len(data); i++ { - for j := 0; j < len(data); j++ { - if data[i] > data[j] { - result = append(result, data[i]) - break - } - } - } - return result -} -``` - -### 1. Performance Bottlenecks - -- Double nested loop with O(n²) time complexity, poor performance for large datasets. -- Inner loop breaks immediately when condition is met, reducing some unnecessary comparisons, but overall still quadratic complexity. - -### 2. Code Duplication - -- No obvious code duplication, but logic can be simplified. - -### 3. Readability Issues - -- The code's intent is not very clear. Based on the code logic, the function filters out elements that have at least one element smaller than them in the array. In other words, it removes all elements that are equal to or are the minimum. -- Readability can be improved through comments or renaming. - -### 4. Potential Bugs - -- No obvious bugs currently, but returning an empty array for empty input is expected behavior. - -### 5. Non-compliance with Best Practices - -- The process is relatively inefficient, not using Go language features to optimize performance. -- Missing comments or documentation. - ---- - -## Optimization Suggestions - -If the goal is indeed to return all elements greater than some element in the array (excluding the minimum element), it can be changed to: - -- Find the minimum element minVal in the array. -- Iterate through the array, adding elements greater than minVal to the result. -- This reduces time complexity to O(n). - -Optimized code: - -```go -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -### Optimization Points Explained - -- Use one traversal to determine minimum value, O(n). -- Another traversal to construct result, O(n). -- Properly use slice capacity initialization to reduce memory allocation. -- Code intent is clear and performance improvement is significant. - ---- - -If you expect different logic, please let me know for further adjustments. - -=== Round 1 - CodeOptimizer === - -```go -// processData returns all values in the input slice that are greater than the minimum element. -// Returns nil if input is empty. -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - // Find the minimum value in the slice - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - // Collect all elements greater than the minimum value - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== Round 1 - ExitController === -TERMINATE - -=== Round 2 - CodeAnalyzer === -EXIT: Code quality has met standards - -=== Round 2 - CodeOptimizer === - -```go -// processData returns all values in the input slice that are greater than the minimum element. -// Returns nil if input is empty. -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - // Find the minimum value in the slice - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - // Collect all elements greater than the minimum value - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== Round 2 - ExitController === -TERMINATE - -=== Round 3 - CodeAnalyzer === -EXIT: Code quality has met standards - -=== Round 3 - CodeOptimizer === - -```go -// processData returns all values in the input slice that are greater than the minimum element. -// Returns nil if input is empty. -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== Round 3 - ExitController === -TERMINATE - -=== Round 4 - CodeAnalyzer === -EXIT: Code quality has met standards - -=== Round 4 - CodeOptimizer === - -```go -// processData returns all elements in the slice that are greater than the minimum element. -// Returns nil if input is empty. -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== Round 4 - ExitController === -TERMINATE - -=== Round 5 - CodeAnalyzer === -EXIT: Code quality has met standards - -=== Round 5 - CodeOptimizer === - -```go -// processData returns all values in the input slice that are greater than the minimum element. -// Returns nil if input is empty. -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== Round 5 - ExitController === -TERMINATE - -Code optimization loop completed! - -``` - - - - -## BreakLoop - - -In a Loop Agent, when an Agent needs to interrupt the loop execution, you can use the corresponding Break Action provided by ADK. - -```go -// BreakLoopAction is a programmatic-only agent action used to prematurely -// terminate the execution of a loop workflow agent. -// When a loop workflow agent receives this action from a sub-agent, it will stop its -// current iteration and will not proceed to the next one. -// It will mark the BreakLoopAction as Done, signalling to any 'upper level' loop agent -// that this action has been processed and should be ignored further up. -// This action is not intended to be used by LLMs. -type BreakLoopAction struct { - // From records the name of the agent that initiated the break loop action. - From string - // Done is a state flag that can be used by the framework to mark when the - // action has been handled. - Done bool - // CurrentIterations is populated by the framework to record at which - // iteration the loop was broken. - CurrentIterations int -} - -// NewBreakLoopAction creates a new BreakLoopAction, signaling a request -// to terminate the current loop. -func NewBreakLoopAction(agentName string) *AgentAction { - return &AgentAction{BreakLoop: &BreakLoopAction{ - From: agentName, - }} -} -``` - -Break Action achieves the interruption purpose without affecting other Agents outside the Loop Agent, while Exit Action immediately interrupts all subsequent Agent execution. - -Using the following diagram as an example: - - - -- When Agent1 issues a BreakAction, the Loop Agent will be interrupted, and Sequential continues to run Agent3 -- When Agent1 issues an ExitAction, the Sequential execution flow terminates entirely, and neither Agent2 nor Agent3 will run - -# ParallelAgent - -## Features - -ParallelAgent allows multiple sub-agents to execute concurrently based on the same input context. All sub-agents start execution simultaneously and wait for all to complete before ending. This pattern is particularly suitable for tasks that can be processed independently in parallel, significantly improving execution efficiency. - - - -```go -type ParallelAgentConfig struct { - Name string // Agent name - Description string // Agent description - SubAgents []Agent // List of sub-agents to execute concurrently -} - -func NewParallelAgent(ctx context.Context, config *ParallelAgentConfig) (Agent, error) -``` - -ParallelAgent execution follows these rules: - -1. **Concurrent execution**: All sub-agents start simultaneously, executing in parallel in independent goroutines -2. **Shared input**: All sub-agents receive the same initial input and context -3. **Wait and result aggregation**: Internally uses sync.WaitGroup to wait for all sub-agents to complete, collecting all sub-agent execution results and outputting them in the order received - -Additionally, Parallel internally includes exception handling mechanisms by default: - -- **Panic recovery**: Each goroutine has independent panic recovery mechanism -- **Error isolation**: Errors from a single sub-agent do not affect execution of other sub-agents -- **Interrupt handling**: Supports sub-agent interrupt and resume mechanisms - -ParallelAgent is suitable for the following scenarios: - -- **Independent task parallel processing**: Multiple unrelated tasks can execute simultaneously -- **Multi-angle analysis**: Analyzing the same problem from different angles simultaneously -- **Performance optimization**: Reducing overall execution time through parallel execution -- **Multi-expert consultation**: Consulting multiple specialized domain Agents simultaneously - -## Example - -This example demonstrates how to use ParallelAgent to analyze a product proposal from four different angles simultaneously: - -1. **TechnicalAnalyst**: Technical feasibility analysis -2. **BusinessAnalyst**: Business value analysis -3. **UXAnalyst**: User experience analysis -4. **SecurityAnalyst**: Security risk analysis - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - "sync" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" -) - -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// Technical analysis Agent -func NewTechnicalAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "TechnicalAnalyst", - Description: "Analyzes content from a technical perspective", - Instruction: `You are a technical expert. Please analyze the provided content from technical implementation, architecture design, and performance optimization perspectives. -Focus on: -1. Technical feasibility -2. Architecture rationality -3. Performance considerations -4. Technical risks -5. Implementation complexity`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Business analysis Agent -func NewBusinessAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "BusinessAnalyst", - Description: "Analyzes content from a business perspective", - Instruction: `You are a business analysis expert. Please analyze the provided content from business value, market prospects, and cost-effectiveness perspectives. -Focus on: -1. Business value -2. Market demand -3. Competitive advantages -4. Cost analysis -5. Revenue model`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// User experience analysis Agent -func NewUXAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "UXAnalyst", - Description: "Analyzes content from a user experience perspective", - Instruction: `You are a user experience expert. Please analyze the provided content from user experience, usability, and user satisfaction perspectives. -Focus on: -1. User friendliness -2. Operational convenience -3. Learning cost -4. User satisfaction -5. Accessibility`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Security analysis Agent -func NewSecurityAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "SecurityAnalyst", - Description: "Analyzes content from a security perspective", - Instruction: `You are a security expert. Please analyze the provided content from information security, data protection, and privacy compliance perspectives. -Focus on: -1. Data security -2. Privacy protection -3. Access control -4. Security vulnerabilities -5. Compliance requirements`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // Create four analysis Agents from different angles - techAnalyst := NewTechnicalAnalystAgent() - bizAnalyst := NewBusinessAnalystAgent() - uxAnalyst := NewUXAnalystAgent() - secAnalyst := NewSecurityAnalystAgent() - - // Create ParallelAgent for simultaneous multi-angle analysis - parallelAgent, err := adk.NewParallelAgent(ctx, &adk.ParallelAgentConfig{ - Name: "MultiPerspectiveAnalyzer", - Description: "Multi-angle parallel analysis: Technical + Business + User Experience + Security", - SubAgents: []adk.Agent{techAnalyst, bizAnalyst, uxAnalyst, secAnalyst}, - }) - if err != nil { - log.Fatal(err) - } - - // Create Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: parallelAgent, - }) - - // Product proposal to analyze - productProposal := ` -Product Proposal: Intelligent Customer Service System - -Overview: Develop an intelligent customer service system based on large language models that can automatically answer user questions, handle common business inquiries, and transfer to human agents when necessary. - -Main Features: -1. Natural language understanding and response -2. Multi-turn conversation management -3. Knowledge base integration -4. Sentiment analysis -5. Human agent transfer -6. Conversation history recording -7. Multi-channel access (Web, WeChat, App) - -Technical Architecture: -- Frontend: React + TypeScript -- Backend: Go + Gin framework -- Database: PostgreSQL + Redis -- AI Model: GPT-4 API -- Deployment: Docker + Kubernetes -` - - fmt.Println("Starting multi-angle parallel analysis...") - iter := runner.Query(ctx, "Please analyze the following product proposal:\n"+productProposal) - - // Use map to collect results from different analysts - results := make(map[string]string) - var mu sync.Mutex - - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Printf("Error during analysis: %v", event.Err) - continue - } - - if event.Output != nil && event.Output.MessageOutput != nil { - mu.Lock() - results[event.AgentName] = event.Output.MessageOutput.Message.Content - mu.Unlock() - - fmt.Printf("\n=== %s analysis completed ===\n", event.AgentName) - } - } - - // Output all analysis results - fmt.Println("\n" + "============================================================") - fmt.Println("Multi-angle Analysis Results Summary") - fmt.Println("============================================================") - - analysisOrder := []string{"TechnicalAnalyst", "BusinessAnalyst", "UXAnalyst", "SecurityAnalyst"} - analysisNames := map[string]string{ - "TechnicalAnalyst": "Technical Analysis", - "BusinessAnalyst": "Business Analysis", - "UXAnalyst": "User Experience Analysis", - "SecurityAnalyst": "Security Analysis", - } - - for _, agentName := range analysisOrder { - if result, exists := results[agentName]; exists { - fmt.Printf("\n【%s】\n", analysisNames[agentName]) - fmt.Printf("%s\n", result) - fmt.Println("----------------------------------------") - } - } - - fmt.Println("\nMulti-angle parallel analysis completed!") - fmt.Printf("Received %d analysis results\n", len(results)) -} -``` - -Run result: - -```markdown -Starting multi-angle parallel analysis... - -=== BusinessAnalyst analysis completed === - -=== UXAnalyst analysis completed === - -=== SecurityAnalyst analysis completed === - -=== TechnicalAnalyst analysis completed === - -============================================================ -Multi-angle Analysis Results Summary -============================================================ - -【Technical Analysis】 -For this intelligent customer service system proposal, here is a detailed analysis from technical implementation, architecture design, and performance optimization perspectives: - ---- - -### I. Technical Feasibility - -1. **Natural Language Understanding and Response** - - Using GPT-4 API for natural language understanding and automatic response is a mature and feasible solution. GPT-4 has strong language understanding and generation capabilities, suitable for handling complex and diverse questions. - -2. **Multi-turn Conversation Management** - - Relies on backend to maintain context state, combined with GPT-4 model can handle multi-turn interactions well. Need to design reasonable context management mechanism (such as conversation history maintenance, key slot extraction, etc.) to ensure context information integrity. - -3. **Knowledge Base Integration** - - Can add specific knowledge base retrieval results to GPT-4 API (retrieval-augmented generation), or integrate knowledge base through local retrieval interface. Technically feasible, but has high requirements for real-time and accuracy. - -4. **Sentiment Analysis** - - Sentiment analysis function can be implemented with independent lightweight models (such as fine-tuned BERT), or try using GPT-4 output, but cost is higher. Sentiment analysis capability helps intelligent customer service better understand user emotions and improve user experience. - -5. **Human Agent Transfer** - - Technically achievable through establishing event trigger rules (such as turn count, emotion threshold, keyword detection) to implement automatic transfer to human. System needs to support ticket or session transfer mechanism and ensure seamless session switching. - -6. **Multi-channel Access** - - Multi-channel access including web, WeChat, App can all be achieved through unified API gateway, technology is mature, while needing to handle channel differences (message format, authentication, push mechanism, etc.). - ---- - -### II. Architecture Rationality - -- **Frontend React + TypeScript** - Very suitable for building responsive customer service interface, mature ecosystem, convenient for multi-channel component sharing. - -- **Backend Go + Gin** - Go language has excellent performance, Gin framework is lightweight and high-performance, suitable for high-concurrency scenarios. Backend handles GPT-4 API integration, state management, multi-channel message forwarding and other responsibilities, reasonable choice. - -- **Database PostgreSQL + Redis** - - PostgreSQL handles structured data storage, such as user information, conversation history, knowledge base metadata. - - Redis handles session state caching, hot knowledge base, rate limiting, etc., improving access performance. - Architecture design follows common large internet product patterns, with clear component division. - -- **AI Model GPT-4 API** - Using mature API reduces development difficulty and model maintenance cost; disadvantage is high dependency on network and API calls. - -- **Deployment Docker + Kubernetes** - Containerization and K8s orchestration ensure system elastic scaling, high availability and canary deployment, suitable for production environment, follows modern microservices architecture trends. - ---- - -### III. Performance Considerations - -1. **Response Time** - - GPT-4 API calls have inherent latency (usually hundreds of milliseconds to 1 second), significantly affecting response time. Need to handle interface asynchronously and design frontend experience well (such as loading animations, partial progressive response). - -2. **Concurrent Processing Capability** - - Backend Go has high concurrent processing advantages, combined with Redis caching hot data, can greatly improve overall throughput. - - But GPT-4 API calls are limited by OpenAI service QPS limits and call costs, need to reasonably design call frequency and degradation strategies. - -3. **Caching Strategy** - - Cache user conversation context and common question answers to reduce repeated API calls. - - Match key questions locally first, call GPT-4 only on failure, improving efficiency. - -4. **Multi-channel Load Balancing** - - Need to design unified message bus and reliable async queue to prevent traffic spikes from one channel affecting overall system stability. - ---- - -### IV. Technical Risks - -1. **GPT-4 API Dependency** - - High dependency on third-party API, risks include service interruption, interface changes and cost fluctuations. - - Recommend designing local cache and limited alternative response logic to handle API exceptions. - -2. **Multi-turn Conversation Context Management Difficulty** - - Context too long or complex will reduce answer quality, need to design context length limits and selective important information retention mechanism. - -3. **Knowledge Base Integration Complexity** - - How to achieve knowledge base and... ----------------------------------------- - -【Business Analysis】 -Here is the business perspective analysis of the intelligent customer service system product proposal: - -1. Business Value -- Improve customer service efficiency: Automatically answer user questions and common inquiries, reduce human agent pressure, lower labor costs. -- Improve user experience: Multi-turn conversation and sentiment analysis make interactions more natural, enhance customer satisfaction and stickiness. -- Data-driven decision support: Conversation history and knowledge base integration provide valuable user feedback and behavior data for enterprises, optimizing products and services. -- Support business expansion: Multi-channel access (web, WeChat, App) meets different customer access habits, improving coverage. - -2. Market Demand -- Market demand for intelligent customer service continues to grow, especially in e-commerce, finance, healthcare, education and other industries, customer service automation is an important direction for enterprise digital transformation. -- With the maturity of AI technology, enterprises expect to use large language models to improve customer service intelligence level. -- Users' demand for instant response and 24/7 service is increasing, driving widespread adoption of intelligent customer service systems. - -3. Competitive Advantages -- Using advanced GPT-4 large language model, has strong natural language understanding and generation capabilities, improving Q&A accuracy and conversation naturalness. -- Sentiment analysis function helps accurately identify user emotions, dynamically adjust response strategies, improve customer satisfaction. -- Multi-channel access design meets enterprise diversified customer reach needs, enhancing product applicability. -- Technical architecture uses microservices, containerized deployment, convenient for elastic scaling and maintenance, improving system stability and scalability. - -4. Cost Analysis -- AI model call cost is high, depends on GPT-4 API, need to adjust budget based on call volume and response speed. -- Technical R&D investment is large, involving frontend and backend, multi-channel integration, AI and knowledge base management. -- Operation and server costs need to consider multi-channel concurrent access. -- In the long term, human agent count can be significantly reduced, saving labor costs. -- Can reduce initial hardware investment through cloud services, but cloud resource usage needs careful management to control costs. - -5. Revenue Model -- SaaS subscription service: Charge monthly/yearly service fees to enterprise customers, tiered pricing based on access channels, concurrency, and feature levels. -- Charge by call count or conversation count, suitable for customers with large business fluctuations. -- Value-added services: Data analysis report customization, industry knowledge base integration, human agent collaboration tools, etc. -- For medium and large customers, can provide custom development and technical support, charging project fees. -- Through continuous model and service optimization, increase customer retention and renewal rates. - -In summary, this intelligent customer service system based on mature technology and AI advantages has good business value and market potential. Its multi-channel access and sentiment analysis features enhance competitiveness, but need to reasonably control AI call costs and operating expenses. Recommend focusing on SaaS subscription and value-added services, combined with marketing, quickly capture customer resources and improve profitability. ----------------------------------------- - -【User Experience Analysis】 -For this intelligent customer service system proposal, I will analyze from user experience, usability, user satisfaction and accessibility perspectives: - -1. User Friendliness -- Natural language understanding and response capability improves user communication experience with the system, allowing users to express needs in natural language, reducing communication barriers. -- Multi-turn conversation management allows the system to understand context, reducing repeated explanations, enhancing conversation coherence, further improving user experience. -- Sentiment analysis function helps the system identify user emotions, making more thoughtful responses, improving interaction personalization and humanization. -- Multi-channel access covers users' commonly used access paths, convenient for users to get service anytime anywhere, improving friendliness. - -2. Operational Convenience -- Automatically answering common business inquiries can reduce user waiting time and operational burden, improving response speed. -- Human agent transfer mechanism ensures complex issues can be handled timely, ensuring service continuity and seamless operation handoff. -- Conversation history recording convenient for users to review consultation content, avoiding repeated queries, improving operational convenience. -- Using modern tech stack (React, TypeScript) provides good frontend interaction performance and response speed, indirectly enhancing operational smoothness. - -3. Learning Cost -- Based on natural language processing, users don't need to learn special commands, lowering usage threshold. -- Multi-turn conversation natural connection makes it easier for users to understand system response logic, reducing confusion and frustration. -- Consistent interface across different channels (such as keeping similar experience on web and WeChat) helps users get started quickly. -- More precise feedback provided through sentiment analysis reduces time cost of users frequently trying due to misunderstanding. - -4. User Satisfaction -- Fast and accurate automatic replies and multi-turn conversation reduce user waiting and repeated input, improving satisfaction. -- Sentiment analysis makes the system better understand user emotions, bringing warmer interaction experience, increasing user stickiness. -- Human agent intervention ensures complex issues are properly handled, improving service quality perception. -- Multi-channel coverage meets different users' usage scenarios, enhancing overall satisfaction. - -5. Accessibility -- Multi-channel access covers web, WeChat, App, adapting to different users' devices and environments, improving accessibility. -- The proposal doesn't explicitly mention accessibility design (such as screen reader compatibility, high contrast mode, etc.), which may be an area to supplement in the future. -- Frontend using React and TypeScript is conducive to implementing responsive design and accessibility features, but need to ensure development standards are implemented. -- Backend architecture and deployment solution ensure system stability and scalability, indirectly improving user continuous accessibility. - -Summary: -This intelligent customer service system proposal is fairly comprehensive in user experience and usability considerations, using large language models to achieve natural multi-turn conversation, sentiment analysis and knowledge base integration, meeting users' diverse needs. Meanwhile, multi-channel access enhances system coverage. Recommend strengthening accessibility design in specific implementation to achieve more comprehensive accessibility assurance, while continuing to optimize conversation strategies to improve user satisfaction. ----------------------------------------- - -【Security Analysis】 -For this intelligent customer service system proposal, here is the analysis from information security, data protection and privacy compliance perspectives: - -I. Data Security - -1. Data Transmission Security -- Recommend all client-server communications use TLS/SSL encryption to ensure data confidentiality and integrity during transmission. -- Since multi-channel access is supported (web, WeChat, App), need to ensure each entry point strictly implements encrypted transmission. - -2. Data Storage Security -- PostgreSQL stores sensitive information like conversation history and user data, need to enable database encryption (such as transparent data encryption TDE or field-level encryption) to prevent data leakage. -- Redis as cache may store temporary session data, also need to enable access authentication and encrypted transmission. -- Implement minimum storage principle for user sensitive data, avoid storing unrelated data beyond scope. -- Data backup process needs encrypted storage, and backup access should also be controlled. - -3. API Call Security -- GPT-4 API calls generate large amounts of user data interaction, should evaluate its data processing and storage policies to ensure compliance with data security requirements. -- Add call permission management, limit API key access scope and permissions to prevent abuse. - -4. Log Security -- System logs should avoid storing plaintext sensitive information, especially personal identity information and conversation content. Log access needs strict control. - -II. Privacy Protection - -1. Personal Data Processing -- Collection and storage of user personal data (name, contact information, account information, etc.) must clearly inform users and obtain user consent. -- Implement data anonymization/de-identification technology, especially for identity information processing in conversation history. - -2. User Privacy Rights -- Meet users' rights to access, correct, and delete data in relevant laws and regulations (such as Personal Information Protection Law, GDPR). -- Provide privacy policy clearly disclosing data collection, use and sharing situations. - -3. Interaction Privacy -- Multi-turn conversation and sentiment analysis features should consider avoiding excessive invasion of user privacy, such as transparent notification and restriction of sensitive emotion data usage. - -4. Third-party Compliance -- GPT-4 API is provided by third party, need to ensure its service complies with relevant privacy compliance requirements and data protection standards. - -III. Access Control - -1. User Identity Verification -- When system involves user identity information query and management, need to establish reliable identity authentication mechanism. -- Support multi-factor authentication to enhance security. - -2. Permission Management -- Backend management interface and human agent transfer module need to use role-based access control (RBAC) to ensure minimum operation permissions. -- Operations accessing sensitive data need detailed audit and monitoring. - -3. Session Management -- Need effective session management mechanism for multi-channel sessions to prevent session hijacking. -- Conversation history access permissions should be limited to only relevant users or authorized personnel. - -IV. Security Vulnerabilities - -1. Application Security -- Frontend React+TypeScript should prevent XSS, CSRF attacks, reasonably use Content Security Policy (CSP). -- Backend Go application needs to prevent SQL injection, request forgery and permission deficiency. Gin framework provides middleware support, recommend fully utilizing security modules. - -2. AI Model Risks -- GPT-4 API input/output may have sensitive information leakage or model misuse risks, need to limit input content and filter sensitive information. -- Prevent generating malicious answers or information leakage, establish content review mechanism. - -3. Container and Deployment Security -- Docker containers must use secure images and patch timely. Kubernetes cluster network policies and access control need to be complete. -- Container runtime permissions minimized to avoid container escape risks. - -V. Compliance Requirements - -1. Data Protection Regulations -- Based on operating region, need to comply with Personal Information Protection Law (PIPL), EU General Data Protection Regulation (GDPR) or other relevant legal requirements. -- Clearly define user data collection, processing, transmission and storage processes comply with regulations. - -2. User Privacy Notice and Consent -- Should provide clear privacy policy and terms of use, explaining data purposes and processing methods. -- Implement user consent management mechanism. - -3. Cross-border Data Transfer Compliance -- If system involves cross-border data flow, need to assess compliance risks and take corresponding technical... ----------------------------------------- - -Multi-angle parallel analysis completed! -Received 4 analysis results -``` - -# Summary - -Workflow Agents provide powerful multi-agent collaboration capabilities for Eino ADK. By reasonably selecting and combining these Workflow Agents, developers can build efficient and reliable multi-agent collaboration systems to meet various complex business requirements. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_interface.md b/content/en/docs/eino/core_modules/eino_adk/agent_interface.md index 0c57ea1982d..e22d4cd2e7a 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_interface.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_interface.md @@ -1,390 +1,198 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: Agent Abstraction' +title: Agent Abstraction weight: 3 --- -# Agent Definition +# Agent Interface -Eino defines a basic interface for Agents. Any struct implementing this interface can be considered an Agent: +All ADK functionality revolves around the `Agent` interface: ```go -// github.com/cloudwego/eino/adk/interface.go +// github.com/cloudwego/eino/adk -type Agent interface { +type TypedAgent[M MessageType] interface { Name(ctx context.Context) string Description(ctx context.Context) string - Run(ctx context.Context, input *AgentInput, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] + Run(ctx context.Context, input *TypedAgentInput[M], options ...AgentRunOption) *AsyncIterator[*TypedAgentEvent[M]] } + +// Default type alias (uses *schema.Message) +type Agent = TypedAgent[*schema.Message] ``` - - - + + +
    MethodDescription
    NameThe name of the Agent, serving as the Agent's identifier
    DescriptionDescription of the Agent's capabilities, mainly used for other Agents to understand and determine this Agent's responsibilities or functions
    RunThe core execution method of the Agent, returns an iterator through which callers can continuously receive events produced by the Agent
    Name
    Agent name identifier
    Description
    Capability description, for other Agents or the framework to understand abilities
    Run
    Core execution method, asynchronously returns event stream (Future pattern)
    -## AgentInput - -The Run method receives AgentInput as the Agent's input: - -```go -type AgentInput struct { - Messages []Message - EnableStreaming bool -} - -type Message = *schema.Message -``` - -Agents typically have ChatModel as their core, so the Agent's input is defined as `Messages`, which is the same type as when calling Eino ChatModel. `Messages` can include user instructions, conversation history, background knowledge, sample data, or any other data you want to pass to the Agent. For example: +## MessageType Constraint ```go -import ( - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/schema" -) - -input := &adk.AgentInput{ - Messages: []adk.Message{ - schema.UserMessage("What's the capital of France?"), - schema.AssistantMessage("The capital of France is Paris.", nil), - schema.UserMessage("How far is it from London? "), - }, +type MessageType interface { + *schema.Message | *schema.AgenticMessage } ``` -`EnableStreaming` is used to **suggest** the output mode to the Agent, but it is not a mandatory constraint. Its core idea is to control the behavior of components that support both streaming and non-streaming output, such as ChatModel, while `EnableStreaming` does not affect components that only support one output method. Additionally, the `AgentOutput.IsStreaming` field indicates the actual output type. The runtime behavior is: +All ADK generic types are parameterized with `[M MessageType]`. `*schema.Message` supports full ADK features; `*schema.AgenticMessage` is for the structured content block pattern introduced in v0.9. -- When `EnableStreaming=false`, components that can output both streaming and non-streaming will use the non-streaming mode that returns complete results at once. -- When `EnableStreaming=true`, components within the Agent that can output streaming (such as ChatModel calls) should return results progressively as a stream. If a component naturally doesn't support streaming, it can still work in its original non-streaming way. +## Type Alias Quick Reference -As shown in the diagram below, ChatModel can output both streaming and non-streaming, while Tool can only output non-streaming: - -- When `EnableStream=false`, both output non-streaming -- When `EnableStream=true`, ChatModel outputs streaming, Tool still outputs non-streaming because it doesn't have streaming capability. - - - -## AgentRunOption - -`AgentRunOption` is defined by the Agent implementation and can modify Agent configuration or control Agent behavior at the request level. - -Eino ADK provides some commonly defined Options for users: - -- `WithSessionValues`: Set cross-Agent read/write data -- `WithSkipTransferMessages`: When configured, if the Event is a Transfer to SubAgent, the messages in the Event will not be appended to History - -Eino ADK provides `WrapImplSpecificOptFn` and `GetImplSpecificOptions` methods for Agents to wrap and read custom `AgentRunOptions`. - -When using the `GetImplSpecificOptions` method to read `AgentRunOptions`, AgentRunOptions that don't match the required type (like options in the example) will be ignored. + + + + + + + +
    Generic TypeDefault Alias
    TypedAgent[*schema.Message]
    Agent
    TypedAgentInput[*schema.Message]
    AgentInput
    TypedAgentEvent[*schema.Message]
    AgentEvent
    TypedAgentOutput[*schema.Message]
    AgentOutput
    TypedMessageVariant[*schema.Message]
    MessageVariant
    -For example, you can define `WithModelName` to require the Agent to change the model being called at the request level: +# AgentInput ```go -// github.com/cloudwego/eino/adk/call_option.go -// func WrapImplSpecificOptFn[T any](optFn func(*T)) AgentRunOption -// func GetImplSpecificOptions[T any](base *T, opts ...AgentRunOption) *T - -import "github.com/cloudwego/eino/adk" - -type options struct { - modelName string -} - -func WithModelName(name string) adk.AgentRunOption { - return adk.WrapImplSpecificOptFn(func(t *options) { - t.modelName = name - }) -} - -func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { - o := &options{} - o = adk.GetImplSpecificOptions(o, opts...) - // run code... +type TypedAgentInput[M MessageType] struct { + Messages []M + EnableStreaming bool } ``` -Additionally, AgentRunOption has a `DesignateAgent` method. Calling this method allows you to specify which Agent the Option takes effect on when calling a multi-Agent system: - -```go -func genOpt() { - // Specify that the option only takes effect for agent_1 and agent_2 - opt := adk.WithSessionValues(map[string]any{}).DesignateAgent("agent_1", "agent_2") -} -``` +- **Messages**: User instructions, conversation history, background knowledge, etc., consistent with ChatModel input format +- **EnableStreaming**: Suggests the Agent use streaming output. Components that support streaming (such as ChatModel) will return results progressively; components that don't support it are unaffected -## AsyncIterator +# AgentEvent -`Agent.Run` returns an iterator `AsyncIterator[*AgentEvent]`: +Events produced during Agent execution: ```go -// github.com/cloudwego/eino/adk/utils.go - -type AsyncIterator[T any] struct { - ... -} - -func (ai *AsyncIterator[T]) Next() (T, bool) { - ... +type TypedAgentEvent[M MessageType] struct { + AgentName string + RunPath []RunStep + Output *TypedAgentOutput[M] + Action *AgentAction + Err error } ``` -It represents an asynchronous iterator (asynchronous means there is no synchronization control between production and consumption), allowing callers to consume a series of events produced by the Agent in an ordered, blocking manner. - -- `AsyncIterator` is a generic struct that can be used to iterate over any type of data. Currently in the Agent interface, the iterator type returned by the Run method is fixed as `AsyncIterator[*AgentEvent]`. This means that every element you get from this iterator will be a pointer to an `AgentEvent` object. `AgentEvent` will be explained in detail in the following sections. -- The main way to interact with the iterator is by calling its `Next()` method. This method is **blocking** - each call to `Next()` will pause execution until one of the following two situations occurs: - - Agent produces a new `AgentEvent`: The `Next()` method returns this event, and the caller can immediately process it. - - Agent actively closes the iterator: When the Agent will no longer produce any new events (usually when the Agent finishes running), it closes the iterator. At this point, the `Next()` call will end blocking and return false in the second return value, telling the caller that iteration has ended. - -Typically, you need to use a for loop to process `AsyncIterator`: +## AgentOutput ```go -iter := myAgent.Run(xxx) // get AsyncIterator from Agent.Run - -for { - event, ok := iter.Next() - if !ok { - break - } - // handle event +type TypedAgentOutput[M MessageType] struct { + MessageOutput *TypedMessageVariant[M] + CustomizedOutput any } ``` -`AsyncIterator` can be created by `NewAsyncIteratorPair`, which returns another parameter `AsyncGenerator` for producing data: +`MessageVariant` provides unified handling of streaming and non-streaming messages: ```go -// github.com/cloudwego/eino/adk/utils.go - -func NewAsyncIteratorPair[T any]() (*AsyncIterator[T], *AsyncGenerator[T]) -``` - -Agent.Run returns AsyncIterator to let callers receive a series of AgentEvents produced by the Agent in real-time. Therefore, Agent.Run typically runs the Agent in a Goroutine to immediately return the AsyncIterator for the caller to listen to: - -```go -import "github.com/cloudwego/eino/adk" - -func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { - // handle input - iter, gen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() - go func() { - defer func() { - // recover code - gen.Close() - }() - // agent run code - // gen.Send(event) - }() - return iter +type TypedMessageVariant[M MessageType] struct { + IsStreaming bool + Message M + MessageStream *schema.StreamReader[M] + Role schema.RoleType // *schema.Message path + AgenticRole schema.AgenticRoleType // *schema.AgenticMessage path + ToolName string } ``` -## AgentWithOptions - -Using the `AgentWithOptions` method allows you to make some general configurations in Eino ADK Agents. - -Unlike `AgentRunOption`, `AgentWithOptions` takes effect before running and does not support custom options. +- `IsStreaming=true` → Read frame by frame from `MessageStream` +- `IsStreaming=false` → Get all at once from `Message` +- `Role`/`ToolName`: Only valid for the `*schema.Message` path (Assistant or Tool) +- `AgenticRole`: Only valid for the `*schema.AgenticMessage` path -```go -// github.com/cloudwego/eino/adk/flow.go -func AgentWithOptions(ctx context.Context, agent Agent, opts ...AgentOption) Agent -``` - -Currently built-in supported configurations in Eino ADK include: - -- `WithDisallowTransferToParent`: Configures that this SubAgent is not allowed to Transfer to ParentAgent, which will trigger the SubAgent's `OnDisallowTransferToParent` callback method -- `WithHistoryRewriter`: When configured, the Agent will rewrite the received context information through this method before execution - -# AgentEvent +## AgentAction -AgentEvent is the core event data structure produced by the Agent during its run. It contains the Agent's metadata, output, actions, and errors: +Behavior signals for controlling multi-Agent collaboration: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentEvent struct { - AgentName string - - RunPath []RunStep - - Output *AgentOutput - - Action *AgentAction - - Err error +type AgentAction struct { + Exit bool + Interrupted *InterruptInfo + TransferToAgent *TransferToAgentAction // NOT RECOMMENDED + BreakLoop *BreakLoopAction + CustomizedAction any } - -// EventFromMessage constructs a regular event -func EventFromMessage(msg Message, msgStream MessageStream, role schema.RoleType, toolName string) *AgentEvent ``` -## AgentName & RunPath - -The `AgentName` and `RunPath` fields are automatically filled by the framework. They provide important context information about the event source, which is crucial in complex systems composed of multiple Agents. +- **Interrupted**: Interrupts Runner execution, carries custom data, supports subsequent Resume +- **BreakLoop**: Terminates the LoopAgent's loop +- **Exit**: Immediately exits the multi-Agent system +- **TransferToAgent**: (Not recommended) Task transfer, AgentAsTool is recommended instead -```go -type RunStep struct { - agentName string -} -``` +# AgentRunOption -- `AgentName` indicates which Agent instance produced the current AgentEvent. -- `RunPath` records the complete call chain to reach the current Agent. `RunPath` is a slice of `RunStep` that sequentially records all `AgentNames` from the initial entry Agent to the current Agent producing the event. - -## AgentOutput +Request-level Agent configuration. ADK built-ins: -`AgentOutput` encapsulates the output produced by the Agent. +- `WithSessionValues(map[string]any)`: Injects KV data shared across Agents +- `WithCallbacks(...callbacks.Handler)`: Adds callback handlers +- `WithCancel()`: Enables Agent Cancel capability (see [Cancel and TurnLoop](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart)) -Message output is set in the MessageOutput field, while other types of custom output are set in the CustomizedOutput field: +Custom Options: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentOutput struct { - MessageOutput *MessageVariant - - CustomizedOutput any +type myOptions struct { + modelName string } -type MessageVariant struct { - IsStreaming bool +func WithModelName(name string) adk.AgentRunOption { + return adk.WrapImplSpecificOptFn(func(t *myOptions) { + t.modelName = name + }) +} - Message Message - MessageStream MessageStream - // message role: Assistant or Tool - Role schema.RoleType - // only used when Role is Tool - ToolName string +// Read in Run +func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + o := adk.GetImplSpecificOptions(&myOptions{}, opts...) + // Use o.modelName ... } ``` -The `MessageVariant` type of the `MessageOutput` field is a core data structure with main functions: - -1. Unified handling of streaming and non-streaming messages: `IsStreaming` is a flag. When true, it indicates the current `MessageVariant` contains a streaming message (read from MessageStream). When false, it indicates a non-streaming message (read from Message): - - - Streaming: Returns a series of message fragments progressively over time, eventually forming a complete message (MessageStream). - - Non-streaming: Returns a complete message at once (Message). -2. Convenient metadata access: The Message struct contains some important metadata, such as the message's Role (Assistant or Tool). To quickly identify message type and source, MessageVariant elevates these commonly used metadata to the top level: - - - `Role`: The role of the message, Assistant / Tool - - `ToolName`: If the message role is Tool, this field directly provides the tool's name. - -The benefit of this is that when code needs to route or make decisions based on message type, it doesn't need to deeply parse the specific content of the Message object - it can directly get the needed information from MessageVariant's top-level fields, simplifying logic and improving code readability and efficiency. - -## AgentAction - -When an Agent produces an Event containing AgentAction, it can control multi-Agent collaboration, such as immediate exit, interrupt, jump, etc.: +`DesignateAgent` can restrict an Option to a specific Agent: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentAction struct { - Exit bool - - Interrupted *InterruptInfo - - TransferToAgent *TransferToAgentAction - - BreakLoop *BreakLoopAction - - CustomizedAction any -} - -type InterruptInfo struct { - Data any -} - -type TransferToAgentAction struct { - DestAgentName string -} +opt := adk.WithSessionValues(map[string]any{"key": "val"}).DesignateAgent("agent_1") ``` -Eino ADK currently has four preset Actions: +# AsyncIterator -1. Exit: When an Agent produces an Exit Action, the Multi-Agent will exit immediately +The asynchronous event iterator returned by `Run`: ```go -func NewExitAction() *AgentAction { - return &AgentAction{Exit: true} +iter := agent.Run(ctx, input) +for { + event, ok := iter.Next() + if !ok { + break + } + // Handle event } ``` -2. Transfer: When an Agent produces a Transfer Action, it will jump to the target Agent to run +`Next()` blocks until a new event is available or iteration ends. Agent implementations typically write to a Generator in a goroutine and return the Iterator immediately: ```go -func NewTransferToAgentAction(destAgentName string) *AgentAction { - return &AgentAction{TransferToAgent: &TransferToAgentAction{DestAgentName: destAgentName}} +func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + iter, gen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() + go func() { + defer gen.Close() + // Execution logic, produce events via gen.Send(event) + }() + return iter } ``` -3. Interrupt: When an Agent produces an Interrupt Action, it will interrupt the Runner's execution. Since interrupts can occur at any position and need to pass unique information when interrupting, the Action provides an `Interrupted` field for Agents to set custom data. When the Runner receives an Action with non-empty Interrupted, it considers an interrupt has occurred. The internal mechanism of Interrupt & Resume is relatively complex and will be elaborated in the [Eino ADK: Agent Runner] - [Eino ADK: Interrupt & Resume] section. - -```go -// For example, when ChatModelAgent interrupts, it sends the following AgentEvent: -h.Send(&AgentEvent{AgentName: h.agentName, Action: &AgentAction{ - Interrupted: &InterruptInfo{ - Data: &ChatModelAgentInterruptInfo{Data: data, Info: info}, - }, -}}) -``` - -4. Break Loop: When a child Agent of LoopAgent emits a BreakLoopAction, the corresponding LoopAgent will stop looping and exit normally. - # Language Settings -ADK provides a `SetLanguage` function to set the language for built-in prompts. This affects the language of prompts generated by all ADK built-in components and middleware. This capability was introduced in [alpha/08](https://github.com/cloudwego/eino/releases/tag/v0.8.0-alpha.13) version. - -## API - ```go -// Language represents the language setting for ADK built-in prompts -type Language uint8 - -const ( - // LanguageEnglish represents English (default) - LanguageEnglish Language = iota - // LanguageChinese represents Chinese - LanguageChinese -) - -// SetLanguage sets the language for ADK built-in prompts -// The default language is English (if not explicitly set) -func SetLanguage(lang Language) error +adk.SetLanguage(adk.LanguageChinese) // Or adk.LanguageEnglish (default) ``` -## Usage Example - -```go -import "github.com/cloudwego/eino/adk" - -// Set to Chinese -err := adk.SetLanguage(adk.LanguageChinese) -if err != nil { - // Handle error -} - -// Set to English (default) -err = adk.SetLanguage(adk.LanguageEnglish) -``` - -## Scope of Effect - -Language settings affect the built-in prompts of the following components: - - - - - - - -
    Component/MiddlewareAffected Prompts
    FileSystem MiddlewareFile system tool descriptions, system prompts, execution tool prompts
    Reduction MiddlewareTool result truncation/cleanup prompt text
    Skill MiddlewareSkill system prompts, skill tool descriptions
    ChatModelAgentBuilt-in system prompts
    - -> 💡 -> It is recommended to set the language during program initialization because the language setting takes effect globally. Changing the language at runtime may result in mixed-language prompts within the same session. +Affects ADK built-in prompts (FileSystem, Reduction, Skill, ChatModelAgent, and other components). It is recommended to set this during program initialization. > 💡 -> The language setting only affects ADK built-in prompts. Your custom prompts (such as Agent's Instruction) need to handle internationalization on your own. +> The language setting only affects ADK built-in prompts. Custom Instructions need to handle internationalization on their own. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_preview.md b/content/en/docs/eino/core_modules/eino_adk/agent_preview.md index 4dcd230afb4..8b8804d166b 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_preview.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_preview.md @@ -1,162 +1,80 @@ --- Description: "" -date: "2026-01-20" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: Overview' +title: Overview weight: 2 --- # What is Eino ADK? -Eino ADK, inspired by [Google-ADK](https://google.github.io/adk-docs/agents/), provides a flexible composition framework for Agent development in Go, i.e., an Agent and Multi-Agent development framework. Eino ADK has accumulated common capabilities for multi-Agent interaction, including context passing, event stream distribution and conversion, task control transfer, interrupt and resume, and common aspects. It is widely applicable, model-agnostic, and deployment-agnostic, making Agent and Multi-Agent development simpler and more convenient while providing comprehensive production-grade application governance capabilities. +Eino ADK is a Go language Agent development framework that provides: -Eino ADK aims to help developers develop and manage Agent applications. It provides a flexible and robust development environment to help developers build various Agent applications such as conversational agents, non-conversational agents, complex tasks, workflows, and more. +- **ChatModelAgent**: A ReAct Agent using LLM as the decision-maker, supporting tool calls, autonomous reasoning, and runtime enhancements (Middleware) +- **Workflow Agents**: Deterministic orchestration primitives (Sequential / Loop / Parallel) +- **Runner / TurnLoop**: Agent execution entry point, supporting event streams, checkpoint/resume, and multi-turn preemption +- **Multi-Agent Collaboration**: AgentAsTool (recommended), Workflow composition -# ADK Framework +Broadly applicable, model-agnostic, and deployment-agnostic. -The overall module structure of Eino ADK is shown in the diagram below: - - +# ADK Architecture ## Agent Interface -The core of Eino ADK is the Agent abstraction (Agent Interface). All ADK functionality is designed around the Agent abstraction. For details, see [Eino ADK: Agent Interface](/docs/eino/core_modules/eino_adk/agent_interface) +All ADK functionality revolves around the `Agent` interface: ```go type Agent interface { Name(ctx context.Context) string Description(ctx context.Context) string - - // Run runs the agent. - // The returned AgentEvent within the AsyncIterator must be safe to modify. - // If the returned AgentEvent within the AsyncIterator contains MessageStream, - // the MessageStream MUST be exclusive and safe to be received directly. - // NOTE: it's recommended to use SetAutomaticClose() on the MessageStream of AgentEvents emitted by AsyncIterator, - // so that even the events are not processed, the MessageStream can still be closed. Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent] } ``` -The definition of `Agent.Run` is: - -1. Get task details and related data from the input AgentInput, AgentRunOption, and optional Context Session -2. Execute the task and write the execution process and results to the AgentEvent Iterator - -`Agent.Run` requires the Agent implementation to execute asynchronously in a Future pattern. The core is divided into three steps. For specifics, refer to the implementation of the Run method in ChatModelAgent: +Semantics of `Run`: -1. Create a pair of Iterator and Generator -2. Start the Agent's asynchronous task and pass in the Generator to process AgentInput. The Agent executes core logic in this asynchronous task (e.g., ChatModelAgent calls LLM) and writes new events to the Generator for the Agent caller to consume from the Iterator -3. Return the Iterator immediately after starting the task in step 2 +1. Obtain task information from `AgentInput` and Context +2. Execute the task asynchronously, writing produced events into the `AsyncIterator` +3. Return the Iterator immediately after starting the async task (Future pattern) -## Multi-Agent Collaboration +## ChatModelAgent -Around the Agent abstraction, Eino ADK provides various simple, easy-to-use composition primitives for rich scenarios, supporting the development of diverse Multi-Agent collaboration strategies such as Supervisor, Plan-Execute, Group-Chat, and other Multi-Agent scenarios. This enables different Agent division of labor and cooperation patterns to handle more complex tasks. For details, see [Eino ADK: Agent Collaboration](/docs/eino/core_modules/eino_adk/agent_collaboration) +The core ADK implementation. Uses ChatModel as the decision-maker and autonomously drives problem-solving through a ReAct Loop. -The collaboration primitives defined by Eino ADK during Agent collaboration are as follows: +**ChatModelAgent = ChatModel + Tools + ReAct Loop + Middleware** -- Collaboration methods between Agents +For details, see: [Eino ADK: ChatModelAgent Introduction](/docs/eino/overview/eino_adk_quickstart) - - - - -
    Collaboration MethodDescription
    TransferDirectly transfer the task to another Agent. The current Agent exits after execution and does not care about the task execution status of the transferred Agent
    ToolCall(AgentAsTool)Call an Agent as a ToolCall, wait for the Agent's response, and obtain the output result of the called Agent for the next round of processing
    +## Multi-Agent Collaboration -- Context strategies for AgentInput +> 💡 +> Recommended approach: **AgentAsTool** — Convert a sub-Agent into a Tool, and the parent Agent invokes it via ToolCall to obtain results. This is the most flexible and composable collaboration pattern. - - - + + +
    Context StrategyDescription
    Upstream Agent Full DialogueGet the complete dialogue record of this Agent's upstream Agent
    New Task DescriptionIgnore the complete dialogue record of the upstream Agent and provide a new task summary as the sub-Agent's AgentInput
    Collaboration MethodMechanismUse Cases
    AgentAsTool (recommended)Sub-Agent wrapped as a Tool, parent Agent autonomously decides whether to invokeDelegating subtasks, capability composition
    WorkflowSequential / Loop / Parallel deterministic orchestrationMulti-step tasks with fixed processes
    -- Decision Autonomy +For details, see: [Agent Collaboration](/docs/eino/core_modules/eino_adk/agent_collaboration) - - - - -
    Decision AutonomyDescription
    Autonomous DecisionInside the Agent, based on its available downstream Agents, when assistance is needed, autonomously select downstream Agents for assistance. Generally, the Agent makes decisions based on LLM internally, but even if selection is based on preset logic, it is still considered autonomous decision from outside the Agent
    Preset DecisionPre-set the next Agent after an Agent executes a task. The execution order of Agents is predetermined and predictable
    +## Runner -Around the collaboration primitives, Eino ADK provides the following Agent composition primitives: +Runner is the execution entry point for Agents. Only when executing through Runner can you use: - - - - - - - -
    TypeDescriptionRun ModeCollaboration MethodContext StrategyDecision Autonomy
    SubAgentsUse the user-provided agent as the Parent Agent and the user-provided subAgents list as Child Agents to form an autonomously deciding Agent, where Name and Description serve as the Agent's name identifier and description.
  • Currently limited to one Agent having only one Parent Agent
  • Use the SetSubAgents function to build a "multi-branch tree" form of Multi-Agent
  • In this "multi-branch tree", AgentName must remain unique
  • TransferUpstream Agent Full DialogueAutonomous Decision
    SequentialCombine the user-provided SubAgents list into a Sequential Agent that executes in order, where Name and Description serve as the Sequential Agent's name identifier and description. When the Sequential Agent executes, it runs the SubAgents list in order until all Agents have been executed.TransferUpstream Agent Full DialoguePreset Decision
    ParallelCombine the user-provided SubAgents list into a Parallel Agent that executes concurrently based on the same context, where Name and Description serve as the Parallel Agent's name identifier and description. When the Parallel Agent executes, it runs the SubAgents list concurrently and ends after all Agents complete execution.TransferUpstream Agent Full DialoguePreset Decision
    LoopExecute the user-provided SubAgents list in array order, cycling repeatedly, to form a Loop Agent, where Name and Description serve as the Loop Agent's name identifier and description. When the Loop Agent executes, it runs the SubAgents list in sequence and ends after all Agents complete execution.TransferUpstream Agent Full DialoguePreset Decision
    AgentAsToolConvert an Agent into a Tool to be used by other Agents as a regular Tool. Whether an Agent can call other Agents as Tools depends on its own implementation. The ChatModelAgent provided in Eino ADK supports the AgentAsTool functionalityToolCallNew Task DescriptionAutonomous Decision
    - -## ChatModelAgent - -`ChatModelAgent` is Eino ADK's key implementation of Agent. It encapsulates the interaction logic with large language models, implements a ReAct paradigm Agent, orchestrates the ReAct Agent control flow based on Graph in Eino, and exports events generated during ReAct Agent execution through callbacks.Handler, converting them to AgentEvent for return. - -To learn more about ChatModelAgent, see: [Eino ADK: ChatModelAgent](/docs/eino/core_modules/eino_adk/agent_implementation/chat_model) +- **Event Stream Output**: Query/Run → AsyncIterator[AgentEvent] +- **Checkpoint / Resume**: Persist running state, support interrupt recovery +- **TurnLoop**: Multi-turn runtime, Push/Preempt/Stop ```go -type ChatModelAgentConfig struct { - // Name of the agent. Better be unique across all agents. - Name string - // Description of the agent's capabilities. - // Helps other agents determine whether to transfer tasks to this agent. - Description string - // Instruction used as the system prompt for this agent. - // Optional. If empty, no system prompt will be used. - // Supports f-string placeholders for session values in default GenModelInput, for example: - // "You are a helpful assistant. The current time is {Time}. The current user is {User}." - // These placeholders will be replaced with session values for "Time" and "User". - Instruction string - - Model model.ToolCallingChatModel - - ToolsConfig ToolsConfig - - // GenModelInput transforms instructions and input messages into the model's input format. - // Optional. Defaults to defaultGenModelInput which combines instruction and messages. - GenModelInput GenModelInput - - // Exit defines the tool used to terminate the agent process. - // Optional. If nil, no Exit Action will be generated. - // You can use the provided 'ExitTool' implementation directly. - Exit tool.BaseTool - - // OutputKey stores the agent's response in the session. - // Optional. When set, stores output via AddSessionValue(ctx, outputKey, msg.Content). - OutputKey string - - // MaxIterations defines the upper limit of ChatModel generation cycles. - // The agent will terminate with an error if this limit is exceeded. - // Optional. Defaults to 20. - MaxIterations int -} +runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + CheckPointStore: store, // Optional +}) -func NewChatModelAgent(_ context.Context, config *ChatModelAgentConfig) (*ChatModelAgent, error) { - // omit code -} +iter := runner.Query(ctx, "Your question") ``` -# AgentRunner - -AgentRunner is the executor for Agents, providing support for extended functionality required by Agent execution. For details, see: [Eino ADK: Agent Extension](/docs/eino/core_modules/eino_adk/agent_extension) - -Only when executing agents through Runner can you use the following ADK features: - -- Interrupt & Resume -- Aspect mechanism (supported in 1226 test version, API compatibility not guaranteed before official release) -- Context environment preprocessing - - ```go - type RunnerConfig struct { - Agent Agent - EnableStreaming bool - - CheckPointStore compose.CheckPointStore - } - - func NewRunner(_ context.Context, conf RunnerConfig) *Runner { - // omit code - } - ``` +For details, see: [Agent Runner and Extension](/docs/eino/core_modules/eino_adk/agent_extension) | [Agent Cancel and TurnLoop](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_quickstart.md b/content/en/docs/eino/core_modules/eino_adk/agent_quickstart.md index 5560ebacc43..f2bf0953bf6 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_quickstart.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_quickstart.md @@ -1,93 +1,55 @@ --- Description: "" -date: "2026-01-30" +date: "2026-05-19" lastmod: "" tags: [] -title: 'Eino ADK: Quickstart' +title: Quickstart weight: 1 --- # Installation -Eino provides ADK from `v0.5.0`. Upgrade your project: +Eino ADK is available since v0.5.0, with v0.9.0 being the current recommended version: ```go -// stable >= eino@v0.5.0 go get github.com/cloudwego/eino@latest ``` -# Agent +# Core Concepts -### What is Eino ADK +**Eino ADK** is a Go language Agent development framework. The core primitive is **ChatModelAgent** — an intelligent agent that uses ChatModel as the decision-maker, Tools as the action space, and autonomously drives problem-solving through a ReAct Loop. -Eino ADK, inspired by [Google‑ADK](https://google.github.io/adk-docs/agents/), is a Go framework for building Agent and Multi‑Agent applications. It standardizes context passing, event streaming, task transfer, interrupts/resume, and cross‑cutting features. +> 💡 +> If you only read one document, read: [Eino ADK: ChatModelAgent Introduction](/docs/eino/overview/eino_adk_quickstart) -### What is an Agent - -An Agent is the core of Eino ADK, representing an independent, executable intelligent task unit. You can think of it as an "intelligent entity" that can understand instructions, execute tasks, and provide responses. Each Agent has a clear name and description, making it discoverable and callable by other Agents. - -Any scenario requiring interaction with a Large Language Model (LLM) can be abstracted as an Agent. For example: - -- An Agent for querying weather information -- An Agent for booking meetings -- An Agent capable of answering domain‑specific questions - -### Agent in ADK - -All features in Eino ADK are designed around the Agent abstraction: - -```go -type Agent interface { - Name(ctx context.Context) string - Description(ctx context.Context) string - Run(ctx context.Context, input *AgentInput) *AsyncIterator[*AgentEvent] -} -``` - -Based on the Agent abstraction, ADK provides three base extension categories: - -- `ChatModel Agent`: The "thinking" part of the application, using LLM as its core to understand natural language, perform reasoning, planning, generate responses, and dynamically decide how to execute or which tools to use. -- `Workflow Agents`: The coordination and management part of the application, controlling sub-Agent execution flow based on predefined logic according to their type (sequential/parallel/loop). Workflow Agents produce deterministic, predictable execution patterns, unlike the dynamic random decisions generated by ChatModel Agent. - - Sequential (Sequential Agent): Execute sub-Agents in order - - Loop (Loop Agent): Repeatedly execute sub-Agents until a specific termination condition is met - - Parallel (Parallel Agent): Execute multiple sub-Agents concurrently -- `Custom Agent`: Implement your own Agent through the interface, allowing highly customized complex Agents - -Based on these base extensions, you can combine these basic Agents according to your needs to build the Multi-Agent system you require. Additionally, Eino provides several out-of-the-box Multi-Agent best practice paradigms based on daily practical experience: - -- Supervisor: Supervisor mode, where the Supervisor Agent controls all communication flows and task delegation, deciding which Agent to call based on current context and task requirements. -- Plan-Execute: Plan-Execute mode, where the Plan Agent generates a plan with multiple steps, and the Execute Agent completes tasks based on user query and the plan. After execution, Plan is called again to decide whether to complete the task or replan. - -The table and diagram below provide the characteristics, differences, and relationships of these base extensions and encapsulations. Subsequent chapters will detail the principles and specifics of each type: +## Component Map - - - - + + + + + +
    CategoryChatModel AgentWorkflow AgentsCustom LogicEinoBuiltInAgent (supervisor, plan‑execute)
    FunctionThinking, generation, tool callsControl execution flow among agentsRun custom logicOut‑of‑the‑box multi‑agent pattern encapsulation
    CoreLLMPredetermined execution flows (sequential/parallel/loop)Custom codeHigh‑level encapsulation based on Eino practical experience
    PurposeGeneration, dynamic decisionsStructured processing, orchestrationCustomization needsTurnkey solutions for specific scenarios
    ComponentResponsibilityDocumentation
    ChatModelAgentReAct Loop: Reasoning → Action → Feedback, autonomous decision-makingChatModelAgent Introduction
    MiddlewareInject behaviors at lifecycle points of the ReAct Loop (compression, search, retry, etc.)ChatModelAgentMiddleware
    RunnerSingle Agent run entry point: Query / Run → Event StreamAgent Runner and Extension
    TurnLoopMulti-turn runtime: Push / Preempt / Stop + declarative checkpoint/resumeAgent Cancel and TurnLoop
    DeepAgentsPre-built Agents: Task planning (PlanTask) + subtask delegation (TaskTool)DeepAgents
    - +## Other Agent Types -# ADK Examples +In addition to ChatModelAgent, ADK also provides deterministic orchestration primitives: -The [Eino‑examples](https://github.com/cloudwego/eino-examples/tree/main/adk) project provides various ADK implementation examples. You can refer to the example code and descriptions to build an initial understanding of ADK capabilities: +- **Workflow Agents**: Sequential / Loop / Parallel Agents, for structured orchestration of predefined processes. +- **Custom Agent**: Implement the `Agent` interface to integrate with the framework. - - - - - - - - - -
    Project PathIntroductionDiagram
    Sequential workflow exampleThis example code demonstrates a sequential multi-agent workflow built using Eino ADK's Workflow paradigm.
  • Sequential workflow construction: Create a sequential execution agent named ResearchAgent via adk.NewSequentialAgent, containing two sub-agents (SubAgents) PlanAgent and WriterAgent, responsible for research plan formulation and report writing respectively.
  • Clear sub-agent responsibilities: PlanAgent receives research topics and generates detailed, logically clear research plans; WriterAgent writes structurally complete academic reports based on the research plan.
  • Chained input/output: PlanAgent's output research plan serves as WriterAgent's input, forming a clear upstream-downstream data flow, reflecting the sequential dependency of business steps.
  • Loop workflow exampleThis example code builds a reflection-iteration agent framework based on Eino ADK's Workflow paradigm using LoopAgent.
  • Iterative reflection framework: Create ReflectionAgent via adk.NewLoopAgent, containing two sub-agents MainAgent and CritiqueAgent, supporting up to 5 iterations, forming a closed loop of main task solving and critical feedback.
  • MainAgent: Responsible for generating initial solutions based on user tasks, pursuing accurate and complete answer output.
  • CritiqueAgent: Performs quality review on MainAgent's output, provides improvement feedback, terminates the loop if results are satisfactory, and provides final summary.
  • Loop mechanism: Utilizes LoopAgent's iteration capability to continuously optimize solutions through multiple rounds of reflection, improving output quality and accuracy.
  • Parallel workflow exampleThis example code builds a concurrent information collection framework based on Eino ADK's Workflow paradigm using ParallelAgent:
  • Concurrent execution framework: Create DataCollectionAgent via adk.NewParallelAgent, containing multiple information collection sub-agents.
  • Sub-agent responsibility allocation: Each sub-agent is responsible for information collection and analysis from one channel, with no interaction needed between them, clear functional boundaries.
  • Concurrent execution: Parallel Agent can simultaneously start information collection tasks from multiple data sources, significantly improving processing efficiency compared to serial approaches.
  • supervisorThis use case employs a single-layer Supervisor managing two relatively comprehensive sub-Agents: Research Agent handles retrieval tasks, Math Agent handles various mathematical operations (add, multiply, divide), but all math operations are uniformly processed within the same Math Agent rather than being split into multiple sub-Agents. This design simplifies the agent hierarchy, suitable for scenarios where tasks are relatively concentrated and don't require excessive decomposition, facilitating rapid deployment and maintenance.
    layered‑supervisorThis use case implements a multi-tier intelligent agent supervision system, where the top-level Supervisor manages Research Agent and Math Agent, and Math Agent is further subdivided into three sub-Agents: Subtract, Multiply, and Divide. The top-level Supervisor is responsible for assigning research tasks and math tasks to lower-level Agents, while Math Agent as a mid-tier supervisor further dispatches specific math operation tasks to its sub-Agents.
  • Multi-tier agent structure: Implements a top-level Supervisor Agent managing two sub-agents — Research Agent (responsible for information retrieval) and Math Agent (responsible for mathematical operations).
  • Math Agent internally subdivides into three sub-agents: Subtract Agent, Multiply Agent, and Divide Agent, handling subtraction, multiplication, and division operations respectively, reflecting multi-level supervision and task delegation.
  • This hierarchical management structure reflects fine-grained decomposition of complex tasks and multi-level task delegation, suitable for scenarios with clear task classification and computational complexity.
    plan‑execute exampleThis example implements a multi-Agent travel planning system using the plan-execute-replan pattern based on Eino ADK. The core function is to process complex user travel requests (such as "3-day Beijing trip, need flights from New York, hotel recommendations, must-see attractions") through a "plan-execute-replan" loop to complete tasks: 1. Plan:
    Planner Agent
    generates a step-by-step execution plan based on the large model (e.g., "Step 1: check Beijing weather, Step 2: search New York to Beijing flights"); 2. Execute:
    Executor Agent
    calls mock tools **weather (get_weather), flights (search_flights), hotels (search_hotels), attractions (search_attractions)** to execute each step. If user input information is missing (e.g., budget not specified), it calls
    ask_for_clarification
    tool to ask follow-up questions; 3. Replan:
    Replanner Agent
    evaluates whether the plan needs adjustment based on tool execution results (e.g., if no flight tickets available, reselect dates). Execute and Replan continuously loop until all steps in the plan are completed; 4. Supports session trajectory tracking (CozeLoop callback) and state management, ultimately outputting a complete travel plan. Structurally, plan-execute-replan has two layers:
  • Layer 2 is a loop agent composed of execute + replan agent, meaning after replan, re-execution may be needed (after replanning, need to query travel information / request user to continue clarifying questions)
  • Layer 1 is a sequential agent composed of plan agent + Layer 2 loop agent, meaning plan executes only once, then hands over to the loop agent for execution
  • book recommendation agent (interrupt and resume)This code demonstrates a book recommendation chat agent implementation built on the Eino ADK framework, showcasing Agent interrupt and resume functionality.
  • Agent construction: Create a chat agent named BookRecommender via adk.NewChatModelAgent for recommending books based on user requests.
  • Tool integration: Integrates two tools — BookSearch tool for searching books and AskForClarification tool for asking clarifying information, supporting multi-turn interaction and information supplementation.
  • State management: Implements simple in-memory CheckPoint storage, supporting session breakpoint continuation to ensure context continuity.
  • Event-driven: Obtains event streams by iterating runner.Query and runner.Resume, handling various events and errors during execution.
  • Custom input: Supports dynamic user input reception, using tool options to pass new query requests, flexibly driving task flow.
  • +> 💡 +> Graph (deterministic orchestration) and Agent (autonomous decision-making) are two different forms of AI applications. When the core problem involves "autonomous decision-making + runtime enhancement", ChatModelAgent is recommended. See "Why not continue using flow/react" in the ChatModelAgent Introduction for details. -# What's Next +# Examples -After this Quickstart overview, you should have a basic understanding of Eino ADK and Agents. +[eino-examples/adk](https://github.com/cloudwego/eino-examples/tree/main/adk) provides complete ADK example code: -The following articles will dive deep into ADK core concepts to help you understand how Eino ADK works and use it more effectively: +- **ChatModelAgent Getting Started**: [chatmodel](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/chatmodel) — Book recommendation Agent, with interrupt and resume +- **DeepAgents**: [deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) — Task planning + subtask delegation +- **Workflow**: [sequential](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/sequential) / [loop](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/loop) / [parallel](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/parallel) +- **Multi-Agent**: [supervisor](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/supervisor) / [plan-execute](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/plan-execute-replan) - +# What's Next diff --git a/content/en/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/_index.md b/content/en/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/_index.md new file mode 100644 index 00000000000..508aede7472 --- /dev/null +++ b/content/en/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/_index.md @@ -0,0 +1,540 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: Agent Cancel and TurnLoop Quick Start +weight: 10 +--- + +A quick start guide for the two core features in Eino ADK: **Agent Cancel** and **TurnLoop**. Introduced since [v0.9.0-alpha.9](https://github.com/cloudwego/eino/releases/tag/v0.9.0-alpha.9). + +## Type Conventions + +All examples in this document use the following generic instantiations: + +- `T = string` (the business item type pushed to TurnLoop) +- `M = *schema.Message` (Agent message type, i.e., standard `Message`) + +Related type aliases in ADK: + +```go +type Agent = TypedAgent[*schema.Message] +type AgentInput = TypedAgentInput[*schema.Message] +type AgentEvent = TypedAgentEvent[*schema.Message] +``` + +When using `*schema.AgenticMessage`, simply replace `M` with the corresponding type — all API signatures are fully symmetric. + +--- + +## Part 1: Agent Cancel + +### Scenario + +After a user sends a request to an agent, they want to cancel the current execution due to waiting too long or a change in requirements. + +### Core API + +```go +// Create cancel option and cancel function +cancelOpt, cancelFunc := adk.WithCancel() + +// Start the agent, passing the cancel option +iter := runner.Run(ctx, []*schema.Message{schema.UserMessage("hello")}, cancelOpt) + +// Initiate cancellation (can be called from any goroutine) +handle, contributed := cancelFunc(adk.WithAgentCancelMode(adk.CancelImmediate)) +// contributed == true: this call affected the execution result +// contributed == false: agent already ended or cancel already completed, this call has no effect + +err := handle.Wait() +``` + +Three possible return values from `CancelHandle.Wait()`: + +```go +switch { +case err == nil: + // Cancel succeeded +case errors.Is(err, adk.ErrCancelTimeout): + // Safe point timeout, automatically escalated to immediate cancel +case errors.Is(err, adk.ErrExecutionEnded): + // Agent ended naturally before cancel took effect +} +``` + +### Three Cancel Modes + + + + + + +
    ModeBehaviorUse Case
    CancelImmediate
    Interrupts immediately without waiting for a safe pointEmergency stop, timeout fallback
    CancelAfterChatModel
    Waits for the current ChatModel call to complete before cancelingNeed complete model response
    CancelAfterToolCalls
    Waits for all current ToolCalls to complete before cancelingEnsure tool side effects are complete
    + +> 💡 +> `CancelMode` is a bitmask and can be combined: `CancelAfterChatModel | CancelAfterToolCalls` is equivalent to "cancel at whichever safe point is reached first." + +### Safe Point Cancellation + +```go +// Cancel after ChatModel completes, with 5-second timeout protection +handle, _ := cancelFunc( + adk.WithAgentCancelMode(adk.CancelAfterChatModel), + adk.WithAgentCancelTimeout(5*time.Second), +) +``` + +> 💡 +> Always pair safe point mode with `WithAgentCancelTimeout`. If the agent never reaches a safe point, timeout automatically escalates to immediate cancel. + +### Recursive Cancellation + +By default, cancellation only affects the root agent. Use `WithRecursive()` to propagate cancellation to sub-agents nested within AgentTools: + +```go +handle, _ := cancelFunc( + adk.WithAgentCancelMode(adk.CancelAfterChatModel), + adk.WithRecursive(), +) +``` + +### Consumer-Side Cancel Detection + +```go +for { + event, ok := iter.Next() + if !ok { + break + } + if event.Err != nil { + var cancelErr *adk.CancelError + if errors.As(event.Err, &cancelErr) { + log.Printf("Agent cancelled (mode=%v, escalated=%v)", + cancelErr.Info.Mode, cancelErr.Info.Escalated) + } + break + } + // Process normal events... +} +``` + +--- + +## Part 2: TurnLoop + +### Scenario + +Build a continuously running agent service: users send messages at any time, the agent processes them by turns; urgent messages can preempt current execution. + +### Turn Lifecycle + + + +### Basic Usage + +```go +loop := adk.NewTurnLoop(adk.TurnLoopConfig[string, *schema.Message]{ + // GenInput: receives all items in the buffer, decides which to consume this turn + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(strings.Join(items, "\n"))}}, + Consumed: items, + }, nil + }, + + // PrepareAgent: builds the Agent based on consumed items for this turn + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return myAgent, nil + }, + + // OnAgentEvents: handles the agent event stream (optional) + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + if event.Err != nil { + return event.Err + } + log.Printf("Received event: agent=%s", event.AgentName) + } + return nil + }, +}) + +loop.Push("message 1") +loop.Push("message 2") +loop.Run(ctx) // Non-blocking, starts background processing +loop.Push("message 3") // Can still push while running +loop.Stop() +result := loop.Wait() // Blocks until exit +``` + +### Core Callbacks + + + + + + + +
    CallbackRequiredResponsibility
    GenInput
    Receives all buffered items, returns
    Consumed
    (processed this turn) and
    Remaining
    (kept for subsequent turns). Items not in either will be discarded.
    PrepareAgent
    Builds the Agent based on Consumed items (set prompt, tools, middleware, etc.)
    OnAgentEvents
    Handles agent event stream. When not set, events are drained by default and the first error is returned
    GenResume
    Called when resuming from checkpoint, decides how to merge interrupted/unhandled/new items
    + +> 💡 +> **Do not propagate CancelError** in `OnAgentEvents` — the framework handles it automatically. Stop-triggered `CancelError` is propagated as `ExitReason`; Preempt-triggered `CancelError` is absorbed by the framework, and the loop continues to the next turn. The callback should only return non-nil error when it encounters a fatal error itself. + +### Preemption + +```go +// Push urgent message, cancel current agent at a safe point +accepted, ack := loop.Push("Urgent message!", adk.WithPreempt[string, *schema.Message](adk.AnySafePoint)) + +if accepted { + <-ack // Wait for preemption signal to be submitted (current turn is guaranteed to be cancelled) +} +``` + +Preemption is an atomic operation — "push new message" and "cancel current agent" execute as a unit: + +1. Urgent message enters the buffer +2. Current agent is cancelled at the safe point +3. TurnLoop automatically starts a new turn +4. `GenInput` receives all buffered items (including the urgent message) and re-decides + +> 💡 +> `WithPreempt` always uses safe point cancellation and **does not automatically set WithRecursive**. However, `WithPreemptTimeout` automatically enables `WithRecursive` — when timeout escalates to immediate cancel, nested sub-agents are also terminated. + +### Preemption with Timeout / Delay + +```go +// Safe point wait, escalates to immediate cancel after 5 seconds (automatically recursive) +loop.Push("urgent", adk.WithPreemptTimeout[string, *schema.Message](adk.AnySafePoint, 5*time.Second)) + +// 2-second grace period before initiating preemption +loop.Push("new message", + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + adk.WithPreemptDelay[string, *schema.Message](2*time.Second), +) +``` + +### Conditional Preemption: WithPushStrategy + +When preemption decisions depend on the current turn state, use `WithPushStrategy` to avoid TOCTOU races: + +```go +loop.Push(urgentItem, adk.WithPushStrategy( + func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message]) []adk.PushOption[string, *schema.Message] { + if tc == nil { + return nil // No active turn, no need to preempt + } + if isLowPriority(tc.Consumed) { + return []adk.PushOption[string, *schema.Message]{ + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + } + } + return nil // Current task is high priority, don't preempt + }, +)) +``` + +### Detecting Preemption and Stop in OnAgentEvents + +`TurnContext` provides `Preempted` and `Stopped` signal channels: + +```go +OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + + select { + case <-tc.Preempted: + log.Println("Current turn preempted, wrapping up...") + case <-tc.Stopped: + log.Printf("Loop is stopping, cause: %s", tc.StopCause()) + default: + } + + if event.Err != nil { + return event.Err + } + // Process events... + } + return nil +}, +``` + +> 💡 +> `Preempted` / `Stopped` are closed only when the corresponding cancel call actually "contributes" to the current turn's `CancelError`. If the cancel has already been finalized by another signal, the channel remains open. + +### Stopping TurnLoop + +```go +// Wait for current turn to complete before exiting (ExitReason is nil) +loop.Stop() + +// Immediately abort current agent (recursively propagated to nested agents) +loop.Stop(adk.WithImmediate()) + +// Safe point stop (recursively propagated, no timeout) +loop.Stop(adk.WithGraceful()) + +// Safe point stop with timeout (escalates to immediate cancel on timeout) +loop.Stop(adk.WithGracefulTimeout(10 * time.Second)) + +// Auto-stop after idle (stops after 30 seconds of continuous idle) +loop.Stop(adk.UntilIdleFor(30 * time.Second)) +``` + +> 💡 +> You can call `Stop()` multiple times to escalate the cancellation strategy. Typical pattern: first `WithGraceful()`, then `WithImmediate()` after timeout. + +### Attaching Stop Cause + +```go +loop.Stop( + adk.WithGraceful(), + adk.WithStopCause("quota exceeded"), +) +result := loop.Wait() +log.Printf("Stop cause: %s", result.StopCause) +``` + +--- + +## Part 3: Declarative Checkpoint Recovery + +### Scenario + +After an Agent is cancelled or interrupted, it automatically resumes from the breakpoint on next start rather than starting over. TurnLoop automatically manages input bookkeeping; the application layer only needs to declare how interrupted/unhandled/new items re-enter subsequent turns. + +### Configuring Checkpoint + +Enable by setting both `Store` and `CheckpointID` in `TurnLoopConfig`: + +```go +store := NewMyCheckpointStore() // Implements CheckPointStore interface + +cfg := adk.TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(items[0])}}, + Consumed: items[:1], + Remaining: items[1:], + }, nil + }, + + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return myAgent, nil + }, + + // GenResume: called when resuming from checkpoint + GenResume: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], interruptedItems, unhandledItems, newItems []string) (*adk.GenResumeResult[string, *schema.Message], error) { + all := append(append(interruptedItems, unhandledItems...), newItems...) + return &adk.GenResumeResult[string, *schema.Message]{ + Consumed: all[:1], + Remaining: all[1:], + }, nil + }, + + Store: store, + CheckpointID: "session-123", +} +``` + +### Recovery Flow + +`Run()` automatically queries the Store on startup: + + + + + + +
    Checkpoint StateBehavior
    Mid-turn checkpoint exists (agent interrupted during execution)Calls
    GenResume
    , passing interrupted/unhandled/new items for application-layer decision before resuming
    Between-turns checkpoint exists (stopped between turns)Adds buffered items to the buffer, processes normally via
    GenInput
    No checkpoint existsStarts from scratch
    + +```go +// First run +loop := adk.NewTurnLoop(cfg) +loop.Push("message 1") +loop.Run(ctx) +loop.Stop(adk.WithGraceful()) +exit := loop.Wait() +log.Printf("checkpoint attempted: %v, err: %v", exit.CheckpointAttempted, exit.CheckpointErr) + +// Second run (same cfg with same CheckpointID) +loop2 := adk.NewTurnLoop(cfg) +loop2.Push("new message") // Passed as newItems to GenResume +loop2.Run(ctx) // Automatically detects checkpoint and resumes +result := loop2.Wait() +``` + +### Skipping Checkpoint + +```go +loop.Stop(adk.WithSkipCheckpoint()) // Don't save checkpoint on this exit +``` + +### Implementing CheckPointStore + +```go +type CheckPointStore interface { + Get(ctx context.Context, checkPointID string) ([]byte, bool, error) + Set(ctx context.Context, checkPointID string, checkPoint []byte) error +} +``` + +Optionally implement `CheckPointDeleter` to support explicit deletion of expired checkpoints: + +```go +type CheckPointDeleter interface { + Delete(ctx context.Context, checkPointID string) error +} +``` + +On normal exit (no new checkpoint saved), TurnLoop attempts to delete the previously loaded checkpoint to prevent stale recovery. **Only Stores implementing CheckPointDeleter will perform the deletion**; otherwise the Store manages the lifecycle itself. + +> 💡 +> When using `Store`, the generic parameter `T` must support `encoding/gob` encoding/decoding — TurnLoop persists runner checkpoint and item bookkeeping information via gob. + +--- + +## Part 4: Complete Example + +Simulates a chat service with priority scheduling, preemption, and checkpoint recovery: + +```go +package main + +import ( + "context" + "log" + "strings" + "time" + + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/schema" +) + +func main() { + ctx := context.Background() + store := adk.NewInMemoryStore() + + cfg := adk.TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + // Sort by priority, consume only the first, keep the rest for subsequent turns + sorted := sortByPriority(items) + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(sorted[0])}}, + Consumed: sorted[:1], + Remaining: sorted[1:], // Items not in either will be discarded + }, nil + }, + + GenResume: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], interruptedItems, unhandledItems, newItems []string) (*adk.GenResumeResult[string, *schema.Message], error) { + all := append(append(interruptedItems, unhandledItems...), newItems...) + return &adk.GenResumeResult[string, *schema.Message]{ + Consumed: all[:1], + Remaining: all[1:], + }, nil + }, + + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return buildAgent(consumed), nil + }, + + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + // Detect preemption/stop signals and perform cleanup + select { + case <-tc.Preempted: + log.Println("Preempted by higher priority message") + case <-tc.Stopped: + log.Printf("Service shutting down: %s", tc.StopCause()) + default: + } + if event.Err != nil { + // Don't propagate CancelError, framework handles it automatically + return event.Err + } + log.Printf("[%s] %s", event.AgentName, extractText(event)) + } + return nil + }, + + Store: store, + CheckpointID: "chat-session-001", + } + + loop := adk.NewTurnLoop(cfg) + loop.Push("Hello, help me check the weather") + loop.Run(ctx) + + // Send urgent message to preempt after 1 second + time.AfterFunc(1*time.Second, func() { + loop.Push("Stop! Help me handle this urgent issue first", + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + ) + }) + + // Graceful shutdown after 5 seconds + time.AfterFunc(5*time.Second, func() { + loop.Stop( + adk.WithGracefulTimeout(3*time.Second), + adk.WithStopCause("service shutdown"), + ) + }) + + result := loop.Wait() + log.Printf("Exit reason: %v", result.ExitReason) + log.Printf("Unhandled messages: %v", result.UnhandledItems) + log.Printf("Stop cause: %s", result.StopCause) + log.Printf("checkpoint: attempted=%v, err=%v", result.CheckpointAttempted, result.CheckpointErr) + + // Next start with the same cfg will automatically resume from checkpoint +} +``` + +--- + +## FAQ + +### Q: Can safe point cancellation end up waiting forever? + +Yes. If the agent is stuck in a long-running tool or model call, the safe point may take a long time to reach. **Always use WithAgentCancelTimeout**; after timeout it automatically escalates to `CancelImmediate`. + +### Q: When is `WithRecursive` needed? + +By default, cancellation only affects the root agent. It's needed only when the agent hierarchy contains sub-agents nested within AgentTools and you want those sub-agents to also respond to cancellation at safe points. When in doubt, don't add it. + +### Q: What are the requirements for generic parameter T? + +When `Store` is configured, `T` must be encodable/decodable by `encoding/gob`. Basic types (`string`, `int`, etc.) and structs with all exported fields are supported by default. If `T` contains interface fields, you need to register concrete types via `gob.Register`. + +### Q: What happens to `Push` after the loop is stopped? + +`Push` returns `(false, closedCh)`. These "late items" won't enter the checkpoint and can be recovered via `result.TakeLateItems()` after `Wait()` returns. Once `TakeLateItems()` is called, subsequent `Push` calls will panic to prevent silent data loss. + +### Q: What happens with multiple `Stop()` calls? + +It's safe — each call can escalate the cancellation strategy. Typical pattern: + +```go +loop.Stop(adk.WithGraceful()) // Try graceful stop first +time.AfterFunc(3*time.Second, func() { + loop.Stop(adk.WithImmediate()) // Escalate to immediate cancel after 3 seconds +}) +``` + +### Q: What happens to items returned by `GenInput` that are neither in Consumed nor Remaining? + +They are discarded. This is by design — it allows `GenInput` to filter out unwanted items during decision-making. diff --git a/content/en/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/agent_cancel_and_turnloop_api_doc.md b/content/en/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/agent_cancel_and_turnloop_api_doc.md new file mode 100644 index 00000000000..6a970ff6663 --- /dev/null +++ b/content/en/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/agent_cancel_and_turnloop_api_doc.md @@ -0,0 +1,1302 @@ +--- +Description: "" +date: "2026-05-19" +lastmod: "" +tags: [] +title: Agent Cancel and TurnLoop API Documentation +weight: 1 +--- + +# Agent Cancel and TurnLoop API Documentation + +## Overview + +This document describes the core advanced features in Eino ADK (Agent Development Kit): + +1. **Agent Cancel**: Mechanisms for gracefully or immediately canceling a running agent +2. **TurnLoop**: A push-based event loop for managing agent execution cycles (depends on Agent Cancel) + +--- + +## Agent Cancel API + +### Overview + +Agent Cancel provides fine-grained control over running agents. It supports both immediate cancellation and safe point cancellation (waiting for specific execution points, such as after a chat model call or after tool calls). By default, cancel modes only affect the root agent; sub-agents nested within AgentTools do not receive cancel notifications. Use `WithRecursive()` to propagate cancellation throughout the entire agent hierarchy (including nested sub-agents within AgentTools), triggering cancellation when a safe point is reached at any level in the hierarchy. + +**Checkpoint Guarantee**: Regardless of which `CancelMode` is used, cancellation saves a checkpoint at the Runner level, allowing execution to be resumed via `Runner.Resume` or `Runner.ResumeWithParams` after cancellation. When using `WithRecursive`, sub-agents also attempt to trigger cancellation and cascade their interrupt information upward, ultimately generating a complete checkpoint at the root agent level that includes sub-agent checkpoints, supporting resumption from the sub-agent interrupt point. + +### Core Types + +#### `CancelMode` + +Specifies when an agent should be cancelled. Modes can be combined using bitwise OR. + +```go +type CancelMode int + +const ( + // CancelImmediate cancels the agent immediately without waiting for ChatModel or ToolCalls safe points. + // By default only interrupts the root agent; sub-agents within AgentTools are cleaned up + // as a side effect via context cancellation. + // Use WithRecursive to propagate an explicit immediate-cancel signal to sub-agents + // for clean teardown (with grace period). + CancelImmediate CancelMode = 0 + + // CancelAfterChatModel cancels after the root agent's next chat model call completes. + // By default only the root agent checks this safe point; nested sub-agents within AgentTools are unaware. + // Use WithRecursive to propagate to all sub-agents—whichever ChatModel finishes first triggers cancellation. + // Note: this safe point is only checked when the model returns tool calls—because tool calls + // imply further execution (call tools → call model again → ...), making cancellation meaningful. + // If the model produces a final answer directly (no tool calls), execution flows toward completion + // and doesn't pass through this checkpoint. + CancelAfterChatModel CancelMode = 1 << iota + + // CancelAfterToolCalls cancels after the root agent's next round of tool calls all complete. + // By default only the root agent checks this safe point. Use WithRecursive to propagate to all sub-agents. + CancelAfterToolCalls +) +``` + +#### `CancelHandle` + +A handle used to wait for cancellation to complete. + +```go +type CancelHandle struct{ /* unexported fields */ } + +func (h *CancelHandle) Wait() error +``` + +**Wait return values:** + +- `error`: + - `nil`: Cancel succeeded (see CancelError's Interrupt absorption mechanism) + - `ErrCancelTimeout`: Safe point cancel timed out, automatically escalated to immediate cancel (cancellation itself still succeeds) + - `ErrExecutionEnded`: Agent ended before cancel took effect (completed normally or errored), no execution to cancel + +#### `AgentCancelFunc` + +Function type for canceling a running agent. + +```go +type AgentCancelFunc func(...AgentCancelOption) (*CancelHandle, bool) +``` + +**Return values:** + +- `CancelHandle`: + - When returned, indicates the cancel request has been submitted + - Use `Wait()` to wait for cancellation to complete and get the result +- `bool`: + - Indicates whether this call "contributed" to the current execution's `CancelError` + - `true`: This call's cancel options were incorporated before the `CancelError` was finalized + - `false`: Cancel was already finalized (e.g., already handled or execution ended), this call won't affect the `CancelError` + - TurnLoop uses this return value to provide strict semantics for `TurnContext.Preempted` / `TurnContext.Stopped` + +#### `AgentCancelOption` + +Opaque option type for configuring an agent cancel request. Users typically don't implement this type themselves, but use `WithAgentCancelMode`, `WithAgentCancelTimeout`, and `WithRecursive` to create options. + +```go +type AgentCancelOption func(*agentCancelConfig) +``` + +#### `AgentCancelInfo` + +Information about the cancel operation. + +```go +type AgentCancelInfo struct { + Mode CancelMode // Cancel mode used + Escalated bool // Whether escalated to immediate cancel + Timeout bool // Whether timed out +} +``` + +#### `CancelError` + +Error sent via `AgentEvent.Err` when an agent is cancelled. Extract using `errors.As`. + +**Interrupt Absorption Mechanism**: When a cancel is active, **any** interrupt—whether produced by the cancel safe point node or by business logic (e.g., `tool.Interrupt` in a tool)—is converted to `CancelError`. Cancel "absorbs" business interrupts. This is intentional: + +- In concurrent execution (parallel workflows, concurrent tool calls), cancel-induced interrupts and business interrupts may arrive as a single composite signal that cannot be split. +- Even in sequential execution, treating business interrupts as CancelError during an active cancel provides consistent semantics—callers only need to handle `CancelError` as a single signal, without distinguishing "cancel-induced interrupts" from "business interrupts that happen to fire during cancel." +- Business interrupts **are not lost**—the checkpoint preserves the complete interrupt hierarchy. On resume (`Runner.Resume`), the agent re-executes the interrupted code path and business interrupts naturally re-fire. + +```go +type CancelError struct { + Info *AgentCancelInfo + + InterruptContexts []*InterruptCtx // Contexts for targeted resume (can be used with ResumeWithParams) +} +``` + +### Functions + +#### `WithCancel` + +Creates an `AgentRunOption` that enables cancellation. Returns the option and a cancel function. + +```go +func WithCancel() (AgentRunOption, AgentCancelFunc) +``` + +**Return values:** + +- `AgentRunOption`: Option to pass to `Run()` or `Resume()` +- `AgentCancelFunc`: Function for performing cancellation + +**Example:** + +```go +cancelOpt, cancelFunc := WithCancel() +iter := runner.Run(ctx, messages, cancelOpt) + +// Later, cancel the agent +handle, contributed := cancelFunc(WithAgentCancelMode(CancelAfterChatModel)) +if contributed { + // This call's cancel options took effect + switch err := handle.Wait(); { + case err == nil: + // Cancel succeeded + case errors.Is(err, ErrExecutionEnded): + // Agent ended before cancel took effect + case errors.Is(err, ErrCancelTimeout): + // Safe point cancel timed out, automatically escalated to immediate cancel + } +} +``` + +### Options + +#### `WithAgentCancelMode` + +Sets the cancel mode for an agent cancel operation. + +```go +func WithAgentCancelMode(mode CancelMode) AgentCancelOption +``` + +**Parameters:** + +- `mode CancelMode`: The cancel mode to use + +**Example:** + +```go +handle, _ := cancelFunc(WithAgentCancelMode(CancelAfterToolCalls)) +_ = handle.Wait() +``` + +#### `WithAgentCancelTimeout` + +Sets the timeout for a cancel operation. Only applies to safe point modes. + +```go +func WithAgentCancelTimeout(timeout time.Duration) AgentCancelOption +``` + +**Parameters:** + +- `timeout time.Duration`: Timeout duration + +**Behavior:** + +- Only effective for `CancelAfterChatModel` / `CancelAfterToolCalls`; if the safe point is not reached within the timeout, automatically escalates to `CancelImmediate`. The escalated cancel still saves a checkpoint and can be resumed via `Runner.Resume` +- `timeout <= 0` does not set an effective deadline and therefore won't trigger timeout escalation +- When timeout escalation occurs, `CancelHandle.Wait()` returns `ErrCancelTimeout`, and `CancelError.Info.Timeout=true` and `CancelError.Info.Escalated=true` + +**Example:** + +```go +handle, _ := cancelFunc( + WithAgentCancelMode(CancelAfterChatModel), + WithAgentCancelTimeout(5*time.Second), +) +_ = handle.Wait() +``` + +#### `WithRecursive` + +Enables recursive cancel propagation. By default, cancel modes only affect the root agent; sub-agents within AgentTools don't receive cancel notifications. `WithRecursive` causes cancel to propagate to all sub-agents: + +- **CancelAfterChatModel / CancelAfterToolCalls**: Sub-agents check their respective safe points; whichever is reached first triggers cancellation. +- **CancelImmediate**: Sub-agents receive an explicit immediate-cancel signal for clean teardown; the root agent uses a grace period to collect sub-agent interrupts. + +With `WithRecursive` enabled, not only does the root agent save a checkpoint, but sub-agents executing within AgentTools also save their own checkpoints. On resume, execution can continue from the sub-agent interrupt point without re-executing from the root agent. + +Once any cancel call includes `WithRecursive`, the flag remains effective for the entire cancel lifecycle (monotonic upgrade). + +```go +func WithRecursive() AgentCancelOption +``` + +**Example:** + +```go +// Propagate cancel to nested sub-agents +handle, _ := cancelFunc( + WithAgentCancelMode(CancelAfterChatModel), + WithRecursive(), +) +_ = handle.Wait() + +// Escalation: first non-recursive cancel, subsequent call adds recursive +handle1, _ := cancelFunc(WithAgentCancelMode(CancelAfterChatModel)) +handle2, _ := cancelFunc(WithRecursive()) // Upgrades to recursive, all sub-agents now check safe points +``` + +### Sentinel Errors + +#### `ErrCancelTimeout` + +Returned by `CancelHandle.Wait` when the cancel operation times out. + +```go +var ErrCancelTimeout = errors.New("cancel timed out") +``` + +#### `ErrExecutionEnded` + +Returned by `CancelHandle.Wait` when the agent ended before cancel took effect (completed normally or errored). + +Note: Business interrupts that occur during an active cancel are absorbed into `CancelError` (see CancelError documentation), so they result in `nil` (cancel succeeded), **not** `ErrExecutionEnded`. This error is only returned when execution has completely ended with no interrupts occurring. + +```go +var ErrExecutionEnded = errors.New("execution already ended") +``` + +#### `ErrStreamCanceled` + +When `CancelImmediate` fires during ongoing streaming output, the framework immediately aborts the underlying stream and returns `ErrStreamCanceled` in the `.Recv()` of `AgentEvent.Output.MessageOutput.MessageStream`. This applies equally to ChatModel streaming responses and StreamableTool streaming output—both streams are exposed to users via `AgentEvent.Output.MessageOutput.MessageStream`, and the cancel monitoring mechanism is fully symmetric. + +**When it appears**: Only during `CancelImmediate` (including automatic escalation from safe point cancel timeout) when ChatModel or StreamableTool is actively streaming. Safe point cancels (`CancelAfterChatModel` / `CancelAfterToolCalls`) do not produce this error as they wait until the safe point before interrupting. + +**Where it appears**: `ErrStreamCanceled` appears in `AgentEvent.Output.MessageOutput.MessageStream.Recv()`, not in `AgentEvent.Err`. Subsequently, the Runner emits a separate event where `AgentEvent.Err` is `*CancelError`, indicating cancellation is complete. Note this event does not include `AgentEvent.Action.Interrupted`—`Action.Interrupted` is only for business interrupts, while cancellation always communicates via `CancelError`. + +```go +var ErrStreamCanceled error = &StreamCanceledError{} +``` + +#### `StreamCanceledError` + +The concrete error type for `ErrStreamCanceled`. This type is exported so that the stream cancel error can be serialized via gob during checkpoint saving; business code typically uses `errors.Is(err, ErrStreamCanceled)` for detection. + +```go +type StreamCanceledError struct{} + +func (e *StreamCanceledError) Error() string +``` + +```go +// Handling ErrStreamCanceled when processing streaming events +for { + event, ok := events.Next() + if !ok { + break + } + + if event.Output != nil && event.Output.MessageOutput != nil && event.Output.MessageOutput.IsStreaming { + stream := event.Output.MessageOutput.MessageStream + for { + chunk, err := stream.Recv() + if err != nil { + if errors.Is(err, ErrStreamCanceled) { + // Stream aborted by immediate cancel (ChatModel or StreamableTool), CancelError will follow in subsequent event + break + } + if err == io.EOF { + break + } + } + // Process chunk... + _ = chunk + } + } + + if event.Err != nil { + var cancelErr *CancelError + if errors.As(event.Err, &cancelErr) { + // Cancellation complete, CancelError contains cancel mode and interrupt context info + break + } + } +} +``` + +## TurnLoop API + +### Overview + +`TurnLoop` is a push-based event loop that manages agent execution in units of turns. Users push data items into TurnLoop's buffer, and TurnLoop processes them through a configured agent. This design enables flexible, event-driven agent workflows. + +**Note**: Some TurnLoop features (such as preemption and stop) depend on the Agent Cancel feature. + +### Core Types + +#### `TurnLoop[T, M]` + +The main event loop instance. Created via `NewTurnLoop()`, then started with `Run()`. + +```go +type TurnLoop[T any, M MessageType] struct { ... } +``` + +#### `MessageType` + +Constrains message types usable with ADK. Currently only supports `*schema.Message` and `*schema.AgenticMessage`; external packages cannot extend this union type. + +```go +type MessageType interface { + *schema.Message | *schema.AgenticMessage +} +``` + +#### `TypedAgent[M]` + +The agent interface that TurnLoop actually runs each turn. + +```go +type TypedAgent[M MessageType] interface { + Name(ctx context.Context) string + Description(ctx context.Context) string + + Run(ctx context.Context, input *TypedAgentInput[M], options ...AgentRunOption) *AsyncIterator[*TypedAgentEvent[M]] +} + +type Agent = TypedAgent[*schema.Message] +``` + +#### `TypedAgentInput[M]` + +Input passed to the agent. + +```go +type TypedAgentInput[M MessageType] struct { + Messages []M + EnableStreaming bool +} + +type AgentInput = TypedAgentInput[*schema.Message] +``` + +#### `TypedAgentEvent[M]` + +Events emitted during agent execution. Consumed by TurnLoop's `OnAgentEvents` callback. + +```go +type TypedAgentEvent[M MessageType] struct { + AgentName string + RunPath []RunStep + Output *TypedAgentOutput[M] + Action *AgentAction + Err error +} + +type AgentEvent = TypedAgentEvent[*schema.Message] +``` + +#### `TurnLoopConfig[T, M]` + +Configuration structure for creating a TurnLoop. + +```go +type TurnLoopConfig[T any, M MessageType] struct { + // GenInput receives the TurnLoop instance and all buffered items, deciding what to process. + // Returns which items to consume now and which to keep for subsequent turns. + // The loop parameter allows calling Push() or Stop() directly within the callback. + // Required. + GenInput func(ctx context.Context, loop *TurnLoop[T, M], items []T) (*GenInputResult[T, M], error) + + // GenResume is called at most once during Run(). When CheckpointID is configured, + // Run() queries the Store for a checkpoint: + // - If the checkpoint contains runner state (i.e., agent was interrupted mid-turn), + // Run() calls GenResume to plan the resume turn. + // - Otherwise (no checkpoint or between-turns checkpoint), GenResume is not called, + // and TurnLoop processes normally via GenInput. + // Parameter meanings: + // - inFlightItems: items being processed when the previous run was cancelled or business-interrupted + // - unhandledItems: items buffered but unprocessed when the previous run exited + // - newItems: new items buffered via Push() before Run() is called + // + // Returns GenResumeResult describing how to resume the interrupted agent turn + // (optional ResumeParams) and how to manipulate the buffer (Consumed/Remaining). + // Optional; only required when recovery is needed. + GenResume func(ctx context.Context, loop *TurnLoop[T, M], inFlightItems, unhandledItems, newItems []T) (*GenResumeResult[T, M], error) + + // PrepareAgent returns a configured TypedAgent to process the consumed items. + // Called once per turn, receiving the items GenInput decided to consume. + // The loop parameter allows calling Push() or Stop() directly within the callback. + // Required. + PrepareAgent func(ctx context.Context, loop *TurnLoop[T, M], consumed []T) (TypedAgent[M], error) + + // OnAgentEvents handles events emitted by the agent. + // TurnContext provides per-turn information and control: + // - tc.Consumed: the consumed items that triggered this agent execution + // - tc.Loop: allows calling Push() or Stop() directly within the callback + // - tc.Preempted / tc.Stopped: signals available while processing events + // + // Error handling: the returned error is only for when the callback itself wants to abort TurnLoop. + // The callback never needs to propagate CancelError—the framework handles it automatically: + // - On Stop: the framework automatically propagates CancelError as ExitReason, terminating TurnLoop. + // - On Preempt: the framework does not propagate CancelError; if the callback also returns nil, TurnLoop enters the next turn. + // In practice, only return non-nil error when the callback encounters an internal fault requiring TurnLoop termination. + // + // Optional. If not provided, events will be consumed and the first error (including Stop-triggered CancelError) will be returned as ExitReason. + OnAgentEvents func(ctx context.Context, tc *TurnContext[T, M], events *AsyncIterator[*TypedAgentEvent[M]]) error + + // Store is the checkpoint store for persistence and recovery. Optional. + // When used with CheckpointID, enables automatic checkpoint recovery. + // TurnLoop always persists runner checkpoint bytes and item bookkeeping information + // (InFlightItems, UnhandledItems) via gob encoding, so T must be gob-encodable when using Store. + Store CheckPointStore + + // CheckpointID works with Store to enable declarative automatic checkpoint recovery. + // On Run(), TurnLoop uses this ID to query the Store: + // - If a checkpoint with runner state exists (mid-turn interrupt), calls GenResume to plan the resume turn. + // - If a checkpoint without runner state exists (between-turns), buffers stored unhandled items, + // then processes normally via GenInput. + // - If no checkpoint exists, TurnLoop starts from scratch. + // + // On exit, if TurnLoop saved a new checkpoint, it uses the same CheckpointID. + // When no new checkpoint is saved, TurnLoop attempts to delete the old checkpoint under the same CheckpointID + // to prevent stale recovery (requires Store to implement CheckPointDeleter). + // Use WithSkipCheckpoint() to explicitly skip checkpoint saving. + CheckpointID string +} +``` + +#### `TurnContext[T, M]` + +Per-turn context information available to the `OnAgentEvents` callback. + +```go +type TurnContext[T any, M MessageType] struct { + // Loop is the TurnLoop instance, allowing Push()/Stop() calls within the callback. + Loop *TurnLoop[T, M] + + // Consumed are the items that triggered this agent execution. + Consumed []T + + // Preempted is closed when at least one preemptive Push actually contributes to the current turn's + // CancelError (via Push + WithPreempt). + // "Contribute" means the cancel call's options were incorporated before the CancelError was finalized. + // If no contributing preemption occurred for this turn (e.g., cancel was already finalized), the channel stays open. + // + // Preempted and Stopped may both close within the same turn—when both signals arrive while the agent + // is still in the cancel process. Signals arriving after cancel is fully processed do not contribute. + Preempted <-chan struct{} + + // Stopped is closed when Stop()'s cancel call actually contributes to the current turn's CancelError. + // If Stop did not contribute (e.g., cancel was already finalized), the channel stays open. + // + // For the relationship between Preempted and Stopped, see the Preempted documentation. + Stopped <-chan struct{} + + // StopCause returns the business-side stop reason passed via WithStopCause. + // This value is only meaningful after the Stopped channel is closed. Before that, it returns an empty string. + StopCause func() string +} +``` + +#### `GenInputResult[T, M]` + +Return result of the `GenInput` callback. + +```go +type GenInputResult[T any, M MessageType] struct { + // RunCtx is the execution context for this turn (optional). + // If set, used for PrepareAgent, agent's Run/Resume, and OnAgentEvents. + // Must be derived from the ctx passed to GenInput to preserve TurnLoop's cancel semantics and inherited values. + // Example: + // runCtx := context.WithValue(ctx, traceKey{}, extractTraceID(items)) + // return &GenInputResult[T, M]{RunCtx: runCtx, ...}, nil + // If nil, TurnLoop's context is used. + RunCtx context.Context + + // Input is the agent input to execute + Input *TypedAgentInput[M] + + // RunOpts are options for this agent run. + // Note: no need to pass WithCheckPointID here; TurnLoop automatically injects the checkpointID into Runner. + RunOpts []AgentRunOption + + // Consumed are the items selected for processing this turn: + // These items are removed from the buffer and passed as PrepareAgent's input parameter. + Consumed []T + + // Remaining are items kept in the buffer for future turns: + // TurnLoop pushes Remaining back to the buffer before this turn starts executing the agent. + // + // Note: items in the input that are neither in Consumed nor Remaining will be discarded. + Remaining []T +} +``` + +#### `GenResumeResult[T, M]` + +Return result of the `GenResume` callback. + +```go +type GenResumeResult[T any, M MessageType] struct { + // RunCtx is the execution context for this resume turn (optional). + RunCtx context.Context + + // RunOpts are options for this agent resume run. + // Note: no need to pass WithCheckPointID here; TurnLoop automatically injects the checkpointID into Runner. + RunOpts []AgentRunOption + + // ResumeParams contains parameters for resuming the interrupted agent (optional). + ResumeParams *ResumeParams + + // Consumed are the items selected for processing this resume turn: + // These items are removed from the buffer and passed as PrepareAgent's input parameter. + Consumed []T + + // Remaining are items kept in the buffer for future turns: + // TurnLoop pushes Remaining back to the buffer before this turn resumes the agent. + // + // Note: items from (inFlightItems, unhandledItems, newItems) that are neither in Consumed nor Remaining + // will be discarded. + Remaining []T +} +``` + +#### `InterruptError` + +When an agent produces a business interrupt (`AgentAction.Interrupted`) that causes TurnLoop to exit, the `ExitReason` is `*InterruptError`. It indicates the agent paused at a business-defined interrupt point, rather than being cancelled. + +```go +type InterruptError struct { + // InterruptContexts provides interrupt contexts needed for targeted resume. + // Each context represents an interrupted position in the agent hierarchy. + InterruptContexts []*InterruptCtx +} + +func (e *InterruptError) Error() string +``` + +**Behavior:** + +- `*InterruptError` triggers TurnLoop checkpoint saving; on resume, the items being processed by the original turn are available via GenResume's `inFlightItems` parameter +- `InterruptContexts` can be used to construct `ResumeParams.Targets`, passed to `Runner.ResumeWithParams` via `GenResumeResult.ResumeParams` +- Unlike `CancelError`, `InterruptError` represents a business-side intentional pause; interrupts occurring during an active cancel are still absorbed into `CancelError` + +#### `TurnLoopExitState[T, M]` + +State returned when TurnLoop exits, containing the exit reason and unhandled items. + +```go +type TurnLoopExitState[T any, M MessageType] struct { + // ExitReason indicates why TurnLoop exited. + // nil means normal exit (Stop() was called and TurnLoop completed normally). + // Non-nil may be context error, callback error, *CancelError, etc. + // When Stop() cancelled a running agent, ExitReason is *CancelError. + // This field does not include checkpoint errors—see CheckpointErr. + ExitReason error + + // UnhandledItems contains items that were buffered but never processed. + // Items where Push returned true but were never consumed by any turn. + // Always valid regardless of ExitReason value. + UnhandledItems []T + + // InFlightItems contains items being processed by the interrupted turn. + // Cancellation (Stop + WithImmediate, WithGraceful, or WithGracefulTimeout) and business interrupts + // both populate this field; if the agent completed normally before cancel took effect, this is empty. + // On resume, passed via GenResume's inFlightItems parameter. + InFlightItems []T + + // StopCause is the business-side stop reason passed via WithStopCause. + // Empty string if Stop was never called or no cause was provided. + StopCause string + + // CheckpointAttempted indicates whether TurnLoop attempted to save a checkpoint on exit. + // True only when Store is configured, CheckpointID is set, TurnLoop is not idle on exit, + // WithSkipCheckpoint was not used, and exit was triggered by Stop() or business interrupt. + CheckpointAttempted bool + + // CheckpointErr is the error from checkpoint saving (if any). + // nil when CheckpointAttempted is false (no attempt) or saving succeeded. + CheckpointErr error + + // TakeLateItems returns items pushed after TurnLoop stopped + // (i.e., items where Push returned false). These items are not included in checkpoints. + // + // This function is idempotent: the first call computes and caches the result, + // subsequent calls return the same slice. + // + // After calling TakeLateItems, subsequent Push() calls will panic + // to prevent items from being silently lost. + // + // Safe to call from any goroutine after Wait() returns. + // If TakeLateItems is never called, late items will be garbage collected normally. + TakeLateItems func() []T +} +``` + +### Functions + +#### `NewTurnLoop` + +Creates a new TurnLoop without starting it. The returned TurnLoop immediately accepts `Push` and `Stop` calls; pushed items are buffered until `Run` is called. + +Panics if `GenInput` or `PrepareAgent` is nil. + +```go +func NewTurnLoop[T any, M MessageType](cfg TurnLoopConfig[T, M]) *TurnLoop[T, M] +``` + +**Parameters:** + +- `cfg TurnLoopConfig[T, M]`: TurnLoop configuration + +**Return values:** + +- `*TurnLoop[T, M]`: An unstarted TurnLoop instance + +**Example:** + +```go +loop := NewTurnLoop(TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], items []string) (*GenInputResult[string, *schema.Message], error) { + return &GenInputResult[string, *schema.Message]{ + Input: &TypedAgentInput[*schema.Message]{Messages: []Message{schema.UserMessage(items[0])}}, + Consumed: items, + }, nil + }, + PrepareAgent: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], consumed []string) (TypedAgent[*schema.Message], error) { + return myAgent, nil + }, +}) + +// Can push items or pass references before starting +_, _ = loop.Push("initial_item") +loop.Run(ctx) +``` + +### Methods + +All methods are safe to call when TurnLoop is not started (lenient API): + +- `Push`: Items are buffered, processing begins after `Run` is called. +- `Stop`: Sets the stop flag; subsequent `Run` will exit immediately. +- `Wait`: Blocks until `Run` is called and TurnLoop exits. If `Run` is never called, `Wait` blocks forever. + +> Note: Items pushed before starting will be processed once Run starts. + +#### `Run` + +Starts TurnLoop's processing goroutine. This method is non-blocking: TurnLoop runs in the background, use `Wait` to get results. + +If `CheckpointID` is configured in `TurnLoopConfig` and a matching checkpoint exists in the `Store`, TurnLoop will automatically attempt to resume from that checkpoint; otherwise it starts processing already-`Push()`ed items from scratch. Multiple calls to `Run` are idempotent no-ops: only the first call starts TurnLoop. + +```go +func (l *TurnLoop[T, M]) Run(ctx context.Context) +``` + +**Parameters:** + +- `ctx context.Context`: Context for TurnLoop's lifecycle + +**Example:** + +```go +loop := NewTurnLoop(cfg) +loop.Run(context.Background()) +``` + +#### `Push` + +Adds an item to TurnLoop's buffer for processing. This method is non-blocking and thread-safe. + +```go +func (l *TurnLoop[T, M]) Push(item T, opts ...PushOption[T, M]) (bool, <-chan struct{}) +``` + +**Parameters:** + +- `item T`: Item to add to the buffer +- `opts ...PushOption[T, M]`: Optional push options + +**Return values:** + +- `bool`: Returns `false` if TurnLoop has stopped (the item is still retained and can be recovered via `TurnLoopExitState.TakeLateItems()`), otherwise returns `true` (including when `Run` hasn't been called yet—items will be buffered) +- `<-chan struct{}`: Returns non-nil only when using `WithPreempt` / `WithPreemptTimeout`. Callers can wait for this channel to close to confirm the preemption signal has been received by TurnLoop and a cancel request submitted—i.e., the current turn is guaranteed to be preempted. Specific timing: + - If there's a running agent: channel closes after TurnLoop calls cancel + - If there's no running agent (TurnLoop idle or not yet started): channel closes immediately (no need to cancel) + - If you don't need to wait for confirmation, you can ignore this return value + +**Example:** + +```go +// Normal push +ok, _ := loop.Push("message1") +if !ok { + // Loop has stopped, item can be recovered via TakeLateItems() +} + +// Preemptive push: push new item and request current turn cancellation +ok, ack := loop.Push("urgent_message", WithPreempt[string, *schema.Message](AnySafePoint)) +if !ok { + // Loop has stopped +} else { + <-ack // Wait for confirmation: preemption signal received, current turn guaranteed to be cancelled +} +``` + +##### SafePoint Type + +`SafePoint` describes at which boundary an agent can be cancelled. Values can be combined with bitwise OR to accept multiple safe points. + +`SafePoint` is only used in preemption APIs (`WithPreempt`/`WithPreemptTimeout`). A key design constraint: **preemption always targets safe points**—the user's intent is to cancel at a well-defined boundary, not to abort immediately. Immediate cancel is only reachable via timeout escalation (through `WithPreemptTimeout`), not as a direct user choice. This is why `SafePoint` has no "immediate" value, and `WithPreempt` requires non-zero `SafePoint` (panics otherwise). + +`SafePoint` maps internally to `CancelMode` but hides that detail from TurnLoop users. + +```go +type SafePoint int + +const ( + AfterChatModel SafePoint = 1 << iota // Allow agent to be cancelled after completing the current chat-model call + AfterToolCalls // Allow agent to be cancelled after completing the current tool call round + AnySafePoint = AfterChatModel | AfterToolCalls // Shorthand for AfterChatModel | AfterToolCalls +) +``` + +##### `PushOption[T, M]` + +Opaque option type for configuring a `Push` call. Users typically don't implement this type themselves, but use `WithPreempt`, `WithPreemptTimeout`, `WithPreemptDelay`, or `WithPushStrategy` to create options. + +```go +type PushOption[T any, M MessageType] func(*pushConfig[T, M]) +``` + +##### `WithPreempt` + +Pushes a new item while requesting cancellation of the current agent turn at the specified `SafePoint`. After cancellation completes, TurnLoop starts a new turn, and `GenInput` will see all buffered items (including the just-pushed one). Use `WithPreemptTimeout` to add a timeout for escalation to immediate abort. + +Since safe points trigger at turn-level boundaries (after chat model returns or after all tool calls complete), **no nested agents are running when cancellation occurs**—nested agents within AgentTools either haven't started (AfterChatModel) or have already completed (AfterToolCalls). Therefore, `WithPreempt`'s cancel doesn't involve nested agents. However, `WithPreemptTimeout` terminates nested agents running within AgentTools when timeout escalates to immediate cancel. + +`WithPreempt` and `WithPreemptTimeout` are mutually exclusive; if both are passed to the same `Push` call, the latter takes effect. + +`safePoint` cannot be zero; passing `SafePoint(0)` panics. + +```go +func WithPreempt[T any, M MessageType](safePoint SafePoint) PushOption[T, M] +``` + +**Parameters:** + +- `safePoint SafePoint`: Specifies at which safe point the agent yields + +**Example:** + +```go +_, _ = loop.Push("urgent_item", WithPreempt[string, *schema.Message](AnySafePoint)) +_, _ = loop.Push("item", WithPreempt[string, *schema.Message](AfterToolCalls)) +``` + +##### `WithPreemptTimeout` + +Similar to `WithPreempt`, but adds a timeout. If the agent doesn't reach the safe point within the timeout, preemption escalates to immediate cancel. On timeout escalation, nested agents within AgentTools also receive cancel signals and are terminated. + +`timeout <= 0` does not set an effective deadline and therefore won't trigger timeout escalation. + +`safePoint` cannot be zero; passing `SafePoint(0)` panics. + +```go +func WithPreemptTimeout[T any, M MessageType](safePoint SafePoint, timeout time.Duration) PushOption[T, M] +``` + +**Parameters:** + +- `safePoint SafePoint`: Specifies at which safe point the agent yields +- `timeout time.Duration`: Escalates to immediate cancel after timeout + +**Example:** + +```go +_, _ = loop.Push("urgent_item", WithPreemptTimeout[string, *schema.Message](AnySafePoint, 5*time.Second)) +``` + +##### `WithPreemptDelay` + +Sets a delay before preemption takes effect. Must be used together with `WithPreempt` or `WithPreemptTimeout`. + +`delay <= 0` is equivalent to no delay. + +```go +func WithPreemptDelay[T any, M MessageType](delay time.Duration) PushOption[T, M] +``` + +**Parameters:** + +- `delay time.Duration`: Delay duration before preemption + +**Example:** + +```go +_, _ = loop.Push("item", WithPreempt[string, *schema.Message](AnySafePoint), WithPreemptDelay[string, *schema.Message](2*time.Second)) +``` + +##### `WithPushStrategy` + +Provides dynamic push option resolution based on current turn state. The callback receives the current turn's context and `TurnContext` (nil if no active turn), and returns the list of `PushOption`s to actually apply. + +When using `WithPushStrategy`, all other `PushOption`s passed in the same `Push` call are ignored. The returned options must not contain another `WithPushStrategy`; nested strategies are silently stripped. + +TurnLoop first holds the current run loop under an internal lock to obtain a current turn snapshot, then calls the callback on that stable snapshot; the turn state read by the callback and the final push decision don't cross into the next turn. + +```go +func WithPushStrategy[T any, M MessageType](fn func(ctx context.Context, tc *TurnContext[T, M]) []PushOption[T, M]) PushOption[T, M] +``` + +**Parameters:** + +- `fn func(ctx context.Context, tc *TurnContext[T, M]) []PushOption[T, M]`: Strategy callback function + - `ctx`: Current turn's context (`context.Background()` when no active turn) + - `tc`: Current turn's `TurnContext` (`nil` when no active turn) + +**Example:** + +```go +_, _ = loop.Push(urgentItem, WithPushStrategy(func(ctx context.Context, tc *TurnContext[MyItem, *schema.Message]) []PushOption[MyItem, *schema.Message] { + if tc == nil { + return nil // Between turns, normal push + } + if isLowPriority(tc.Consumed) { + return []PushOption[MyItem, *schema.Message]{WithPreempt[MyItem, *schema.Message](AnySafePoint)} + } + return nil // Don't preempt high-priority tasks +})) +``` + +#### `Stop` + +Sends a stop signal to TurnLoop and returns immediately (non-blocking). + +Without options, the current agent turn runs to completion, and TurnLoop exits at the turn boundary without starting a new turn. In this case, `ExitReason` is `nil`. + +Use `WithImmediate()` to immediately abort the running agent turn. Use `WithGraceful()` to cancel at the nearest safe point with recursive propagation to nested agents. Use `WithGracefulTimeout()` for safe point cancel with an escalation deadline. Use `UntilIdleFor()` to delay stop until TurnLoop has been continuously idle for a duration. + +Can be called before `Run`; subsequent `Run` will exit immediately. + +Multiple calls are allowed; subsequent calls update cancel options. A `Stop()` call without `UntilIdleFor` immediately shuts down TurnLoop even if a previous `UntilIdleFor` is still waiting. Note that `WithSkipCheckpoint` and `WithStopCause` have sticky semantics—see their respective documentation. + +If the running agent doesn't support `WithCancel`'s `AgentRunOption`, all cancel-related options (`WithImmediate`, `WithGraceful`, `WithGracefulTimeout`) degrade to "exit TurnLoop upon entering the next iteration"—the current agent turn runs to completion before TurnLoop exits. + +Call `Wait()` to block until TurnLoop fully exits and get results. + +```go +func (l *TurnLoop[T, M]) Stop(opts ...StopOption) +``` + +**Parameters:** + +- `opts ...StopOption`: Optional stop options + +**Example:** + +```go +// Without options: turn boundary exit (current turn completes, then stops; ExitReason is nil) +loop.Stop() + +// Immediately abort current agent turn +loop.Stop(WithImmediate()) + +// Safe point stop (graceful shutdown, recursively propagated to nested agents) +loop.Stop(WithGraceful()) + +// Safe point stop with timeout +loop.Stop(WithGracefulTimeout(10 * time.Second)) + +// Auto-stop after idle (stops after 30 seconds of continuous idle) +loop.Stop(UntilIdleFor(30 * time.Second)) +``` + +##### `StopOption` + +Opaque option type for configuring a `Stop` call. Users typically don't implement this type themselves, but use `WithImmediate`, `WithGraceful`, `WithGracefulTimeout`, `UntilIdleFor`, `WithSkipCheckpoint`, or `WithStopCause` to create options. + +```go +type StopOption func(*stopConfig) +``` + +##### `WithImmediate` + +Immediately aborts the running agent turn without waiting for any safe point. Nested agents within AgentTools also receive cancel signals and are terminated. + +This is the most aggressive stop mode, suitable for scenarios prioritizing fast shutdown; if you're also certain future recovery isn't needed, additionally use `WithSkipCheckpoint()`. + +```go +func WithImmediate() StopOption +``` + +**Example:** + +```go +loop.Stop(WithImmediate()) +``` + +##### `WithGraceful` + +Requests graceful stop: waits at the nearest safe point (after tool calls or after chat-model call) and recursively propagates to nested agents. No time limit; use `WithGracefulTimeout` to add a timeout that escalates to immediate cancel. + +`WithGraceful` and `WithGracefulTimeout` are mutually exclusive; if both are passed to the same `Stop` call, the latter takes effect. + +```go +func WithGraceful() StopOption +``` + +**Example:** + +```go +loop.Stop(WithGraceful()) +``` + +##### `WithGracefulTimeout` + +Similar to `WithGraceful`, but adds a timeout deadline. If the agent doesn't reach a safe point within `gracePeriod`, stop escalates to immediate cancel. + +`gracePeriod` must be positive; passing zero or negative panics. + +```go +func WithGracefulTimeout(gracePeriod time.Duration) StopOption +``` + +**Parameters:** + +- `gracePeriod time.Duration`: Escalates to immediate cancel after timeout + +**Example:** + +```go +loop.Stop(WithGracefulTimeout(10 * time.Second)) +``` + +##### `UntilIdleFor` + +Defers stop until TurnLoop has been continuously idle (blocking between turns with no pending items) for the specified duration. The timer resets from zero whenever new items arrive. + +Useful when business code externally monitors agent activity and wants to shut down TurnLoop after a period of no work, without racing with concurrent `Push` calls. + +`UntilIdleFor` doesn't affect running agents; it only takes effect when TurnLoop is idle. Cancel options in the same call (`WithImmediate`, `WithGraceful`, `WithGracefulTimeout`) are silently ignored. `UntilIdleFor` can be combined with non-cancel options (`WithSkipCheckpoint`, `WithStopCause`). + +To escalate to immediate shutdown during idle waiting, make a new `Stop` call without `UntilIdleFor` to override the idle wait: + +```go +loop.Stop(UntilIdleFor(30 * time.Second)) // Wait for idle +// ... later, if immediate abort is needed: +loop.Stop(WithImmediate()) // Overrides idle wait, immediate shutdown +``` + +Only the first `UntilIdleFor`'s duration takes effect; subsequent calls with different durations are ignored. + +`duration` must be positive; passing zero or negative panics. + +```go +func UntilIdleFor(duration time.Duration) StopOption +``` + +**Parameters:** + +- `duration time.Duration`: Idle wait duration + +**Example:** + +```go +// Auto-stop after 30 seconds of continuous idle +loop.Stop(UntilIdleFor(30 * time.Second)) + +// Idle stop without saving checkpoint +loop.Stop(UntilIdleFor(30*time.Second), WithSkipCheckpoint()) +``` + +##### `WithSkipCheckpoint` + +Tells TurnLoop not to persist a checkpoint on this Stop. Suitable when the caller is certain recovery won't be needed in the future. + +The flag is sticky: once any `Stop()` call sets this option, subsequent escalation calls cannot undo it. + +```go +func WithSkipCheckpoint() StopOption +``` + +**Example:** + +```go +// Permanent stop, no checkpoint +loop.Stop(WithSkipCheckpoint()) + +// Combined with cancel options: immediate abort without checkpoint +loop.Stop(WithImmediate(), WithSkipCheckpoint()) +``` + +##### `WithStopCause` + +Attaches a business-side stop reason string to the Stop call. + +The reason is exposed in two places: + +- `TurnLoopExitState.StopCause`: Available after `Wait()` returns +- `TurnContext.StopCause()`: In `OnAgentEvents`, available after `<-tc.Stopped` closes + +If multiple `Stop()` calls provide a cause, the first non-empty value takes precedence. + +```go +func WithStopCause(cause string) StopOption +``` + +**Parameters:** + +- `cause string`: Business-side stop reason + +**Example:** + +```go +loop.Stop(WithStopCause("user session timeout")) + +// Combined usage +loop.Stop( + WithGraceful(), + WithStopCause("quota exceeded"), +) +``` + +#### `Wait` + +Blocks until TurnLoop exits and returns the exit state. Safe to call from multiple goroutines; all callers receive the same result. Blocks until `Run` is called and TurnLoop exits; if `Run` is never called, blocks forever. + +```go +func (l *TurnLoop[T, M]) Wait() *TurnLoopExitState[T, M] +``` + +**Return values:** + +- `*TurnLoopExitState[T, M]`: Exit state containing exit reason, unhandled items, checkpoint status, and business stop reason + +**Example:** + +```go +loop.Stop() +result := loop.Wait() +if result.ExitReason != nil { + log.Printf("Loop exited with error: %v", result.ExitReason) +} +``` + +### Extension Interfaces + +#### `CheckPointStore` + +Storage interface for saving and reading checkpoints. Used by `TurnLoopConfig.Store`; when configured together with `CheckpointID`, TurnLoop enables automatic recovery and persistence. + +```go +type CheckPointStore interface { + Get(ctx context.Context, checkPointID string) ([]byte, bool, error) + Set(ctx context.Context, checkPointID string, checkPoint []byte) error +} +``` + +#### `CheckPointDeleter` + +Optional extension interface for `CheckPointStore`. Stores implementing this interface support explicit checkpoint deletion. + +TurnLoop attempts to delete previously loaded checkpoints when no new checkpoint is saved, to prevent stale recovery. **Only Stores implementing CheckPointDeleter perform this deletion**; otherwise the lifecycle of expired checkpoints is managed by the Store itself. + +```go +type CheckPointDeleter interface { + Delete(ctx context.Context, checkPointID string) error +} +``` + +--- + +## Usage Examples + +### Basic Agent Cancel Usage + +```go +ctx := context.Background() +runner := NewRunner(ctx, RunnerConfig{ + Agent: myAgent, +}) + +// Enable cancel +cancelOpt, cancelFunc := WithCancel() +iter := runner.Run(ctx, messages, cancelOpt) + +// In another goroutine, cancel after chat model completes +go func() { + time.Sleep(2 * time.Second) + handle, _ := cancelFunc( + WithAgentCancelMode(CancelAfterChatModel), + WithAgentCancelTimeout(5*time.Second), + ) + err := handle.Wait() + if err != nil { + log.Printf("Cancel failed: %v", err) + } +}() + +// Process events +for { + event, ok := iter.Next() + if !ok { + break + } + if event.Err != nil { + var cancelErr *CancelError + if errors.As(event.Err, &cancelErr) { + log.Printf("Agent cancelled: mode=%v, escalated=%v", + cancelErr.Info.Mode, cancelErr.Info.Escalated) + } + break + } + // Process events +} +``` + +### Basic TurnLoop Usage + +```go +ctx := context.Background() + +loop := NewTurnLoop(TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], items []string) (*GenInputResult[string, *schema.Message], error) { + // Process all items and bind trace context for this turn + runCtx := context.WithValue(ctx, traceKey{}, extractTrace(items[0])) + return &GenInputResult[string, *schema.Message]{ + RunCtx: runCtx, + Input: &TypedAgentInput[*schema.Message]{Messages: []Message{schema.UserMessage(strings.Join(items, "\n"))}}, + Consumed: items, + }, nil + }, + PrepareAgent: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], consumed []string) (Agent, error) { + return myAgent, nil + }, + OnAgentEvents: func(ctx context.Context, tc *TurnContext[string, *schema.Message], events *AsyncIterator[*TypedAgentEvent[*schema.Message]]) error { + for { + event, ok := events.Next() + if !ok { + break + } + if event.Err != nil { + var cancelErr *CancelError + if errors.As(event.Err, &cancelErr) { + // Cancel is captured by TurnLoop and converted to exit state; callback doesn't need to return it. + continue + } + return event.Err + } + // Process events + } + return nil + }, +}) + +// Can push items before starting +_, _ = loop.Push("user message 1") +_, _ = loop.Push("user message 2") + +// Start the loop +loop.Run(ctx) + +// Stop and wait (turn boundary exit, ExitReason is nil) +loop.Stop() +result := loop.Wait() +``` + +### TurnLoop with Preemption + +```go +loop := NewTurnLoop(TurnLoopConfig[string, *schema.Message]{...}) +loop.Run(ctx) + +// Push urgent item and preempt current agent +_, ack := loop.Push("urgent_message", WithPreempt[string, *schema.Message](AnySafePoint)) +if ack != nil { + <-ack +} + +// Or with delay +_, _ = loop.Push("item", WithPreempt[string, *schema.Message](AnySafePoint), WithPreemptDelay[string, *schema.Message](1*time.Second)) +``` + +### TurnLoop Declarative Checkpoint Recovery + +```go +ctx := context.Background() + +// First run—configure Store and CheckpointID to enable automatic checkpoint +cfg := TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], items []string) (*GenInputResult[string, *schema.Message], error) { + return &GenInputResult[string, *schema.Message]{ + Input: &TypedAgentInput[*schema.Message]{Messages: []Message{schema.UserMessage(items[0])}}, + Consumed: items, + }, nil + }, + GenResume: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], inFlightItems, unhandledItems, newItems []string) (*GenResumeResult[string, *schema.Message], error) { + all := append(append(inFlightItems, unhandledItems...), newItems...) + return &GenResumeResult[string, *schema.Message]{ + Consumed: all, + }, nil + }, + PrepareAgent: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], consumed []string) (Agent, error) { + return myAgent, nil + }, + Store: myStore, + CheckpointID: "my-session-id", +} + +loop := NewTurnLoop(cfg) +_, _ = loop.Push("message1") +loop.Run(ctx) + +// Stop the run +loop.Stop(WithGraceful()) +exit := loop.Wait() + +// Resume from checkpoint—using the same cfg (with same CheckpointID), +// Run() automatically detects and resumes from checkpoint +loop2 := NewTurnLoop(cfg) +_, _ = loop2.Push("new_message") // New items are passed as newItems to GenResume +loop2.Run(ctx) +result2 := loop2.Wait() +``` + +--- + +## Best Practices + +### Agent Cancel + +1. **Choose the right mode**: Use safe point modes (`CancelAfterChatModel`, `CancelAfterToolCalls`) for graceful cancellation, `CancelImmediate` for emergencies +2. **Set timeouts**: Always set timeouts for safe point modes to prevent infinite waiting +3. **Handle CancelError**: Check for `CancelError` in event errors to distinguish cancellation from failure +4. **Understand Interrupt absorption**: Business interrupts during active cancellation are absorbed into `CancelError`, but checkpoints retain complete data; on resume, business interrupts naturally re-fire +5. **Recovery capability**: Use `InterruptContexts` from `CancelError` for targeted recovery +6. **Recursive propagation**: By default, cancel only affects the root agent. When the agent hierarchy contains sub-agents nested within AgentTools, use `WithRecursive()` to propagate to all sub-agents. When in doubt, don't add `WithRecursive()` — only enable when you explicitly need sub-agents to respond to cancel safe points + +### TurnLoop + +1. **Process all events**: If `OnAgentEvents` is provided, fully consume the event iterator; when not provided, the framework automatically drains events +2. **Detect preemption/stop**: Use `TurnContext.Preempted` / `TurnContext.Stopped` (`select`) in `OnAgentEvents` to detect preemption/stop; note they only close when the corresponding cancel call actually contributes to this turn's `CancelError` +3. **Declarative Checkpoint**: Configure both `Store` and `CheckpointID` in `TurnLoopConfig` to enable automatic checkpoint recovery; `Run()` automatically detects and resumes from existing checkpoints +4. **Resume runs**: Create a new TurnLoop with the same `CheckpointID` and call `Run()`; the framework automatically detects the checkpoint and calls `GenResume`; new items are buffered via `Push()` before `Run()` +5. **Expired Checkpoint cleanup**: When no new checkpoint is saved, the framework automatically deletes previously loaded checkpoints to prevent stale recovery; **only Stores implementing the CheckPointDeleter interface perform this deletion** +6. **Distinguish CancelError from business interrupt**: `*CancelError` represents the cancel path, `*InterruptError` represents business-side intentional interrupt; both may produce checkpoints and pass in-flight items back via GenResume's `inFlightItems` +7. **Skip Checkpoint**: When recovery is not needed, use `WithSkipCheckpoint()` to avoid unnecessary checkpoint writes; the flag remains sticky across escalation calls +8. **Business stop reason**: Use `WithStopCause()` to attach business-layer stop reasons, separate from technical `ExitReason`; in `OnAgentEvents`, read `tc.StopCause()` after `<-tc.Stopped` closes +9. **T's gob compatibility**: When using `Store`, `T` must be gob-encodable since the framework persists runner bytes and item bookkeeping information via gob +10. **Stop escalation**: Call `Stop()` multiple times—subsequent calls update cancel options (e.g., escalate from `WithGraceful()` to `WithImmediate()`) +11. **Idle shutdown**: Use `UntilIdleFor()` to auto-stop when there's no work, avoiding races with concurrent `Push` +12. **Context derivation**: For per-turn traces, derive `RunCtx` from `ctx` in `GenInput`/`GenResume` +13. **Late Items recovery**: When `Push()` returns `false`, items aren't lost—recover via `TurnLoopExitState.TakeLateItems()`; note that after calling `TakeLateItems`, you can no longer `Push()` +14. **Distinguish exit result from Checkpoint result**: `ExitReason` reflects the loop's own exit reason, `CheckpointAttempted` + `CheckpointErr` reflects checkpoint persistence result—judge them independently + +### Integration + +1. **Preempt vs Stop**: Use `WithPreempt()` for urgent items during execution, `Stop()` for final shutdown +2. **Conditional preemption**: When preemption decisions depend on current turn state, use `WithPushStrategy` instead of reading state then calling `Push`—it executes under an atomic snapshot, avoiding TOCTOU races +3. **Context cancellation**: Canceling the `ctx` passed to `Run(ctx)` can abort the current turn and exit the loop (`ExitReason` is typically `context.Canceled`/`context.DeadlineExceeded`); `Stop()` is better for orderly shutdown and allows controlling cancel strategy via `WithGraceful`/`WithGracefulTimeout` diff --git a/content/en/docs/eino/core_modules/flow_integration_components/react_agent_manual.md b/content/en/docs/eino/core_modules/flow_integration_components/react_agent_manual.md index b91b261e0ab..05137dd1b25 100644 --- a/content/en/docs/eino/core_modules/flow_integration_components/react_agent_manual.md +++ b/content/en/docs/eino/core_modules/flow_integration_components/react_agent_manual.md @@ -1,30 +1,30 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: ReAct Agent Manual' +title: ReAct Agent User Manual weight: 1 --- # Introduction -Eino ReAct Agent is an agent framework that implements the [ReAct logic](https://react-lm.github.io/), allowing users to quickly and flexibly build and invoke ReAct Agents. +Eino React Agent is an intelligent agent framework that implements [ReAct logic](https://react-lm.github.io/), allowing users to quickly and flexibly build and invoke React Agents. > 💡 -> See the code implementation at: [Implementation Directory](https://github.com/cloudwego/eino/tree/main/flow/agent/react) +> See the implementation code at: [Implementation code directory](https://github.com/cloudwego/eino/tree/main/flow/agent/react) ## Node Topology & Data Flow Diagram -ReAct Agent uses `compose.Graph` as its orchestration scheme under the hood. Generally, there are 2 nodes: ChatModel and Tools. All historical messages during the intermediate running process are stored in state. Before passing all historical messages to ChatModel, the messages are copied and processed by MessageModifier, and the processed result is then passed to ChatModel. The process continues until ChatModel returns a message without any tool call, at which point the final message is returned. +The react agent uses `compose.Graph` as its orchestration mechanism under the hood. Generally, it has 2 nodes: ChatModel and Tools. All history messages generated during execution are stored in state. Before passing all history messages to ChatModel, they are copied and handed to MessageModifier for processing, and the processed result is then passed to ChatModel. This continues until the ChatModel's response no longer contains tool calls, at which point the final message is returned. -When at least one Tool in the Tools list is configured with ReturnDirectly, the ReAct Agent structure becomes more complex: a Branch is added after ToolsNode to determine whether a ReturnDirectly Tool was called. If so, it goes directly to END; otherwise, it proceeds to ChatModel as usual. +When at least one Tool in the Tools list is configured with ReturnDirectly, the ReAct Agent structure becomes more complex: after ToolsNode, a Branch is added to determine whether a ReturnDirectly Tool was called. If so, it goes directly to END; otherwise, it proceeds to ChatModel as usual. ## Initialization -The ReactAgent initialization function is provided. Required parameters are Model and ToolsConfig. Optional parameters are MessageModifier, MaxStep, ToolReturnDirectly, and StreamToolCallChecker. +An initialization function for ReactAgent is provided. Required parameters are Model and ToolsConfig; optional parameters include MessageModifier, MaxStep, ToolReturnDirectly, and StreamToolCallChecker. ```bash go get github.com/cloudwego/eino-ext/components/model/openai@latest @@ -43,16 +43,16 @@ import ( ) func main() { - // first initialize the required chatModel + // First initialize the required chatModel toolableChatModel, err := openai.NewChatModel(...) - // initialize the required tools + // Initialize the required tools tools := compose.ToolsNodeConfig{ InvokableTools: []tool.InvokableTool{mytool}, StreamableTools: []tool.StreamableTool{myStreamTool}, } - // create agent + // Create the agent agent, err := react.NewAgent(ctx, &react.AgentConfig{ ToolCallingModel: toolableChatModel, ToolsConfig: tools, @@ -63,9 +63,9 @@ func main() { ### Model -Since ReAct Agent needs to make tool calls, the Model needs to have ToolCall capability, so you need to configure a ToolCallingChatModel. +Since the ReAct Agent needs to make tool calls, the Model must have ToolCall capability, so a ToolCallingChatModel must be configured. -Inside the Agent, the WithTools interface is called to register the Agent's tool list with the model. The definition is: +Internally, the Agent calls the WithTools interface to register the Agent's tool list with the model. The definition is: ```go // BaseChatModel defines the basic interface for chat models. @@ -91,11 +91,13 @@ type ToolCallingChatModel interface { } ``` -Currently, eino provides implementations such as openai and ark, as long as the underlying model supports tool call. +Currently, Eino provides implementations such as openai and ark—any underlying model that supports tool call will work. + ```bash go get github.com/cloudwego/eino-ext/components/model/openai@latest go get github.com/cloudwego/eino-ext/components/model/ark@latest ``` + ```go import ( "github.com/cloudwego/eino-ext/components/model/openai" @@ -131,7 +133,8 @@ func arkExample() { ### ToolsConfig -toolsConfig type is `compose.ToolsNodeConfig`. In eino, to build a Tool node, you need to provide the Tool's information and the function to call the Tool. The tool interface definition is as follows: +ToolsConfig is of type `compose.ToolsNodeConfig`. In Eino, to build a Tool node, you need to provide the Tool's information and the function to invoke the Tool. The tool interface is defined as: + ```go type InvokableRun func(ctx context.Context, arguments string, opts ...Option) (content string, err error) type StreamableRun func(ctx context.Context, arguments string, opts ...Option) (content *schema.StreamReader[string], err error) @@ -153,20 +156,21 @@ type StreamableTool interface { } ``` -Users can implement the required tools according to the tool interface definition. The framework also provides a more convenient method to build tools: +Users can implement the required tools according to the tool interface definition. The framework also provides a simpler way to build tools: + ```go userInfoTool := utils.NewTool( &schema.ToolInfo{ Name: "user_info", - Desc: "根据用户的姓名和邮箱,查询用户的公司、职位、薪酬信息", + Desc: "Query a user's company, position, and salary information based on their name and email", ParamsOneOf: schema.NewParamsOneOfByParams(map[string]*schema.ParameterInfo{ "name": { Type: "string", - Desc: "用户的姓名", + Desc: "The user's name", }, "email": { Type: "string", - Desc: "用户的邮箱", + Desc: "The user's email", }, }), }, @@ -187,14 +191,14 @@ toolConfig := &compose.ToolsNodeConfig{ ### MessageModifier -MessageModifier is executed before each time all historical messages are passed to ChatModel. The definition is: +MessageModifier is executed each time before passing all history messages to ChatModel. It is defined as: ```go // modify the input messages before the model is called. type MessageModifier func(ctx context.Context, input []*schema.Message) []*schema.Message ``` -Configuring MessageModifier in the Agent can modify the messages passed to the model, commonly used to add a preceding system message: +Configuring MessageModifier in the Agent allows you to modify the messages passed to the model, commonly used to prepend a system message: ```go import ( @@ -210,24 +214,24 @@ func main() { MessageModifier: func(ctx context.Context, input []*schema.Message) []*schema.Message { res := make([]*schema.Message, 0, len(input)+1) - res = append(res, schema.SystemMessage("你是一个 golang 开发专家.")) + res = append(res, schema.SystemMessage("You are a golang development expert.")) res = append(res, input...) return res }, }) - agent.Generate(ctx, []*schema.Message{schema.UserMessage("Write a hello world code")}) - // The actual input to the model is: + agent.Generate(ctx, []*schema.Message{schema.UserMessage("Write a hello world program")}) + // The actual input received by the model: // []*schema.Message{ // {Role: schema.System, Content:"You are a golang development expert."}, - // {Role: schema.Human, Content: "Write a hello world code"} + // {Role: schema.Human, Content: "Write a hello world program"} //} } ``` ### MessageRewriter -MessageRewriter is executed before each ChatModel call and modifies and updates the historical messages saved in the global state: +MessageRewriter is executed before each ChatModel call and modifies and updates the history messages stored in the global state: ```go // MessageRewriter modifies message in the state, before the ChatModel is called. @@ -238,17 +242,17 @@ MessageRewriter is executed before each ChatModel call and modifies and updates MessageRewriter MessageModifier ``` -Commonly used for context compression, which is a message change that needs to take effect continuously across multiple ReAct loops. +Commonly used for context compression and other message changes that need to persist across multiple ReAct loops. -Compared to MessageModifier (which only changes without persisting, thus suitable for system prompts), MessageRewriter's changes are visible in subsequent ReAct loops. +Compared to MessageModifier (changes are not persisted, thus suitable for system prompts), MessageRewriter's changes are visible in subsequent ReAct loops. ### MaxStep -Specify the Agent's maximum running step length. Each transition from one node to the next node counts as one step. The default value is node count + 2. +Specifies the maximum number of steps the Agent can run. Each transition from one node to the next counts as one step. The default value is the number of nodes + 2. -Since one loop in the Agent is ChatModel + Tools, which equals 2 steps, the default value of 12 allows up to 6 loops. However, since the last step must be a ChatModel return (because ChatModel must determine that no tool needs to run before returning the final result), at most 5 tools can be run. +Since one loop in the Agent consists of ChatModel + Tools (2 steps), the default value of 12 allows up to 6 loops. However, since the last step must be a ChatModel return (because the Agent only returns the final result when ChatModel determines no more tools need to be run), it can run at most 5 tool invocations. -Similarly, if you want to run at most 10 loops (10 ChatModel + 9 Tools), you need to set MaxStep to 20. If you want to run at most 20 loops, MaxStep needs to be 40. +Similarly, if you want to allow up to 10 loops (10 ChatModel + 9 Tools), set MaxStep to 20. For up to 20 loops, MaxStep should be 40. ```go func main() { @@ -262,7 +266,7 @@ func main() { ### ToolReturnDirectly -If you want the Agent to directly return the Tool's Response ToolMessage after ChatModel selects a specific Tool and executes it, you can configure this Tool in ToolReturnDirectly. +If you want the Agent to directly return the Tool's Response ToolMessage after ChatModel selects and executes a specific Tool, you can configure that Tool in ToolReturnDirectly. ```go a, err = NewAgent(ctx, &AgentConfig{ @@ -278,9 +282,9 @@ a, err = NewAgent(ctx, &AgentConfig{ ### StreamToolCallChecker -Different models may output tool calls differently in streaming mode: some models (like OpenAI) output tool calls directly; some models (like Claude) output text first, then output tool calls. Therefore, different methods are needed to determine this. This field is used to specify the function that determines whether the model's streaming output contains tool calls. +Different models may output tool calls differently in streaming mode: some models (e.g., OpenAI) output tool calls directly; others (e.g., Claude) output text first, then tool calls. Therefore, different methods are needed to determine this. This field specifies the function for checking whether the model's streaming output contains tool calls. -Optional. If not set, the default checks whether the first "non-empty chunk" contains a tool call: +This is optional. When not specified, a "non-empty chunk" check is used by default: ```go func firstChunkStreamToolCallChecker(_ context.Context, sr *schema.StreamReader[*schema.Message]) (bool, error) { @@ -308,9 +312,9 @@ func firstChunkStreamToolCallChecker(_ context.Context, sr *schema.StreamReader[ } ``` -The default implementation is suitable for: models whose Tool Call Message contains only Tool Calls. +The default implementation above is suitable for: models where Tool Call Messages contain only Tool Calls. -The default implementation is NOT suitable for: cases where there are non-empty content chunks before the Tool Call output. In such cases, you need to define a custom tool call checker: +Cases where the default implementation is not suitable: when there are non-empty content chunks before the Tool Call output. In such cases, a custom tool call checker is needed: ```go toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Message]) (bool, error) { @@ -334,12 +338,12 @@ toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Mes } ``` -The custom StreamToolCallChecker above may need to check **all chunks** for ToolCalls in extreme cases, which can cause the "streaming decision" effect to be lost. To preserve the "streaming decision" effect as much as possible, the recommendation is: +The custom StreamToolCallChecker above may, in extreme cases, need to check **all chunks** for ToolCalls, potentially losing the "streaming judgment" effect. If you want to preserve the "streaming judgment" effect as much as possible, the recommendation is: > 💡 -> Try adding a prompt to constrain the model not to output additional text when calling tools, for example: "If you need to call a tool, output the tool directly, do not output text." +> Try adding a prompt to constrain the model from outputting extra text during tool calls, e.g.: "If you need to call a tool, output the tool directly without any text." > -> Different models may be affected differently by prompts, so you need to adjust the prompt and verify the effect in actual use. +> Different models may respond differently to prompts. In practice, you need to adjust and verify the prompt's effect for your specific model. ## Invocation @@ -350,7 +354,7 @@ agent, _ := react.NewAgent(...) var outMessage *schema.Message outMessage, err = agent.Generate(ctx, []*schema.Message{ - schema.UserMessage("写一个 golang 的 hello world 程序"), + schema.UserMessage("Write a golang hello world program"), }) ``` @@ -361,7 +365,7 @@ agent, _ := react.NewAgent(...) var msgReader *schema.StreamReader[*schema.Message] msgReader, err = agent.Stream(ctx, []*schema.Message{ - schema.UserMessage("写一个 golang 的 hello world 程序"), + schema.UserMessage("Write a golang hello world program"), }) for { @@ -383,7 +387,7 @@ for { ### WithCallbacks -Callback is a callback executed at specific timings during Agent runtime. Since the Agent Graph only has ChatModel and ToolsNode, the Agent's Callback is the Callback for ChatModel and Tool. The react package provides a helper function to help users quickly build Callback Handlers for these two component types. +Callbacks are executed at specific moments during Agent runtime. Since the Agent Graph only contains ChatModel and ToolsNode, Agent Callbacks are essentially ChatModel and Tool callbacks. The react package provides a helper function to quickly build Callback Handlers for these two component types. ```go import ( @@ -402,9 +406,9 @@ func BuildAgentCallback(modelHandler *template.ModelCallbackHandler, toolHandler ### Options -React agent supports dynamic modification through runtime Options. +React agent supports dynamic modification via runtime Options. -Scenario 1: Modify the Model configuration in the Agent at runtime: +Scenario 1: Modify the Agent's Model configuration at runtime: ```go // WithChatModelOptions returns an agent option that specifies model.Option for the chat model in agent. @@ -424,7 +428,7 @@ func WithToolList(tools ...tool.BaseTool) agent.AgentOption { Additionally, you also need to modify the tools bound in ChatModel: `WithChatModelOptions(model.WithTools(...))` -Scenario 3: Modify the options for a specific Tool at runtime: +Scenario 3: Modify a specific Tool's options at runtime: ```go // WithToolOptions returns an agent option that specifies tool.Option for the tools in agent. @@ -435,11 +439,11 @@ func WithToolOptions(opts ...tool.Option) agent.AgentOption { ### Prompt -Modifying the prompt at runtime is essentially passing different Message lists when calling Generate or Stream. +Modifying the prompt at runtime simply means passing different Message lists when calling Generate or Stream. -### Get Intermediate Results +### Getting Intermediate Results -If you want to get the `*schema.Message` generated during the ReAct Agent execution process in real-time, you can first obtain a runtime Option and a MessageFuture through WithMessageFuture: +If you want to receive `*schema.Message` produced during React Agent execution in real time, you can first obtain a runtime Option and a MessageFuture via WithMessageFuture: ```go // WithMessageFuture returns an agent option and a MessageFuture interface instance. @@ -470,14 +474,14 @@ func WithMessageFuture() (agent.AgentOption, MessageFuture) { } ``` -This runtime Option is passed normally to the Generate or Stream method. The MessageFuture can use GetMessages or GetMessageStreams to get the Messages of various intermediate states. +Pass this runtime Option to the Generate or Stream method as normal. Use the MessageFuture's GetMessages or GetMessageStreams to retrieve intermediate Messages. > 💡 -> After passing the MessageFuture Option, the Agent will still run in a blocking manner. Receiving intermediate results through MessageFuture needs to be asynchronous with the Agent running (read MessageFuture in a goroutine or run the Agent in a goroutine). +> After passing the MessageFuture Option, the Agent still runs in a blocking manner. Receiving intermediate results via MessageFuture must be asynchronous with Agent execution (read MessageFuture in a goroutine, or run the Agent in a goroutine). ## Agent In Graph/Chain -Agent can be embedded into other Graphs as a Lambda: +An Agent can be embedded into other Graphs as a Lambda: ```go agent, _ := NewAgent(ctx, &AgentConfig{ @@ -506,9 +510,9 @@ res, _ := r.Invoke(ctx, []*schema.Message{{Role: schema.User, Content: "hello"}} ## Demo -### Basic Info +### Basic Information -Description: This is a `Food Recommender` with two tools (query_restaurants and query_dishes). +Description: This is a `Food Recommendation Expert` with two tools (query_restaurants and query_dishes) Repository: [eino-examples/flow/agent/react](https://github.com/cloudwego/eino-examples/tree/main/flow/agent/react) @@ -518,14 +522,14 @@ Usage: 2. Provide an `OPENAI_API_KEY`: `export OPENAI_API_KEY=xxxxxxx` 3. Run the demo: `go run flow/agent/react/react.go` -### Running Process +### Execution Process -### Running Process Explanation +### Execution Process Explanation -- Simulating user input: `I'm in Haidian District, recommend some dishes for me, need some spicy dishes, recommend at least 2 restaurants` -- The agent runs the first node `ChatModel`, the LLM determines that a ToolCall needs to be made to query restaurants, with the following parameters: +- The simulated user input: `I'm in Haidian District, recommend some dishes, need some spicy options, recommend at least 2 restaurants` +- The agent runs the first node `ChatModel`. The LLM determines it needs to make a ToolCall to query restaurants, with the following parameters: ```json "function": { @@ -534,13 +538,13 @@ Usage: } ``` -- Entering the `Tools` node, calling the query_restaurants tool and getting the result. The result returns information about 2 restaurants in Haidian District: +- Enters the `Tools` node, calls the restaurant query tool, and gets results returning 2 restaurants in Haidian District: ```json -[{"id":"1001","name":"Old Place Restaurant","place":"Beijing Old Hutong 5F, turn left to enter","desc":"","score":3},{"id":"1002","name":"Human Taste Restaurant","place":"Beijing Big World Mall -1F","desc":"","score":5}] +[{"id":"1001","name":"Old Place Restaurant","place":"Beijing Old Hutong 5F, turn left","desc":"","score":3},{"id":"1002","name":"Human Flavor Restaurant","place":"Beijing Grand World Mall -1F","desc":"","score":5}] ``` -- After getting the tool result, the conversation history now contains the tool result. Running `ChatModel` again, the LLM determines that another ToolCall needs to be made to query what dishes the restaurants have. Note that since there are two restaurants, the LLM returns 2 ToolCalls as follows: +- After getting the tool results, the conversation history now contains the tool results. Running `ChatModel` again, the LLM determines it needs to call another ToolCall to query what dishes the restaurants have. Since there are two restaurants, the LLM returns 2 ToolCalls: ```json "Message": { @@ -567,21 +571,21 @@ Usage: } ``` -- Entering the `Tools` node again. Since there are 2 tool calls, the Tools node executes these two calls concurrently internally, and both are added to the conversation history. From the callback debug logs, you can see the results as follows: +- Enters the `Tools` node again. Since there are 2 tool calls, the Tools node executes both concurrently, adding both results to the conversation history. From the callback debug logs, the results are: ```json =========[OnToolStart]========= {"restaurant_id": "1001", "topn": 5} =========[OnToolEnd]========= -[{"name":"Braised Pork","desc":"A piece of braised pork","price":20,"score":8},{"name":"Spring Beef","desc":"Lots of boiled beef","price":50,"score":8},{"name":"Stir-fried Pumpkin","desc":"Mushy stir-fried pumpkin","price":5,"score":5},{"name":"Korean Spicy Cabbage","desc":"This is blessed spicy cabbage, very delicious","price":20,"score":9},{"name":"Hot and Sour Potato Shreds","desc":"Sour and spicy potato shreds","price":10,"score":9}] +[{"name":"Braised Pork","desc":"A piece of braised pork","price":20,"score":8},{"name":"Spring Beef","desc":"Lots of boiled beef","price":50,"score":8},{"name":"Stir-fried Baby Pumpkin","desc":"Overcooked pumpkin","price":5,"score":5},{"name":"Korean Spicy Kimchi","desc":"This blessed kimchi is really delicious","price":20,"score":9},{"name":"Hot and Sour Shredded Potatoes","desc":"Sour and spicy shredded potatoes","price":10,"score":9}] =========[OnToolStart]========= {"restaurant_id": "1002", "topn": 5} =========[OnToolEnd]========= -[{"name":"Braised Spare Ribs","desc":"Piece by piece spare ribs","price":43,"score":7},{"name":"Big Knife Twice-cooked Pork","desc":"Classic twice-cooked pork, big pieces of meat","price":40,"score":8},{"name":"Fiery Kiss","desc":"Cold pig snout, spicy but not greasy","price":60,"score":9},{"name":"Chili Mixed with Preserved Egg","desc":"Pounded chili preserved egg, a rice killer","price":15,"score":8}] +[{"name":"Braised Spare Ribs","desc":"Piece by piece spare ribs","price":43,"score":7},{"name":"Big Knife Twice-cooked Pork","desc":"Classic twice-cooked pork, big pieces","price":40,"score":8},{"name":"Fiery Kiss","desc":"Cold dressed pig snout, spicy but not greasy","price":60,"score":9},{"name":"Chili Pepper with Century Egg","desc":"Smashed pepper century egg, perfect with rice","price":15,"score":8}] ``` -- After getting all the tool call results, entering the `ChatModel` node again. This time the LLM finds that it has all the information needed to answer the user's question, so it integrates the information and outputs the conclusion. Since the `Stream` method was used for the call, the LLM result is returned in a streaming manner. +- After getting all tool call results, enters the `ChatModel` node again. This time the LLM finds it has all the information needed to answer the user's question. It synthesizes the information and outputs a conclusion. Since the `Stream` method was used, the LLM's result is returned in streaming fashion. ## Related Reading -- [Eino Tutorial: Host Multi-Agent ](/docs/eino/core_modules/flow_integration_components/multi_agent_hosting) +- [Eino Tutorial: Host Multi-Agent](/docs/eino/core_modules/flow_integration_components/multi_agent_hosting) diff --git a/content/en/docs/eino/ecosystem_integration/_index.md b/content/en/docs/eino/ecosystem_integration/_index.md index 7c831883a7e..39e602db695 100644 --- a/content/en/docs/eino/ecosystem_integration/_index.md +++ b/content/en/docs/eino/ecosystem_integration/_index.md @@ -1,67 +1,8 @@ --- Description: "" -date: "2026-01-20" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: Component Integration' -weight: 6 +title: Component Integration +weight: 5 --- - -## Component Integration - -### ChatModel - -- openai: [OpenAI](/docs/eino/ecosystem_integration/chat_model/agentic_model_openai) -- ark: [ARK](/docs/eino/ecosystem_integration/chat_model/agentic_model_ark) -- More components: [ChatModel component list](/docs/eino/ecosystem_integration/chat_model) - -### Document - -#### Loader - -- file: [Loader - local file](/docs/eino/ecosystem_integration/document/loader_local_file) -- s3: [Loader - amazon s3](/docs/eino/ecosystem_integration/document/loader_amazon_s3) -- web url: [Loader - web url](/docs/eino/ecosystem_integration/document/loader_web_url) - -#### Parser - -- html: [Parser - html](/docs/eino/ecosystem_integration/document/parser_html) -- pdf: [Parser - pdf](/docs/eino/ecosystem_integration/document/parser_pdf) - -#### Transformer - -- markdown splitter: [Splitter - markdown](/docs/eino/ecosystem_integration/document/splitter_markdown) -- recursive splitter: [Splitter - recursive](/docs/eino/ecosystem_integration/document/splitter_recursive) -- semantic splitter: [Splitter - semantic](/docs/eino/ecosystem_integration/document/splitter_semantic) - -### Embedding - -- ark: [Embedding - ARK](/docs/eino/ecosystem_integration/embedding/embedding_ark) -- openai: [Embedding - OpenAI](/docs/eino/ecosystem_integration/embedding/embedding_openai) - -### Indexer - -- volc vikingdb: [Indexer - volc VikingDB](/docs/eino/ecosystem_integration/indexer/indexer_volc_vikingdb) -- Milvus 2.5+: [Indexer - Milvus 2 (v2.5+)](/docs/eino/ecosystem_integration/indexer/indexer_milvusv2) -- Milvus 2.4: [Indexer - Milvus](/docs/eino/ecosystem_integration/indexer/indexer_milvus) -- OpenSearch 3: [Indexer - OpenSearch 3](/docs/eino/ecosystem_integration/indexer/indexer_opensearch3) -- OpenSearch 2: [Indexer - OpenSearch 2](/docs/eino/ecosystem_integration/indexer/indexer_opensearch2) -- ElasticSearch 9: [Indexer - Elasticsearch 9](/docs/eino/ecosystem_integration/indexer/indexer_elasticsearch9) -- Elasticsearch 8: [Indexer - ES8](/docs/eino/ecosystem_integration/indexer/indexer_es8) -- ElasticSearch 7: [Indexer - Elasticsearch 7 ](/docs/eino/ecosystem_integration/indexer/indexer_elasticsearch7) - -### Retriever - -- volc vikingdb: [Retriever - volc VikingDB](/docs/eino/ecosystem_integration/retriever/retriever_volc_vikingdb) -- Milvus 2.5+: [Retriever - Milvus 2 (v2.5+) ](/docs/eino/ecosystem_integration/retriever/retriever_milvusv2) -- Milvus 2.4: [Retriever - Milvus](/docs/eino/ecosystem_integration/retriever/retriever_milvus) -- OpenSearch 3: [Retriever - OpenSearch 3](/docs/eino/ecosystem_integration/retriever/retriever_opensearch3) -- OpenSearch 2: [Retriever - OpenSearch 2](/docs/eino/ecosystem_integration/retriever/retriever_opensearch2) -- ElasticSearch 9: [Retriever - Elasticsearch 9](/docs/eino/ecosystem_integration/retriever/retriever_elasticsearch9) -- ElasticSearch 8: [Retriever - ES8](/docs/eino/ecosystem_integration/retriever/retriever_es8) -- ElasticSearch 7: [Retriever - ES 7](/docs/eino/ecosystem_integration/retriever/retriever_elasticsearch7) - -### Tools - -- googlesearch: [Tool - Googlesearch](/docs/eino/ecosystem_integration/tool/tool_googlesearch) -- duckduckgo search: [Tool - DuckDuckGoSearch](/docs/eino/ecosystem_integration/tool/tool_duckduckgo_search) diff --git a/content/en/docs/eino/overview/eino_adk_quickstart.md b/content/en/docs/eino/overview/eino_adk_quickstart.md new file mode 100644 index 00000000000..c4134ebb781 --- /dev/null +++ b/content/en/docs/eino/overview/eino_adk_quickstart.md @@ -0,0 +1,255 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: Get Started with Eino ADK in 5 Minutes +weight: 9 +--- + +This guide is for developers already familiar with Eino, focusing on the most important autonomous decision-making primitive in ADK: **ChatModelAgent** and its runtime enhancement mechanism **ChatModelAgentMiddleware**. + +## First, Meet ChatModelAgent + +When we talk about "Agent," we almost always mean: an entity powered by a large model at its core, equipped with tools, capable of autonomous decision-making and solving complex real-world problems. `ChatModelAgent` is Eino ADK's direct implementation of this concept. + +**ChatModelAgent = A ReAct Agent that uses ChatModel as its decision maker, Tools as its action space, and tool feedback plus history as the context for the next decision.** + +Four key components: + +1. **ChatModel**: The large model, responsible for reasoning and decision-making. +2. **Tools**: The tool collection, defining the range of actions the Agent can take. +3. **Feedback**: Tool execution results flow back to the model context, serving as the basis for the next decision. +4. **History**: Complete preservation of the reasoning traces, tool calls, and tool results during problem-solving. + +Therefore, `ChatModelAgent` is not a single model call, but a sustained problem-solving process. + +## ChatModelAgent's Execution Structure: ReAct Loop + +The core capability of `ChatModelAgent` is **autonomous decision-making** — in a single `Run`, the model can repeatedly reason, act, and receive feedback until the problem is solved. The execution structure supporting this capability is the ReAct Loop. + +Autonomous decision-making requires four elements to coexist: + +1. **Decision Maker (ChatModel)**: Each iteration judges what to do next based on the current context. +2. **Action Space (Tools)**: Defines the concrete actions the Agent can take. +3. **Feedback Signal (Tool Feedback)**: Action results are injected into the context, serving as the basis for subsequent decisions — this allows the Agent to correct course based on actual execution results rather than guessing everything in one shot. +4. **Accumulated Context (History)**: Complete preservation of reasoning traces, tool calls, and results. Each iteration, the model sees not an isolated query, but the complete problem-solving process from start to current. + +All four are indispensable: without a decision maker there's no reasoning, without an action space there's no execution, without feedback there's no correction, without accumulated context there's no better judgment based on history. + + + +Key characteristic: **Progressive decision-making driven by accumulated context**. Each loop iteration doesn't start from scratch, but continues on top of the complete trace of all prior reasoning and actions. Every model decision is made based on a continuously growing problem-solving context, enabling the Agent to handle complex tasks requiring multi-step reasoning, trial-and-error, and correction. + +## What Makes Your ChatModelAgent Different + +The structure of the ReAct Loop is fixed. So what makes **your** ChatModelAgent different from others, tailored to your specific problem? + +Four dimensions: + +1. **ChatModel** — Which model to use for decision-making. +2. **Instruction** — System instruction: role definition, behavioral constraints, few-shot examples. +3. **Tools** — Tool collection: determines what the Agent can do. +4. **Middleware (ChatModelAgentMiddleware)** — Injects behavior at specific lifecycle points in the ReAct Loop: intercepting, modifying, and enhancing inputs and outputs within the loop. + +The first three define what the Agent "is" — decision capability, role constraints, action scope. + +Middleware defines "how the Agent runs" — it doesn't change the Loop structure (reason → act → feedback remains unchanged), but controls runtime behavior within the loop. For example: compressing context before model calls, dynamically injecting tools before execution, performing permission checks on tool calls, retrying or switching to backup models on failure. These are all runtime enhancements injected at specific points in the Loop. + +## Middleware: Injecting Behavior into the ReAct Loop + +When building a ChatModelAgent, you'll encounter these typical problems: + +- **Agent needs to read/write files, execute commands?** → Need to inject a set of general-purpose tools before execution. +- **Agent needs to reuse predefined instructions and knowledge?** → Need to package reusable capabilities as Skills, loaded on demand. +- **Context keeps growing, exceeding model window?** → Need to automatically compress history before each model call. +- **Too many tools, putting them all in the prompt dilutes attention?** → Need to search and load tools on demand. +- **Model occasionally fails or returns garbage?** → Need automatic retry or switch to backup model. + +The common thread: they don't need to change the ReAct Loop structure, only intercept and enhance at specific points in the loop. That's what Middleware does. + +Corresponding built-in Middlewares: + + + + + + + + +
    ScenarioMiddlewareWhat It Does
    Need filesystem capabilitiesFileSystemInjects ls/read/write/edit/grep/execute tools before execution
    Reuse predefined capabilitiesSkillPackages instructions, knowledge, tools as skill units loadable on demand
    Context exceeds windowReduction / SummarizationCompresses messages and tool results before model call
    Too many toolsToolSearchSearches and loads Tools on demand rather than exposing all at once
    Unstable model callsModelRetry / ModelFailoverRetry / failover at the individual model call level
    + +Each Middleware implementation injects at a specific hook point in the ReAct Loop. The diagram below shows where each `ChatModelAgentMiddleware` hook sits in the loop: + + + +Hook point summary: + + + + + + + + + +
    Hook PointTimingTypical Use
    BeforeAgent
    Before Agent runs (once only)Enhance Instruction, inject Tools
    BeforeModelRewriteState
    Before each model callModify Messages / ToolInfos
    AfterModelRewriteState
    After each model callModify model response or patch state
    WrapModel
    Individual model call levelRetry, failover, rewrite model output
    WrapToolCall
    Individual tool call levelPermissions, safety, output rewriting
    AfterAgent
    After Agent succeedsPost-processing, state cleanup
    + +See the appendix at the end for a complete Middleware quick reference. + +## Quick Start: Create and Run a ChatModelAgent + +`Runner` is the entry point for executing an Agent. It converts a user request into an Agent run, handling single-run configuration, event stream output, streaming toggle, and runtime capabilities like checkpoint/resume. The minimal usage is: put a `ChatModelAgent` into `RunnerConfig`, then call `Query` or `Run`. + +The following example shows how to create a minimal ChatModelAgent and execute it via Runner: + +```go +package main + +import ( + "context" + "fmt" + "log" + + "github.com/cloudwego/eino-ext/components/model/ark" + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/compose" + "github.com/cloudwego/eino/components/tool" +) + +func main() { + ctx := context.Background() + + // 1. Create ChatModel + chatModel, err := ark.NewChatModel(ctx, &ark.ChatModelConfig{ + Model: "doubao-seed-1-8-251228", + APIKey: "your_api_key", // Replace with your API Key + }) + if err != nil { + log.Fatal(err) + } + + // 2. Create ChatModelAgent + agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "my-assistant", + Description: "An assistant that can use tools to answer questions.", + Instruction: "You are a helpful assistant. Please answer user questions using available tools.", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{ + // Register your tools, e.g. webSearchTool + }, + }, + }, + // Handlers: []adk.ChatModelAgentMiddleware{...}, // Register Middleware + }) + if err != nil { + log.Fatal(err) + } + + // 3. Execute Agent via Runner + runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + }) + + // 4. Send user request and consume event stream + iter := runner.Query(ctx, "Help me search for today's news") + for { + event, ok := iter.Next() + if !ok { + break + } + fmt.Println(event) + } +} +``` + +Core flow: `NewChatModelAgent` → `NewRunner` → `Runner.Query/Run` → consume `AsyncIterator` event stream. + +For more basic examples, see: [Eino: Quick Start](/docs/eino/quick_start). + +## Further Reading: DeepAgents + +DeepAgents is a pre-built ChatModelAgent whose core value lies in two preset Middlewares: + +- **WriteTodos (PlanTask)**: Lets the main Agent explicitly plan a task list before execution and continuously track progress during execution. Complex problems are no longer solved by the model "thinking everything through at once," but by decomposing first, then advancing step by step. +- **TaskTool**: Lets the main Agent delegate subtasks to sub-Agents for independent execution; sub-Agents complete their work and report results back to the main loop. This allows a single Agent's capability boundaries to be extended through composition. + +Additionally, DeepAgents comes with preset system prompts and an optional FileSystem Middleware, ready to handle scenarios requiring task planning and multi-Agent collaboration out of the box. + +``` +DeepAgents = ChatModelAgent + + WriteTodos (task planning and tracking) + + TaskTool (subtask delegation) + + Optional FileSystem + + Preset system prompts +``` + +Further reading: + +- Eino ADK Deep Agents Complete Guide: [Eino ADK: DeepAgents](/docs/eino/core_modules/eino_adk/agent_implementation/deepagents) +- DeepAgents Examples: [eino-examples/adk/multiagent/deep at main · cloudwego/eino-examples](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) + +## Further Reading: Why Not Continue Using flow/react? + +Back to first principles: Graph and Agent are two fundamentally different AI application paradigms. + +- **Graph's** core is **determinism**: developers predefine topology, and flow relationships between nodes are determined at compile time. Input is structured, output is predictable. +- **Agent's** core is **autonomy**: the LLM dynamically decides the next action at runtime, execution paths are unpredictable, and output is a full-process event stream. + +`flow/react` essentially uses Graph to "simulate" an Agent — unfolding the ReAct reasoning loop into static nodes and edges. This works, but is fundamentally a mismatch: using deterministic orchestration to carry dynamic decision-making. As Agent complexity grows, this mismatch creates systemic problems: + +1. **Deliverable mismatch**: Graph targets "final results," while an Agent's deliverable is the entire process (reasoning traces, intermediate tool calls, state changes). Using Graph for Agents requires extracting intermediate processes through sidechannels like Callbacks — feasible but a workaround. +2. **Execution model mismatch**: Graph is a synchronous execution model, while Agents are naturally async long-running processes. Event stream output, checkpoint/resume, interrupt recovery, and other runtime capabilities need unified framework management at the Agent level, not scattered across Graph node callbacks. +3. **Extension point mismatch**: Agent runtime enhancements (context compression, dynamic tool loading, model retry, safety controls) are fundamentally interceptions and injections into the decision loop. In Graph, these capabilities have no unified mount point and can only be scattered across nodes or edges; in ChatModelAgent, they have clear lifecycle hooks (Middleware). + +Therefore, flow/react isn't deprecated — it returns to its best-fit position: **deterministic process orchestration**. When the core problem is "autonomous decision-making + runtime enhancement," the correct abstraction is `ChatModelAgent + ChatModelAgentMiddleware`. + +Further reading: + +- Agent or Graph? AI Application Route Analysis: [Agent or Graph? AI Application Route Analysis](/docs/eino/overview/graph_or_agent) + + + +## Appendix: Middleware Quick Reference + +### Instance Overview + + + + + + + + + + + + + + + + + +
    MiddlewareDescription
    ReductionTruncates overly long tool outputs / writes to filesystem to prevent token overflow
    SummarizationCompresses history messages via summarization
    SkillReusable instructions/knowledge exposed as Tools, loaded on demand by Agent
    FileSystemls/read/write/edit/glob/grep/execute file operation toolset
    ToolSearch
    tool_search
    meta-tool for on-demand tool discovery (reduces resident tool list size)
    PatchToolCallPatches dangling tool calls in message history (missing tool results)
    SafeToolWrapToolCall-level interception of tool execution errors, converted to readable text for the model so Agent can self-correct instead of crashing
    ModelRetryRetries failed model calls with configurable strategy [built-in config]
    ModelFailoverSwitches to backup model on failure [built-in config]
    AgentsMDInjects Agents.md knowledge files into model context to improve context quality
    PlanTaskPersistent task management toolset (create/get/update/list) with dependency tracking
    WriteTodosLightweight TODO list tool; Agent can create and track structured to-do items [DeepAgent built-in]
    TaskToolSub-Agent delegation tool; main Agent delegates subtasks to sub-Agents for independent execution [DeepAgent built-in]
    PermissionTool call permission control [WIP]
    + +> Note: ModelRetry / ModelFailover are built-in fields of `ChatModelAgentConfig` (`ModelRetryConfig` / `ModelFailoverConfig`) in code, conceptually corresponding to the `WrapModel` hook. SafeTool is an example pattern (see ChatWithEino ch05), implemented as user-defined Middleware. WriteTodos / TaskTool are DeepAgent built-ins, not exported separately. Permission is a planned capability. + +### Categories + + + + + + + + +
    CategoryProblem SolvedIncludes
    Extend General ToolsGive Agent more capabilitiesFileSystem, Skill, ToolSearch, PlanTask, WriteTodos, TaskTool
    Handle Errors During ReActImprove reliabilityModelRetry, ModelFailover, SafeTool, PatchToolCall
    Keep Context Within Window LimitPrevent token overflowReduction, Summarization, ToolSearch
    Safety and PermissionsConstrain Agent behaviorPermission
    Improve Context Content QualityGive model better contextSkill, AgentsMD
    + +ToolSearch spans two categories: it's both "Extend Tools" (providing on-demand tool discovery) and "Keep Context Within Window" (avoiding loading too many tool descriptions at once). + +Further reading: + +- ChatModelAgent Middleware Deep Dive: [Eino ADK: ChatModelAgentMiddleware](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware) diff --git a/content/en/docs/eino/overview/graph_or_agent.md b/content/en/docs/eino/overview/graph_or_agent.md index 000dcaef6cb..00ceecceb9f 100644 --- a/content/en/docs/eino/overview/graph_or_agent.md +++ b/content/en/docs/eino/overview/graph_or_agent.md @@ -1,22 +1,22 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: Agent or Graph? AI Application Path Analysis +title: "Agent or Graph? Analyzing AI Application Approaches" weight: 8 --- ## Introduction: Two Coexisting AI Interaction Paradigms -Many application interfaces have integrated different forms of AI capabilities, as shown below: +Many application interfaces integrate different forms of AI capabilities, as shown below: This seemingly simple screenshot represents two forms of "AI applications": -- The "Agent" represented by the "chat box". **Agents use LLM (Large Language Model) as the decision center, autonomously plan and can conduct multi-turn interactions**, naturally suited for handling open-ended, continuous tasks, manifesting as a "dialogue" form. -- The "Graph" represented by "buttons" or "APIs". For example, the "Recording Summary" button above - the Graph behind it is roughly "Recording" → "LLM understands and summarizes" → "Save recording" - this kind of fixed process. **The core of Graph lies in the determinism of its process and the closure of tasks**, completing specific goals through predefined nodes and edges, manifesting as a "function" form. For example, video generation is an "API" form AI application: +- The "Agent" represented by the "chat box." **An Agent uses an LLM (Large Language Model) as its decision center, autonomously plans, and can conduct multi-turn interactions**, naturally suited for open-ended, ongoing tasks, manifesting as a "conversation" form. +- The "Graph" represented by "buttons" or "APIs." For example, a "Meeting Summary" button above, whose underlying Graph roughly follows a fixed flow of "recording" → "LLM understanding and summarization" → "save recording." **The core of a Graph lies in the determinism of its flow and the bounded nature of its tasks**, completing specific goals through predefined nodes and edges, manifesting as a "feature" form. For example, video generation is an "API-form" AI application: @@ -39,8 +39,8 @@ flowchart TD G2("Deterministic Output") S --> D - D -->|"Open or Uncertain"| A - D -->|"Closed and Deterministic"| G + D -->|"Open-ended or uncertain"| A + D -->|"Bounded and deterministic"| G A --> A1 A --> A2 G --> G1 @@ -51,29 +51,29 @@ flowchart TD class A,G,A1,A2,G1,G2 process_style ``` -This article explores in detail the differences and connections between Agent and Graph, two forms of AI applications, proposes that "the best integration point is to encapsulate Graph as Agent's Tool", and provides recommended usage patterns for [Eino](https://github.com/cloudwego/eino) developers. +This article explores in detail the differences and connections between the Agent and Graph forms of AI applications, proposes that "the best integration point for both is to wrap a Graph as an Agent's Tool," and provides recommended usage patterns for [Eino](https://github.com/cloudwego/eino) developers. ## Core Concept Analysis ### Basic Definitions -- **Graph**: A flowchart **predefined** by developers with a clear topology. Its nodes can be code functions, API calls, or LLMs, and inputs and outputs are typically structured. **The core characteristic is "determinism"** - given the same input, the execution path and final output are predictable. -- **Agent**: An entity centered on LLM that can **autonomously plan, decide, and execute** tasks. It completes goals through **dynamic interaction** with the environment (Tools, users, other Agents), and its behavior is uncertain. **The core characteristic is "autonomy"**. -- **Tool**: Any external capability that an Agent can call, typically a **function or API that encapsulates specific functionality**. Tools themselves can be synchronous or asynchronous, stateful or stateless. They are only responsible for execution and do not have autonomous decision-making capabilities. -- **Orchestration**: The process of **organizing and coordinating multiple compute units (nodes, Agents) to work together**. In this article, it specifically refers to predefining static processes through Graphs. +- **Graph**: A flowchart **predefined** by developers with a clear topological structure. Its nodes can be code functions, API calls, or LLMs, and inputs/outputs are typically structured. **The core characteristic is "determinism"**—given the same input, the execution path and final output are predictable. +- **Agent**: An entity with an LLM at its core that can **autonomously plan, make decisions, and execute** tasks. It accomplishes goals through **dynamic interaction** with the environment (Tools, users, other Agents), and its behavior is non-deterministic. **The core characteristic is "autonomy."** +- **Tool**: Any external capability that an Agent can call, typically a **function or API that encapsulates specific functionality**. A Tool can be synchronous or asynchronous, stateful or stateless. It is only responsible for execution and does not possess autonomous decision-making capability. +- **Orchestration**: The process of **organizing and coordinating multiple computational units (nodes, Agents) to work together**. In this article, it specifically refers to predefining static flows through Graph. -### Deep Comparison +### In-depth Comparison - - + + - - - + + +
    Feature DimensionAgentGraph
    Core DriverLLM Autonomous DecisionDeveloper Preset Process
    DimensionAgentGraph
    Core DriverLLM autonomous decision-makingDeveloper-preset flow
    InputUnstructured natural language, images, etc.Structured data
    DeliverableProcess and result equally importantFocused on final result
    State ManagementLong-term, cross-executionSingle execution, stateless
    Runtime ModeTends toward asynchronousTends toward synchronous
    DeliverableProcess and result equally importantFocus on final result
    State ManagementLong-duration, cross-executionSingle execution, stateless
    Execution ModeTends toward asynchronousTends toward synchronous
    -Summary: Agent can be considered autonomous, driven overall by LLM, using external capabilities in the form of Tool Calls. Graph is deterministic, connecting external capabilities with a clear topology, while locally utilizing LLM for decision-making/generation. +Summary: An Agent can be considered autonomous, driven overall by an LLM, using external capabilities in the form of Tool Calls. A Graph is deterministic, linking external capabilities with a clear topological structure while locally leveraging LLMs for decision-making/generation. ```mermaid flowchart TD @@ -82,18 +82,18 @@ flowchart TD Graph["Graph (Determinism)"] end - subgraph CoreDrive["Source of Intelligence"] + subgraph CoreDrive["Intelligence Source"] LLM["LLM (Decision/Generation)"] end - subgraph ExternalCap["External Capability"] - Tool["External Capability
    (Function/API)"] + subgraph ExternalCap["External Capabilities"] + Tool["External Capacity
    (Function/API)"] end - Agent -- "Source of drive" --> LLM - Graph -- "Contains node" --> LLM + Agent -- "Driving force" --> LLM + Graph -- "Contains as node" --> LLM Agent -- "Tool call" --> Tool - Graph -- "Contains node" --> Tool + Graph -- "Contains as node" --> Tool classDef agent fill:#EAE2FE,stroke:#000000 classDef graphClass fill:#F0F4FC,stroke:#000000 @@ -108,7 +108,7 @@ flowchart TD ## Historical Perspective: From Determinism to Autonomy -When the Langchain framework was first released in 2022, the LLM world's API paradigm was still OpenAI's [Completions API](https://platform.openai.com/docs/guides/completions), a simple "text in, text out" API. At launch, Langchain's slogan was "[connect LLMs to external sources of computation and data](https://blog.langchain.com/langchain-second-birthday/)". A typical "Chain" might look like this: +When the Langchain framework was first released in 2022, the LLM world's API paradigm was still OpenAI's [Completions API](https://platform.openai.com/docs/guides/completions)—a simple "text in, text out" API. At launch, Langchain's motto was "[connect LLMs to external sources of computation and data](https://blog.langchain.com/langchain-second-birthday/)." A typical "Chain" might look like: ```mermaid flowchart LR @@ -119,51 +119,51 @@ flowchart LR S-->L-->P ``` -Subsequently, the [ReAct](https://react-lm.github.io/) (Reasoning and Acting) paradigm was proposed, systematically demonstrating for the first time how LLMs can not only generate text but also interact with the external world through "think-act-observe" loops to solve complex problems. This breakthrough laid the theoretical foundation for Agent's autonomous planning capabilities. Almost simultaneously, OpenAI launched the [ChatCompletions API](https://platform.openai.com/docs/api-reference/chat), driving the transformation of LLM interaction capabilities from "single text input/output" to "multi-turn dialogue". Then [Function Calling](https://platform.openai.com/docs/guides/function-calling) capability emerged, giving LLMs standard capabilities to interact with external functions and APIs. At this point, we could already build "multi-turn dialogue with autonomous external interaction" LLM application scenarios, i.e., Agents. In this context, AI application frameworks saw two important developments: +Subsequently, the [ReAct](https://react-lm.github.io/) (Reasoning and Acting) paradigm was proposed, systematically demonstrating for the first time how LLMs could not only generate text but also interact with the external world through "think-act-observe" loops to solve complex problems. This breakthrough laid the theoretical foundation for Agent autonomous planning capabilities. Almost simultaneously, OpenAI launched the [ChatCompletions API](https://platform.openai.com/docs/api-reference/chat), driving LLM interaction capabilities from "single text input/output" toward "multi-turn conversations." Then [Function Calling](https://platform.openai.com/docs/guides/function-calling) appeared, giving LLMs a standardized ability to interact with external functions and APIs. At this point, we could build "multi-turn conversation with autonomous external interaction" LLM application scenarios—i.e., Agents. Against this backdrop, AI application frameworks saw two important developments: -- Langchain launched Langgraph: Static orchestration evolved from simple input/output Chains to complex topologies. This type of orchestration framework fits well with "Graph" type AI application forms: "arbitrary" structured inputs, with "final result" as the core deliverable, decoupling message history and other state management mechanisms from core orchestration logic, supporting flexible orchestration of various topologies, and various nodes/components represented by LLMs and knowledge bases. -- Agent and Multi-Agent frameworks emerged in large numbers: such as AutoGen, CrewAI, Google ADK, etc. The common thread among these Agent frameworks is attempting to solve problems like "LLM-driven processes", "context passing", "memory management", and "Multi-Agent common patterns", which are different from the "connecting LLMs with external systems in complex processes" problem that orchestration frameworks try to solve. +- Langchain launched Langgraph: static orchestration evolved from simple input/output Chains to complex topological structures. Such orchestration frameworks are highly suited to "Graph" type AI application forms: "arbitrary" structured input, with "final results" as the core deliverable, decoupling message history and other state management mechanisms from core orchestration logic, supporting flexible orchestration of various topological structures, and various nodes/components represented by LLMs and knowledge bases. +- Agent and Multi-Agent frameworks emerged in large numbers: such as AutoGen, CrewAI, Google ADK, etc. The common thread among these Agent frameworks is their attempt to solve problems like "LLM-driven flow," "context propagation," "memory management," and "Multi-Agent common patterns"—different from the orchestration framework's problem of "connecting LLMs with external systems in complex flows." -Even with different positioning, orchestration frameworks can implement ReAct Agents or other Multi-Agent patterns, because "Agent" is a special form of "LLM interacting with external systems", and "LLM-driven processes" can be implemented through "static branch enumeration" and other methods. However, this implementation is essentially a "simulation", like writing code in Word - possible, but not a good fit. Orchestration frameworks were originally designed to manage deterministic Graphs, while the core of Agents is responding to dynamically changing "chains of thought". Forcing the latter to adapt to the former will inevitably produce "mismatches" in deliverables, runtime modes, etc. For example, in actual use, you might encounter some pain points: +Even with different positioning, orchestration frameworks can implement ReAct Agents or other Multi-Agent patterns because "Agent" is a special form of "LLM interacting with external systems," and "LLM-driven flow" can be achieved through "static branch enumeration." However, this implementation is essentially a "simulation"—like writing code in Word: possible, but not a good fit. Orchestration frameworks are designed to manage deterministic Graphs, while the Agent's core is responding to dynamic "chains of thought." Forcing the latter to fit the former inevitably creates "mismatches" in deliverables, execution modes, etc. For example, in practice, some pain points may arise: -- Deliverable mismatch: The output of an orchestrated ReAct Agent is the "final result", while actual applications often focus on various intermediate processes. Callbacks and other solutions can solve this - complete enough, but still a "patch". +- Deliverable mismatch: The output of an orchestrated ReAct Agent is the "final result," but real applications often focus on various intermediate processes. Callback-based solutions can address this—comprehensive, but still "patches." ```mermaid flowchart LR A[ReAct Agent] - P@{ shape: processes, label: "Full Process Data" } + P@{ shape: processes, label: "Full process data" } A--o|Focus on|P G[Graph] - F[Final Result] - G-->|Main flow output,
    but covered by side-channel output|F + F[Final result] + G-->|Main flow output,
    but covered by side output|F - G-.->|Side-channel extraction|P + G-.->|Side extraction|P ``` -- Runtime mode mismatch: Due to synchronous execution, "to display LLM replies to users as quickly as possible", nodes within ReAct Agent orchestration need to be as "fast" as possible. This mainly means that in the branch judgment logic of "whether LLM output contains ToolCall", decisions should be made based on the first packet or first few packets as much as possible. This branch judgment logic can be customized, such as "read streaming output until Content is seen, then determine no ToolCall", but sometimes it cannot completely solve the problem, and callbacks are used as a "side-channel" to manually switch from "synchronous" to "asynchronous". +- Execution mode mismatch: Due to synchronous execution, "displaying the LLM's response to the user as quickly as possible" requires that each node within the ReAct Agent orchestration be as "fast" as possible. This mainly means that in the branch logic determining "whether the LLM's output contains a ToolCall," the judgment should be made based on the first packet or first few packets. This branch logic can be customized, for example "read the streaming output until Content is seen, then determine no ToolCall," but this doesn't always fully solve the problem—manual switching from "synchronous" to "asynchronous" via "side-channel" Callbacks is sometimes needed. ```mermaid flowchart LR L[LLM Node] - S@{ shape: processes, label: "Streaming Content"} - L-->|Generate|S + S@{ shape: processes, label: "Streaming content"} + L-->|Generates|S - B{Contains
    Tool Call?} - D@{ shape: processes, label: "Streaming Content"} + B{Contains
    tool call?} + D@{ shape: processes, label: "Streaming content"} - B-->|No, display on screen|D + B-->|No, display to user|D S-->|Frame-by-frame check|B ``` -These pain points stem from the essential differences between the two. A framework designed for deterministic processes (Graph) has difficulty natively supporting an autonomous system (Agent) centered on dynamic "chains of thought". +These pain points stem from the fundamental difference between the two. A framework designed for deterministic flows (Graph) cannot natively support an autonomous system (Agent) with dynamic "chains of thought" at its core. -## Exploring Integration Paths: The Relationship Between Agent and Graph +## Exploring Integration: The Relationship Between Agent and Graph -The goal of the Eino framework is to support both Graph and Agent scenarios. Our evolution path started with Graph and orchestration framework (eino-compose), and introduced relatively independent Agent capabilities (eino-adk) outside the orchestration framework. This may seem like an unnecessary split, as if "Eino as an orchestration framework" and "Eino as an Agent framework" are independent of each other, with development experience not being shareable. The current situation is indeed so, and in the long term the "relatively independent" state will continue, but there will also be deep integration in some areas. +The Eino framework aims to support both Graph and Agent scenarios. Our evolution path started with Graph and orchestration framework (eino-compose), and introduced relatively independent Agent capabilities (eino-adk) outside the orchestration framework. This may seem unnecessarily fragmented—as if "Eino as an orchestration framework" and "Eino as an Agent framework" are independent, with no shared development experience. The current situation is indeed so; in the long term, the "relatively independent" state will persist, but there will also be localized deep integration. -Below we analyze the specific relationship between "Agent" and "Graph" in the Eino framework from three perspectives: +Below we analyze the specific relationship between "Agent" and "Graph" in the Eino framework from three angles: - Multi-Agent orchestration - Agent as a node @@ -171,31 +171,31 @@ Below we analyze the specific relationship between "Agent" and "Graph" in the Ei ### Multi-Agent and Orchestration -Although "Agent" and "Graph" have essential differences, are there scenarios that belong to the "intersection" of the two forms, where you can't make a black-or-white choice? A typical scenario is Multi-Agent, where multiple Agents interact in "some way", presenting to users as a complete Agent. Can this "interaction method" be understood as "Graph orchestration"? +Although "Agent" and "Graph" are fundamentally different, are there scenarios that represent a "crossover fusion" where a binary choice cannot be made? A typical scenario is Multi-Agent, where multiple Agents interact in "some manner," presenting to the user as a complete Agent. Can this "manner of interaction" be understood as "Graph orchestration"? -Let's observe several mainstream collaboration patterns: +Let's examine several mainstream collaboration patterns: -- Hierarchical invocation (Agent as Tool): This is the most common pattern (see Google ADK's [definition](https://google.github.io/adk-docs/agents/multi-agents/#c-explicit-invocation-agenttool) and [examples](https://google.github.io/adk-docs/agents/multi-agents/#hierarchical-task-decomposition)). A top-level Agent delegates specific subtasks to specialized "Tool Agents". For example, a main Agent is responsible for interacting with users, and when code execution is needed, it calls a "code execution Agent". In this pattern, sub-Agents are usually stateless, don't share memory with the main Agent, and their interaction is a simple Function Call. There is only one relationship between the top-level Agent and sub-Agents: caller and callee. Therefore, we can conclude that the Agent as Tool Multi-Agent pattern is not the "node flow" relationship in "Graph orchestration". +- Hierarchical invocation (Agent as Tool): This is the most common pattern (see Google ADK's [definition](https://google.github.io/adk-docs/agents/multi-agents/#c-explicit-invocation-agenttool) and [examples](https://google.github.io/adk-docs/agents/multi-agents/#hierarchical-task-decomposition)). An upper-level Agent delegates specific subtasks to specialized "Tool Agents." For example, a main Agent handles user interaction, and when code execution is needed, it calls a "code execution Agent." In this pattern, sub-Agents are typically stateless, not sharing memory with the main Agent, and their interaction is a simple Function Call. The upper-level Agent and sub-Agents have only one relationship: caller and callee. Therefore, we can conclude that the Agent as Tool Multi-Agent pattern is not the "node transition" relationship in "Graph orchestration." ```mermaid flowchart LR subgraph Main Agent L[Main Agent's LLM] - T1[Sub Agent 1] - T2[Sub Agent 2] + T1[Sub-Agent 1] + T2[Sub-Agent 2] - L-->|Tool Call|T1 - L-->|Tool Call|T2 + L-->|Tool call|T1 + L-->|Tool call|T2 end ``` -- Preset flows: For some mature collaboration patterns, such as "Plan-Execute-Replan" (see Langchain's [example](https://langchain-ai.github.io/langgraph/tutorials/plan-and-execute/plan-and-execute/)), the interaction order and roles between Agents are fixed. Frameworks (like Eino adk) can encapsulate these patterns as "prebuilt Multi-Agent patterns", which developers can use directly without caring about internal details or manually setting up or adjusting the process relationships between sub-Agents. Therefore, we can conclude that for mature collaboration patterns, "Graph orchestration" is an implementation detail encapsulated inside the prebuilt pattern, which developers don't perceive. +- Preset flows: For mature collaboration patterns, such as "Plan-Execute-Replan" (see Langchain's [example](https://langchain-ai.github.io/langgraph/tutorials/plan-and-execute/plan-and-execute/)), the interaction order and roles between Agents are fixed. Frameworks (such as Eino ADK) can encapsulate these patterns as "prefabricated Multi-Agent patterns" that developers can use directly without worrying about internal details or manually setting up the flow relationships between sub-Agents. Therefore, we can conclude that for mature collaboration patterns, "Graph orchestration" is an implementation detail encapsulated within the prefabricated pattern—developers don't need to be aware of it. ```mermaid flowchart LR subgraph Plan-Execute-Replan - P[Planner] - E[Executor] + P[planner] + E[executor] R[Replanner] P-->E E-->R @@ -205,7 +205,7 @@ flowchart LR user -->|Use as a whole| Plan-Execute-Replan ``` -- Dynamic collaboration: In more complex scenarios, the collaboration method between Agents is dynamic (see Google ADK's [definition](https://google.github.io/adk-docs/agents/multi-agents/#b-llm-driven-delegation-agent-transfer) and [examples](https://google.github.io/adk-docs/agents/multi-agents/#coordinatordispatcher-pattern)), possibly involving bidding, voting, or runtime decisions by a "coordinator Agent". In this pattern, the relationship between Agents is "Agent transfer", similar to "node flow" in "Graph orchestration" - both are complete handoffs of "control" from A to B. However, this "Agent transfer" can be completely dynamic, with its dynamic nature reflected not only in "which Agents can be transferred to", but also in "how the decision of which Agent to transfer to is made" - neither is preset by developers, but is the LLM's real-time dynamic behavior. This forms a sharp contrast with the static determinism of "Graph orchestration". Therefore, we can conclude that the dynamic collaboration Multi-Agent pattern is fundamentally different from "Graph orchestration" and is better suited for independent solutions at the Agent framework level. +- Dynamic collaboration: In more complex scenarios, Agent collaboration is dynamic (see Google ADK's [definition](https://google.github.io/adk-docs/agents/multi-agents/#b-llm-driven-delegation-agent-transfer) and [examples](https://google.github.io/adk-docs/agents/multi-agents/#coordinatordispatcher-pattern)), potentially involving bidding, voting, or a "coordinator Agent" making decisions at runtime. In this pattern, the relationship between Agents is "Agent transfer," similar to the "node transition" in "Graph orchestration"—both represent complete transfer of "control" from A to B. However, this "Agent transfer" can be entirely dynamic, with its dynamism reflected not only in "which Agents can be transferred to" but also in "how the decision to transfer to a specific Agent is made"—neither is preset by developers, but rather the LLM's real-time dynamic behavior. This contrasts sharply with the static determinism of "Graph orchestration." Therefore, we can conclude that the dynamic collaboration Multi-Agent pattern is fundamentally different from "Graph orchestration" and is better addressed with independent solutions at the Agent framework level. ```mermaid flowchart LR @@ -213,71 +213,71 @@ flowchart LR B[Agent 2] C[Agent 3] - A-.->|Dynamic handoff|B-.->|Dynamic handoff|C + A-.->|Dynamic transfer|B-.->|Dynamic transfer|C ``` -In summary, Multi-Agent collaboration problems can either be solved by reducing dimensions through the "Agent as Tool" pattern, or by frameworks providing fixed patterns, or are essentially completely dynamic collaborations. Their need for "orchestration" is fundamentally different from Graph's static, deterministic process orchestration. +In summary, Multi-Agent collaboration problems can either be reduced through the "Agent as Tool" pattern, addressed by framework-provided fixed patterns, or are fundamentally fully dynamic—their need for "orchestration" is essentially different from Graph's static, deterministic flow orchestration. ### Agent as a Graph Node -After exploring "the relationship between Multi-Agent and Graph orchestration", we can ask from another angle: Is there a need to use Agents in Graph orchestration? In other words, can an Agent be a "node" in a Graph? +After exploring "the relationship between Multi-Agent and Graph orchestration," we can pose the question from another angle: is an Agent needed within Graph orchestration? In other words, can an Agent serve as a "node" within a Graph? -Let's first recall the characteristics of Agent and Graph: +Let's recall the respective characteristics of Agent and Graph: -- Agent's input sources are more diverse. Besides receiving structured data from upstream nodes, it heavily depends on its own conversation history (Memory). This forms a sharp contrast with Graph nodes that strictly depend on upstream outputs as the only input. -- Agent's output is asynchronous full-process data. This means other nodes have difficulty using the output of an "Agent node". +- An Agent's input sources are more diverse—besides receiving structured data from upstream nodes, it heavily depends on its own conversation history (Memory). This contrasts sharply with Graph nodes' strict dependency on upstream outputs as their sole input. +- An Agent's output is asynchronous full-process data. This means other nodes have difficulty using the output of an "Agent node." ```mermaid flowchart LR - U[Upstream Node] + U[Preceding Node] A[Agent Node] - D[Downstream Node] + D[Following Node] M[Memory] - U-->|Not all inputs
    |A + U-->|Not the full input
    |A M-.->|External state injection|A - A-->|Full process data
    for users or LLM
    |D + A-->|Full process data
    Aimed at users or LLMs
    |D ``` -Therefore, adding an Agent node to a Graph means forcing an Agent that requires multi-turn interaction, long-term memory, and asynchronous output into a deterministic, synchronously executing Graph node, which is usually inelegant. An Agent's startup can be orchestrated by a Graph, but its internal complex interactions should not block the main flow. +Therefore, adding an Agent node to a Graph means forcibly embedding an Agent that requires multi-turn interaction, long-term memory, and asynchronous output into a deterministic, synchronously-executing Graph node—this is typically inelegant. An Agent's startup can be orchestrated by a Graph, but its internal complex interactions should not block the main flow. -In fact, what we need in a Graph is not a complete Agent node, but a more functionally pure **"LLM node"**. This node is responsible for receiving specific inputs in deterministic processes, completing intent recognition or content generation, and producing structured outputs, thereby injecting intelligence into the process. +In practice, what we need within a Graph is not a complete Agent node, but a more functionally pure **"LLM node."** This node is responsible for receiving specific inputs within the deterministic flow, performing intent recognition or content generation, and producing structured output—injecting intelligence into the flow. -At the same time, if a simple "LLM" node really doesn't meet the requirements and an "Agent" is indeed needed, a more appropriate approach might not be to stuff the Agent into a statically predefined Graph, but to add various "plugins" like pre-processing and post-processing to the "Agent", embedding specific business logic inside the Agent. +Additionally, if a simple "LLM" node truly doesn't meet requirements and an "Agent" is indeed needed, a more appropriate approach may not be to stuff the Agent into a statically predefined Graph, but rather to add pre-processing, post-processing, and other "plugins" to the "Agent," embedding specific business logic within the Agent itself. -In summary: Treating an Agent simply as a Graph node is **inefficient**; a better approach is to use LLM nodes, or inject business logic as plugins into Agents. +In summary: treating an Agent simply as a Graph node is **inefficient**; a better approach is to use an LLM node or inject business logic as plugins into the Agent. -### The Integration Path: Encapsulating Graph as Agent's Tool +### The Integration Path: Wrapping Graph as an Agent's Tool -Since direct integration of Agent and Graph at the micro level (nodes) faces difficulties, is there a more elegant way to combine them at the macro level? The answer is yes, and this bridge is "Tool". If we observe the meanings of Graph and Tool, we can find many similarities: +Since direct integration of Agent and Graph at the micro level (nodes) presents difficulties, is there a more elegant way to combine them at the macro level? The answer is yes—the bridge is "Tool." If we observe the characteristics of Graph and Tool, many similarities emerge: - + - + - +
    Feature DimensionGraphTool
    DimensionGraphTool
    InputStructured dataStructured data
    DeliverableFocused on final resultFocused on final result
    DeliverableFocus on final resultFocus on final result
    State ManagementSingle execution, statelessSingle execution, stateless
    Runtime ModeSynchronous as a wholeTool is synchronous from LLM's perspective
    Execution ModeSynchronous overallFrom LLM's perspective, Tool is synchronous
    -These similarities mean that "Graph's presentation form matches Tool's requirements very well, so encapsulating Graph as a Tool is intuitive and simple". Therefore, most Graphs are suitable for joining Agents through the Tool mechanism, becoming part of Agent's capabilities. This way, Agents can clearly use most of Graph's capabilities, including efficient orchestration of "arbitrary" business topologies, ecosystem integration of a large number of related components, and supporting framework and governance capabilities (stream processing, callbacks, interrupt/resume, etc.). +These similarities mean that "a Graph's external form matches Tool requirements very well, making it intuitive and simple to wrap a Graph as a Tool." Therefore, most Graphs are suitable to join an Agent via the Tool mechanism, becoming part of the Agent's capabilities. This way, the Agent can clearly leverage most Graph capabilities, including efficient orchestration of "arbitrary" business topologies, ecosystem integration with numerous related components, and supporting framework and governance capabilities (stream processing, callbacks, interrupt/resume, etc.). -The "route debate" between "Agent" and "Graph" achieves dialectical unity. +The "route debate" between "Agent" and "Graph" achieves a dialectical unity. ```mermaid flowchart TD subgraph Agent ["Agent"] - A["LLM Decision"] --> B{"Call Tool?"} + A["LLM Decision"] --> B{"Call tool?"} B -- "Yes" --> C["Tool: my_graph_tool"] end subgraph Tool ["Tool"] - C -- "Encapsulate" --> D["Graph: my_graph"] + C -- "Wraps" --> D["Graph: my_graph"] end subgraph Graph ["Graph"] - D -- "Execute" --> E["Node 1"] + D -- "Executes" --> E["Node 1"] E --> F["Node 2"] - F --> G["Return Result"] + F --> G["Return result"] end G -- "Output" --> C @@ -291,24 +291,24 @@ flowchart TD class D,E,F,G graphGroup ``` -Graph-Tool-Agent Relationship Diagram +Graph-Tool-Agent relationship diagram ## Conclusion -Agent and Graph are not a route debate, but two complementary AI application paradigms. +Agent and Graph are not competing approaches, but rather two complementary AI application paradigms. -- Graph is the cornerstone for building reliable, deterministic AI functionality. It excels at orchestrating complex business logic, data processing pipelines, and API calls into predictable, maintainable workflows. When you need a "feature button" or a stable backend service, Graph is the best choice. -- Agent is the future for achieving general intelligence and autonomous exploration. It centers on LLM, solving open-ended problems through dynamic planning and Tools. When you need an "intelligent assistant" that can converse with people and autonomously complete complex tasks, Agent is the core direction. +- Graph is the cornerstone for building reliable, deterministic AI features. It excels at orchestrating complex business logic, data processing pipelines, and API calls into predictable, maintainable workflows. When you need a "feature button" or a stable backend service, Graph is the go-to choice. +- Agent is the future for achieving general intelligence and autonomous exploration. With an LLM at its core, it solves open-ended problems through dynamic planning and Tools. When you need an "intelligent assistant" that can converse with humans and autonomously complete complex tasks, Agent is the core direction. -The best integration point is to encapsulate Graph as Agent's Tool. +The best integration point for both is to wrap a Graph as an Agent's Tool. -Through this approach, we can fully leverage Graph's powerful capabilities in process orchestration and ecosystem integration to expand Agent's Tool list. A complex Graph application (such as a complete RAG pipeline, a data analysis pipeline) can be simplified into one of Agent's atomic capabilities, dynamically called at the right time. +Through this approach, we can fully leverage Graph's powerful capabilities in flow orchestration and ecosystem integration to expand the Agent's Tool list. A complex Graph application (such as a complete RAG pipeline or data analysis pipeline) can be simplified into one of the Agent's atomic capabilities, dynamically invoked at the appropriate time. For Eino developers, this means: -- Use eino-compose to write your Graphs, encapsulating deterministic business logic into "functional modules". -- Use eino-adk to build your Agents, giving them the ability to think, plan, and interact with users. -- Use the former as Tools for the latter, ultimately achieving a "1+1 > 2" effect. +- Use eino-compose to write your Graph, encapsulating deterministic business logic into "functional modules." +- Use eino-adk to build your Agent, giving it the ability to think, plan, and interact with users. +- Use the former as the latter's Tools, ultimately achieving a "1+1 > 2" effect. Code example: diff --git a/content/en/docs/eino/quick_start/_index.md b/content/en/docs/eino/quick_start/_index.md index d50018aa4ff..f6d49faa3f2 100644 --- a/content/en/docs/eino/quick_start/_index.md +++ b/content/en/docs/eino/quick_start/_index.md @@ -1,21 +1,21 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] -title: 'Quick Start' +title: Quick Start weight: 2 --- -This page is the unified entrypoint for the ChatWithEino Quickstart series: it provides a clear path to get you running and explains what you will build by the end (an extensible end-to-end Agent application skeleton). +This document serves as the unified entrypoint for the ChatWithEino Quickstart series: it provides a clear path to get you running and explains what this series will ultimately deliver (an extensible end-to-end Agent application skeleton). ## What is this ChatWithEino is a learning-oriented Agent built with Eino: it can read source code/docs/examples and help developers understand Eino and write Eino code through conversation. -This Quickstart series follows a “progressive build-up” approach: +This Quickstart series follows a "progressive build-up" approach: -- Start with a Console app, then progressively introduce ChatModel, Agent/Runner, Memory, Tools, Middleware, Callback, Interrupt/Resume, Graph Tool, and Skill +- Start with a Console app, then progressively introduce ChatModel, Agent/Runner, Memory, Tool, Middleware, Callback, Interrupt/Resume, Graph Tool, Skill - Deliver the same Agent as a Web app in the end, and use the A2UI protocol to render the event stream into an incrementally updating UI ## The shortest path: run it first @@ -39,7 +39,7 @@ export OPENAI_MODEL="gpt-4.1-mini" Run: ```bash -go run ./cmd/ch01 -- "Explain in one sentence what problem Eino’s Component design solves." +go run ./cmd/ch01 -- "Explain in one sentence what problem Eino's Component design solves." ``` ### 2) Final Web (A2UI) @@ -52,7 +52,7 @@ After it starts, open the address printed in the output (default `http://localho ### 3) (Optional) Enable skills (Chapter 9 capability reuse) -Skills inject a stable set of “knowledge/instruction packs” (`SKILL.md` + `reference/*.md`) into the Agent, so the model can load and call them on demand when needed. +Skills inject a stable set of "knowledge/instruction packs" (`SKILL.md` + `reference/*.md`) into the Agent, so the model can load and call them on demand when needed. ```bash go run ./scripts/sync_eino_ext_skills.go -src /path/to/eino-ext -dest ./skills/eino-ext -clean @@ -69,33 +69,34 @@ Notes: - + - - + + - + +
    ChapterTopicEntry
    Chapter 1ChatModel and Message (Console)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch01_chatmodel_agent_console.md
    Chapter 1ChatModel and AgenticMessage (Console)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch01_chatmodel_agent_console.md
    Chapter 2Agent and Runner (Console multi-turn)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch02_chatmodel_agent_runner_console.md
    Chapter 3Memory and Session (persistent conversation)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch03_memory_session_jsonl.md
    Chapter 4Tools and file system accesshttps://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch04_tool_backend_filesystem.md
    Chapter 5Middleware patternhttps://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch05_middleware.md
    Chapter 4Tool and file system accesshttps://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch04_tool_backend_filesystem.md
    Chapter 5Middleware (middleware pattern)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch05_middleware.md
    Chapter 6Callback and Trace (observability)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch06_callback.md
    Chapter 7Interrupt/Resumehttps://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch07_interrupt_resume.md
    Chapter 8Graph Tool (complex workflows)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch08_graph_tool.md
    Chapter 9Skill (Console)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch09_skill.md
    FinalA2UI (Web)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch10_a2ui.md
    Chapter 10A2UI (Web)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch10_a2ui.md
    Chapter 11 TurnLoophttps://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch11_turnloop.md |
    ## Final deliverable: an extensible end-to-end Agent application skeleton -You can think of the final output of this Quickstart as a “pluggable application skeleton” that connects Eino’s key capabilities into a full loop: +You can think of the final output of this Quickstart as a "pluggable application skeleton" that connects Eino's key capabilities into a full loop: - Runtime: Runner drives execution, supporting streaming output and the event model -- Tools: integrate file system/retrieval/workflows via Tool +- Tool layer: integrate file system/retrieval/workflows and more via Tool - Middleware: carry cross-cutting concerns like retries, approvals, and error handling via handler/middleware -- Human-in-the-loop: interrupt/resume + checkpoint enable interactive flows like approval, missing-arg filling, and branch selection +- Human-in-the-loop: interrupt/resume + checkpoint enable interactive flows like approval, parameter filling, and branch selection - Deterministic orchestration: compose (graph/chain/workflow) organizes complex business flows into maintainable, reusable execution graphs - UI delivery: map the Agent event stream to an incrementally renderable UI component tree with A2UI (SSE push) -The boundary of A2UI is important: it is not part of the Eino framework itself; it is a business-layer UI protocol/rendering solution. This Quickstart uses it to demonstrate “how Agent capabilities can be delivered as a product”, and the details are defined by the final chapter. +The boundary of A2UI is important: it is not part of the Eino framework itself; it is a business-layer UI protocol/rendering solution. This Quickstart uses it to demonstrate "how Agent capabilities can be delivered as a product to users", and the specific implementation and protocol details are defined by the final chapter. ## Next explorations (from Quickstart to real business) -- To systematically understand Eino’s component abstractions and usage: start from Chapter 1 and then fill in Tools/Graph/Callback/Interrupt step by step -- To reuse larger-scale knowledge and instruction packs: integrate `eino-ext` skills and load them on demand via Skill middleware -- To build an Agent into a business product: follow the final chapter (A2UI/Web) to connect event stream, state, and interaction, then replace it with your own UI form/protocol +- To systematically understand Eino's component abstractions and usage: start from Chapter 1's Component introduction, then fill in Tool/Graph/Callback/Interrupt and other capabilities step by step +- To reuse larger-scale knowledge and instructions: integrate `eino-ext` skills and load them on demand via Skill middleware +- To build an Agent into a business product: follow the final chapter (A2UI/Web) to connect event stream, state, and interaction, then replace it with your own UI form and protocol diff --git a/content/en/docs/eino/quick_start/chapter_01_chatmodel_and_message.md b/content/en/docs/eino/quick_start/chapter_01_chatmodel_and_message.md index bc08710a458..f7456c6a492 100644 --- a/content/en/docs/eino/quick_start/chapter_01_chatmodel_and_message.md +++ b/content/en/docs/eino/quick_start/chapter_01_chatmodel_and_message.md @@ -1,117 +1,133 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: "Chapter 1: ChatModel and Message (Console)" weight: 1 --- -## Introduction to the Eino framework +## Introduction to the Eino Framework **What is Eino?** -Eino is an AI application development framework in Go (Agent Development Kit) designed to help developers quickly build scalable and maintainable AI applications. +Eino is an AI application development framework (Agent Development Kit) implemented in Go, designed to help developers quickly build extensible, maintainable AI applications. **What problems does Eino solve?** -1. **Model abstraction**: unify interfaces across different LLM providers (OpenAI, Ark, Claude, etc.), so switching models does not require changing business code -2. **Capability composition**: provide replaceable, composable capability units through the Component interfaces (chat, tools, retrieval, etc.) -3. **Orchestration framework**: offer orchestration abstractions such as Agent, Graph, and Chain to support complex multi-step AI workflows -4. **Runtime support**: built-in streaming output, interrupt/resume, state management, and Callback-based observability +1. **Model abstraction**: Unifies interfaces across different LLM providers (OpenAI, Ark, Claude, etc.), allowing model switching without modifying business code +2. **Capability composition**: Implements replaceable, composable capability units (conversation, tools, retrieval, etc.) through Component interfaces +3. **Orchestration framework**: Provides orchestration abstractions like Agent, Graph, and Chain to support complex multi-step AI workflows +4. **Runtime support**: Built-in streaming output, interrupt and resume, state management, Callback observability, and more -**Main repositories of Eino:** +**Eino's main repositories:** -- **eino** (this repo): the core library, defining interfaces, orchestration abstractions, and ADK -- **eino-ext**: the extension library, providing concrete implementations of Components (OpenAI, Ark, Milvus, etc.) -- **eino-examples**: the examples repo, including this Quickstart series +- **eino** (this repository): Core library, defines interfaces, orchestration abstractions, and ADK +- **eino-ext**: Extension library, provides concrete implementations of various Components (OpenAI, Ark, Milvus, etc.) +- **eino-examples**: Example code repository, includes this quickstart series --- -## ChatWithEino: an assistant that talks with Eino docs +## ChatWithEino: An Intelligent Assistant for Conversing with Eino Documentation **What is ChatWithEino?** -ChatWithEino is an intelligent assistant built with Eino. It helps developers learn Eino and write Eino code by accessing the Eino repository’s source code, comments, and examples, so it can provide accurate and up-to-date technical help. +ChatWithEino is an intelligent assistant built on the Eino framework that helps developers learn the Eino framework and write Eino code. It accesses source code, comments, and examples from the Eino repository to provide users with the most accurate and up-to-date technical support. **Core capabilities:** -- **Conversational interaction**: understand questions about Eino and respond clearly -- **Code access**: read Eino source code/comments/examples and answer based on real implementations -- **Persistent sessions**: support multi-turn conversations, remember context, and restore sessions across processes -- **Tool calling**: perform operations such as file reading and code search +- **Conversational interaction**: Understands user questions about Eino and provides clear answers +- **Code access**: Directly reads Eino source code, comments, and examples to answer questions based on real implementations +- **Persistent sessions**: Supports multi-turn conversations, remembers context, and can resume sessions across processes +- **Tool invocation**: Can perform operations like file reading and code searching -**Architecture overview:** +**Technical architecture:** -- **ChatModel**: communicate with LLM providers (OpenAI, Ark, Claude, etc.) -- **Tool**: extend capabilities such as file system access and code search -- **Memory**: persist conversation history -- **Agent**: a unified execution framework that coordinates components +- **ChatModel**: Communicates with large language models (OpenAI, Ark, Claude, etc.) +- **Tool**: Capability extensions such as file system access and code search +- **Memory**: Persistent storage for conversation history +- **Agent**: Unified execution framework that coordinates all components to work together -## Quickstart series: build ChatWithEino from scratch +## Quickstart Document Series: Building ChatWithEino from Scratch -This series walks you step by step: starting from the most basic ChatModel call, and progressively building a fully functional ChatWithEino Agent. +This document series takes a progressive approach, guiding you from the most basic ChatModel invocation to building a fully-featured ChatWithEino Agent step by step. **Learning path:** - - - - - - - - - - + + + + + + + + + + +
    ChapterTopicCore contentCapability gain
    Chapter 1ChatModel and MessageUnderstand the Component abstraction and implement a single-turn chatBasic conversation
    Chapter 2Agent and RunnerIntroduce execution abstractions and implement multi-turn chatSession management
    Chapter 3Memory and SessionPersist chat history and support session recoveryPersistence
    Chapter 4Tools and file systemAdd file access to read source codeTool calling
    Chapter 5MiddlewareMiddleware mechanism and unified cross-cutting concernsExtensibility
    Chapter 6CallbackCallbacks to observe the Agent execution processObservability
    Chapter 7Interrupt and ResumeInterrupt and resume to support long-running tasksReliability
    Chapter 8Graph and ToolUse Graph to orchestrate complex workflowsComplex orchestration
    Chapter 9A2UIIntegration from Agent to UIProduction-grade delivery
    ChapterTopicCore ContentCapability Gained
    Chapter 1ChatModel and AgenticMessageUnderstand Component abstraction, implement a single conversationBasic conversation
    Chapter 2Agent and RunnerIntroduce execution abstraction, implement multi-turn conversationSession management
    Chapter 3Memory and SessionPersist conversation history, support session resumptionPersistence
    Chapter 4Tool and file systemAdd file access capability, read source codeTool invocation
    Chapter 5MiddlewareMiddleware mechanism for handling cross-cutting concerns uniformlyEnhanced extensibility
    Chapter 6CallbackCallback mechanism for monitoring Agent executionObservability
    Chapter 7Interrupt and ResumeInterrupt and resume support for long-running tasksEnhanced reliability
    Chapter 8Graph and ToolOrchestrate complex workflows using GraphComplex orchestration
    Chapter 9SkillLoad and reuse skill documents via Skill middlewareKnowledge reuse
    Final chapterA2UIAgent-to-UI integration solutionProduction-grade application
    -**Why design it this way?** +**Why is it designed this way?** -Each chapter adds one core capability on top of the previous chapter, so you can: +Each chapter adds one core capability on top of the previous one, so you can: -1. **Understand the role of each component**: features are introduced progressively instead of all at once -2. **See the architecture evolve**: from simple to complex, and why each abstraction exists -3. **Build practical skills**: every chapter comes with runnable code you can try hands-on +1. **Understand the role of each component**: Instead of showing all features at once, they are introduced progressively +2. **See the architecture evolution**: From simple to complex, understand why each abstraction is needed +3. **Acquire practical development skills**: Every chapter has runnable code for hands-on practice --- -Goal of this chapter: understand Eino’s Component abstraction, call a ChatModel once with minimal code (with streaming output), and learn the basics of `schema.Message`. +This chapter's goal: Understand Eino's Component abstraction, invoke a ChatModel with minimal code (supporting streaming output), and learn how to organize model input and streaming output using `schema.AgenticMessage`. -## Code location +## Code Location - Entry code: [cmd/ch01/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch01/main.go) -## Why we need the Component interfaces +## Why Component Interfaces Are Needed -Eino defines a set of Component interfaces (`ChatModel`, `Tool`, `Retriever`, `Loader`, etc.). Each interface describes one replaceable capability category: +Eino defines a set of Component interfaces (`ChatModel`, `Tool`, `Retriever`, `Loader`, etc.), each describing a replaceable capability: ```go -type BaseChatModel interface { - Generate(ctx context.Context, input []*schema.Message, opts ...Option) (*schema.Message, error) - Stream(ctx context.Context, input []*schema.Message, opts ...Option) ( - *schema.StreamReader[*schema.Message], error) +type BaseModel[M any] interface { + Generate(ctx context.Context, input []M, opts ...Option) (M, error) + Stream(ctx context.Context, input []M, opts ...Option) (*schema.StreamReader[M], error) } + +type AgenticModel = BaseModel[*schema.AgenticMessage] ``` **Benefits of interfaces:** -1. **Replaceable implementations**: `eino-ext` provides implementations for OpenAI, Ark, Claude, Ollama, and more. Business code depends only on the interface, so switching models only changes construction logic. -2. **Composable orchestration**: orchestration layers such as Agent, Graph, and Chain depend only on Component interfaces, not concrete implementations. You can swap OpenAI for Ark without changing orchestration code. -3. **Mockable in tests**: interfaces make mocking natural; unit tests do not need real model calls. +1. **Replaceable implementations**: `eino-ext` provides multiple implementations including OpenAI, Ark, Claude, Ollama, etc. Business code only depends on the interface, and switching models only requires changing the construction logic. +2. **Composable orchestration**: Orchestration layers like Agent, Graph, and Chain only depend on Component interfaces, not specific implementations. You can swap OpenAI for Ark without changing orchestration code. +3. **Mockable for testing**: Interfaces naturally support mocking, so unit tests don't need real model calls. + +This chapter only involves `ChatModel`; subsequent chapters will progressively introduce `Tool`, `Retriever`, and other Components. -This chapter focuses on `ChatModel`. Later chapters will introduce Components such as `Tool` and `Retriever`. +The example code uses `model.AgenticModel` by default, which is `model.BaseModel[*schema.AgenticMessage]`. This allows subsequent chapters to express text, reasoning, tool calls, tool results, and more within the same message structure. -## schema.Message: the basic unit of conversation +## schema.AgenticMessage: The Basic Unit of Conversation -`Message` is the basic structure for conversation data in Eino: +`AgenticMessage` is the conversation data structure used in this Quickstart: + +In a single model invocation, the model may return multiple ordered events—for example, first outputting `reasoning`, then calling a server tool, continuing with `reasoning`, and then calling a function tool. `AgenticMessage` preserves these structured events in order using `ContentBlock`. ```go -type Message struct { - Role RoleType // system / user / assistant / tool - Content string // text content - ToolCalls []ToolCall // only assistant messages may have this +type AgenticMessage struct { + Role AgenticRoleType + ContentBlocks []*ContentBlock + ResponseMeta *AgenticResponseMeta + Extra map[string]any +} + +type ContentBlock struct { + Type ContentBlockType + Reasoning *Reasoning + UserInputText *UserInputText + AssistantGenText *AssistantGenText + FunctionToolCall *FunctionToolCall + FunctionToolResult *FunctionToolResult // ... } ``` @@ -119,22 +135,27 @@ type Message struct { Common constructors: ```go -schema.SystemMessage("You are a helpful assistant.") -schema.UserMessage("What is the weather today?") -schema.AssistantMessage("I don't know.", nil) // second arg is ToolCalls -schema.ToolMessage("tool result", "call_id") +schema.SystemAgenticMessage("You are a helpful assistant.") +schema.UserAgenticMessage("What is the weather today?") + +&schema.AgenticMessage{ + Role: schema.AgenticRoleTypeAssistant, + ContentBlocks: []*schema.ContentBlock{ + schema.NewContentBlock(&schema.AssistantGenText{Text: "I don't know."}), + }, +} ``` **Role semantics:** -- `system`: system instructions, typically placed at the beginning of messages -- `user`: user input -- `assistant`: model response -- `tool`: tool call result (covered in later chapters) +- `system`: System instructions, typically placed at the beginning of the message list +- `user`: User input +- `assistant`: Model reply +- Tool calls and tool results are expressed through `function_tool_call` / `function_tool_result` content blocks (covered in later chapters) ## Prerequisites -### Get the code +### Get the Code ```bash git clone https://github.com/cloudwego/eino-examples.git @@ -142,13 +163,13 @@ cd eino-examples/quickstart/chatwitheino ``` - Go version: Go 1.21+ (see `go.mod`) -- A callable ChatModel (OpenAI by default; Ark is also supported) +- A callable ChatModel (defaults to OpenAI; Ark is also supported) ### Option A: OpenAI (default) ```bash export OPENAI_API_KEY="..." -export OPENAI_MODEL="gpt-4.1-mini" # OpenAI 2025 new model; gpt-4o / gpt-4o-mini also work +export OPENAI_MODEL="gpt-4.1-mini" # OpenAI 2025 new model, can also use gpt-4o, gpt-4o-mini, etc. # Optional: # OPENAI_BASE_URL (proxy or compatible service) # OPENAI_BY_AZURE=true (use Azure OpenAI) @@ -163,60 +184,65 @@ export ARK_MODEL="..." # Optional: ARK_BASE_URL ``` -## Run +## Running -In `eino-examples/quickstart/chatwitheino`, run: +In the `examples/quickstart/chatwitheino` directory: ```bash -go run ./cmd/ch01 -- "Explain in one sentence what problem Eino’s Component design solves." +go run ./cmd/ch01 -- "Explain in one sentence what problem Eino's Component design solves." ``` -Example output (printed incrementally as the stream arrives): +Example output (streamed incrementally): ``` -[assistant] Eino’s Component design defines unified interfaces... +[assistant] Eino's Component design solves the problem of... ``` -## What the entry code does +## What the Entry Code Does In execution order: -1. **Create a ChatModel**: choose OpenAI or Ark based on the `MODEL_TYPE` environment variable -2. **Build input messages**: `SystemMessage(instruction)` + `UserMessage(query)` -3. **Call Stream**: all ChatModel implementations must support `Stream()`, returning a `StreamReader[*Message]` -4. **Print the result**: iterate `StreamReader` and print the assistant reply chunk by chunk +1. **Create ChatModel**: Select OpenAI or Ark's agentic model based on the `MODEL_TYPE` environment variable +2. **Construct input messages**: Create `AgenticMessage` via `msgops.NewSystem[M]` / `msgops.NewUser[M]` +3. **Call Stream**: Use `model.BaseModel[M].Stream()`, which returns a `StreamReader[M]` +4. **Print results**: Iterate over `StreamReader` to print the assistant reply frame by frame -Key code snippet (**note: simplified and not directly runnable; for the full code see** [cmd/ch01/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch01/main.go)): +Key code snippet (**Note: This is a simplified code snippet and cannot be run directly. For the complete code, refer to** [cmd/ch01/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch01/main.go)): ```go -// Build input -messages := []*schema.Message{ - schema.SystemMessage(instruction), - schema.UserMessage(query), -} - -// Call Stream (all ChatModels must implement this) -stream, err := cm.Stream(ctx, messages) -if err != nil { - log.Fatal(err) -} -defer stream.Close() +func runTyped[M adk.MessageType](ctx context.Context, instruction, query string) { + cm, err := chatmodel.NewModel[M](ctx) + if err != nil { + log.Fatal(err) + } -for { - chunk, err := stream.Recv() - if errors.Is(err, io.EOF) { - break + messages := []M{ + msgops.NewSystem[M](instruction), + msgops.NewUser[M](query), } + + stream, err := cm.Stream(ctx, messages) if err != nil { log.Fatal(err) } - fmt.Print(chunk.Content) + defer stream.Close() + + for { + frame, err := stream.Recv() + if errors.Is(err, io.EOF) { + break + } + if err != nil { + log.Fatal(err) + } + fmt.Print(msgops.AssistantDeltaText(frame)) + } } ``` -## Summary +## Chapter Summary -- **Component interfaces**: define boundaries for replaceable, composable, and testable capabilities -- **Message**: the basic unit of conversation data, with semantics defined by roles -- **ChatModel**: the most fundamental Component, providing `Generate` and `Stream` -- **Implementation choice**: switch between OpenAI/Ark implementations via env/config without changing business code +- **Component interface**: Defines replaceable, composable, testable capability boundaries +- **AgenticMessage**: The basic unit of conversation data, distinguishing semantics through roles and content blocks +- **ChatModel**: The most fundamental Component, providing two core methods: `Generate` and `Stream` +- **Implementation selection**: Switch between different implementations like OpenAI/Ark via environment variables or configuration without modifying business code diff --git a/content/en/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md b/content/en/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md index ad2bd195ee6..216246c70d9 100644 --- a/content/en/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md +++ b/content/en/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md @@ -1,26 +1,320 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 2: ChatModelAgent, Runner, AgentEvent (Console multi-turn)" +title: "Chapter 2: ChatModelAgent, Runner, AgentEvent (Console Multi-turn)" weight: 2 --- -Goal of this chapter: introduce ADK execution abstractions (Agent + Runner) and implement a multi-turn conversation in a Console program. +This chapter's goal: Introduce the ADK's execution abstractions (Agent + Runner) and implement multi-turn conversation with a Console program. -## Code location +## Code Location - Entry code: [cmd/ch02/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch02/main.go) -## Full tutorial +## Prerequisites -This page is a website-friendly overview. For the full runnable walkthrough, see: +Same as Chapter 1: a usable ChatModel (OpenAI or Ark) must be configured. -- [ch02_chatmodel_agent_runner_console.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch02_chatmodel_agent_runner_console.md) +## Running -## What you learn +In the `examples/quickstart/chatwitheino` directory: -- Why “Agent” is a higher-level abstraction than “ChatModel”: it owns the interaction loop and tool routing. -- What “Runner” does: it provides the runtime (streaming, events, interrupt/resume plumbing) for running an Agent. -- How “AgentEvent” models the execution stream: user input, model output, tool calls, tool results, and lifecycle signals. +```bash +go run ./cmd/ch02 +``` + +After seeing the prompt, enter your question (empty line to exit): + +``` +you> Hi, explain what an Agent is in Eino? +... +you> Summarize that in one sentence +... +``` + +## Key Concepts + +### From Component to Agent + +In Chapter 1 we learned about **Components**, which are replaceable, composable capability units in Eino: + +- `ChatModel`: Invokes large language models +- `Tool`: Executes specific tasks +- `Retriever`: Retrieves information +- `Loader`: Loads data + +**Relationship between Component and Agent:** + +- **Components alone don't constitute a complete AI application**: They are just capability units that need to be organized, orchestrated, and executed +- **An Agent is a complete AI application**: It encapsulates complete business logic and can run directly +- **Agents use Components internally**: The most critical are `ChatModel` (conversation capability) and `Tool` (execution capability) + +**Why do we need Agents?** + +With only Components, you would need to manually: + +- Manage conversation history +- Orchestrate the call flow (when to invoke the model, when to invoke tools) +- Handle streaming output +- Implement interrupt and resume +- ... + +**What does an Agent provide?** + +- **A complete runtime framework**: Unified execution management through `Runner` +- **Standardized event stream output**: `Run() -> AsyncIterator[*AgentEvent]`, supporting streaming, interrupt, and resume +- **Extensible capabilities**: Tools, middleware, interrupt, and more can be added +- **Ready to use out of the box**: Create an Agent and run it directly without worrying about internal details + +**This chapter's example:** + +`ChatModelAgent` is the simplest Agent—it only uses a `ChatModel` internally, but already possesses the complete Agent capability framework. Subsequent chapters will demonstrate how to add `Tool` and other capabilities. + +### Agent Interface + +`Agent` is the core interface in ADK that defines the basic behavior of an intelligent agent: + +```go +type Agent interface { + Name(ctx context.Context) string + Description(ctx context.Context) string + + // Run executes the Agent and returns an event stream + Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent] +} +``` + +**Interface responsibilities:** + +- `Name()` / `Description()`: Identify the Agent's name and description +- `Run()`: The core method to execute the Agent—receives input messages and returns an event stream + +**Design philosophy:** + +- **Unified abstraction**: All Agents (ChatModelAgent, WorkflowAgent, SupervisorAgent, etc.) implement this interface +- **Event-driven**: Outputs the execution process through an event stream (`AsyncIterator[*AgentEvent]`), supporting streaming responses +- **Extensibility**: The interface remains unchanged when adding tools, middleware, interrupt, and other capabilities later + +### ChatModelAgent + +`ChatModelAgent` is an implementation of the Agent interface, built on ChatModel: + +```go +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "Ch02ChatModelAgent", + Description: "A minimal ChatModelAgent with in-memory multi-turn history.", + Instruction: instruction, + Model: cm, +}) +``` + +**ChatModel vs ChatModelAgent: The Essential Difference** + + + + + + + + +
    DimensionChatModelChatModelAgent
    PositioningComponentAgent (intelligent agent)
    Core interface
    Generate()
    /
    Stream()
    Run() -> AsyncIterator[*AgentEvent]
    Output formDirectly returns message contentReturns an event stream (containing messages, control actions, etc.)
    Core capabilitiesPure large language model invocationSupports extending with tools, middleware, interrupt, and more
    Use caseSimple conversational interaction scenariosComplex intelligent agent application development
    + +**Why do we need ChatModelAgent?** + +1. **Unified abstraction**: ChatModel is just one type of Component, while Agent is a higher-level abstraction that can compose multiple Components +2. **Event-driven**: Agent outputs an event stream, supporting streaming responses, interrupt/resume, state transitions, and other complex scenarios +3. **Extensibility**: ChatModelAgent can add tools, middleware, interrupt, and other capabilities, while ChatModel can only invoke the model +4. **Orchestration-friendly**: Agents can be uniformly managed by Runner, supporting checkpoint, resume, and other runtime capabilities + +**In simple terms:** + +- **ChatModel** = "The component responsible for communicating with large language models, abstracting away differences between model providers (OpenAI, Ark, Claude, etc.)" +- **ChatModelAgent** = "An intelligent agent built on top of a model—it can invoke the model, but can also do much more" + +**Analogy:** + +- **ChatModel** is like a "database driver": responsible for communicating with the database, abstracting away MySQL/PostgreSQL differences +- **ChatModelAgent** is like the "business logic layer": built on top of the database driver, but also includes business rules, transaction management, etc. + +**Features:** + +- Encapsulates the ChatModel invocation logic +- Provides a unified `Run() -> AgentEvent` output form +- Can add tools, middleware, and other capabilities later + +### Runner + +`Runner` is the entry point for executing an Agent, responsible for managing the Agent's lifecycle: + +```go +type Runner struct { + a Agent // The Agent to execute + enableStreaming bool + store CheckPointStore // State storage for interrupt/resume +} +``` + +**Why do we need Runner?** + +Although Agent provides a `Run()` method, calling it directly lacks many runtime capabilities: + +1. **Lifecycle management**: Runner manages the Agent's startup, resume, interrupt, and other states +2. **Checkpoint support**: Works with `CheckPointStore` to implement interrupt/resume (covered in later chapters) +3. **Unified entry point**: Provides convenient methods like `Run()` and `Query()` +4. **Event stream wrapping**: Converts the Agent's event stream into a consumable `AsyncIterator[*TypedAgentEvent[M]]` + +**Usage:** + +```go +runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ + Agent: agent, + EnableStreaming: true, +}) + +// Option 1: Pass a message list +events := runner.Run(ctx, history) + +// Option 2: Convenience method, pass a single query string +events := runner.Query(ctx, "Hello") +``` + +### AgentEvent + +`AgentEvent` is the event unit returned by Runner: + +```go +type AgentEvent struct { + AgentName string + RunPath []RunStep + + Output *AgentOutput // Output content + Action *AgentAction // Control action + Err error // Execution error +} +``` + +**Key fields:** + +- `event.Err`: Execution error +- `event.Output.MessageOutput`: Message or message stream (streaming) +- `event.Action`: Control actions like interrupt/transfer/exit (used in later chapters) + +### AsyncIterator: How to Consume the Event Stream + +`Runner.Run()` returns an `*AsyncIterator[*AgentEvent]`, which is a non-blocking streaming iterator. + +**Why use AsyncIterator instead of returning a result directly?** + +Because Agent execution is **streaming**: the model generates replies token by token, with Tool calls interspersed. If you wait for everything to complete before returning, the user would have to wait longer. `AsyncIterator` lets you consume each event in real time. + +**Consumption pattern:** + +```go +// events is *AsyncIterator[*AgentEvent], returned by runner.Run() +events := runner.Run(ctx, history) + +for { + event, ok := events.Next() // Get next event, blocks until an event arrives or iteration ends + if !ok { + break // Iterator closed, all events consumed + } + if event.Err != nil { + // Handle error + } + if event.Output != nil && event.Output.MessageOutput != nil { + // Handle message output (may be streaming) + } +} +``` + +**Note:** Each `runner.Run()` creates a new iterator; it cannot be reused after being consumed once. + +## Implementing Multi-turn Conversation + +This chapter implements simple multi-turn conversation: user input → model reply → user continues input → ... + +**Implementation approach:** + +Without tools, `ChatModelAgent` only completes one model invocation per `Run()` call. Multi-turn conversation is achieved by maintaining history on the caller side: + +1. Use `history []M` to accumulate the conversation; in this example, `M` defaults to `*schema.AgenticMessage` +2. On each user input: append to history via `msgops.NewUser[M]` +3. Call `runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history))` to get the event stream and consume it to obtain the assistant text +4. Append the current turn's assistant text back to history via `msgops.NewAssistant[M]`, then continue to the next turn + +**Key code snippet (**Note: This is a simplified code snippet and cannot be run directly. For the complete code, refer to** [cmd/ch02/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch02/main.go)): + +```go +func runTyped[M adk.MessageType](ctx context.Context, instruction string) { + agent, err := adk.NewTypedChatModelAgent[M](ctx, &adk.TypedChatModelAgentConfig[M]{ + Name: "Ch02Agent", + Instruction: instruction, + Model: cm, + }) + if err != nil { + log.Fatal(err) + } + + runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ + Agent: agent, + EnableStreaming: true, + }) + + history := make([]M, 0, 16) + + for { + line := readUserInput() + if line == "" { + break + } + + history = append(history, msgops.NewUser[M](line)) + events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) + result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) + if err != nil { + log.Fatal(err) + } + history = append(history, msgops.NewAssistant[M](result.AssistantText, nil)) + } +} +``` + +**Flow diagram:** + +``` +┌─────────────────────────────────────────┐ +│ Initialize history = [] │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ User input UserMessage │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Append to history │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ runner.Run(history) │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Consume event stream │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Append AssistantMessage│ + └──────────────────────┘ + ↓ + (loop continues) +``` + +## Chapter Summary + +- **Agent interface**: Defines the basic behavior of an intelligent agent; the core is `Run() -> AsyncIterator[*AgentEvent]` +- **ChatModelAgent**: An Agent implementation based on ChatModel, providing a unified execution abstraction +- **Runner**: The execution entry point for Agents, managing lifecycle, checkpoint, event stream, and other runtime capabilities +- **AgentEvent**: An event-driven output unit supporting streaming responses and control actions +- **Multi-turn conversation**: Achieved by maintaining history on the caller side; each `Run()` completes one turn of conversation diff --git a/content/en/docs/eino/quick_start/chapter_03_memory_and_session.md b/content/en/docs/eino/quick_start/chapter_03_memory_and_session.md index e90a0a8e990..c21bb27114e 100644 --- a/content/en/docs/eino/quick_start/chapter_03_memory_and_session.md +++ b/content/en/docs/eino/quick_start/chapter_03_memory_and_session.md @@ -1,28 +1,332 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 3: Memory and Session (persistent conversations)" +title: "Chapter 3: Memory and Session (Persistent Conversation)" weight: 3 --- -Goal of this chapter: persist conversation history and support session recovery across processes. +This chapter's goal: Implement persistent storage for conversation history, supporting session resumption across processes. -> ⚠️ Important note: **Memory, Session, and Store here are business-layer concepts**, not core Eino framework components. +> **⚠️ Important Note: Business Layer Concepts vs Framework Concepts** > -> Eino focuses on “how to process messages”; “how to store messages” is entirely up to your application (DB/Redis/object storage/etc.). The implementation in this chapter is a simple reference you can replace. +> The **Memory, Session, and Store concepts introduced in this chapter are business-layer concepts**, **not core Eino framework components**. +> +> - **Eino framework layer**: Provides foundational abstractions like `adk.Runner`, `adk.NewTypedRunner[M]`, `schema.AgenticMessage`, etc. The framework itself does not concern itself with how conversation history is stored +> - **Business layer**: Memory/Session/Store are business logic designed by this example project to implement persistent conversations, interacting with the Eino framework by assembling input for `adk.Runner` +> +> In other words, the Eino framework is only responsible for "how to process messages," while "how to store messages" is entirely up to the business layer. The implementation provided in this chapter is just a simple reference example—you can choose a completely different storage solution (database, Redis, cloud storage, etc.) based on your business needs. -## Code location +## Code Location - Entry code: [cmd/ch03/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch03/main.go) +- Memory implementation: [mem/store.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/mem/store.go) + +## Prerequisites + +Same as Chapter 1: a usable ChatModel (OpenAI or Ark) must be configured. + +## Running + +In the `examples/quickstart/chatwitheino` directory: + +```bash +# Create a new session +go run ./cmd/ch03 + +# Resume an existing session +go run ./cmd/ch03 --session +``` + +Example output: + +``` +Created new session: 083d16da-6b13-4fe6-afb0-c45d8f490ce1 +Session title: New Session +Enter your message (empty line to exit): +you> Hi, I'm Zhang San +[assistant] Hi Zhang San! Nice to meet you... +you> What's my name? +[assistant] Your name is Zhang San... + +Session saved: 083d16da-6b13-4fe6-afb0-c45d8f490ce1 +Resume with: go run ./cmd/ch03 --session 083d16da-6b13-4fe6-afb0-c45d8f490ce1 +``` + +## From In-Memory to Persistent: Why Memory Is Needed + +In Chapter 2 we implemented multi-turn conversation, but there's a problem: **conversation history only exists in memory**. + +**Limitations of in-memory storage:** + +- Conversation history is lost when the process exits +- Cannot resume sessions across devices or processes +- Cannot implement session management (listing, deleting, searching, etc.) + +**Memory's role:** + +- **Memory is persistent storage for conversation history**: Saves conversations to disk or database +- **Memory supports Session management**: Each Session represents one complete conversation +- **Memory is decoupled from Agent**: The Agent doesn't care about storage details, only about the message list + +**Simple analogy:** + +- **In-memory storage** = "scratch paper" (gone when the process exits) +- **Memory** = "notebook" (permanently saved, accessible anytime) + +## Key Concepts + +> **Reiteration**: The following Session, Store, and other concepts are all **business-layer implementations** for managing conversation history storage. The Eino framework itself does not provide these components—the business layer is responsible for managing the message list and then passing messages to `adk.Runner` for processing. + +### Session (Business Layer Concept) + +`Session` represents one complete conversation: + +```go +type Session struct { + ID string + CreatedAt time.Time + + messages []M // Conversation history; in this example, M defaults to *schema.AgenticMessage + // ... +} +``` + +**Core methods:** + +- `Append(msg)`: Appends a message to the session and persists it +- `GetMessages()`: Retrieves all messages +- `Title()`: Generates a session title from the first user message + +### Store (Business Layer Concept) + +`Store` manages persistent storage for multiple Sessions: + +```go +type Store struct { + dir string // Storage directory + cache map[string]*Session // In-memory cache +} +``` + +**Core methods:** + +- `GetOrCreate(id)`: Gets or creates a Session +- `List()`: Lists all Sessions +- `Delete(id)`: Deletes a Session + +### JSONL File Format + +Each Session is stored as a `.jsonl` file: + +``` +{"type":"session","id":"083d16da-...","created_at":"2026-03-11T10:00:00Z","message_kind":"agentic"} +{"role":"user","content_blocks":[{"type":"user_input_text","user_input_text":{"text":"Hello, who am I?"}}]} +{"role":"assistant","content_blocks":[{"type":"assistant_gen_text","assistant_gen_text":{"text":"Hello! I don't know who you are yet..."}}]} +{"role":"user","content_blocks":[{"type":"user_input_text","user_input_text":{"text":"My name is Zhang San"}}]} +{"role":"assistant","content_blocks":[{"type":"assistant_gen_text","assistant_gen_text":{"text":"Got it, Zhang San, nice to meet you!"}}]} +``` + +Sessions are saved by default in `./data/sessions_agentic`; to use a different directory, set `SESSION_DIR_AGENTIC`. + +**Why JSONL?** + +- **Simple**: One JSON object per line, easy to read and write +- **Extensible**: New messages can be appended without rewriting the entire file +- **Readable**: Can be viewed directly with a text editor +- **Fault-tolerant**: A corrupted line doesn't affect other lines + +## Memory Implementation (Business Layer Example) + +Below is a simple business-layer implementation example using JSONL files to store conversation history. This is just one of many possible implementations—you can choose databases, Redis, or other storage solutions based on your actual needs. + +### 1. Create a Store + +```go +sessionDir := "./data/sessions_agentic" +store, err := mem.NewStore(sessionDir) +if err != nil { + log.Fatal(err) +} +``` + +### 2. Get or Create a Session + +```go +sessionID := "083d16da-6b13-4fe6-afb0-c45d8f490ce1" +session, err := store.GetOrCreate(sessionID) +if err != nil { + log.Fatal(err) +} +``` + +### 3. Append a User Message + +```go +userMsg := msgops.NewUser[M]("Hello") +if err := session.Append(userMsg); err != nil { + log.Fatal(err) +} +``` + +### 4. Get History and Invoke the Agent + +```go +history := session.GetMessages() +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) +if err != nil { + log.Fatal(err) +} +``` + +### 5. Append the Assistant Message + +```go +assistantMsg := msgops.NewAssistant[M](result.AssistantText, nil) +if err := session.Append(assistantMsg); err != nil { + log.Fatal(err) +} +``` + +**Key code snippet (**Note: This is a simplified code snippet and cannot be run directly. For the complete code, refer to** [cmd/ch03/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch03/main.go)): + +```go +store, err := mem.NewStore[M](msgops.DefaultSessionDir(msgops.KindOf[M]())) +if err != nil { + log.Fatal(err) +} + +// Create or resume a Session +session, err := store.GetOrCreate(sessionID) +if err != nil { + log.Fatal(err) +} + +// User input +userMsg := msgops.NewUser[M](line) +if err := session.Append(userMsg); err != nil { + log.Fatal(err) +} + +// Invoke the Agent +history := session.GetMessages() +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) +if err != nil { + log.Fatal(err) +} + +// Save the assistant reply +assistantMsg := msgops.NewAssistant[M](result.AssistantText, nil) +if err := session.Append(assistantMsg); err != nil { + log.Fatal(err) +} +``` + +## Relationship Between Session and Agent: Business Layer and Framework Layer Collaboration + +**Key understanding:** + +- **Session is a business-layer concept**: Implemented and managed by business code, responsible for storing and loading conversation history +- **Agent (Runner) is a framework-layer concept**: Provided by the Eino framework, responsible for processing messages and generating replies +- **Their interaction point**: The business layer obtains the message list via `session.GetMessages()`, generates model input via `msgops.NormalizeMessagesForModelInput(history)`, and passes it to `runner.Run(ctx, messages)` for processing + +**Architecture layers:** + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Business Layer (your code) │ +│ ┌─────────────┐ ┌──────────────┐ ┌───────────────┐ │ +│ │ Session │───→│ GetMessages() │───→│ runner.Run() │ │ +│ │ (storage) │ │ (message list)│ │(framework call)│ │ +│ └─────────────┘ └──────────────┘ └───────────────┘ │ +│ ↑ │ │ +│ │ ↓ │ +│ ┌─────────────┐ ┌───────────────┐ │ +│ │ Append() │←─────────────────────│ Assistant reply│ │ +│ │(save message)│ └───────────────┘ │ +│ └─────────────┘ │ +└─────────────────────────────────────────────────────────────┘ + │ + ↓ +┌─────────────────────────────────────────────────────────────┐ +│ Framework Layer (Eino framework) │ +│ ┌───────────────────────────────────────────────────────┐ │ +│ │ adk.Runner: receives message list, invokes ChatModel, │ │ +│ │ returns reply │ │ +│ └───────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────┘ +``` + +**Flow diagram:** + +``` +┌─────────────────────────────────────────┐ +│ User input │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ session.Append() │ + │ Save user message │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ session.GetMessages()│ + │ Get complete history │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ runner.Run(history) │ + │ Agent processes msgs │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Collect assistant reply│ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ session.Append() │ + │ Save assistant msg │ + └──────────────────────┘ +``` + +## Chapter Summary + +**Framework layer vs Business layer:** + +- **Eino framework layer**: Provides foundational abstractions like `adk.Runner`, typed runner, `schema.AgenticMessage`, etc., without concerning itself with how messages are stored +- **Business layer (this chapter's implementation)**: Memory/Session/Store are business-layer concepts for managing conversation history storage + +**Business-layer concepts:** + +- **Memory**: Persistent storage for conversation history, supporting cross-process resumption +- **Session**: One complete conversation, containing ID, creation time, and message list +- **Store**: Manages storage for multiple Sessions, supporting create, get, list, and delete +- **JSONL format**: A simple file format, easy to read/write and extend + +**Business layer and framework layer interaction:** + +- The business layer is responsible for storing messages, obtaining the message list via `session.GetMessages()` +- After normalizing the message list for model input, it passes them to the framework layer's `runner.Run(ctx, messages)` for processing +- It collects the reply returned by the framework layer, then saves it to storage via the business layer + +> **💡 Tip**: This chapter's implementation is just one simple example among many storage approaches. In real projects, you can choose databases, Redis, cloud storage, or other solutions based on business needs, and even implement more advanced features like session expiration cleanup, search, sharing, etc. + +## Extended Thinking: Choosing a Business Layer Storage Solution + +The JSONL file storage approach provided in this chapter is suitable for simple single-machine applications. In real business scenarios, you may need to consider other storage solutions: -## Full tutorial +**Other storage implementations:** -- [ch03_memory_session_jsonl.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch03_memory_session_jsonl.md) +- Database storage (MySQL, PostgreSQL, MongoDB) +- Redis storage (supports distributed deployment) +- Cloud storage (S3, OSS) -## What you learn +**Advanced features:** -- How to model “Session” as a stable ID and resume a conversation by reloading stored messages. -- A simple storage format (JSONL) as a baseline for implementing your own persistence layer. -- How to integrate persistence with the Agent/Runner loop without coupling it into Eino itself. +- Session expiration cleanup +- Session search +- Session export/import +- Session sharing diff --git a/content/en/docs/eino/quick_start/chapter_04_tool_and_filesystem.md b/content/en/docs/eino/quick_start/chapter_04_tool_and_filesystem.md index aff152a9b87..67ee159046e 100644 --- a/content/en/docs/eino/quick_start/chapter_04_tool_and_filesystem.md +++ b/content/en/docs/eino/quick_start/chapter_04_tool_and_filesystem.md @@ -1,33 +1,329 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 4: Tools and file system access" +title: "Chapter 4: Tool and File System Access" weight: 4 --- -Goal of this chapter: add Tool capabilities so the Agent can access the file system. +This chapter's goal: Add Tool capabilities to the Agent, enabling it to access the file system. -## Why Tools +## Why Tools Are Needed -In Chapters 1–3, the Agent can only chat; it cannot perform real actions. +In the first three chapters, our Agent could only converse—it couldn't perform actual operations. -Typical limitations without tools: +**Agent's limitations:** -- Only generates text responses -- Cannot access external resources (files/APIs/databases) -- Cannot execute real tasks (compute/query/modify) +- Can only generate text replies +- Cannot access external resources (files, APIs, databases, etc.) +- Cannot execute actual tasks (calculations, queries, modifications, etc.) -## Code location +**Tool's role:** + +- **Tool is a capability extension for the Agent**: Enables the Agent to perform concrete operations +- **Tool encapsulates specific implementations**: The Agent doesn't care about how a Tool works internally, only about its input and output +- **Tools are composable**: An Agent can have multiple Tools and choose which to invoke based on need + +**Simple analogy:** + +- **Agent** = "intelligent assistant" (can understand instructions, but needs tools to execute) +- **Tool** = "toolbox" (file operations, network requests, database queries, etc.) + +## Why File System Access Is Needed + +This example is ChatWithDoc (conversing with documentation), with the goal of helping users learn the Eino framework and write Eino code. So what is the best documentation? + +**The answer is: the Eino repository's code itself.** + +- **Code**: Source code shows the framework's real implementation +- **Comment**: Code comments provide design thinking and usage instructions +- **Examples**: Example code demonstrates best practices + +Through file system access, the Agent can directly read Eino source code, comments, and examples, providing users with the most accurate and up-to-date technical support. + +## Key Concepts + +### Tool Interface + +`Tool` is the interface in Eino that defines executable capabilities: + +```go +// BaseTool provides tool metadata that ChatModel uses to decide whether and how to invoke the tool +type BaseTool interface { + Info(ctx context.Context) (*schema.ToolInfo, error) +} + +// InvokableTool is a tool that can be executed by ToolsNode +type InvokableTool interface { + BaseTool + // InvokableRun executes the tool; arguments are a JSON-encoded string, returns a string result + InvokableRun(ctx context.Context, argumentsInJSON string, opts ...Option) (string, error) +} + +// StreamableTool is the streaming variant of InvokableTool +type StreamableTool interface { + BaseTool + // StreamableRun executes the tool in streaming mode, returns a StreamReader + StreamableRun(ctx context.Context, argumentsInJSON string, opts ...Option) (*schema.StreamReader[string], error) +} +``` + +**Interface hierarchy:** + +- `BaseTool`: Base interface, provides metadata only +- `InvokableTool`: Executable tool (extends BaseTool) +- `StreamableTool`: Streaming tool (extends BaseTool) + +### Backend Interface + +`Backend` is the abstract interface in Eino for file system operations: + +```go +type Backend interface { + // List file information in a directory + LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) + + // Read file content, supports line offset and limit + Read(ctx context.Context, req *ReadRequest) (*FileContent, error) + + // Search for matching content in files + GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error) + + // Match files by glob pattern + GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error) + + // Write file content + Write(ctx context.Context, req *WriteRequest) error + + // Edit file content (string replacement) + Edit(ctx context.Context, req *EditRequest) error +} +``` + +### LocalBackend + +`LocalBackend` is the local file system implementation of Backend, directly accessing the operating system's file system: + +```go +import localbk "github.com/cloudwego/eino-ext/adk/backend/local" + +backend, err := localbk.NewBackend(ctx, &localbk.Config{}) +``` + +**Features:** + +- Directly accesses the local file system using Go standard library +- Supports all Backend interface methods +- Supports executing shell commands (ExecuteStreaming) +- Path safety: requires absolute paths to prevent directory traversal attacks +- Zero configuration: works out of the box with no additional setup + +## Implementation: Using DeepAgent + +This chapter uses the DeepAgent prebuilt Agent, which provides first-class configuration for Backend and StreamingShell, making it easy to register file system-related tools. + +### From ChatModelAgent to DeepAgent: When to Switch? + +Previous chapters used `ChatModelAgent`, which can already handle multi-turn conversations. But to access the file system, we need to switch to `DeepAgent`. + +**ChatModelAgent vs DeepAgent comparison:** + + + + + + + + + +
    CapabilityChatModelAgentDeepAgent
    Multi-turn conversation
    Add custom Tools✅ Manually register each Tool✅ Manual or automatic registration
    File system access (Backend)❌ Must manually create and register all file tools✅ First-class config, auto-registered
    Command execution (StreamingShell)❌ Must manually create✅ First-class config, auto-registered
    Built-in task management✅ write_todos tool
    Sub-Agent support
    + +**Selection guide:** + +- Pure conversation scenarios (no external access) → Use `ChatModelAgent` +- Need file system access or command execution → Use `DeepAgent` + +### Why Use DeepAgent? + +Compared to using ChatModelAgent directly, DeepAgent's advantages: + +1. **First-class configuration**: Backend and StreamingShell are first-class config options—just pass them in +2. **Automatic tool registration**: Configuring Backend automatically registers file system tools, no manual creation needed +3. **Built-in task management**: Provides a `write_todos` tool supporting task planning and tracking +4. **Sub-Agent support**: Can configure specialized sub-Agents to handle specific tasks +5. **More powerful**: Integrates file system, command execution, and other capabilities + +### Code Implementation + +```go +import ( + localbk "github.com/cloudwego/eino-ext/adk/backend/local" + "github.com/cloudwego/eino/adk/prebuilt/deep" +) + +// Create LocalBackend +backend, err := localbk.NewBackend(ctx, &localbk.Config{}) + +// Create DeepAgent with automatic file system tool registration +agent, err := deep.New(ctx, &deep.Config{ + Name: "Ch04ToolAgent", + Description: "ChatWithDoc agent with filesystem access via LocalBackend.", + ChatModel: cm, + Instruction: agentInstruction, + Backend: backend, // Provides file system operation capabilities + StreamingShell: backend, // Provides command execution capabilities + MaxIteration: 50, +}) +``` + +### Tools Auto-registered by DeepAgent + +When Backend and StreamingShell are configured, DeepAgent automatically registers the following tools: + +- `read_file`: Read file content +- `write_file`: Write file content +- `edit_file`: Edit file content +- `glob`: Find files by glob pattern +- `grep`: Search content in files +- `execute`: Execute shell commands + +## Code Location - Entry code: [cmd/ch04/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch04/main.go) -## Full tutorial +## Prerequisites + +Same as Chapter 1: a usable ChatModel (OpenAI or Ark) must be configured. + +This chapter also requires setting `PROJECT_ROOT` (optional, see running instructions below). + +## Running + +In the `examples/quickstart/chatwitheino` directory: + +```bash +# Optional: Set the root directory path of the Eino core library +# When not set, the Agent defaults to using the current working directory (the chatwitheino directory) as root +# To let the Agent search the complete Eino codebase, point this to the eino core library root +export PROJECT_ROOT=/path/to/eino + +# Verify the path is correct (you should see directories like adk, components, compose, etc.) +ls $PROJECT_ROOT + +go run ./cmd/ch04 +``` + +**PROJECT_ROOT notes:** + +- **When not set**: `PROJECT_ROOT` defaults to the current working directory (where `chatwitheino` is located), and the Agent can only access files in this example project. This is sufficient for quick experimentation. +- **When set**: Points to the Eino core library root directory, and the Agent can search the complete Eino framework codebase (core library, extension library, example library). This is the full ChatWithEino usage scenario. + +**Recommended three-repository directory structure (for the full experience):** + +``` +eino/ # PROJECT_ROOT (Eino core library) +├── adk/ +├── components/ +├── compose/ +├── ext/ # eino-ext (extension components like OpenAI, Ark implementations) +├── examples/ # eino-examples (this repository, where this example resides) +│ └── quickstart/ +│ └── chatwitheino/ +└── ... +``` + +You can use the `dev_setup.sh` script to automatically set up the above directory structure: + +```bash +# Run in the eino root directory to automatically clone extension and example repos to the correct locations +bash scripts/dev_setup.sh +``` + +Example output: + +``` +you> List the files in the current directory +[assistant] Let me list the files in the current directory... +[tool call] glob(pattern: "*") +[tool result] Found 5 files: +- main.go +- go.mod +- go.sum +- README.md +- cmd/ + +you> Read the contents of main.go +[assistant] Let me read the main.go file... +[tool call] read_file(file_path: "main.go") +[tool result] File contents: +... +``` + +**Note:** If you encounter a Tool error that interrupts the Agent during execution, don't panic—this is normal behavior. Tool errors are common, such as incorrect arguments, files not found, etc. How to gracefully handle Tool errors will be covered in detail in the next chapter. + +## Tool Invocation Flow + +When the Agent needs to invoke a Tool: + +``` +┌─────────────────────────────────────────┐ +│ User: List the files in current dir │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent analyzes intent│ + │ Decides to call glob │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Generate Tool Call │ + │ {"pattern": "*"} │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Execute Tool │ + │ glob("*") │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Return Tool Result │ + │ {"files": [...]} │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent generates reply│ + │ "Found 5 files..." │ + └──────────────────────┘ +``` + +## Chapter Summary + +- **Tool**: A capability extension for the Agent, enabling it to perform concrete operations +- **Backend**: An abstract interface for file system operations, providing unified file operation capabilities +- **LocalBackend**: The local file system implementation of Backend, directly accessing the OS file system +- **DeepAgent**: A prebuilt advanced Agent providing first-class configuration for Backend and StreamingShell +- **Automatic tool registration**: Configuring Backend automatically registers file system tools +- **Tool invocation flow**: Agent analyzes intent → Generates Tool Call → Executes Tool → Returns result → Generates reply + +## Extended Thinking + +**Other Tool types:** + +- HTTP Tool: Call external APIs +- Database Tool: Query databases +- Calculator Tool: Perform calculations +- Code Executor Tool: Run code + +**Other Backend implementations:** + +- Other storage backends can be implemented based on the Backend interface +- For example: cloud storage, database storage, etc. +- LocalBackend already provides comprehensive file system operation capabilities -- [ch04_tool_backend_filesystem.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch04_tool_backend_filesystem.md) +**Custom Tool creation:** -## What you learn +If you need to create custom Tools, you can use `utils.InferTool` to auto-infer from functions. See: -- How to expose file reads as tools and let the model call them through the Agent. -- How to keep tool boundaries explicit (inputs/outputs) so they are testable and observable. +- [Tool interface documentation](https://github.com/cloudwego/eino/tree/main/components/tool) +- [Tool creation examples](https://github.com/cloudwego/eino-examples/tree/main/components/tool) diff --git a/content/en/docs/eino/quick_start/chapter_05_middleware.md b/content/en/docs/eino/quick_start/chapter_05_middleware.md index d15b7f96640..2a6a4ec00b1 100644 --- a/content/en/docs/eino/quick_start/chapter_05_middleware.md +++ b/content/en/docs/eino/quick_start/chapter_05_middleware.md @@ -1,33 +1,455 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 5: Middleware (cross-cutting concerns)" +title: "Chapter 5: Middleware" weight: 5 --- -Goal of this chapter: understand the middleware pattern and implement Tool error handling and ChatModel retry. +This chapter's goal: Understand the Middleware pattern and implement Tool error handling and ChatModel retry mechanisms. -## Why Middleware +## Why Middleware Is Needed -Once you add tools (Chapter 4), failures become normal in real-world systems: +In Chapter 4 we added Tool capabilities to the Agent, enabling file system access. But in real-world application scenarios, **Tool errors or ChatModel errors are common**, for example: -- Tool failures: file not found, invalid args, missing permissions, etc. -- ChatModel failures: rate limits (429), network timeouts, temporary outages, etc. +- **Tool errors**: File not found, invalid arguments, insufficient permissions, etc. +- **ChatModel errors**: API rate limiting (429), network timeouts, service unavailable, etc. -Middleware provides a single place to handle these cross-cutting concerns without scattering logic throughout your business code. +### Problem 1: Tool Errors Interrupt the Entire Flow -## Code location +When a Tool execution fails, the error propagates directly to the Agent, causing the entire conversation to terminate: + +``` +[tool call] read_file(file_path: "nonexistent.txt") +Error: open nonexistent.txt: no such file or directory +// Conversation interrupted, user needs to start over +``` + +### Problem 2: Model Calls May Fail Due to Rate Limiting + +When the model API returns a 429 (Too Many Requests) error, the entire conversation also terminates: + +``` +Error: rate limit exceeded (429) +// Conversation interrupted +``` + +### Desired Behavior + +These error messages should **not directly terminate the Agent flow**. Instead, the error information should be passed to the model, allowing it to self-correct and proceed to the next round. For example: + +``` +[tool call] read_file(file_path: "nonexistent.txt") +[tool result] [tool error] open nonexistent.txt: no such file or directory +[assistant] Sorry, the file doesn't exist. Let me list the files in the current directory... +[tool call] glob(pattern: "*") +``` + +### Middleware's Role + +The **Middleware pattern** can extend the behavior of Tools and ChatModel, making it ideal for solving this problem: + +- **Middleware is an interceptor for the Agent**: Inserts custom logic before and after calls +- **Middleware can handle errors**: Converts errors into a format the model can understand +- **Middleware can implement retries**: Automatically retries failed operations +- **Middleware is composable**: Multiple Middlewares can be chained together + +**Simple analogy:** + +- **Agent** = "business logic" +- **Middleware** = "AOP aspects" (logging, retry, error handling, and other cross-cutting concerns) + +## Code Location - Entry code: [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go) -## Full tutorial +## Prerequisites + +Same as Chapter 1: a usable ChatModel (OpenAI or Ark) must be configured. Additionally, `PROJECT_ROOT` needs to be set as in Chapter 4: + +```bash +export PROJECT_ROOT=/path/to/eino # Eino core library root directory +``` + +## Running + +In the `examples/quickstart/chatwitheino` directory: + +```bash +# Set the project root directory +export PROJECT_ROOT=/path/to/your/project + +go run ./cmd/ch05 +``` + +Example output: + +``` +you> List the files in the current directory +[assistant] Let me list the files for you... +[tool call] list_files(directory: ".") + +you> Read a file that doesn't exist +[assistant] Trying to read the file... +[tool call] read_file(file_path: "nonexistent.txt") +[tool result] [tool error] open nonexistent.txt: no such file or directory +[assistant] Sorry, the file doesn't exist... +``` + +## Key Concepts + +### Middleware Interface + +`ChatModelAgentMiddleware` is the middleware interface for Agents: + +```go +type ChatModelAgentMiddleware interface { + // BeforeAgent is called before each agent run, allowing modification of + // the agent's instruction and tools configuration. + BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) + + // BeforeModelRewriteState is called before each model invocation. + // The returned state is persisted to the agent's internal state and passed to the model. + BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) + + // AfterModelRewriteState is called after each model invocation. + // The input state includes the model's response as the last message. + AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) + + // WrapInvokableToolCall wraps a tool's synchronous execution with custom behavior. + // This method is only called for tools that implement InvokableTool. + WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) + + // WrapStreamableToolCall wraps a tool's streaming execution with custom behavior. + // This method is only called for tools that implement StreamableTool. + WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) + + // WrapEnhancedInvokableToolCall wraps an enhanced tool's synchronous execution. + // This method is only called for tools that implement EnhancedInvokableTool. + WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) + + // WrapEnhancedStreamableToolCall wraps an enhanced tool's streaming execution. + // This method is only called for tools that implement EnhancedStreamableTool. + WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) + + // WrapModel wraps a chat model with custom behavior. + // This method is called at request time when the model is about to be invoked. + WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) +} +``` + +**Design philosophy:** + +- **Decorator pattern**: Each Middleware wraps the original call and can modify input, output, or errors +- **Onion model**: Requests pass through Middlewares from outside in, responses return from inside out +- **Composable**: Multiple Middlewares execute in sequence + +### Middleware Execution Order + +`Handlers` (i.e., Middlewares) are wrapped in **array order**, forming an onion model: + +```go +Handlers: []adk.ChatModelAgentMiddleware{ + &middlewareA{}, // Outermost layer: wraps first, intercepts requests first, but WrapModel takes effect last + &middlewareB{}, // Middle layer + &middlewareC{}, // Innermost layer: wraps last +} +``` + +**Execution order for Tool calls:** + +``` +Request → A.Wrap → B.Wrap → C.Wrap → Actual Tool execution → C returns → B returns → A returns → Response +``` + +**Practical advice:** Place `safeToolMiddleware` (error capture) at the innermost layer (end of array) to ensure that interrupt errors thrown by other Middlewares can correctly propagate outward. + +### SafeToolMiddleware + +`SafeToolMiddleware` converts Tool errors into strings so the model can understand and handle them: + +```go +type safeToolMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *safeToolMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err + } + // Convert the error to a string instead of returning an error + return fmt.Sprintf("[tool error] %v", err), nil + } + return result, nil + }, nil +} +``` + +**Effect:** + +``` +[tool call] read_file(file_path: "nonexistent.txt") +[tool result] [tool error] open nonexistent.txt: no such file or directory +[assistant] Sorry, the file doesn't exist, please check the file path... +// Conversation continues, model can adjust strategy based on error information +``` + +### ModelRetryConfig + +`ModelRetryConfig` configures automatic retry for ChatModel: + +```go +type ModelRetryConfig struct { + MaxRetries int // Maximum retry count + IsRetryAble func(ctx context.Context, err error) bool // Determines if an error is retryable +} +``` + +**Usage (with DeepAgent as an example):** + +```go +agent, err := deep.New(ctx, &deep.Config{ + // ... + ModelRetryConfig: &adk.ModelRetryConfig{ + MaxRetries: 5, + IsRetryAble: func(_ context.Context, err error) bool { + // 429 rate limiting errors are retryable + return strings.Contains(err.Error(), "429") || + strings.Contains(err.Error(), "Too Many Requests") || + strings.Contains(err.Error(), "qpm limit") + }, + }, +}) +``` + +**Retry strategy:** + +- Exponential backoff: Retry intervals increase each time +- Configurable conditions: Use `IsRetryAble` to determine which errors are retryable +- Automatic recovery: No user intervention needed + +## Middleware Implementation + +### 1. Implement SafeToolMiddleware + +```go +type safeToolMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *safeToolMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + if err != nil { + // Don't convert interrupt errors—they need to propagate + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err + } + // Convert other errors to strings + return fmt.Sprintf("[tool error] %v", err), nil + } + return result, nil + }, nil +} +``` + +### 2. Implement Streaming Tool Error Handling + +```go +func (m *safeToolMiddleware) WrapStreamableToolCall( + _ context.Context, + endpoint adk.StreamableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.StreamableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) { + sr, err := endpoint(ctx, args, opts...) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return nil, err + } + // Return a single-frame stream containing the error message + return singleChunkReader(fmt.Sprintf("[tool error] %v", err)), nil + } + // Wrap the stream to capture errors within the stream + return safeWrapReader(sr), nil + }, nil +} +``` + +### 3. Configure Agent to Use Middleware + +This chapter continues using the `DeepAgent` introduced in Chapter 4, registering Middleware in its `Handlers` field: + +```go +agent, err := deep.New(ctx, &deep.Config{ + Name: "Ch05MiddlewareAgent", + Description: "ChatWithDoc agent with safe tool middleware and retry.", + ChatModel: cm, + Instruction: agentInstruction, + Backend: backend, + StreamingShell: backend, + MaxIteration: 50, + Handlers: []adk.ChatModelAgentMiddleware{ + &safeToolMiddleware{}, // Convert Tool errors to strings + }, + ModelRetryConfig: &adk.ModelRetryConfig{ + MaxRetries: 5, + IsRetryAble: func(_ context.Context, err error) bool { + return strings.Contains(err.Error(), "429") || + strings.Contains(err.Error(), "Too Many Requests") || + strings.Contains(err.Error(), "qpm limit") + }, + }, +}) +``` + +**Note**: The `Handlers` field (in configuration) and "Middleware" (the concept discussed in documentation) are the same thing—`Handlers` is the config field name, while `ChatModelAgentMiddleware` is the interface name. + +``` +**Key code snippet (**Note: This is a simplified code snippet and cannot be run directly. For the complete code, refer to** [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go)): + +```go +// SafeToolMiddleware captures Tool errors and converts them to strings +type safeToolMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *safeToolMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err + } + return fmt.Sprintf("[tool error] %v", err), nil + } + return result, nil + }, nil +} + +// Configure DeepAgent (same as Chapter 4, with Handlers and ModelRetryConfig added) +agent, _ := deep.New(ctx, &deep.Config{ + ChatModel: cm, + Backend: backend, + StreamingShell: backend, + MaxIteration: 50, + Handlers: []adk.ChatModelAgentMiddleware{ + &safeToolMiddleware{}, + }, + ModelRetryConfig: &adk.ModelRetryConfig{ + MaxRetries: 5, + IsRetryAble: func(_ context.Context, err error) bool { + return strings.Contains(err.Error(), "429") + }, + }, +}) +``` + +## Middleware Execution Flow + +``` +┌─────────────────────────────────────────┐ +│ User: Read a non-existent file │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent analyzes intent│ + │ Decides to call │ + │ read_file │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ SafeToolMiddleware │ + │ Intercepts Tool call │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Execute read_file │ + │ Returns error │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ SafeToolMiddleware │ + │ Converts error to │ + │ string │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Return Tool Result │ + │ "[tool error] ..." │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent generates reply│ + │ "Sorry, file not │ + │ found..." │ + └──────────────────────┘ +``` + +## Chapter Summary + +- **Middleware**: An interceptor for the Agent, inserting custom logic before and after calls +- **SafeToolMiddleware**: Converts Tool errors to strings so the model can understand and handle them +- **ModelRetryConfig**: Configures automatic retry for ChatModel, handling temporary errors like rate limiting +- **Decorator pattern**: Middleware wraps the original call and can modify input, output, or errors +- **Onion model**: Requests pass through Middlewares from outside in, responses return from inside out + +## Extended Thinking + +**Eino built-in Middlewares:** + + + + + + +
    MiddlewareDescription
    reductionTool output reduction—automatically truncates and offloads to file system when tool output is too long, preventing context overflow
    summarizationAutomatic conversation history summarization—generates summaries to compress history when token count exceeds threshold
    skillSkill loading middleware—enables the Agent to dynamically load and execute predefined skills
    + +**Middleware chain example:** + +```go +import ( + "github.com/cloudwego/eino/adk/middlewares/reduction" + "github.com/cloudwego/eino/adk/middlewares/summarization" + "github.com/cloudwego/eino/adk/middlewares/skill" +) -- [ch05_middleware.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch05_middleware.md) +// Create reduction middleware: manages tool output length +reductionMW, _ := reduction.New(ctx, &reduction.Config{ + Backend: filesystemBackend, // Storage backend + MaxLengthForTrunc: 50000, // Max length for single tool output + MaxTokensForClear: 30000, // Token threshold to trigger cleanup +}) -## What you learn +// Create summarization middleware: automatically compresses conversation history +summarizationMW, _ := summarization.New(ctx, &summarization.Config{ + Model: chatModel, // Model used for generating summaries + Trigger: &summarization.TriggerCondition{ + ContextTokens: 190000, // Token threshold to trigger summarization + }, +}) -- How to wrap tool execution with consistent error handling. -- How to add retry policies around ChatModel calls in a composable way. -- How middleware keeps the Agent core clean and extensible. +// Combine multiple middlewares (conceptual example; when using DeepAgent, replace adk.NewChatModelAgent with deep.New) +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Handlers: []adk.ChatModelAgentMiddleware{ // Note: config field name is Handlers, conceptually equivalent to Middlewares + summarizationMW, // Outermost layer: conversation history summarization + reductionMW, // Middle layer: tool output reduction + }, +}) +``` diff --git a/content/en/docs/eino/quick_start/chapter_06_callback_and_trace.md b/content/en/docs/eino/quick_start/chapter_06_callback_and_trace.md index 397af7dae31..34ca2e3c4e0 100644 --- a/content/en/docs/eino/quick_start/chapter_06_callback_and_trace.md +++ b/content/en/docs/eino/quick_start/chapter_06_callback_and_trace.md @@ -1,23 +1,347 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 6: Callback and Trace (observability)" +title: "Chapter 6: Callback and Trace (Observability)" weight: 6 --- -Goal of this chapter: understand the Callback mechanism and integrate tracing/observability for the Agent execution. +This chapter's goal: Understand the Callback mechanism and integrate CozeLoop to achieve tracing and observability. -## Code location +## Code Location - Entry code: [cmd/ch06/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch06/main.go) -## Full tutorial +## Prerequisites -- [ch06_callback.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch06_callback.md) +Same as Chapter 1: a usable ChatModel (OpenAI or Ark) must be configured. Additionally, `PROJECT_ROOT` needs to be set as in Chapter 4: -## What you learn +```bash +export PROJECT_ROOT=/path/to/eino # Eino core library root directory (defaults to current directory if not set) +``` -- How callbacks expose lifecycle hooks for key execution points (model calls, tool calls, streaming chunks). -- How to build logging/metrics/tracing without coupling instrumentation into core logic. +Optional: Configure CozeLoop to enable tracing: + +```bash +export COZELOOP_WORKSPACE_ID=your_workspace_id +export COZELOOP_API_TOKEN=your_token +``` + +## Running + +In the `examples/quickstart/chatwitheino` directory: + +```bash +# Set the project root directory +export PROJECT_ROOT=/path/to/your/project + +# Optional: Configure CozeLoop +export COZELOOP_WORKSPACE_ID=your_workspace_id +export COZELOOP_API_TOKEN=your_token + +go run ./cmd/ch06 +``` + +Example output: + +``` +[trace] starting session: 083d16da-6b13-4fe6-afb0-c45d8f490ce1 +you> Hello +[trace] chat_model_generate: model=gpt-4.1-mini tokens=150 +[trace] tool_call: name=list_files duration=23ms +[assistant] Hello! How can I help you? +``` + +## From Black Box to White Box: Why Callback Is Needed + +In previous chapters, our Agent was a "black box": input a question, get an answer, but we don't know what happened in between. + +**Problems with the black box:** + +- Don't know how many times the model was called +- Don't know how long Tool execution took +- Don't know how many tokens were consumed +- Difficult to diagnose issues when problems occur + +**Callback's role:** + +- **Callback is Eino's sidecar mechanism**: Consistent from components to compose (discussed below) to ADK +- **Callbacks trigger at fixed points**: 5 key moments in the component lifecycle +- **Callbacks extract real-time information**: Input, output, errors, streaming data, etc. +- **Callbacks have broad applications**: Observation, logging, metrics, tracing, debugging, auditing, etc. + +**Simple analogy:** + +- **Agent** = "business logic" (main path) +- **Callback** = "sidecar hooks" (extract information at fixed points) + +## Key Concepts + +### Handler Interface + +`Handler` is the core interface in Eino for defining callback handlers: + +```go +type Handler interface { + // Non-streaming input (before component starts processing) + OnStart(ctx context.Context, info *RunInfo, input CallbackInput) context.Context + + // Non-streaming output (after component returns successfully) + OnEnd(ctx context.Context, info *RunInfo, output CallbackOutput) context.Context + + // Error (when component returns an error) + OnError(ctx context.Context, info *RunInfo, err error) context.Context + + // Streaming input (when component receives streaming input) + OnStartWithStreamInput(ctx context.Context, info *RunInfo, + input *schema.StreamReader[CallbackInput]) context.Context + + // Streaming output (when component returns streaming output) + OnEndWithStreamOutput(ctx context.Context, info *RunInfo, + output *schema.StreamReader[CallbackOutput]) context.Context +} +``` + +**Design philosophy:** + +- **Sidecar mechanism**: Does not interfere with the main flow; extracts information at fixed points +- **Full coverage**: From components to compose to ADK, all components support callbacks +- **State passing**: OnStart→OnEnd of the same Handler can pass state via context +- **Performance optimization**: Implementing the `TimingChecker` interface can skip unnecessary timings + +**RunInfo structure:** + +```go +type RunInfo struct { + Name string // Business name (node name or user-specified) + Type string // Implementation type (e.g., "OpenAI") + Component string // Component type (e.g., "ChatModel") +} +``` + +**Important notes:** + +- Streaming callbacks must close the StreamReader, otherwise it will cause goroutine leaks +- Do not modify Input/Output—they are shared by all downstream consumers +- RunInfo may be nil; check before using + +### CozeLoop + +CozeLoop is ByteDance's open-source AI application observability platform, providing: + +- **Tracing**: Complete call chain visualization +- **Metrics monitoring**: Latency, token consumption, error rate, etc. +- **Log aggregation**: Centralized log management +- **Debug support**: Online viewing and debugging + +**Integration:** + +```go +import ( + clc "github.com/cloudwego/eino-ext/callbacks/cozeloop" + "github.com/cloudwego/eino/callbacks" + "github.com/coze-dev/cozeloop-go" +) + +// Create CozeLoop client +client, err := cozeloop.NewClient( + cozeloop.WithAPIToken(apiToken), + cozeloop.WithWorkspaceID(workspaceID), +) + +// Register as global Callback +callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client)) +``` + +### Callback Trigger Timings + +Callbacks trigger at 5 key moments in the component lifecycle. The `Timing*` names in the table below are Eino internal constants (used for the `TimingChecker` interface), with the corresponding Handler interface methods shown on the right: + + + + + + + + +
    Timing ConstantHandler MethodTrigger PointInput / Output
    TimingOnStartOnStartBefore component starts processingCallbackInput
    TimingOnEndOnEndAfter component returns successfullyCallbackOutput
    TimingOnErrorOnErrorWhen component returns an errorerror
    TimingOnStartWithStreamInputOnStartWithStreamInputWhen component receives streaming inputStreamReader[CallbackInput]
    TimingOnEndWithStreamOutputOnEndWithStreamOutputWhen component returns streaming outputStreamReader[CallbackOutput]
    + +**Example: ChatModel invocation flow** + +``` +┌─────────────────────────────────────────┐ +│ ChatModel.Generate(ctx, messages) │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ OnStart │ ← Input: CallbackInput (messages) + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Model processing │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ OnEnd │ ← Output: CallbackOutput (response) + └──────────────────────┘ +``` + +**Example: Streaming output flow** + +``` +┌─────────────────────────────────────────┐ +│ ChatModel.Stream(ctx, messages) │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ OnStart │ ← Input: CallbackInput (messages) + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Model processing │ + │ (streaming) │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ OnEndWithStreamOutput │ ← Output: StreamReader[CallbackOutput] + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Return chunks one │ + │ by one │ + └──────────────────────┘ +``` + +**Notes:** + +- Streaming errors (errors mid-stream) do not trigger OnError—they are returned within the StreamReader +- OnStart→OnEnd of the same Handler can pass state via context +- There is no guaranteed execution order between different Handlers + +## Callback Implementation + +### 1. Implement a Custom Callback Handler + +Fully implementing the `Handler` interface requires implementing all 5 methods, which is verbose. Eino provides the `callbacks.HandlerHelper` utility class to simplify the implementation: + +```go +import "github.com/cloudwego/eino/callbacks" + +// Use NewHandlerHelper to register callbacks for the timings you care about +handler := callbacks.NewHandlerHelper(). + OnStart(func(ctx context.Context, info *callbacks.RunInfo, input callbacks.CallbackInput) context.Context { + log.Printf("[trace] %s/%s start", info.Component, info.Name) + return ctx + }). + OnEnd(func(ctx context.Context, info *callbacks.RunInfo, output callbacks.CallbackOutput) context.Context { + log.Printf("[trace] %s/%s end", info.Component, info.Name) + return ctx + }). + OnError(func(ctx context.Context, info *callbacks.RunInfo, err error) context.Context { + log.Printf("[trace] %s/%s error: %v", info.Component, info.Name, err) + return ctx + }). + Handler() + +// Register as global Callback +callbacks.AppendGlobalHandlers(handler) +``` + +**Note**: `RunInfo` may be `nil` (e.g., top-level calls without RunInfo); check before using. + +### 2. Integrate CozeLoop + +```go +// Setup CozeLoop tracing (optional) +// Set COZELOOP_API_TOKEN and COZELOOP_WORKSPACE_ID to enable +cozeloopApiToken := os.Getenv("COZELOOP_API_TOKEN") +cozeloopWorkspaceID := os.Getenv("COZELOOP_WORKSPACE_ID") +if cozeloopApiToken != "" && cozeloopWorkspaceID != "" { + client, err := cozeloop.NewClient( + cozeloop.WithAPIToken(cozeloopApiToken), + cozeloop.WithWorkspaceID(cozeloopWorkspaceID), + ) + if err != nil { + log.Fatalf("cozeloop.NewClient failed: %v", err) + } + defer func() { + time.Sleep(5 * time.Second) + client.Close(ctx) + }() + callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client)) + log.Println("CozeLoop tracing enabled") +} else { + log.Println("CozeLoop tracing disabled (set COZELOOP_API_TOKEN and COZELOOP_WORKSPACE_ID to enable)") +} +``` + +**Key code snippet (**Note: This is a simplified code snippet and cannot be run directly. For the complete code, refer to** [cmd/ch06/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch06/main.go)): + +```go +// Setup CozeLoop tracing +cozeloopApiToken := os.Getenv("COZELOOP_API_TOKEN") +cozeloopWorkspaceID := os.Getenv("COZELOOP_WORKSPACE_ID") +if cozeloopApiToken != "" && cozeloopWorkspaceID != "" { + client, err := cozeloop.NewClient( + cozeloop.WithAPIToken(cozeloopApiToken), + cozeloop.WithWorkspaceID(cozeloopWorkspaceID), + ) + if err != nil { + log.Fatalf("cozeloop.NewClient failed: %v", err) + } + defer func() { + time.Sleep(5 * time.Second) + client.Close(ctx) + }() + callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client)) +} +``` + +## The Value of Observability + +### 1. Performance Analysis + +With data collected through Callbacks, you can analyze: + +- Model invocation latency distribution +- Tool execution time rankings +- Token consumption trends + +### 2. Error Tracing + +When the Agent encounters problems: + +- View the complete call chain +- Pinpoint which step failed +- Analyze the root cause + +### 3. Cost Optimization + +Through token consumption data: + +- Identify high-consumption conversations +- Optimize prompts to reduce tokens +- Choose more cost-effective models + +## Chapter Summary + +- **Callback**: Eino's observation hooks, triggering callbacks at key points +- **CozeLoop**: ByteDance's AI application observability platform +- **Global registration**: Register global Callbacks via `callbacks.AppendGlobalHandlers` +- **Non-intrusive**: Business code doesn't need to be modified; Callbacks trigger automatically +- **Observability value**: Performance analysis, error tracing, cost optimization + +## Extended Thinking + +**Other Callback implementations:** + +- OpenTelemetry Callback: Integrates with standard observability protocols +- Custom logging Callback: Records to local files +- Metrics Callback: Integrates with monitoring systems like Prometheus + +**Advanced usage:** + +- Implement sampling in Callbacks (record only a portion of requests) +- Implement rate limiting in Callbacks (based on token consumption) +- Implement alerting in Callbacks (notify when error rate is too high) diff --git a/content/en/docs/eino/quick_start/chapter_07_interrupt_resume.md b/content/en/docs/eino/quick_start/chapter_07_interrupt_resume.md index fe5fdfa2924..710aa134949 100644 --- a/content/en/docs/eino/quick_start/chapter_07_interrupt_resume.md +++ b/content/en/docs/eino/quick_start/chapter_07_interrupt_resume.md @@ -1,23 +1,366 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 7: Interrupt/Resume (human-in-the-loop)" +title: "Chapter 7: Interrupt/Resume" weight: 7 --- -Goal of this chapter: understand Interrupt/Resume and implement an approval flow so users can confirm before sensitive tool operations. +Goal of this chapter: understand the Interrupt/Resume mechanism, implement a Tool approval flow, and allow users to confirm before sensitive operations. -## Code location +## Code Location - Entry code: [cmd/ch07/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch07/main.go) -## Full tutorial +## Prerequisites -- [ch07_interrupt_resume.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch07_interrupt_resume.md) +Same as Chapter 1: you need to configure a working ChatModel (OpenAI or Ark). Additionally, you need to set `PROJECT_ROOT` as in Chapter 4: -## What you learn +```bash +export PROJECT_ROOT=/path/to/eino # Eino core library root (defaults to current directory if not set) +``` -- How to pause an execution at a safe boundary and request user input. -- How to resume from checkpoints to support long-running or approval-gated tasks. +## Running + +Execute in the `examples/quickstart/chatwitheino` directory: + +```bash +# Set the project root directory +export PROJECT_ROOT=/path/to/your/project + +go run ./cmd/ch07 +``` + +Example output: + +``` +you> Please execute the command echo hello + +⚠️ Approval Required ⚠️ +Tool: execute +Arguments: {"command":"echo hello"} + +Approve this action? (y/n): y +[tool result] hello + +hello +``` + +## From Automatic Execution to Human Approval: Why Interrupt is Needed + +In previous chapters, our Agent automatically executed all Tool calls, but this can be dangerous in certain scenarios: + +**Risks of automatic execution:** + +- Deleting files: accidentally removing important data +- Sending emails: sending incorrect content +- Executing commands: running dangerous operations +- Modifying configuration: breaking system settings + +**The role of Interrupt:** + +- **Interrupt is the Agent's pause mechanism**: pauses before critical operations, waiting for user confirmation +- **Interrupt can carry information**: shows the user what operation is about to be executed +- **Interrupt is resumable**: continues execution after user approval, returns an error after rejection + +**Simple analogy:** + +- **Automatic execution** = "autopilot" (fully trusting the system) +- **Interrupt** = "manual override" (critical decisions are made by humans) + +## Key Concepts + +### Interrupt Mechanism + +`Interrupt` is the core mechanism for human-machine collaboration in Eino. + +**Core idea: pause before executing critical operations, and continue after user confirmation.** + +A Tool that requires approval is executed in **two phases**: + +1. **First call (triggers interrupt)**: The Tool saves the current arguments, then returns an interrupt signal. The Runner pauses execution and returns an Interrupt event to the caller. +2. **Resume after user approval**: The Runner calls the Tool again. This time, the Tool detects that it has been "previously interrupted" and directly reads the user's approval result to execute (or reject). + +**Simplified pseudocode:** + +``` +func myTool(ctx, args): + if first call: + save args + return interrupt signal // Runner pauses, shows approval prompt + else: // Second call after Resume + if user approved: + return execute operation(saved args) + else: + return "Operation rejected by user" +``` + +**Full code with key field explanations:** + +```go +// Triggering an interrupt in a Tool +func myTool(ctx context.Context, args string) (string, error) { + // wasInterrupted: whether this is the second call after Resume (false on first call, true after Resume) + // storedArgs: arguments saved via StatefulInterrupt on the first call, retrievable after Resume + wasInterrupted, _, storedArgs := tool.GetInterruptState[string](ctx) + + if !wasInterrupted { + // First call: trigger interrupt and save args for use after Resume + return "", tool.StatefulInterrupt(ctx, &ApprovalInfo{ + ToolName: "my_tool", + ArgumentsInJSON: args, + }, args) // Third parameter is the state to save (retrieved via storedArgs after Resume) + } + + // Second call after Resume: read user's approval result + // isTarget: whether this Resume targets the current Tool (one Resume targets only one Tool) + // hasData: whether the Resume carries approval result data + // data: the approval result passed in by the user + isTarget, hasData, data := tool.GetResumeContext[*ApprovalResult](ctx) + if isTarget && hasData { + if data.Approved { + return doSomething(storedArgs) // Execute the actual operation using saved arguments + } + return "Operation rejected by user", nil + } + + // Other cases (isTarget=false means this Resume does not target the current Tool): re-interrupt + return "", tool.StatefulInterrupt(ctx, &ApprovalInfo{ + ToolName: "my_tool", + ArgumentsInJSON: storedArgs, + }, storedArgs) +} +``` + +### ApprovalMiddleware + +`ApprovalMiddleware` is a generic approval middleware that can intercept specific Tool calls: + +```go +type approvalMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *approvalMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + tCtx *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + // Only intercept Tools that require approval + if tCtx.Name != "execute" { + return endpoint, nil + } + + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + wasInterrupted, _, storedArgs := tool.GetInterruptState[string](ctx) + + if !wasInterrupted { + return "", tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ + ToolName: tCtx.Name, + ArgumentsInJSON: args, + }, args) + } + + isTarget, hasData, data := tool.GetResumeContext[*commontool.ApprovalResult](ctx) + if isTarget && hasData { + if data.Approved { + return endpoint(ctx, storedArgs, opts...) + } + if data.DisapproveReason != nil { + return fmt.Sprintf("tool '%s' disapproved: %s", tCtx.Name, *data.DisapproveReason), nil + } + return fmt.Sprintf("tool '%s' disapproved", tCtx.Name), nil + } + + isTarget, _, _ = tool.GetResumeContext[any](ctx) + if !isTarget { + return "", tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ + ToolName: tCtx.Name, + ArgumentsInJSON: storedArgs, + }, storedArgs) + } + + return endpoint(ctx, storedArgs, opts...) + }, nil +} + +func (m *approvalMiddleware) WrapStreamableToolCall( + _ context.Context, + endpoint adk.StreamableToolCallEndpoint, + tCtx *adk.ToolContext, +) (adk.StreamableToolCallEndpoint, error) { + // If the agent is configured with StreamingShell, execute will use streaming calls; this method must be implemented to intercept it + if tCtx.Name != "execute" { + return endpoint, nil + } + return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) { + wasInterrupted, _, storedArgs := tool.GetInterruptState[string](ctx) + if !wasInterrupted { + return nil, tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ + ToolName: tCtx.Name, + ArgumentsInJSON: args, + }, args) + } + + isTarget, hasData, data := tool.GetResumeContext[*commontool.ApprovalResult](ctx) + if isTarget && hasData { + if data.Approved { + return endpoint(ctx, storedArgs, opts...) + } + if data.DisapproveReason != nil { + return singleChunkReader(fmt.Sprintf("tool '%s' disapproved: %s", tCtx.Name, *data.DisapproveReason)), nil + } + return singleChunkReader(fmt.Sprintf("tool '%s' disapproved", tCtx.Name)), nil + } + + isTarget, _, _ = tool.GetResumeContext[any](ctx) + if !isTarget { + return nil, tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ + ToolName: tCtx.Name, + ArgumentsInJSON: storedArgs, + }, storedArgs) + } + + return endpoint(ctx, storedArgs, opts...) + }, nil +} +``` + +### CheckPointStore + +`CheckPointStore` is the key component for implementing interrupt and resume: + +```go +type CheckPointStore interface { + // Save a checkpoint + Put(ctx context.Context, key string, checkpoint *Checkpoint) error + + // Get a checkpoint + Get(ctx context.Context, key string) (*Checkpoint, error) +} +``` + +**Why is CheckPointStore needed?** + +- Saves state on interrupt: Tool arguments, execution position, etc. +- Loads state on resume: continues execution from the interrupt point +- Supports cross-process recovery: can resume even after a process restart + +## Interrupt/Resume Implementation + +### 1. Configure Runner with CheckPointStore + +```go +runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ + Agent: agent, + EnableStreaming: true, + CheckPointStore: adkstore.NewInMemoryStore(), // In-memory storage +}) +``` + +### 2. Configure Agent with ApprovalMiddleware + +```go +agent, err := deep.NewTyped[M](ctx, &deep.TypedConfig[M]{ + // ... other configuration + Handlers: []adk.TypedChatModelAgentMiddleware[M]{ + newApprovalMiddleware[M](), // Add the approval middleware + newSafeToolMiddleware[M](), // Convert Tool errors to strings (interrupt-type errors continue to propagate upward) + }, +}) +``` + +### 3. Handle Interrupt Events + +```go +checkPointID := sessionID + +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history), adk.WithCheckPointID(checkPointID)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{ + ShowToolCalls: true, + ShowToolResults: true, + CaptureInterrupt: true, +}) +if err != nil { + return err +} + +assistantText := result.AssistantText +if result.InterruptInfo != nil { + // Note: it is recommended to use the same stdin reader for both "user input" and "approval y/n" + // to avoid approval input being treated as the next you> message + assistantText, err = handleInterrupt[M](ctx, runner, checkPointID, result.InterruptInfo, reader) + if err != nil { + return err + } +} + +_ = session.Append(msgops.NewAssistant[M](assistantText, nil)) +``` + +## Interrupt/Resume Execution Flow + +``` +┌─────────────────────────────────────────┐ +│ User: execute command echo hello │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent analyzes intent│ + │ Decides to call │ + │ execute │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ ApprovalMiddleware │ + │ Intercepts Tool call │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Triggers Interrupt │ + │ Saves state to Store │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Returns Interrupt │ + │ event; waits for │ + │ user approval │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ User inputs y/n │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ runner.ResumeWith... │ + │ Resumes execution │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Executes execute │ + │ or returns rejection │ + └──────────────────────┘ +``` + +## Chapter Summary + +- **Interrupt**: The Agent's pause mechanism that halts before critical operations to await confirmation +- **Resume**: Resumes execution—continues after user approval or returns an error after rejection +- **ApprovalMiddleware**: A generic approval middleware that intercepts specific Tool calls +- **CheckPointStore**: Saves interrupt state; supports cross-process recovery +- **Human-machine collaboration**: Critical decisions are confirmed by humans, improving safety + +## Further Thinking + +**Other Interrupt scenarios:** + +- Multi-option approval: user selects one of multiple options +- Parameter completion: user provides missing parameters +- Conditional branching: user decides the execution path + +**Approval strategies:** + +- Whitelist: only approve sensitive operations +- Blacklist: approve all operations except safe ones +- Dynamic rules: decide whether to approve based on argument content diff --git a/content/en/docs/eino/quick_start/chapter_08_graph_tool.md b/content/en/docs/eino/quick_start/chapter_08_graph_tool.md index e842eadb624..882451766ca 100644 --- a/content/en/docs/eino/quick_start/chapter_08_graph_tool.md +++ b/content/en/docs/eino/quick_start/chapter_08_graph_tool.md @@ -1,24 +1,331 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 8: Graph Tool (complex workflows)" +title: "Chapter 8: Graph Tool (Complex Workflows)" weight: 8 --- -Goal of this chapter: understand the Graph Tool concept and build more complex workflows using the compose package. +Goal of this chapter: understand the Graph Tool concept, implement parallel chunk retrieval for large files, and introduce the compose package to build complex workflows. -## Code location +## Code Location - Entry code: [cmd/ch08/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch08/main.go) - RAG implementation: [rag/rag.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/rag/rag.go) -## Full tutorial +## Prerequisites -- [ch08_graph_tool.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch08_graph_tool.md) +Same as Chapter 1: you need to configure a working ChatModel (OpenAI or Ark). -## What you learn +## Running -- How to decompose a complex task into a deterministic execution graph. -- How to parallelize “chunking + retrieval” for large files and aggregate results back into a final answer. +Execute in the `examples/quickstart/chatwitheino` directory: + +```bash +# Set the project root directory +export PROJECT_ROOT=/path/to/your/project + +go run ./cmd/ch08 +``` + +Example output: + +``` +you> Please analyze the WebSocket handshake section in the RFC6455 document +[assistant] Let me analyze the document for you... +[tool call] answer_from_document(file_path: "rfc6455.txt", question: "WebSocket handshake process") +[tool result] Found 3 relevant fragments, generating answer... +[assistant] According to the RFC6455 document, the WebSocket handshake process is as follows... +``` + +## From Simple Tools to Graph Tools: Why Complex Workflows Are Needed + +In Chapter 4 we created simple Tools, each performing a single task. But in real-world scenarios, many tasks require multiple steps working together. + +**Limitations of simple Tools:** + +- Single responsibility: each Tool does only one thing +- No parallelism: multiple independent tasks cannot execute simultaneously +- Hard to reuse: complex logic is difficult to split and compose + +**Important note: this chapter only demonstrates a small portion of compose/graph/workflow capabilities.** + +From a broader perspective, Eino's `compose` package provides very general, deterministic orchestration capabilities: you can organize any system that needs "deterministic business flows" into executable pipelines using `compose`'s Graph/Chain/Workflow, and it can **natively orchestrate all Eino components** (such as ChatModel, Prompt, Tools, Retriever, Embedding, Indexer, etc.), with a complete **callback** system and **interrupt/resume + checkpoint** support. + +**The role of Graph Tool:** + +- **Graph Tool is a Tool-wrapped compose workflow**: it packages `compose.Graph / compose.Chain / compose.Workflow` compilable orchestration artifacts as a Tool that an Agent can call +- **Supports parallelism/branching/composition**: provided by compose (parallel, branching, field mapping, subgraphs, etc.); Graph Tool simply exposes them as a Tool entry point +- **Supports state management and persistence**: passing data between nodes and saving/restoring run state via checkpoints +- **Supports interrupt and resume**: both workflow-internal interrupts (triggered inside a node) and tool-level interrupt wrapping (nested interrupt scenarios) + +**Simple analogy:** + +- **Simple Tool** = "single-step operation" (read a file) +- **Graph Tool** = "pipeline" (read → chunk → score → filter → generate answer) + +## Key Concepts + +### compose.Workflow + +`compose.Workflow` is the core component for building workflows in Eino: + +```go +wf := compose.NewWorkflow[Input, Output]() + +// Add nodes +wf.AddLambdaNode("load", loadFunc).AddInput(compose.START) +wf.AddLambdaNode("chunk", chunkFunc).AddInput("load") +wf.AddLambdaNode("score", scoreFunc).AddInput("chunk") +wf.AddLambdaNode("answer", answerFunc).AddInput("score") + +// Connect to end node +wf.End().AddInput("answer") +``` + +**Core concepts:** + +- **Node**: a processing unit in the workflow +- **Edge**: data flow between nodes +- **START**: the workflow entry point +- **END**: the workflow exit point + +### BatchNode + +`BatchNode` is used for parallel processing of multiple tasks: + +```go +scorer := batch.NewBatchNode(&batch.NodeConfig[Task, Result]{ + Name: "ChunkScorer", + InnerTask: scoreOneChunk, // Processing function for a single task + MaxConcurrency: 5, // Maximum concurrency +}) +``` + +**How it works:** + +1. Receives a list of tasks as input +2. Executes each task in parallel (limited by MaxConcurrency) +3. Collects and returns all results + +### FieldMapping + +`FieldMapping` is used to pass data across nodes: + +```go +wf.AddLambdaNode("answer", answerFunc). + AddInputWithOptions("filter", // Get data from the filter node + []*compose.FieldMapping{compose.ToField("TopK")}, + compose.WithNoDirectDependency()). + AddInputWithOptions(compose.START, // Get data from the START node + []*compose.FieldMapping{compose.MapFields("Question", "Question")}, + compose.WithNoDirectDependency()) +``` + +**Why is FieldMapping needed?** + +- Passing data between non-adjacent nodes +- Merging multiple data sources into a single node +- Renaming data fields + +## Graph Tool Implementation + +### 1. Define Input/Output Structures + +```go +type Input struct { + FilePath string `json:"file_path" jsonschema:"description=Absolute path to the uploaded document file"` + Question string `json:"question" jsonschema:"description=The question to answer from the document"` +} + +type Output struct { + Answer string `json:"answer"` + Sources []string `json:"sources"` +} +``` + +### 2. Build the Workflow + +```go +func buildWorkflow(cm model.BaseChatModel) *compose.Workflow[Input, Output] { + wf := compose.NewWorkflow[Input, Output]() + + // load: read the file + wf.AddLambdaNode("load", compose.InvokableLambda( + func(ctx context.Context, in Input) ([]*schema.Document, error) { + data, err := os.ReadFile(in.FilePath) + if err != nil { + return nil, err + } + return []*schema.Document{{Content: string(data)}}, nil + }, + )).AddInput(compose.START) + + // chunk: split into chunks + wf.AddLambdaNode("chunk", compose.InvokableLambda( + func(ctx context.Context, docs []*schema.Document) ([]*schema.Document, error) { + var out []*schema.Document + for _, d := range docs { + out = append(out, splitIntoChunks(d.Content, 800)...) + } + return out, nil + }, + )).AddInput("load") + + // score: parallel scoring + scorer := batch.NewBatchNode(&batch.NodeConfig[scoreTask, scoredChunk]{ + Name: "ChunkScorer", + InnerTask: newScoreWorkflow(cm), + MaxConcurrency: 5, + }) + + wf.AddLambdaNode("score", compose.InvokableLambda( + func(ctx context.Context, in scoreIn) ([]scoredChunk, error) { + tasks := make([]scoreTask, len(in.Chunks)) + for i, c := range in.Chunks { + tasks[i] = scoreTask{Text: c.Content, Question: in.Question} + } + return scorer.Invoke(ctx, tasks) + }, + )). + AddInputWithOptions("chunk", []*compose.FieldMapping{compose.ToField("Chunks")}, compose.WithNoDirectDependency()). + AddInputWithOptions(compose.START, []*compose.FieldMapping{compose.MapFields("Question", "Question")}, compose.WithNoDirectDependency()) + + // filter: sort descending by score, keep up to top-3 chunks with score ≥ 3. + wf.AddLambdaNode("filter", compose.InvokableLambda( + func(ctx context.Context, scored []scoredChunk) ([]scoredChunk, error) { + sort.Slice(scored, func(i, j int) bool { + return scored[i].Score > scored[j].Score + }) + const maxK = 3 + var top []scoredChunk + for _, c := range scored { + if c.Score < 3 { + break + } + top = append(top, c) + if len(top) == maxK { + break + } + } + return top, nil + }, + )).AddInput("score") + + // answer: synthesize a response from top-k chunks, or return a not-found message if empty. + wf.AddLambdaNode("answer", compose.InvokableLambda( + func(ctx context.Context, in synthIn) (Output, error) { + if len(in.TopK) == 0 { + return Output{ + Answer: fmt.Sprintf("No relevant content found in the document for: %q", in.Question), + }, nil + } + return synthesize(ctx, cm, in) + }, + )). + AddInputWithOptions("filter", []*compose.FieldMapping{compose.ToField("TopK")}, compose.WithNoDirectDependency()). + AddInputWithOptions(compose.START, []*compose.FieldMapping{compose.MapFields("Question", "Question")}, compose.WithNoDirectDependency()) + + wf.End().AddInput("answer") + + return wf +} +``` + +### 3. Wrap as a Tool + +```go +func BuildTool(ctx context.Context, cm model.BaseChatModel) (tool.BaseTool, error) { + wf := buildWorkflow(cm) + return graphtool.NewInvokableGraphTool[Input, Output]( + wf, + "answer_from_document", + "Search a large uploaded document for content relevant to a question and synthesize a "+ + "cited answer from the most relevant passages. "+ + "Use this instead of read_file when the document may be too large to fit in context.", + ) +} +``` + +**Key code snippet** (note: this is a simplified snippet that cannot be run directly; see [rag/rag.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/rag/rag.go) for the full code): + +```go +func BuildTool[M adk.MessageType](ctx context.Context, cm model.BaseModel[M]) (tool.BaseTool, error) { +// Build the workflow +wf := compose.NewWorkflow[Input, Output]() + +// Add nodes +wf.AddLambdaNode("load", loadFunc).AddInput(compose.START) +wf.AddLambdaNode("chunk", chunkFunc).AddInput("load") +wf.AddLambdaNode("score", scoreFunc). + AddInputWithOptions("chunk", []*compose.FieldMapping{compose.ToField("Chunks")}, compose.WithNoDirectDependency()). + AddInputWithOptions(compose.START, []*compose.FieldMapping{compose.MapFields("Question", "Question")}, compose.WithNoDirectDependency()) + +// Wrap as a Tool +return graphtool.NewInvokableGraphTool[Input, Output](wf, "answer_from_document", "...") +} +``` + +## Graph Tool Execution Flow + +``` +┌─────────────────────────────────────────┐ +│ Input: file_path, question │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ load: read file │ + │ Output: []*Document │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ chunk: split │ + │ Output: []*Document │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ score: parallel │ + │ scoring │ + │ (MaxConcurrency=5) │ + │ Output: []scoredChunk│ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ filter: select top-k │ + │ Output: []scoredChunk│ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ answer: generate │ + │ answer │ + │ Output: Output │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Return result │ + │ {answer, sources} │ + └──────────────────────┘ +``` + +## Chapter Summary + +- **Graph Tool**: wraps complex workflows as a Tool, supporting multi-step coordination +- **compose.Workflow**: the core component for building workflows +- **BatchNode**: parallel processing of multiple tasks +- **FieldMapping**: passing data across nodes +- **Interrupt and resume**: Graph Tool supports the Checkpoint mechanism + +## Further Thinking + +**Other Graph Tool applications:** + +- Multi-document RAG: processing multiple documents in parallel +- Multi-model collaboration: different models handling different tasks +- Complex decision trees: selecting different branches based on conditions + +**Performance optimization:** + +- Adjust MaxConcurrency to control parallelism +- Use caching to avoid redundant computation +- Use streaming output to improve user experience diff --git a/content/en/docs/eino/quick_start/chapter_09_skill_console.md b/content/en/docs/eino/quick_start/chapter_09_skill_console.md index ee7a57add54..7edc2a8de49 100644 --- a/content/en/docs/eino/quick_start/chapter_09_skill_console.md +++ b/content/en/docs/eino/quick_start/chapter_09_skill_console.md @@ -1,59 +1,56 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-19" lastmod: "" tags: [] title: "Chapter 9: Skill (Console)" weight: 9 --- -Goal of this chapter: on top of Chapter 8 (RAG + Interrupt/Resume + Checkpoint), introduce the `skill` middleware so the agent can discover and load reusable skill documents (`SKILL.md`) and invoke them via tool calls. +Goal of this chapter: building on Chapter 8 (RAG + Interrupt/Resume + Checkpoint), introduce the `skill` package and use `skill middleware` to inject and manage skills, enabling the Agent to discover and load a set of reusable skill documents (`SKILL.md`) and use them via tool calls when needed. -## Code location +## Code Location -- Entry: [cmd/ch09/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch09/main.go) +- Entry code: [cmd/ch09/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch09/main.go) - Sync script: [scripts/sync_eino_ext_skills.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/scripts/sync_eino_ext_skills.go) ## Prerequisites -- Same as Chapter 1: configure a ChatModel (OpenAI or Ark) -- Prepare skills provided by the eino-ext PR (`eino-guide` / `eino-component` / `eino-compose` / `eino-agent`) +- Same as Chapter 1: you need to configure a working ChatModel (OpenAI or Ark) +- Have the skills documents from the `eino-ext` PR ready (`eino-guide` / `eino-component` / `eino-compose` / `eino-agent`) -Why these four? +`skill middleware` supports integrating various skills. This chapter only uses the four Eino-related skills as examples to demonstrate how to integrate skills with `skill middleware`. Why these four? -ChatWithEino is positioned as “help users learn Eino and assist with Eino coding using AI.” These four skills cover the key knowledge areas: +ChatWithEino is positioned as "helping users learn the Eino framework and attempting AI-assisted Eino code writing." These four skill documents cover exactly the key knowledge areas needed for this goal. -- `eino-guide`: entry point and navigation (where to start, how to run quickly) -- `eino-component`: component interfaces and implementation references (Model/Embedding/Retriever/Tool/Callback, etc.) -- `eino-compose`: orchestration and deterministic workflow references (Graph/Chain/Workflow, etc.) -- `eino-agent`: ADK/Agent references (Agent/Runner/Middleware/Filesystem/Human-in-the-loop, etc.) +Sources of skills can be: -Skill sources: +- The `eino-ext` repository local path (the script automatically reads `/skills/...`) +- Or a directory where you have installed skills (the directory should contain the four subdirectories mentioned above) -- Local path to the `eino-ext` repository (the script reads `/skills/...`) -- Or any directory where skills are already installed (containing the above subdirectories) +## From Graph Tool to Skill: Why "Skill Documents" Are Needed -## From Graph Tool to Skill: why “skill docs” +Chapter 8 solved the problem of "how to turn complex workflows into a callable Tool" (Graph Tool). But when building an Agent for framework learning/development assistance, you encounter another type of problem: **how to inject a set of stable, reusable knowledge and instructions into the Agent and let it load them on demand at runtime?** -Chapter 8 solves “how to make a complex workflow callable as a Tool” (Graph Tool). But for a framework-learning/development assistant agent, there is another problem: **how to inject stable, reusable knowledge and instructions into the agent, and let it load them on demand at runtime**. +This is the role of Skill: -That is the role of Skills: +- **Tool** is more like "actions/capabilities": reading files, running workflows, calling external systems +- **Skill** is more like "reusable knowledge/instruction packages": a set of markdown files (`SKILL.md` + `reference/*.md`) describing "how to do a certain type of task" -- **Tool** is more like an “action/capability”: read files, run workflows, call external systems -- **Skill** is more like a “reusable knowledge/instruction pack”: a set of markdown files (`SKILL.md` + `reference/*.md`) that describe “how to do something” +`Skill middleware` is responsible for integrating skills into the agent. After registering the skill middleware, the Agent can read a specific Skill on demand via the `skill` tool. Simple analogy: -- **Tool** = “what you can do” (function/interface) -- **Skill** = “how to do it” (reusable handbook/manual) +- **Tool** = "what can be done" (function/interface) +- **Skill** = "how to do it" (reusable manual/operation guide) -## Run +## Running -In `quickstart/chatwitheino`, do: +Execute in the `quickstart/chatwitheino` directory: -### 1) Sync eino-ext skills into a local directory +### 1) Sync eino-ext skills to a local directory -To let the `skill` middleware discover skills, place them under a single directory and follow the scan convention: +For the `skill` middleware to "discover" these skills, they need to be placed in a unified directory that satisfies the scanning convention: - `EINO_EXT_SKILLS_DIR//SKILL.md` @@ -66,32 +63,33 @@ go run ./scripts/sync_eino_ext_skills.go -src /path/to/eino-ext -dest ./skills/e Notes: - `-src` supports two forms: - - The root of the `eino-ext` repo (the script reads `/skills/...`) - - A directory where skills are already installed (should contain `eino-guide/`, `eino-component/`, etc.) + - The `eino-ext` repository root (the script automatically reads `/skills/...`) + - A directory where you have installed skills (should contain `eino-guide/`, `eino-component/`, etc. subdirectories) - `-dest` defaults to `./skills/eino-ext` (can be omitted) ### 2) Start Chapter 9 ```bash -EINO_EXT_SKILLS_DIR=/absolute/path/to/chatwitheino/skills/eino-ext go run ./cmd/ch09 +export EINO_EXT_SKILLS_DIR=/absolute/path/to/chatwitheino/skills/eino-ext +go run ./cmd/ch09 ``` -Output example (snippet): +Example output (excerpt): ``` Skills dir: /.../skills/eino-ext Enter your message (empty line to exit): ``` -## Enable Skill in DeepAgent +## Enabling Skill in DeepAgent -Skill invocation is not automatic. You must register the `skill` middleware when building the agent. It’s a three-step setup: +The "Skill is callable" behavior in this chapter does not happen automatically—you need to register the `Skill middleware` when building the Agent. The core steps are: -1. Use a local filesystem backend (this chapter uses `eino-ext/adk/backend/local`) to provide file reading/Glob -2. Use `skill.NewBackendFromFilesystem` to turn `EINO_EXT_SKILLS_DIR` into a skill backend -3. Use `skill.NewMiddleware` to create the middleware and attach it to DeepAgent’s `Handlers` +1. Use a local filesystem backend (this chapter uses `eino-ext/adk/backend/local`) to provide file reading/Glob capabilities +2. Use `skill.NewBackendFromFilesystem` to turn `EINO_EXT_SKILLS_DIR` into a Skill Backend +3. Use `skill.NewTyped[M]` to generate a generic `Skill middleware` and add it to the DeepAgent's `Handlers` -**Key snippet (simplified; see cmd/ch09/main.go for full code):** +**Key code snippet (note: this is a simplified snippet that cannot be run directly; see cmd/ch09/main.go for the full code):** ```go backend, _ := localbk.NewBackend(ctx, &localbk.Config{}) @@ -100,43 +98,43 @@ skillBackend, _ := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesys Backend: backend, BaseDir: skillsDir, // = $EINO_EXT_SKILLS_DIR }) -skillMiddleware, _ := skill.NewMiddleware(ctx, &skill.Config{ +skillMiddleware, _ := skill.NewTyped[M](ctx, &skill.TypedConfig[M]{ Backend: skillBackend, }) -agent, _ := deep.New(ctx, &deep.Config{ +agent, _ := deep.NewTyped[M](ctx, &deep.TypedConfig[M]{ ChatModel: cm, Backend: backend, StreamingShell: backend, - Handlers: []adk.ChatModelAgentMiddleware{ + Handlers: []adk.TypedChatModelAgentMiddleware[M]{ skillMiddleware, - // ... other middlewares like approval/safeTool/retry + // ... other middlewares such as approval/safeTool/retry, etc. }, }) ``` -Notes: +Additional notes: -- This quickstart checks `EINO_EXT_SKILLS_DIR` existence at runtime: if it exists, it registers `skillMiddleware`; otherwise it skips it (the agent still runs and can use RAG tools). -- Skill tool input is JSON: `{"skill": ""}`, e.g. `{"skill":"eino-guide"}`. +- This quickstart checks for the existence of `EINO_EXT_SKILLS_DIR` in code to ensure it "can still run without skills configured": the `skillMiddleware` is only registered if the directory exists; otherwise it is skipped (you can still have conversations and use the RAG tool). +- The Skill tool's input is a JSON: `{"skill": ""}`, for example `{"skill":"eino-guide"}`. -## Quick verification (recommended) +## Quick Verification (Recommended) -After startup, send a prompt that forces a skill tool call to verify that skills are discovered and loadable: +After starting, enter a command that explicitly asks the model to call the skill tool (to verify that skills have been discovered and can be loaded): ``` Use the skill tool with skill="eino-guide" and tell me what the entry point is for getting started. ``` -You should see output similar to: +You should see output similar to the following in the console: - `[tool result] Launching skill: eino-guide` -- Tool result includes `Base directory for this skill: .../eino-guide` +- The tool result contains `Base directory for this skill: .../eino-guide` -## What you will see +## What You Will See - When the model calls the skill tool, the console prints: - `[tool call] ...` - - `[tool result] ...` (truncated) -- Sessions are stored under `SESSION_DIR` (default `./data/sessions`) and can be resumed: + - `[tool result] ...` (results are truncated for display) +- Sessions are saved by default in `./data/sessions_agentic`, and can be restored: - `go run ./cmd/ch09 --session ` diff --git a/content/en/docs/eino/quick_start/chapter_11_turnloop.md b/content/en/docs/eino/quick_start/chapter_11_turnloop.md new file mode 100644 index 00000000000..6ca94b0208c --- /dev/null +++ b/content/en/docs/eino/quick_start/chapter_11_turnloop.md @@ -0,0 +1,247 @@ +--- +Description: "" +date: "2026-05-19" +lastmod: "" +tags: [] +title: "Chapter 11: TurnLoop — Preemption, Abort, and Multi-Turn Lifecycle" +weight: 11 +--- + +In the previous chapter, we used `adk.Runner` to build a complete A2UI web application. It works fine, but try this scenario: + +> You ask the Agent a complex question, it starts calling tools and generating a long answer... but you suddenly realize you asked the wrong thing and want to switch to a different question. + +In the previous chapter's Runner mode, your only options are to wait for it to finish or refresh the page and lose everything. + +This chapter introduces `adk.TurnLoop`, enabling two new user-facing capabilities for the Agent: **preemption** and **abort**. + +## Prerequisites + +Same as Chapter 1: you need a configured, usable ChatModel (OpenAI or Ark). See the "Prerequisites" section in Chapter 1 for details. + +## Run & Experience + +In the `quickstart/chatwitheino` directory, execute: + +```bash +go run . +``` + +Open your browser to `http://localhost:8080`, then try the following: + +### Experience Preemption + +1. Send a question that triggers a long answer, e.g., "Explain all of Eino's components in detail" +2. **While the Agent is still responding**, send a new message, e.g., "Never mind, just tell me what ChatModel is" +3. Observe: the old response stops immediately, and the Agent begins answering the new question + +### Experience Abort + +1. Send a question +2. **While the Agent is responding**, click the **Abort button** in the upper right corner +3. Observe: the Agent stops immediately and produces no further output + +Neither of these capabilities existed in the previous chapter's Runner version. Below we explain how they're implemented. + +## Code Location + +- Entry code: [main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/main.go) +- Agent construction: [agent.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/agent.go) +- TurnLoop server: [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) + +## Why Runner Can't Do This + +In the previous chapter's `cmd/ch10`, each `/sessions/:id/chat` request calls `runner.Run(ctx, messages)` once. Runner is a **single-turn** model — call once, execute once, done. If the user sends another message while the Agent is executing, Runner has no "running loop" to receive it. + +TurnLoop is a **persistent multi-turn execution loop**. It stays idle between turns, ready to receive new input via `Push()` and respond immediately at any time. Because there's a continuously running loop, preemption and abort become possible — you can interrupt an ongoing turn or stop the entire loop directly. + + + + + + + + + +
    CapabilityCh10 (Runner, single-turn)Ch11 (TurnLoop, multi-turn)
    Streaming output
    Approval / interrupt
    Cross-turn persistence, real-time response to new input❌ Each Run() is independent✅ Push() at any time
    Preempt an ongoing response✅ Push(item, WithPreempt(...))
    Abort Agent✅ loop.Stop(WithImmediate())
    Flexible per-turn input construction❌ Business layer assembles manually✅ GenInput callback
    + +## TurnLoop's Core Model + +TurnLoop is a **push-based event loop that manages Agent execution in units of turns**. Unlike Runner's "call once, execute once" model, TurnLoop runs continuously: after a turn ends, it enters idle waiting; when new items arrive, it immediately starts the next turn. + +``` +Push(item) → [queue] → GenInput(items) → Agent.Run() → OnAgentEvents(events) + ↑ │ + └──── idle wait / next turn ←──┘ +``` + +Key concepts: + +- **Item**: The carrier of user input. In this example, defined as `ChatItem`, which can carry user messages or approval decisions +- **GenInput**: Builds Agent input from queued items (choosing which items to consume and which to keep for the next turn) +- **OnAgentEvents**: Receives the Agent's output event stream, responsible for rendering and persistence +- **Push**: Pushes a new item into the queue, optionally with preemption options + +## One Session Per TurnLoop + +In this example's web scenario, each chat session corresponds to one TurnLoop instance. When the user sends their first message, the server creates a TurnLoop for that session and calls `Run()` to start it; subsequent messages are fed into the same loop via `Push()`. This loop stays idle between turns until the session is deleted or the user aborts. + +This is TurnLoop's most typical usage pattern: **the loop's lifecycle is bound to the user session**. A long-running TurnLoop makes preemption and abort natural operations — because "the running loop" always exists, and new input can be fed in at any time. + +## Normal Flow: idle → new message → response → idle + +The simplest scenario is the user asking questions sequentially, waiting for answers, then asking the next: + +```go +// When the user sends the first message, create and start TurnLoop +loop := adk.NewTurnLoop(cfg) +loop.Push(&ChatItem{Query: "hello"}) +loop.Run(ctx) +// → GenInput builds input → Agent executes → OnAgentEvents streams output +// → Turn ends, TurnLoop enters idle waiting + +// User sends second message (loop is idle) +loop.Push(&ChatItem{Query: "explain Eino's architecture"}) +// → TurnLoop wakes up, starts new turn: GenInput → Agent → OnAgentEvents → idle +``` + +This flow is indistinguishable from the previous chapter's Runner in user experience — the difference is that TurnLoop's loop **persists**, without needing to be recreated each time. Once the user sends a new message while the Agent is still responding, we enter the "preemption" scenario below. + +## How Preemption Works + +When the user sends a new message while the Agent is responding, the business layer triggers preemption with a single line of code: + +```go +loop.Push(item, adk.WithPreempt[*ChatItem, M](adk.AfterToolCalls)) +``` + +After TurnLoop receives this instruction: + +1. Waits for the current tool call to complete (`AfterToolCalls` means don't interrupt executing tools to avoid inconsistent state) +2. Cancels the current turn — OnAgentEvents' context is cancelled, the old turn exits +3. Takes new items from the queue, builds input via GenInput, starts a new turn + +Preemption mode can be chosen based on business needs: + + + + + + +
    ModeSpecific Behavior
    AfterToolCallsWaits for currently executing tool calls to complete, then cancels the current turn and starts a new one
    AfterChatModelWaits for the current model call to complete, then cancels the current turn and starts a new one
    AnySafePointCancels at any safe point (e.g., between tool calls, between model calls) and immediately starts a new turn
    + +> In this example, TurnLoop runs in a separate goroutine while the HTTP handler needs to write the event stream to an SSE response. The two coordinate via channels (see the `iterEnvelope`/`iterResult` and `handlerDone` signal mechanism in [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go)). These are HTTP adaptation layer details and not part of the TurnLoop API itself. + +## How Abort Works + +Abort is simpler — directly stop the entire TurnLoop: + +```go +loop.Stop(adk.WithImmediate()) // Immediate cancel, don't wait for current turn +loop.Wait() // Wait for complete exit +``` + +### Three Stop Modes + + + + + + +
    ModeSpecific Behavior
    loop.Stop()Turn boundary exit: waits for the current turn to complete then exits
    loop.Stop(WithImmediate())Immediate exit: cancels the current turn's context
    loop.Stop(WithGraceful())Safe point exit: exits at the next safe point (e.g., between tool calls)
    + +## TurnLoop Configuration + +When creating a TurnLoop, specify callbacks and options via `TurnLoopConfig`: + +```go +cfg := adk.TurnLoopConfig[*ChatItem, M]{ + // GenInput: called at the start of each turn, decides "what the Agent sees this turn" + // Selects items from the queue to build Agent input, returns Consumed (processed this turn) and Remaining (kept for subsequent turns) + GenInput: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], items []*ChatItem) (*adk.GenInputResult[*ChatItem, M], error) { + // ...build AgentInput, persist user messages... + }, + + // PrepareAgent: called once per turn, returns the Agent to use this turn + // This example returns the same Agent directly, but you could dynamically select different Agents based on items + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], consumed []*ChatItem) (adk.TypedAgent[M], error) { + return agent, nil + }, + + // OnAgentEvents: receives the Agent's event stream, responsible for rendering output and persisting intermediate messages + // This example transfers the event stream to the HTTP handler via channel for SSE output + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[*ChatItem, M], events *adk.AsyncIterator[*adk.TypedAgentEvent[M]]) error { + // ...hand events to HTTP handler, wait for consumption to complete... + }, + + // The following three fields are for declarative checkpoint (approval recovery), detailed in the next section + GenResume: makeGenResume(), + Store: checkpointStore, + CheckpointID: sessionID, +} + +loop := adk.NewTurnLoop(cfg) +``` + + + + + + + + +
    CallbackWhen CalledResponsibility
    GenInputWhen items are in the queueSelect which items to consume, build Agent input (can decide which items to keep for next turn)
    PrepareAgentAfter GenInputReturn the Agent instance for this turn, supports dynamic Agent configuration adjustment
    OnAgentEventsWhen Agent produces event streamConsume events, render output, persist results — the core entry point for business-layer Agent output handling
    GenResumeWhen resuming from checkpointExtract approval results from newly Pushed items, build
    ResumeParams
    to automate approval recovery
    Store + CheckpointIDEnable declarative checkpoint; TurnLoop automatically handles execution state save and restore
    + +> For the complete callback implementations, refer to [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go). + +## Declarative Checkpoint: Automating Approval Recovery + +In Chapter 7 (Runner mode), approval recovery required the business layer to manually call `runner.ResumeWithParams()` and determine whether "this is a normal execution or a recovery execution." TurnLoop provides a more concise approach — declare `Store` and `CheckpointID` in the configuration (see previous section), and TurnLoop automatically handles save and restore: + +1. When Agent execution reaches an approval interrupt, TurnLoop automatically saves execution state to `Store` (using `CheckpointID` as key) +2. After the user makes an approval decision, the business layer creates a new TurnLoop (with the **same** `CheckpointID`) and Pushes the approval item +3. When the new TurnLoop `Run()`s, it detects the checkpoint exists and **automatically calls `GenResume`** (instead of `GenInput`) to get recovery parameters +4. Agent continues execution from the interrupt point + +`GenResume`'s responsibility is to extract the approval result from newly Pushed items and build `ResumeParams`: + +```go +GenResume: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], + canceledItems, unhandledItems, newItems []*ChatItem, +) (*adk.GenResumeResult[*ChatItem, M], error) { + // newItems contains the item Pushed during approval recovery + item := newItems[0] + return &adk.GenResumeResult[*ChatItem, M]{ + ResumeParams: &adk.ResumeParams{ + InterruptID: item.InterruptID, + ApprovalResult: item.ApprovalResult, + }, + }, nil +} +``` + +Compared to Runner's `ResumeWithParams()`, declarative checkpoint frees the business layer from managing the "normal execution vs recovery execution" branching — TurnLoop automatically chooses `GenInput` or `GenResume` based on whether a checkpoint exists. + +## Chapter Summary + +- **TurnLoop** is a persistent multi-turn execution loop whose lifecycle is bound to the user session +- **Normal flow**: `Push(item)` → GenInput → Agent → OnAgentEvents → idle → wait for next Push +- **Preemption**: `Push(item, WithPreempt(AfterToolCalls))` — one line of code cancels the current turn and starts a new one +- **Abort**: `loop.Stop(WithImmediate())` — one line of code terminates the entire loop +- **Declarative checkpoint**: Configure `Store` + `CheckpointID`, and TurnLoop automatically handles interrupt save and restore +- For complete callback implementations, refer to [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) + +## Series Wrap-Up: Complete Agent Application Skeleton + +By this chapter, we've used a runnable Agent to thread together Eino's core capabilities: + +- **Runtime**: Runner / TurnLoop drive execution, supporting streaming output, preemption, and abort +- **Tool layer**: Filesystem / Shell tool capabilities, with safe tool error handling +- **Middleware**: Pluggable middleware/handlers for cross-cutting concerns like error handling, retry, and approval +- **Observability**: callbacks/trace capabilities connect critical paths for debugging and production monitoring +- **Human-Agent collaboration**: interrupt/resume + checkpoint supports approval, parameter supplementation, branch selection, and other interactive flows +- **Deterministic orchestration**: compose (graph/chain/workflow) organizes complex business flows into maintainable, reusable execution graphs +- **Business delivery**: A2UI protocol presents Agent capabilities to users as streaming UI +- **Execution control**: TurnLoop provides preemption, abort, and multi-turn lifecycle management to accommodate complex interaction needs in real business scenarios + +You can incrementally replace/extend any component on this skeleton: models, tools, storage, workflows, frontend rendering protocols — without starting from scratch. diff --git a/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_Breaking_Changes.md b/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_Breaking_Changes.md index e00ac16fbe2..a12b39e3fd1 100644 --- a/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_Breaking_Changes.md +++ b/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_Breaking_Changes.md @@ -9,9 +9,9 @@ weight: 1 ## 1. API Breaking Changes -### 1.1 filesystem Shell Interface Renamed +### 1.1 filesystem Shell Interface Rename -**Location**: `adk/filesystem/backend.go` **Change Description**: Shell-related interfaces have been renamed and no longer embed the `Backend` interface. **Before (v0.7.x)**: +**Location**: `adk/filesystem/backend.go` **Change Description**: Shell-related interfaces were renamed and no longer embed the `Backend` interface. **Before (v0.7.x)**: ```go type ShellBackend interface { @@ -41,41 +41,72 @@ type StreamingShell interface { - `ShellBackend` renamed to `Shell` - `StreamingShellBackend` renamed to `StreamingShell` -- Interfaces no longer embed `Backend`. If your implementation depends on the composite interface, you need to implement them separately **Migration Guide**: +- Interfaces no longer embed `Backend`; if your implementation relies on the combined interface, you need to implement them separately **Migration Guide**: ```go // Before type MyBackend struct {} func (b *MyBackend) Execute(...) {...} -// MyBackend implementing ShellBackend needed to implement all Backend methods +// MyBackend implementing ShellBackend required implementing all Backend methods too // After type MyShell struct {} func (s *MyShell) Execute(...) {...} -// MyShell only needs to implement Shell interface methods +// MyShell only needs to implement the Shell interface methods // If you also need Backend functionality, implement both interfaces separately ``` --- +### 1.2 Filesystem Backend: Read Return Value Breaking Change + +- **Location**: adk/filesystem/backend.go +- **Change Description**: Backend.Read's return value changed incompatibly from returning a string to returning a *FileContent struct + +**Before (v0.7.x)**: + +```go +type Backend interface { + ... + Read(ctx context.Context, req *ReadRequest) (string, error) + ... + } +``` + +**After (v0.8.0)**: + +```go +type Backend interface { + ... + Read(ctx context.Context, req *ReadRequest) (*FileContent, error) + ... + } +``` + +**Impact:** + +- v0.7.x's Read interface returned `string`. v0.8.0's Read interface returns the struct `FileContent`, which is a breaking change. +- For Backend implementors: you need to replace the Read method implementation to return *FileContent instead of String. +- For Backend consumers: you need to upgrade your Backend implementation to support v0.8 and modify calls to Backend.Read to use the new *FileContent return. + ## 2. Behavioral Breaking Changes -### 2.1 AgentEvent Sending Mechanism Change +### 2.1 AgentEvent Emission Mechanism Change -**Location**: `adk/chatmodel.go` **Change Description**: `ChatModelAgent`'s `AgentEvent` sending mechanism changed from eino callback mechanism to Middleware mechanism. **Before (v0.7.x)**: +**Location**: `adk/chatmodel.go` **Change Description**: `ChatModelAgent`'s `AgentEvent` emission mechanism changed from eino callback mechanism to Middleware mechanism. **Before (v0.7.x)**: -- `AgentEvent` was sent through eino's callback mechanism -- If users customized ChatModel or Tool Decorator/Wrapper, and the original ChatModel/Tool had embedded Callback points, `AgentEvent` would be sent **inside** the Decorator/Wrapper -- This applied to all ChatModels implemented in eino-ext, but may not apply to most user-implemented Tools and Tools provided by eino **After (v0.8.0)**: -- `AgentEvent` is sent through Middleware mechanism -- `AgentEvent` is sent **outside** user-customized Decorator/Wrapper **Impact**: +- `AgentEvent` was sent via eino's callback mechanism +- If users customized ChatModel or Tool Decorators/Wrappers, and the original ChatModel/Tool had embedded Callback points internally, `AgentEvent` would be sent **inside** the Decorator/Wrapper +- This applied to all ChatModels implemented in eino-ext, but might not apply to most user-implemented Tools or first-party Tools provided by eino **After (v0.8.0)**: +- `AgentEvent` is sent via Middleware mechanism +- `AgentEvent` is sent **outside** of user-customized Decorators/Wrappers **Impact**: - Under normal circumstances, users won't notice this change -- If users previously implemented their own ChatModel or Tool Decorator/Wrapper, the relative position of event sending will change -- Position change may cause `AgentEvent` content to change: previous events didn't include Decorator/Wrapper modifications, current events will include them **Reason for Change**: -- In normal business scenarios, we want emitted events to include Decorator/Wrapper modifications **Migration Guide**: If you previously wrapped ChatModel or Tool through Decorator/Wrapper, you need to implement the `ChatModelAgentMiddleware` interface instead: +- If users previously implemented their own ChatModel or Tool Decorators/Wrappers, the relative position of event emission changes +- The position change may also cause `AgentEvent` content to change: previously events didn't include changes made by Decorators/Wrappers, now events include them **Reason for Change**: +- Under normal business scenarios, emitted events should include changes made by Decorators/Wrappers **Migration Guide**: If you previously wrapped ChatModel or Tool via Decorators/Wrappers, migrate to implementing the `ChatModelAgentMiddleware` interface: ```go -// Before: Wrapping ChatModel through Decorator/Wrapper +// Before: wrapping ChatModel via Decorator/Wrapper type MyModelWrapper struct { inner model.BaseChatModel } @@ -85,19 +116,19 @@ func (w *MyModelWrapper) Generate(ctx context.Context, input []*schema.Message, return w.inner.Generate(ctx, input, opts...) } -// After: Implement WrapModel method of ChatModelAgentMiddleware +// After: implement ChatModelAgentMiddleware's WrapModel method type MyMiddleware struct{} func (m *MyMiddleware) WrapModel(ctx context.Context, chatModel model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) { return &myWrappedModel{inner: chatModel}, nil } -// For Tool Wrappers, implement WrapInvokableToolCall / WrapStreamableToolCall methods instead +// For Tool Wrappers, migrate to implementing WrapInvokableToolCall / WrapStreamableToolCall methods ``` ### 2.2 filesystem.ReadRequest.Offset Semantic Change -**Location**: `adk/filesystem/backend.go` **Change Description**: `Offset` field changed from 0-based to 1-based. **Before (v0.7.x)**: +**Location**: `adk/filesystem/backend.go` **Change Description**: The `Offset` field changed from 0-based to 1-based. **Before (v0.7.x)**: ```go type ReadRequest struct { @@ -112,6 +143,7 @@ type ReadRequest struct { ```go type ReadRequest struct { + FilePath string // Offset specifies the starting line number (1-based) for reading. // Line 1 is the first line of the file. @@ -124,21 +156,21 @@ type ReadRequest struct { **Migration Guide**: ```go -// Before: Read from line 0 (i.e., first line) +// Before: reading from line 0 (i.e., the first line) req := &ReadRequest{Offset: 0, Limit: 100} -// After: Read from line 1 (i.e., first line) +// After: reading from line 1 (i.e., the first line) req := &ReadRequest{Offset: 1, Limit: 100} // If you previously used Offset: 10 to mean starting from line 11 -// Now you need to use Offset: 11 +// Now use Offset: 11 ``` --- ### 2.3 filesystem.FileInfo.Path Semantic Change -**Location**: `adk/filesystem/backend.go` **Change Description**: `FileInfo.Path` field is no longer guaranteed to be an absolute path. **Before (v0.7.x)**: +**Location**: `adk/filesystem/backend.go` **Change Description**: The `FileInfo.Path` field no longer guarantees an absolute path. **Before (v0.7.x)**: ```go type FileInfo struct { @@ -160,18 +192,25 @@ type FileInfo struct { **Impact**: -- Code that depends on `Path` being an absolute path may have issues -- Need to check and handle relative path cases +- Code that depends on `Path` being an absolute path may break +- You need to check for and handle relative paths --- ### 2.4 filesystem.WriteRequest Behavior Change -**Location**: `adk/filesystem/backend.go` **Change Description**: `WriteRequest` write behavior changed from "error if file exists" to "overwrite if file exists". **Before (v0.7.x)**: +**Location**: `adk/filesystem/backend.go` **Change Description**: `WriteRequest`'s write behavior changed from "error if file exists" to "overwrite if file exists." **Before (v0.7.x)**: ```go // WriteRequest comment: // The file will be created if it does not exist, or error if file exists. +type WriteRequest struct { + // FilePath is the absolute path of the file to write. Must start with '/'. + // The file will be created if it does not exist, or error if file exists. + FilePath string + + ... +} ``` **After (v0.8.0)**: @@ -179,19 +218,26 @@ type FileInfo struct { ```go // WriteRequest comment: // Creates the file if it does not exist, overwrites if it exists. +type WriteRequest struct { + // FilePath is the path of the file to write. + FilePath string + + .... +} ``` **Impact**: -- Code that previously relied on "error if file exists" behavior will no longer error, but directly overwrite -- May cause unexpected data loss **Migration Guide**: -- If you need to preserve the original behavior, check if the file exists before writing +- Code that relied on the "error if file exists" behavior will no longer error and will silently overwrite instead +- May lead to unexpected data loss **Migration Guide**: +- If you need to preserve the original behavior, check whether the file exists before writing +- The old FilePath represented an absolute path; the new version no longer specifies FilePath as an absolute path. Scenarios that depend on absolute paths need corresponding FilePath adaptation --- ### 2.5 GrepRequest.Pattern Semantic Change -**Location**: `adk/filesystem/backend.go` **Change Description**: `GrepRequest.Pattern` changed from literal matching to regular expression matching. **Before (v0.7.x)**: +**Location**: `adk/filesystem/backend.go` **Change Description**: `GrepRequest.Pattern` changed from literal string matching to regex matching. **Before (v0.7.x)**: ```go // Pattern is the literal string to search for. This is not a regular expression. @@ -208,16 +254,16 @@ type FileInfo struct { **Impact**: - Search patterns containing regex special characters will behave differently -- For example, searching for `interface{}` now needs to be escaped as `interface\{\}` **Migration Guide**: +- For example, searching for `interface{}` now requires escaping to `interface\{\}` **Migration Guide**: ```go -// Before: Literal search +// Before: literal search req := &GrepRequest{Pattern: "interface{}"} -// After: Regex search, need to escape special characters +// After: regex search, special characters need escaping req := &GrepRequest{Pattern: "interface\\{\\}"} -// Or if searching for literals containing . * + ?, also need to escape +// Or if searching for literals containing . * + ?, also need escaping // Before req := &GrepRequest{Pattern: "config.json"} // After @@ -226,11 +272,37 @@ req := &GrepRequest{Pattern: "config\\.json"} --- +### 2.6 EditRequest.FilePath Semantic Change + +**Location**: `adk/filesystem/backend.go` **Change Description**: EditRequest.FilePath comment removes the mandatory absolute path description. **Before (v0.7.x)**: + +```go +type EditRequest struct { + // FilePath is the absolute path of the file to edit. Must start with '/'. + FilePath string + .... + } + } +``` + +**After (v0.8.0)**: + +```go +type EditRequest struct { + // FilePath is the path of the file to edit. + FilePath string +} +``` + +**Impact**: + +- In the old version, `FilePath` implied an absolute path; the new version no longer guarantees `FilePath` is an absolute path. Logic that depended on `FilePath` being absolute needs corresponding adaptation. + ## Migration Recommendations -1. **Handle compile errors first**: Type changes (like Shell interface renaming) will cause compilation failures, need to fix first -2. **Pay attention to semantic changes**: `ReadRequest.Offset` changed from 0-based to 1-based, `Pattern` changed from literal to regex - these won't cause compile errors but will change runtime behavior -3. **Check file operations**: `WriteRequest` overwrite behavior change may cause data loss, requires additional checks -4. **Migrate Decorator/Wrapper**: If you have custom ChatModel/Tool Decorator/Wrapper, change to implement `ChatModelAgentMiddleware` -5. Upgrade backend implementations as needed: If using local/ark agentkit backend provided by eino-ext, upgrade to corresponding alpha versions: [local backend v0.2.0-alpha](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Flocal%2Fv0.2.0-alpha.1), [ark agentkit backend v0.2.0-alpha](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Fagentkit%2Fv0.2.0-alpha.1) -6. **Test verification**: After migration, perform comprehensive testing, especially for code involving file operations and search functionality +1. **Address compilation errors first**: Type changes (like Shell interface rename) will cause compilation failures and need to be fixed first +2. **Pay attention to semantic changes**: `ReadRequest.Offset` changing from 0-based to 1-based, `Pattern` changing from literal to regex — these won't cause compilation errors but will change runtime behavior +3. **Check file operations**: `WriteRequest`'s overwrite behavior change may lead to data loss and needs extra verification +4. **Migrate Decorators/Wrappers**: If you have custom ChatModel/Tool Decorators/Wrappers, migrate to implementing `ChatModelAgentMiddleware` +5. **Upgrade backend implementations as needed**: If using local/ark agentkit backends from eino-ext, upgrade to the corresponding latest versions: [adk/backend/local/v0.2.1](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Flocal%2Fv0.2.1) [adk/backend/agentkit/v0.2.1](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Fagentkit%2Fv0.2.1) +6. **Test and verify**: After migration, perform comprehensive testing, especially code involving file operations and search functionality diff --git a/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md b/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md index 24e6d20ae02..fd4b75a41f0 100644 --- a/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md +++ b/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md @@ -1,20 +1,17 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: v0.8.*-adk middlewares' +title: v0.8.*-adk middlewares weight: 8 --- -This document introduces the main new features and improvements in Eino ADK v0.8.*. - -> 💡 -> Currently in the v0.8.0.Beta version stage: [https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1) +This document introduces the major new features and improvements in Eino ADK v0.8.*. ## Version Highlights -v0.8 is a significant feature enhancement release that introduces a new middleware interface architecture, adds multiple practical middlewares, and provides enhanced observability support. +v0.8 is a significant feature enhancement release that introduces a brand-new middleware interface architecture, adds multiple practical middlewares, and provides enhanced observability support.
    @@ -28,9 +25,9 @@ Agent-level Callback support
    ## 1. ChatModelAgentMiddleware Interface > 💡 -> **Core Update**: A new middleware interface providing more flexible Agent extension mechanisms +> **Core Update**: A brand-new middleware interface providing more flexible Agent extension mechanisms -`ChatModelAgentMiddleware` is the most important architectural update in v0.8, providing unified extension points for `ChatModelAgent` and Agents built on top of it (such as `DeepAgent`). +`ChatModelAgentMiddleware` is the most important architectural update in v0.8, providing unified extension points for `ChatModelAgent` and Agents built upon it (such as `DeepAgent`). **Advantages over AgentMiddleware**: @@ -41,14 +38,14 @@ Agent-level Callback support Configuration ManagementScattered in closuresCentralized in struct fields -**Interface Methods**: +**Interface methods**: - `BeforeAgent` - Modify configuration before Agent runs -- `BeforeModelRewriteState` - Process state before model invocation -- `AfterModelRewriteState` - Process state after model invocation +- `BeforeModelRewriteState` - Process state before model call +- `AfterModelRewriteState` - Process state after model call - `WrapInvokableToolCall` - Wrap synchronous tool calls - `WrapStreamableToolCall` - Wrap streaming tool calls -- `WrapModel` - Wrap model invocation +- `WrapModel` - Wrap model calls **Usage**: @@ -59,27 +56,27 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ }) ``` -See [Eino ADK: ChatModelAgentMiddleware](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware) for details +See [Eino ADK: ChatModelAgentMiddleware](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware) for details. --- ### 1.1 Summarization Middleware > 💡 -> **Function**: Automatic conversation history summarization to prevent exceeding model context window limits +> **Feature**: Automatic conversation history summarization to prevent exceeding model context window limits -📚 **Detailed Documentation**: [Middleware: Summarization](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_summarization) +📚 **Detailed documentation**: [Middleware: Summarization](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_summarization) -When the token count of conversation history exceeds a threshold, automatically calls LLM to generate a summary and compress the context. +When the conversation history token count exceeds a threshold, automatically calls an LLM to generate a summary, compressing the context. -**Core Capabilities**: +**Core capabilities**: - Configurable trigger conditions (token threshold) - Support for retaining recent user messages -- Support for recording complete conversation history to files -- Provides pre and post processing hooks +- Support for recording complete conversation history to file +- Pre/post processing hooks -**Quick Start**: +**Quick start**: ```go mw, err := summarization.New(ctx, &summarization.Config{ @@ -93,19 +90,19 @@ mw, err := summarization.New(ctx, &summarization.Config{ ### 1.2 ToolReduction Middleware > 💡 -> **Function**: Tool result compression to optimize context usage efficiency +> **Feature**: Tool result compression to optimize context usage efficiency -📚 **Detailed Documentation**: [Middleware: ToolReduction](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_toolreduction) +📚 **Detailed documentation**: [Middleware: ToolReduction](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_toolreduction) Provides two-phase tool output management: - - - + + +
    PhaseTrigger TimeEffect
    TruncationAfter tool returnsTruncate overlong output, save to file
    ClearBefore model invocationClear historical tool results, free up tokens
    PhaseTriggerEffect
    TruncationAfter tool returnsTruncates overlong output, saves to file
    ClearBefore model callClears historical tool results, frees tokens
    -**Quick Start**: +**Quick start**: ```go mw, err := reduction.New(ctx, &reduction.Config{ @@ -118,57 +115,57 @@ mw, err := reduction.New(ctx, &reduction.Config{ ### 1.3 Filesystem Middleware > 💡 -> **Function**: File system operation toolset +> **Feature**: File system operation toolset -📚 **Detailed Documentation**: [Middleware: FileSystem](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem) +📚 **Detailed documentation**: [Middleware: FileSystem](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem) -**New Capabilities**: +**New capabilities**: -- **Grep Enhancement**: Support for full regular expression syntax -- **New Options**: `CaseInsensitive`, `EnableMultiline`, `FileType` filtering -- **Custom Tool Names**: All filesystem tools support custom naming +- **Grep enhancement**: Full regular expression syntax support +- **New options**: `CaseInsensitive`, `EnableMultiline`, `FileType` filtering +- **Custom tool names**: All filesystem tools support custom naming ### 1.4 Skill Middleware > 💡 -> **Function**: Dynamic loading and execution of Skills +> **Feature**: Dynamic loading and execution of Skills -📚 **Detailed Documentation**: [Middleware: Skill](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_skill) +📚 **Detailed documentation**: [Middleware: Skill](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_skill) -**New Capabilities**: +**New capabilities**: -- **Context Modes**: Support for `fork` and `isolate` context modes -- **Custom Configuration**: Support for custom system prompts and tool descriptions -- **FrontMatter Extension**: Support for specifying agent and model via FrontMatter +- **Context mode**: Supports `fork` and `isolate` context modes +- **Custom configuration**: Supports custom system prompts and tool descriptions +- **FrontMatter extension**: Supports specifying agent and model via FrontMatter ### 1.5 PlanTask Middleware > 💡 -> **Function**: Task planning and execution tools +> **Feature**: Task planning and execution tools -📚 **Detailed Documentation**: [Middleware: PlanTask](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_plantask) +📚 **Detailed documentation**: [Middleware: PlanTask](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_plantask) -Supports Agent creation and management of task plans, suitable for complex task scenarios requiring step-by-step execution. +Supports the Agent in creating and managing task plans, suitable for complex task scenarios requiring step-by-step execution. ### 1.6 ToolSearch Middleware > 💡 -> **Function**: Tool search with dynamic retrieval from a large number of tools +> **Feature**: Tool search, supporting dynamic retrieval from large tool collections -📚 **Detailed Documentation**: [Middleware: ToolSearch](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_toolsearch) +📚 **Detailed documentation**: [Middleware: ToolSearch](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_toolsearch) -When there are many tools, dynamically selects the most relevant tools through semantic search to avoid context overload. +When the number of tools is large, dynamically selects the most relevant tools through semantic search to avoid context overload. ### 1.7 PatchToolCalls Middleware > 💡 -> **Function**: Patch dangling tool calls to ensure message history completeness +> **Feature**: Patches dangling tool calls to ensure message history integrity -📚 **Detailed Documentation**: [Middleware: PatchToolCalls](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_patchtoolcalls) +📚 **Detailed documentation**: [Middleware: PatchToolCalls](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_patchtoolcalls) -Scans message history and inserts placeholder messages for tool calls missing responses. Suitable for scenarios where tool calls are interrupted or cancelled. +Scans message history and inserts placeholder messages for tool calls missing responses. Suitable for scenarios where tool calls were interrupted or cancelled. -**Quick Start**: +**Quick start**: ```go mw, err := patchtoolcalls.New(ctx, nil) @@ -177,14 +174,14 @@ mw, err := patchtoolcalls.New(ctx, nil) ## 2. Agent Callback Support > 💡 -> **Function**: Agent-level callback mechanism for observation and tracing +> **Feature**: Agent-level callback mechanism for observation and tracing -Supports registering callbacks throughout the Agent execution lifecycle for logging, tracing, monitoring, and other functions. +Supports registering callbacks throughout the Agent's entire execution lifecycle for logging, tracing, monitoring, and other functions. -**Core Types**: +**Core types**: -- `AgentCallbackInput` - Callback input containing Agent input or resume information -- `AgentCallbackOutput` - Callback output containing Agent event stream +- `AgentCallbackInput` - Callback input, containing Agent input or resume information +- `AgentCallbackOutput` - Callback output, containing Agent event stream **Usage**: @@ -205,16 +202,16 @@ agent.Run(ctx, input, adk.WithCallbacks( )) ``` -See [Eino ADK: Agent Callback](/docs/eino/core_modules/eino_adk/adk_agent_callback) for details +See [Eino ADK: Agent Callback](/docs/eino/core_modules/eino_adk/adk_agent_callback) for details. --- ## 3. Language Setting > 💡 -> **Function**: Global language settings +> **Feature**: Global language setting -Supports global setting of ADK language preferences, affecting the language of built-in prompts and messages. +Supports global language preference for ADK, affecting built-in prompts and message language. **Usage**: @@ -228,7 +225,7 @@ adk.SetLanguage(adk.LanguageEnglish) // Set to English (default) ## Middleware Usage Recommendations > 💡 -> **Recommended Combination**: The following middlewares can be combined to cover most long conversation scenarios +> **Recommended combination**: The following middlewares can be used together to cover most long-conversation scenarios ```go handlers := []adk.ChatModelAgentMiddleware{ @@ -250,30 +247,30 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ > 💡 > Before upgrading to v0.8, please review the Breaking Changes documentation for all incompatible changes -📚 **Complete Documentation**: [Eino v0.8 Breaking Changes](/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_Breaking_Changes) +📚 **Full documentation**: [Eino v0.8 Breaking Changes](/docs/eino/release_notes_and_migration/eino_v0.8._-adk_middlewares/eino_v0.8_breaking_changes) -**Change Overview**: +**Changes overview**: - + - +
    TypeChange Item
    TypeChange
    API Change
    ShellBackend
    Shell
    interface rename
    Behavior Change
    AgentEvent
    sending mechanism changed to Middleware
    Behavior Change
    ReadRequest.Offset
    changed from 0-based to 1-based
    Behavior Change
    FileInfo.Path
    no longer guaranteed to be absolute path
    Behavior Change
    WriteRequest
    changed from error on file exists to overwrite
    Behavior Change
    WriteRequest
    changed from error on existing file to overwrite
    Behavior Change
    GrepRequest.Pattern
    changed from literal to regular expression
    ## Upgrade Guide -For detailed migration steps and code examples, please refer to: [Eino v0.8 Breaking Changes](/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_Breaking_Changes) +For detailed migration steps and code examples, refer to: [Eino v0.8 Breaking Changes](/docs/eino/release_notes_and_migration/eino_v0.8._-adk_middlewares/eino_v0.8_breaking_changes) -**Quick Checklist**: +**Quick checklist**: -1. Check if you are using `ShellBackend` / `StreamingShellBackend` interface (needs renaming) +1. Check if you use `ShellBackend` / `StreamingShellBackend` interfaces (need renaming) 2. Check `ReadRequest.Offset` usage (0-based → 1-based) -3. Check `GrepRequest.Pattern` usage (literal → regex, special characters need escaping) -4. Check if you depend on `WriteRequest`'s "error on file exists" behavior +3. Check `GrepRequest.Pattern` usage (literal → regular expression, special characters need escaping) +4. Check if you depend on `WriteRequest`'s "error on existing file" behavior 5. Check if you depend on `FileInfo.Path` being absolute path 6. If you have custom ChatModel/Tool Decorator/Wrapper, consider migrating to `ChatModelAgentMiddleware` 7. Run tests to verify functionality diff --git a/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md b/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md new file mode 100644 index 00000000000..ab4929edba6 --- /dev/null +++ b/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md @@ -0,0 +1,61 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: v0.9.* agentic-runtime +weight: 9 +--- + +The theme of V0.9 is `agentic-runtime`. This release focuses on ADK's message protocol, Agent runtime control, and multi-turn runtime capabilities. While preserving the default `*schema.Message` path, it introduces `AgenticMessage` and supporting generic abstractions, laying the foundation for richer model-native Agent protocols, server-side tool calls, and runtime interruption and recovery. + +## 1. AgenticMessage and ADK Support + +V0.9 introduces `schema.AgenticMessage` for representing more complete Agentic message structures compared to the traditional `schema.Message`. + +- `AgenticMessage` adopts a content block model, supporting structured fragments for text, reasoning content, tool calls, tool results, server-side tools, MCP tools, and multimodal content. +- `[]ContentBlock` can more completely preserve the temporal ordering of blocks in different model protocol responses; new block types are also better suited for structures like tool use, reasoning, and streaming metadata in OpenAI Responses API, Claude, Gemini, and other protocols. +- `components/model` introduces the `AgenticModel` component for integrating model implementations that use `AgenticMessage` as input/output. +- ADK provides typed agent, typed event, typed runner, and typed `ChatModelAgent` support for the `AgenticMessage` path, enabling AgenticModel to enter ADK's Agent lifecycle. + +## 2. ChatModelAgent Capability Extensions + +V0.9 systematically enhances `ChatModelAgent`'s runtime control, model call reliability, and middleware extension points. + +### Cancel + +- New Agent Cancel capability for externally terminating a running Agent. +- Supports safe point cancellation, recursive cancellation, cancel timeout escalation, and checkpoint persistence during cancellation. +- Interrupts during cancellation are unified into cancel semantics; callers can distinguish active cancellation from normal business failures via `CancelError`. + +### Model Retry + +- Retry is extended from simple error retry to `ShouldRetry(ctx, RetryContext) -> RetryDecision`. +- Retry decisions can read model output, reject outputs that don't meet conditions, modify the next input, append model options, and override backoff. + +### Model Failover + +- New Model Failover capability for switching to a backup model after a model call failure. +- Failover decisions can read the failed attempt's output, error, original input, and attempt number, and select the next model to use. +- Supports input rewriting for backup models; also supports preferring the last successfully used model to reduce the cost of starting from a fixed primary model each time. + +### Middleware Enhancements + +- `ChatModelAgentMiddleware` adds `AfterAgent` for executing cleanup logic after Agent succeeds. +- Summarization, reduction, skill, filesystem, plan-task, patch-tool-calls, and other middlewares are generified to support the `AgenticMessage` path. +- Summarization middleware adds `TypedMiddleware.Summarize`, consolidating synchronous summarization capability from a standalone function into the middleware. +- Filesystem middleware enhances multimodal reading capabilities and adds PDF page validation. +- New `agentsmd` middleware for loading and injecting `AGENTS.md`-style project instructions. +- `ChatModelAgentState` adds `ToolInfos` and `DeferredToolInfos` as the primary path for middleware to adjust the model-visible tool set. +- `ToolInfos` represents tools directly visible to the current model call; `DeferredToolInfos` represents candidate tools discoverable by the model through tool search mechanisms. +- Tool search middleware supports three types of tool loading: using model-native tool search capability to load from deferred tools on demand; providing a fixed-schema `ToolSearchTool` per model protocol requirements for the model to search deferred tools; using Eino's custom `tool_search` tool independent of model-side protocols to retrieve tools and append hits to regular `ToolInfos`. +- Compose adds `AgenticToolsNode`; `ToolsNode` adds tool name and argument alias support. + +## 3. TurnLoop + +V0.9 introduces `TurnLoop` for elevating one-shot Agent runs into continuously running, externally-driven turn-level runtimes. + +- Multi-turn oriented: `TurnLoop` continuously receives external input; each turn independently plans input, constructs the Agent, and consumes events — suitable for long-lived interactive Agents. +- Input merging support: `GenInput` decides at turn boundaries which inputs to consume this turn and which to keep waiting, enabling applications to implement batching, deduplication, and merging of consecutive user inputs. +- Preemption support: `Push` with preempt options atomically writes new input and requests cancellation of the current turn, allowing high-priority input to interrupt a running Agent. +- Declarative checkpoint/resume support: on recovery, applications don't need to restore the input queue themselves; `TurnLoop` distinguishes interrupted inputs, unhandled inputs, and newly arrived inputs, and applications only need to declare how these inputs re-enter subsequent turns. diff --git a/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md b/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md new file mode 100644 index 00000000000..6e558b7475b --- /dev/null +++ b/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md @@ -0,0 +1,198 @@ +--- +Description: "" +date: "2026-05-19" +lastmod: "" +tags: [] +title: Eino V0.9 Migration Notes +weight: 1 +--- + +This document lists API and semantic changes that existing users need to be aware of when upgrading from V0.8.x to V0.9 `agentic-runtime`. New capabilities not listed here generally don't affect the existing `*schema.Message` path. + +## Explicit API Changes + +### Agent Transfer / Workflow Agent / Supervisor Marked as NOT RECOMMENDED + +V0.9 marks the entire multi-Agent collaboration mode based on Agent Transfer (full context sharing) as **NOT RECOMMENDED**. Affected public APIs include: + +**Agent Transfer related**: + +- `SetSubAgents` +- `AgentWithOptions` / `WithDisallowTransferToParent` / `WithHistoryRewriter` +- `ChatModelAgentConfig.Exit` / `ChatModelAgentConfig.OutputKey` +- `AgentWithDeterministicTransferTo` +- `OnSetSubAgents` / `OnSetAsSubAgent` / `OnDisallowTransferToParent` + +**Workflow Agent**: + +- `NewSequentialAgent` / `SequentialAgentConfig` +- `NewParallelAgent` / `ParallelAgentConfig` +- `NewLoopAgent` / `LoopAgentConfig` + +**Supervisor**: + +- `supervisor.New` / `supervisor.Config` + +> 💡 +> These APIs still work and won't cause compilation failures, but are not recommended for new projects. Experience has shown that the transfer mode where Agents share full conversation context doesn't outperform the tool-call mode in practice. + +Recommended migration directions: + +- Use `ChatModelAgent` + `AgentTool` (wrap sub-Agents as tools, invoke on demand). +- Use `DeepAgent` (structured subtask delegation). +- Both approaches offer better controllability, observability, and prompt cache efficiency. + +### ChatModelAgentMiddleware Adds AfterAgent + +`ChatModelAgentMiddleware` adds a new `AfterAgent` method. Types that manually implement this interface need to add this method, otherwise compilation will fail. + +Recommended approach: + +- If the middleware doesn't need special cleanup logic, embed `*adk.BaseChatModelAgentMiddleware`. +- If the middleware needs to clean up state, record events, or collect statistics after Agent succeeds, implement `AfterAgent(ctx, state)`. + +Impact scope: + +- Only affects user code that explicitly implements `ChatModelAgentMiddleware`. +- Code extending via `BaseChatModelAgentMiddleware` composition remains compatible. + +### AgentMiddleware Struct Deprecated + +The `AgentMiddleware` struct and `ChatModelAgentConfig.Middlewares` field are marked as **Deprecated** and will be removed in a future version. + +> 💡 +> Both AgentMiddleware and the Middlewares field are deprecated. Please migrate to the interface-based Handlers (ChatModelAgentMiddleware) approach. + +Migration approach: + +- Migrate logic from `Middlewares []AgentMiddleware` to `Handlers []ChatModelAgentMiddleware`. +- `AgentMiddleware.BeforeChatModel` → implement `ChatModelAgentMiddleware.BeforeModelRewriteState`. +- `AgentMiddleware.AfterChatModel` → implement `ChatModelAgentMiddleware.AfterModelRewriteState`. +- `AgentMiddleware.WrapToolCall` → implement `ChatModelAgentMiddleware.WrapToolCall`. +- `AgentMiddleware.AdditionalInstruction` → modify `state.Instruction` in `BeforeModelRewriteState`. +- `AgentMiddleware.AdditionalTools` → modify `state.ToolInfos` in `BeforeModelRewriteState`. +- If the middleware doesn't need special logic, embed `*adk.BaseChatModelAgentMiddleware` for default no-op implementations. + +Impact scope: + +- All code using `AgentMiddleware` in `ChatModelAgentConfig.Middlewares` needs migration. +- In the current version, both approaches can coexist (Handlers execute after Middlewares), but early migration is recommended to avoid compilation failures when removed in future versions. + +### summarization.SummarizeMessages Removed + +`summarization.SummarizeMessages` and `summarization.SummarizeOutput` are no longer exported. + +Migration approach: + +- Continue using `summarization.New` or `summarization.NewTyped` when constructing the summarization middleware. +- When needing to trigger synchronous summarization proactively, use `TypedMiddleware.Summarize`. + +This adjustment consolidates summarization's configuration, state reading, and execution logic within the middleware, avoiding semantic divergence between standalone functions and runtime state. + +## Capabilities Requiring Attention to Semantic Changes + +### Summarization Finalize Post-Processing Semantic Change + +In V0.8.x, the summarization middleware would first execute default summary post-processing, then call the user-configured `Finalize`. Therefore, custom `Finalize` received a `summary` that already included `PreserveUserMessages` replacement, `TranscriptFilePath` injection, and summary preamble. + +In V0.9, if `Config.Finalize` is set, the middleware passes the model-generated raw summary directly to `Finalize` without executing default post-processing. Affected configurations include: + +- `PreserveUserMessages` +- `TranscriptFilePath` + +Migration approach: + +- If you want to keep default post-processing, don't set `Finalize`; let the middleware use the default finalization path. +- If you must customize `Finalize` but still want default post-processing, first construct the default finalizer via `DefaultFinalizer`, then explicitly compose it in your custom logic. +- `DefaultFinalizer` does not automatically read the outer `Config.PreserveUserMessages` and `Config.TranscriptFilePath`; pass them explicitly via `DefaultFinalizerConfig`. +- Code using `NewFinalizer().PreserveSkills(...).Build()` needs special attention: that finalizer only handles preserve skills and doesn't automatically add `PreserveUserMessages` and `TranscriptFilePath`. + +### Tool List Modification Path Adjustment + +`ModelContext.Tools` is no longer the recommended entry point for modifying the tool list. + +Upgrade suggestions: + +- Modify `state.ToolInfos` in `BeforeModelRewriteState`. +- For model-native deferred tool search, modify `state.DeferredToolInfos`. +- Modifying the tool list in `WrapModel` is not recommended; such modifications only affect the current model call and won't be inherited by subsequent middleware, subsequent turns, or checkpoint/resume. + +### ToolSearch / AgentsMD Middleware Internal Implementation Migration + +The internal implementations of ToolSearch and AgentsMD middleware migrated from `WrapModel` (v0.8.x) to `BeforeModelRewriteState` (v0.9). + +> 💡 +> For users who only use `toolsearch.New()` / `agentsmd.New()`, the public API (Config struct, constructor) is unchanged; no code modifications are needed. + +Semantic changes: + +- **v0.8.x**: Middleware injected tool lists temporarily during model calls via `WrapModel` (through `model.Option`); changes were not persisted and didn't enter agent state. +- **v0.9**: Middleware directly modifies `state.ToolInfos` / `state.DeferredToolInfos` and `state.Messages` (injecting reminder messages) in `BeforeModelRewriteState`; changes are persisted with state. + +Impact: + +- **Checkpoint/Resume**: Reminder messages injected by ToolSearch and dynamic tool search results now persist with checkpoints and are correctly rebuilt on recovery; in v0.8.x this information would be lost after recovery. +- **Other Middleware visibility**: Subsequent middleware's `BeforeModelRewriteState` / `AfterModelRewriteState` can now see ToolSearch's modified `state.ToolInfos`; in v0.8.x these modifications were invisible to other middleware. +- **Prompt Cache**: Since tool list changes now reflect in state (rather than being temporarily injected per model call), the model's KV-cache behavior may differ. + +Points to note: + +- If you have custom middleware that relies on `WrapModel`'s `ModelContext.Tools` to read/modify the tool list, migrate to reading `state.ToolInfos` in `BeforeModelRewriteState`. + +### Model Retry Decision Semantic Enhancement + +`ModelRetryConfig` adds `ShouldRetry`. When `ShouldRetry` is non-nil, `IsRetryAble` is ignored. + +Points to note: + +- The old `IsRetryAble` is still usable for simple error-based retry. +- When using `ShouldRetry`, explicitly handle scenarios where output succeeds but is not business-acceptable. +- Interrupts and `ErrStreamCanceled` are not treated as regular retry errors. + +### Cancel Error Semantics + +After V0.9 introduces active cancel semantics, applications need to distinguish between active cancellation, normal errors, and business interrupts. + +Upgrade suggestions: + +- The upper layer should distinguish `CancelError`, normal errors, and business interrupts. +- If the application actively integrates `WithCancel`, don't treat `CancelError` as a normal business failure. + +### AgenticMessage Migration Requires Understanding New Message Structure + +`TypedChatModelAgent[*schema.AgenticMessage]` is the new path for model-native Agentic protocols. Migrating to this path isn't just changing the generic parameter from `*schema.Message` to `*schema.AgenticMessage`; you also need to handle message content according to `AgenticMessage`'s content block structure. + +Points to note: + +- The AgenticMessage path uses `AgenticModel` and `AgenticToolsNode` for tool calls. +- Tool calls and tool results are expressed through `AgenticMessage` content blocks; correctly handling tool call / tool result content blocks is especially important. +- Agent transfer capability doesn't apply to the AgenticMessage path. +- Existing applications that don't need model-native Agentic protocols should continue using the default `*schema.Message` path; only migrate when explicitly wanting to integrate `AgenticModel` protocols. + +### Model Adapters Need to Recognize New Options + +After V0.9 introduces `AgenticModel`, model adapters need to handle call-time options more strictly. `AgenticModel` is an alias for `BaseModel[*schema.AgenticMessage]` and no longer provides enhanced interfaces like `ToolCallingChatModel.WithTools`; tool binding is unified through `model.WithTools` as a `model.Option`. + +Points to note: + +- All model adapters supporting AgenticMessage should read `Options.Tools` and map them to the provider's tool calling protocol. +- `AgenticModel` should not require users to first call a `WithTools` method to get a "model instance with tools"; ADK passes the current tool list via `model.WithTools` on each model call. +- If an adapter only reads tools from its own config and ignores `model.WithTools`, in the ChatModelAgent / AgenticToolsNode path the model won't see tools or the tool list won't reflect runtime changes. + +V0.9 also adds to `model.Options`: + +- `DeferredTools` +- `ToolSearchTool` +- `AgenticToolChoice` + +Existing model adapters ignoring these options typically won't cause compilation failures, but will result in deferred tool search, model-native tool search, or agentic tool choice not taking effect. Adapter maintainers should add conversion logic according to their target provider's protocol. + +### ToolInfo Serialization Format Change + +`ToolInfo` adds explicit JSON/Gob encoding/decoding to preserve `ParamsOneOf`. + +Impact: + +- `ToolInfo` enters `ChatModelAgentState.ToolInfos` / `DeferredToolInfos`, so it may enter checkpoints along with Agent state. +- Explicit JSON/Gob encoding/decoding ensures `ParamsOneOf` isn't lost during checkpoint, deep copy, and recovery processes. +- If external systems directly depend on the old `ToolInfo` JSON format, serialization compatibility needs to be re-verified. diff --git a/content/en/docs/eino/release_notes_and_migration/v02_second_release.md b/content/en/docs/eino/release_notes_and_migration/v02_second_release.md index e2f407a6dc0..f6a86208533 100644 --- a/content/en/docs/eino/release_notes_and_migration/v02_second_release.md +++ b/content/en/docs/eino/release_notes_and_migration/v02_second_release.md @@ -3,94 +3,94 @@ Description: "" date: "2026-03-02" lastmod: "" tags: [] -title: 'Eino: v0.2.*-second release' +title: v0.2.*-second release weight: 2 --- ## v0.2.6 -> Release Date: 2024-11-27 +> Release date: 2024-11-27 ### Features -- Added streaming Pre and Post StateHandler +- Added streaming Pre and Post StateHandlers - Support for StateChain -- Added MessageParser node to convert ChatModel output Message to business-customized structures +- New MessageParser node that converts ChatModel output Messages into custom business structures - Parse(ctx context.Context, m *Message) (T, error) -- Support for WithNodeKey() when Chain AppendNode +- Support for WithNodeKey() when appending nodes to a Chain ### BugFix -- Fixed the issue where the first Chunk was modified during ConcatMessage due to lack of deep Copy. -- During ConcatMessage, FinishReason now only retains the valid value from the last Chunk +- Fixed an issue where ConcatMessage did not deep copy, causing the first Chunk to be modified. +- ConcatMessage now only retains the FinishReason from the last valid Chunk ## v0.2.5 -> Release Date: 2024-11-21 +> Release date: 2024-11-21 ### BugFix -- Fixed panic caused by disabling keywords like include in Gonja +- Fixed a panic caused by Gonja disabling keywords like include ## v0.2.4 -> Release Date: 2024-11-20 +> Release date: 2024-11-20 ### Features -- Added TokenUsage field in Eino Message ResponseMeta -- Eino Message ToolsCall sorted by index +- Added TokenUsage field to Eino Message ResponseMeta +- Eino Message ToolCalls sorted by index ### BugFix ## v0.2.3 -> Release Date: 2024-11-12 +> Release date: 2024-11-12 ### Features -- Support for context Timeout and Cancel during Graph invocation +- Graph invocation now supports context Timeout and Cancel ### BugFix -- FinishReason may be returned in any chunk, not necessarily in the last chunk -- callbacks.HandlerBuilder no longer provides a default Needed() method. This method defaults to returning false, which causes all aspect functions to fail in embedded callbacks.HandlerBuilder scenarios +- FinishReason may be returned in any chunk, not necessarily the last one +- callbacks.HandlerBuilder no longer provides a default Needed() method. This method previously returned false by default, which would cause all aspect functions to be disabled when embedding callbacks.HandlerBuilder ## v0.2.2 -> Release Date: 2024-11-12 +> Release date: 2024-11-12 ### Features -- Added FinishReason field in Message -- Added GetState[T]() method to get State struct in nodes +- Added FinishReason field to Message +- Added GetState[T]() method for retrieving the State struct within nodes - Lazy Init Gonja SDK ### BugFix ## v0.2.1 -> Release Date: 2024-11-07 +> Release date: 2024-11-07 ### BugFix -- Fixed the SSTI vulnerability in the Jinja chat template(langchaingo gonja template injection) +- Fixed the SSTI vulnerability in the Jinja chat template [langchaingo gonja template injection vulnerability](https://bytedance.larkoffice.com/docx/UvqxdlFfSoTIr1xtsQ5cIZTVn2b) ## v0.2.0 -> Release Date: 2024-11-07 +> Release date: 2024-11-07 ### Features - Callback API refactoring (compatible update) - - For component implementers: Hidden and deprecated callbacks.Manager, providing simpler utility functions for injecting callback aspects. - - For Handler implementers: Provides template methods for quick callbacks.Handler implementation, encapsulating details such as component type checking, input/output type assertion and conversion. Users only need to provide specific implementations of specific callback methods for specific components. - - Runtime mechanism: For a specific callback aspect timing during a run, additional filtering of handlers to execute is performed based on component type and specific methods implemented by Handler. -- Added Host Multi-Agent: Implemented Host mode Multi-Agent, where Host performs intent recognition and then redirects to various Specialist Agents for specific generation. -- React Agent API changes (incompatible) + - For component implementers: hides and deprecates callbacks.Manager, provides simpler utility functions for injecting callback aspects. + - For Handler implementers: provides template methods for quick callbacks.Handler implementation, encapsulating component type determination, input/output type assertion and conversion details. Users only need to provide specific implementations for specific callback methods of specific components. + - Runtime mechanism: for a specific callback aspect timing during a run, additionally filters out handlers that need to be executed based on component type and the Handler's specific method implementations. +- New Host Multi-Agent: implements Host mode Multi-Agent, where the Host performs intent recognition and then routes to individual Specialist Agents for specific generation. +- React Agent API changes (breaking) - - Removed AgentCallback definition, changed to quickly inject ChatModel and Tool CallbackHandlers through BuildAgentCallback utility function. Usage: + - Removed AgentCallback definition, replaced with BuildAgentCallback utility function for quickly injecting ChatModel and Tool CallbackHandlers. Usage: ```go func BuildAgentCallback(modelHandler *model.CallbackHandler, toolHandler *tool.CallbackHandler) callbacks.Handler { @@ -98,27 +98,27 @@ weight: 2 } ``` - - This achieves semantic alignment between AgentCallback and components, allowing ctx to be returned and using extended tool.CallbackInput, tool.CallbackOutput. - - Removed react.Option definition. React Agent now uses the common agent.Option definition for Agent, facilitating orchestration at the multi-agent level. + - This achieves semantic alignment between AgentCallback and components, can return ctx, and can use the extended tool.CallbackInput and tool.CallbackOutput. + - Removed react.Option definition. React Agent now uses the common agent.Option definition, facilitating composition and orchestration at the multi-agent level. - - WithAgentCallback is no longer needed to inject special AgentCallback, new usage: + - WithAgentCallback is no longer needed to inject the special AgentCallback. New usage: ``` agent.WithComposeOptions(compose.WithCallbacks(xxxCallbackHandler)) ``` -- Added Document Parser interface definition: As a dependency of the Loader component, responsible for parsing io.Reader into Document, and provides ExtParser implementation for parsing based on file extension. +- New Document Parser interface definition: as a dependency of the Loader component, responsible for parsing io.Reader into Document, with an ExtParser implementation that parses based on file extension. ### BugFix -- Fixed potential null pointer exception caused by embedding.GetCommonOptions and indexer.GetCommonOptions not checking apply for null. -- During Graph runtime, preProcessor and postProcessor use the current ctx. +- Fixed potential null pointer exception caused by embedding.GetCommonOptions and indexer.GetCommonOptions not null-checking apply. +- During Graph execution, preProcessor and postProcessor now use the current ctx. ## v0.2.0-dev.1 -> Release Date: 2024-11-05 +> Release date: 2024-11-05 ### Features -- Initial design and support for Checkpoint mechanism, available for early trial +- Initial design and support for Checkpoint mechanism, available for early testing ### BugFix diff --git a/content/zh/docs/eino/Cookbook.md b/content/zh/docs/eino/Cookbook.md index db3b6c8f616..df2aaeebce8 100644 --- a/content/zh/docs/eino/Cookbook.md +++ b/content/zh/docs/eino/Cookbook.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: Cookbook @@ -63,6 +63,27 @@ weight: 3 adk/multiagent/integration-excel-agentExcel Agent (ADK 集成版)ADK 集成版 Excel Agent,包含 Planner、Executor、Replanner、Reporter +### Agent + + + + +
    目录名称说明
    adk/agent/ralph-loopRalph Loop自主迭代模式:外部
    for
    循环配合
    Runner.Run
    实现单轮迭代,Agent 通过文件系统感知先前工作,验证门控检查 BUG 标记后才接受完成承诺
    + +### Cancel (取消) + + + + +
    目录名称说明
    adk/cancel/graceful-exitGraceful Exit演示 Agent Cancel + Resume:捕获终端信号后以
    CancelAfterChatModel
    +
    WithRecursive
    模式取消嵌套 Agent,等待安全点保存 Checkpoint,然后恢复继续执行
    + +### Middlewares (中间件) + + + + +
    目录名称说明
    adk/middlewares/skillSkill 中间件从文件系统加载 Agent 技能(如 log_analyzer),展示技能中间件的使用方式
    + ### GraphTool (图工具) @@ -209,6 +230,7 @@ weight: 3 +
    quickstart/chatChat 快速开始最基础的 LLM 对话示例,包含模板、生成、流式输出
    quickstart/eino_assistantEino 助手完整的 RAG 应用示例,包含知识索引、Agent 服务、Web 界面
    quickstart/todoagentTodo Agent简单的 Todo 管理 Agent 示例
    quickstart/chatwitheinoChat with Eino (教程)9 章渐进式教程,从 ChatModel → Runner → Session → Tool → Middleware → Callback → Interrupt → GraphTool → Skill,逐步构建完整 Agent
    --- diff --git a/content/zh/docs/eino/FAQ.md b/content/zh/docs/eino/FAQ.md index dc39656ace1..f898896337a 100644 --- a/content/zh/docs/eino/FAQ.md +++ b/content/zh/docs/eino/FAQ.md @@ -1,10 +1,10 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-19" lastmod: "" tags: [] title: FAQ -weight: 11 +weight: 10 --- # Q: cannot use openapi3.TypeObject (untyped string constant "object") as *openapi3.Types value in struct literal,cannot use types (variable of type string) as *openapi3.Types value in struct literal @@ -13,11 +13,7 @@ weight: 11 # Q: Agent 流式调用时不会进入 ToolsNode 节点。或流式效果丢失,表现为非流式。 -- 先更新 eino 版本到最新 - -不同的模型在流式模式下输出工具调用的方式可能不同: 某些模型(如 OpenAI) 会直接输出工具调用;某些模型 (如 Claude) 会先输出文本,然后再输出工具调用。因此需要使用不同的方法来判断,这个字段用来指定判断模型流式输出中是否包含工具调用的函数。 - -ReAct Agent 的 Config 中有一个 StreamToolCallChecker 字段,如未填写,Agent 会使用“非空包”是否包含工具调用判断: +- 先更新 eino 版本到最新不同的模型在流式模式下输出工具调用的方式可能不同: 某些模型(如 OpenAI) 会直接输出工具调用;某些模型 (如 Claude) 会先输出文本,然后再输出工具调用。因此需要使用不同的方法来判断,这个字段用来指定判断模型流式输出中是否包含工具调用的函数。ReAct Agent 的 Config 中有一个 StreamToolCallChecker 字段,如未填写,Agent 会使用“非空包”是否包含工具调用判断: ```go func firstChunkStreamToolCallChecker(_ context.Context, sr *schema.StreamReader[*schema.Message]) (bool, error) { @@ -45,9 +41,7 @@ func firstChunkStreamToolCallChecker(_ context.Context, sr *schema.StreamReader[ } ``` -上述默认实现适用于:模型输出的 Tool Call Message 中只有 Tool Call。 - -默认实现不适用的情况:在输出 Tool Call 前,有非空的 content chunk。此时,需要自定义 tool Call checker 如下: +上述默认实现适用于:模型输出的 Tool Call Message 中只有 Tool Call。默认实现不适用的情况:在输出 Tool Call 前,有非空的 content chunk。此时,需要自定义 tool Call checker 如下: ```go toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Message]) (bool, error) { @@ -74,9 +68,7 @@ toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Mes 上面这个自定义 StreamToolCallChecker,在模型常规输出 answer 时,需要判断**所有包**是否包含 ToolCall,从而导致“流式判断”的效果丢失。如果希望尽可能保留“流式判断”效果,解决这一问题的建议是: > 💡 -> 尝试添加 prompt 来约束模型在工具调用时不额外输出文本,例如:“如果需要调用 tool,直接输出 tool,不要输出文本”。 -> -> 不同模型受 prompt 影响可能不同,实际使用时需要自行调整 prompt 并验证效果。 +> 尝试添加 prompt 来约束模型在工具调用时不额外输出文本,例如:“如果需要调用 tool,直接输出 tool,不要输出文本”。不同模型受 prompt 影响可能不同,实际使用时需要自行调整 prompt 并验证效果。 # Q: [github.com/bytedance/sonic/loader](http://github.com/bytedance/sonic/loader): invalid reference to runtime.lastmoduledatap @@ -91,9 +83,7 @@ toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Mes Eino 目前不支持批处理,可选方法有两种 1. 每次请求按需动态构建 graph,额外成本不高。 这种方法需要注意 Chain Parallel 要求其中并行节点数量大于一, -2. 自定义批处理节点,节点内自行批处理任务 - -代码示例:[https://github.com/cloudwego/eino-examples/tree/main/compose/batch](https://github.com/cloudwego/eino-examples/tree/main/compose/batch) +2. 自定义批处理节点,节点内自行批处理任务代码示例:[https://github.com/cloudwego/eino-examples/tree/main/compose/batch](https://github.com/cloudwego/eino-examples/tree/main/compose/batch) # Q: eino 支持把模型结构化输出吗 @@ -101,9 +91,7 @@ Eino 目前不支持批处理,可选方法有两种 1. 部分模型支持直接配置(比如 openai 的 response format),可以看下模型配置里有没有。 2. 通过 tool call 功能获得 -3. 写 prompt 要求模型 - -得到模型结构化输出后,可以用 schema.NewMessageJSONParser 把 message 转换成你需要的 struct +3. 写 prompt 要求模型得到模型结构化输出后,可以用 schema.NewMessageJSONParser 把 message 转换成你需要的 struct # Q: 如何获取模型(chat model)输出的 Reasoning Content/推理/深度思考 内容: @@ -115,14 +103,8 @@ Eino 目前不支持批处理,可选方法有两种 1. context.canceled: 在执行 graph 或者 agent 时,用户侧传入了一个可以 cancel 的 context,并发起了取消。排查应用层代码的 context cancel 操作。此报错与 eino 框架无关。 2. Context deadline exceeded: 可能是两种情况: - 1. 在执行 graph 或者 agent 时,用户侧传入了一个带 timeout 的 context,触发了超时。 - 2. 给 ChatModel 或者其他外部资源配置了 timeout 或带 timeout 的 httpclient,触发了超时。 - -查看抛出的 error 中的 `node path: [node name x]`,如果 node name 不是 ChatModel 等带外部调用的节点,大概率是 2-a 这种情况,反之大概率是 2-b 这种情况。 - -如果怀疑是 2-a 这种情况,自行排查下上游链路那个环节给 context 设置了 timeout,常见的可能性如 faas 平台等。 - -如果怀疑是 2-b 这种情况,看下节点是否自行配置了超时,比如 Ark ChatModel 配置了 Timeout,或者 OpenAI ChatModel 配置了 HttpClient(内部配置了 Timeout)。如果都没有配置,但依然超时了,可能是模型侧 SDK 的默认超时。已知 Ark SDK 默认超时 10 分钟,Deepseek SDK 默认超时 5 分钟。 +3. 在执行 graph 或者 agent 时,用户侧传入了一个带 timeout 的 context,触发了超时。 +4. 给 ChatModel 或者其他外部资源配置了 timeout 或带 timeout 的 httpclient,触发了超时。查看抛出的 error 中的 `node path: [node name x]`,如果 node name 不是 ChatModel 等带外部调用的节点,大概率是 2-a 这种情况,反之大概率是 2-b 这种情况。如果怀疑是 2-a 这种情况,自行排查下上游链路那个环节给 context 设置了 timeout,常见的可能性如 faas 平台等。如果怀疑是 2-b 这种情况,看下节点是否自行配置了超时,比如 Ark ChatModel 配置了 Timeout,或者 OpenAI ChatModel 配置了 HttpClient(内部配置了 Timeout)。如果都没有配置,但依然超时了,可能是模型侧 SDK 的默认超时。已知 Ark SDK 默认超时 10 分钟,Deepseek SDK 默认超时 5 分钟。 # Q:想要在子图中获取父图的 State 怎么做 @@ -138,37 +120,268 @@ eino-ext 支持的多模态输入输出场景,可以查阅 [https://www.cloudw # Q: 升级到 0.6.x 版本后,有不兼容问题 -根据先前社区公告规划 [Migration from OpenAPI 3.0 Schema Object to JSONSchema in Eino · cloudwego/eino · Discussion #397](https://github.com/cloudwego/eino/discussions/397),已发布 eino V0.6.1 版本。重要更新内容为移除了 getkin/kin-openapi 依赖以及所有 OpenAPI 3.0 相关代码。 +根据先前社区公告规划 [Migration from OpenAPI 3.0 Schema Object to JSONSchema in Eino · cloudwego/eino · Discussion #397](https://github.com/cloudwego/eino/discussions/397),已发布 eino V0.6.1 版本。重要更新内容为移除了 getkin/kin-openapi 依赖以及所有 OpenAPI 3.0 相关代码。eino-ext 部分 module 报错 undefined: schema.NewParamsOneOfByOpenAPIV3 等问题,升级报错的 eino-ext module 到最新版本即可。如果 schema 改造比较复杂,可以使用 [JSONSchema 转换方法](https://bytedance.larkoffice.com/wiki/ZMaawoQC4iIjNykzahwc6YOknXf)文档中的工具方法辅助转换。 -eino-ext 部分 module 报错 undefined: schema.NewParamsOneOfByOpenAPIV3 等问题,升级报错的 eino-ext module 到最新版本即可。 - -如果 schema 改造比较复杂,可以使用 JSONSchema 转换工具方法辅助转换。 +> 💡 -# Q: Eino-ext 提供的 ChatModel 有哪些模型是支持 Response API 形式调用嘛? +# Q: 我创建模型之后,尝试模型调用报错 : 400 Bad Reqvest,message: code: missing_required_parameter; message: Missing reqvired parameter:'input 。 -- Eino-Ext 中目前只有 ARK 的 Chat Model 可通过 **NewResponsesAPIChatModel **创建 ResponsesAPI ChatModel,其他模型目前不支持 ResponsesAPI 的创建与使用, - - 遇到这个报错请确认咱们生成 chat model 是填写的 base url 是 chat completion 的 URL 还是 ResponseAPI 的 URL,绝大多数场景是错误传递了 Response API 的 Base URL +- 遇到这个报错请确认咱们生成 chat model 是填写的 base url 是 chat completion 的 URL 还是 ResponseAPI 的 URL,绝大多数场景是错误传递了 Response API 的 Base URL # Q: 如何排查 ChatModel 调用报错?比如[NodeRunError] failed to create chat completion: error, status code: 400, status: 400 Bad Request。 -这类报错是模型 API(如 GPT、Ark、Gemini 等)的报错,通用的思路是检查实际调用模型 API 的 HTTP Request 是否有缺字段、字段值错误、BaseURL 错误等情况。建议将实际的 HTTP Request 通过日志打印出来,并通过 HTTP 直接请求的方式(如命令行发起 Curl 或使用 Postman 直接请求)来验证、修改该 HTTP Request。在定位问题后,再相应修改对应的 Eino 代码中的问题。 - -如何通过日志打印出模型 API 的实际 HTTP Request,参考这个代码样例:[https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport](https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport) +这类报错是模型 API(如 GPT、Ark、Gemini 等)的报错,通用的思路是检查实际调用模型 API 的 HTTP Request 是否有缺字段、字段值错误、BaseURL 错误等情况。建议将实际的 HTTP Request 通过日志打印出来,并通过 HTTP 直接请求的方式(如命令行发起 Curl 或使用 Postman 直接请求)来验证、修改该 HTTP Request。在定位问题后,再相应修改对应的 Eino 代码中的问题。如何通过日志打印出模型 API 的实际 HTTP Request,参考这个代码样例:[https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport](https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport) # Q: 使用 eino-ext 仓库下 创建的 gemini chat model 不支持使用 Image URL 传递多模态?如何适配? 目前 Eino-ext 仓库下的 gemini Chat model 已经做了传递 URL 类型的支持,使用 go get github.com/cloudwego/eino-ext/components/model/gemini 更新到 [components/model/gemini/v0.1.22](https://github.com/cloudwego/eino-ext/releases/tag/components%2Fmodel%2Fgemini%2Fv0.1.22) 目前最新版本,传递 Image URL 测试是否满足业务需求 -# Q: 调用工具(包括 MCP tool)之前,报 JSON Unmarshal 失败的错误,如何解决 +# Q: 模型产生的 Tool Call 有问题(参数非法 JSON、调用了不存在的工具、参数名称发生变化等),如何处理? + +模型(LLM)产生的 Tool Call 可能存在多种问题,Eino 提供了多层防御机制来应对。以下按问题类型分别介绍: + +## 1. Tool Call 参数不是合法 JSON(Unmarshal 失败) + +**典型报错:** `failed to call mcp tool: failed to marshal request: json: error calling MarshalJSON for type json.RawMessage: unexpected end of JSON input` **根因:** ChatModel 产生的 Tool Call 中,Argument 字段是 string。Eino 在调用工具前会做 JSON Unmarshal,如果模型输出的 JSON 不合法(多余前缀/后缀、特殊字符转义、缺失大括号、超长截断等),则会报错。**方案 A:ToolArgumentsHandler(推荐)**在 `ToolsNodeConfig`(或 ADK 的 `ToolsConfig`)中配置 `ToolArgumentsHandler`,在工具执行前对参数进行预处理和修复: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: tools, + ToolArgumentsHandler: func(ctx context.Context, name, arguments string) (string, error) { + // 在此修复常见 JSON 格式问题,如缺失大括号、多余前缀等 + return fixJSON(arguments), nil + }, + }, + }, +}) +``` + +一个 JSON 修复的参考实现:[eino-examples/components/tool/middlewares/jsonfix](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix)**执行顺序:** `ArgumentsAliases 别名替换 → ToolArgumentsHandler → 工具执行` + +## 2. 模型调用了不存在的工具(Tool Name 幻觉) + +**典型报错:** `tool xxx not found in toolsNode indexes` **根因:** 模型可能"幻觉"出不存在的工具名称。**方案:UnknownToolsHandler** 配置后,当模型调用不存在的工具时,不会直接报错,而是由 Handler 返回一段提示文本,让模型自行纠正: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + UnknownToolsHandler: func(ctx context.Context, name, input string) (string, error) { + return fmt.Sprintf("Tool '%s' does not exist. Available tools: %s. Please retry.", name, availableToolNames), nil + }, +} +``` + +## 3. 工具名称或参数名称发生变化(Schema 迁移导致的兼容性问题) + +**场景:** 工具重命名(如 `search` → `web_search`),或参数字段重命名(如 `q` → `query`),但模型可能仍使用旧名称。这在使用 LLM Cache 或对话历史中记录了旧工具 Schema 时尤为常见。**方案:ToolAliases** 为工具配置名称别名和参数别名,框架在调度时自动解析: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + ToolAliases: map[string]compose.ToolAliasConfig{ + "web_search": { + NameAliases: []string{"search", "web-search"}, // 旧工具名 → 当前工具名 + ArgumentsAliases: map[string][]string{ + "query": {"q", "search_term"}, // 旧参数名 → 当前参数名 + }, + }, + }, +} +``` + +> 💡 +> ToolAliases 的参数别名替换发生在 ToolArgumentsHandler 之前。完整的执行顺序为:Name Alias 解析 → Arguments Alias 替换 → ToolArgumentsHandler → 工具执行。 + +## 4. 工具执行失败后,让模型自行纠错(而非中断流程) + +**场景:** Tool 执行报错(如文件不存在、权限不足、API 调用失败)时,默认会中断 Agent 流程。但通常更好的做法是将错误信息作为正常的 Tool Result 返回给模型,由模型自动纠错重试。**方案 A:ADK Middleware(WrapInvokableToolCall)**在 ADK Agent 中,通过 `ChatModelAgentMiddleware` 的 `WrapInvokableToolCall` 方法将错误转换为字符串结果: + +```go +func (m *safeToolMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err // 中断错误不转换 + } + return fmt.Sprintf("[tool error] %v", err), nil + } + return result, nil + }, nil +} +``` + +参考:[quickstart/chatwitheino Ch05 Middleware](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go)**方案 B:compose 层 ToolCallMiddlewares** 在 compose 层直接使用 `ToolCallMiddlewares`,适用于直接使用 Graph/ToolsNode 的场景: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + ToolCallMiddlewares: []compose.ToolMiddleware{ + { + Invokable: func(next compose.InvokableToolEndpoint) compose.InvokableToolEndpoint { + return func(ctx context.Context, in *compose.ToolInput) (*compose.ToolOutput, error) { + output, err := next(ctx, in) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return nil, err + } + return &compose.ToolOutput{Result: fmt.Sprintf("[tool error] %v", err)}, nil + } + return output, nil + } + }, + }, + }, +} +``` + +参考:[eino-examples/components/tool/middlewares/errorremover](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/errorremover) -ChatModel 产生的 Tool Call 中,Argument 字段是 string。Eino 框架在根据这个 Argument string 调用工具时,会先做 JSON Unmarshal。这时,如果 Argument string 不是合法的 JSON,则 JSON Unmarshal 会失败,报出类似这样的错误:`failed to call mcp tool: failed to marshal request: json: error calling MarshalJSON for type json.RawMessage: unexpected end of JSON input` +> 💡 +> 注意:在转换错误时,必须先检查 `compose.IsInterruptRerunError`。InterruptRerun 错误是框架用于 Human-in-the-loop 等场景的控制流信号,不应被吞掉。 + +## 总结 -解决这个问题的根本途径是依靠模型输出合法的 Tool Call Argument。在工程方面,我们可以尝试修复一些常见的 JSON 格式问题,如多余的前缀、后缀,特殊字符转义问题,缺失的大括号等,但无法保证 100% 的修正。一个类似的修复实现可以参考代码样例:[https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix) + + + + + + +
    问题机制配置位置
    参数 JSON 不合法
    ToolArgumentsHandler
    ToolsNodeConfig
    /
    ToolsConfig
    调用不存在的工具
    UnknownToolsHandler
    ToolsNodeConfig
    /
    ToolsConfig
    工具名/参数名变化
    ToolAliases
    ToolsNodeConfig
    /
    ToolsConfig
    工具执行报错需自动纠错Middleware 错误转换ADK
    Handlers
    ToolCallMiddlewares
    # Q:如何可视化一个 graph/chain/workflow 的拓扑结构? 利用 `GraphCompileCallback` 机制在 `graph.Compile` 的过程中将拓扑结构导出。一个导出为 mermaid 图的代码样例:[https://github.com/cloudwego/eino-examples/tree/main/devops/visualize](https://github.com/cloudwego/eino-examples/tree/main/devops/visualize) +# Q: Eino 中使用 Flow/react Agent 场景下如何获取工具调用的 Tool Call Message 以及本次调用工具的 Tool Result 结果? + - Flow/React Agent 场景下获取中间结构参考文档 [Eino: ReAct Agent 使用手册](/zh/docs/eino/core_modules/flow_integration_components/react_agent_manual) +- 此外还可以将 Flow/React Agent 替换成 ADK 的 ChatModel Agent 具体可参考 [Eino ADK: 概述](/zh/docs/eino/core_modules/eino_adk/agent_preview) + +# Q: 在使用 Eino 开发 Agent 时,定义了一个不需要任何参数的工具(Tool)。为什么在调用部分大模型时,会遇到类似 JSON Schema 校验失败(如 `unknown msg type` 或格式不支持)的报错?该如何规范解决? + +**A: 问题根因:**在 Function Calling / 工具调用的生态中,许多大模型厂商对下发的 JSON Schema 都有着严格的格式校验逻辑。如果在定义无参工具时,开发者错误地传入了空的参数映射或空结构体(例如导致框架生成 `{"type": "object", "properties": {}}` 这样虽然语法合法但无实际意义的 Schema),部分模型的校验引擎会将其判定为不符合预期的异常格式,进而直接拒绝请求。**框架机制与代码行为:** + +- 在 Eino 框架的核心定义(`eino/schema/tool.go`)中,`schema.ToolInfo` 结构体专门使用 `ParamsOneOf` 字段来描述参数。 +- 框架设计上明确允许:对于不需要参数的工具,`ParamsOneOf` 应当为 `nil`。 +- 当 `ParamsOneOf` 为 `nil` 时,Eino 的底层组件在向各类模型 Provider 构建请求时,会直接省略工具的 `parameters` 字段,从而从根本上避免触发模型的强校验规则。**最佳实践:**在 Eino 中构造无参工具时,**切勿使用空结构体或空 Map 去初始化参数描述**,应直接让 `ParamsOneOf` 保持默认的 `nil` 状态。 + +```go +tool := &schema.ToolInfo{ + Name: "fetch_current_time", + Desc: "获取当前系统时间,无需任何参数", + // 最佳实践:明确置为 nil,或直接不声明该字段 + ParamsOneOf: nil, +} +``` + +**(注:如果使用的是 **utils.InferTool** 等反射推导工具,且入参为空结构体时,需注意确保使用的 Eino 扩展版本已正确处理了空属性的过滤,或考虑根据需要手动覆盖其参数定义。)** + +# Q: 如何在 Agent 外部获取 Session Values(如 deep agent 的 TODOs)? + +在 ADK 中,`adk.GetSessionValues(ctx)` 和 `adk.AddSessionValue(ctx, key, value)` 依赖 Agent 运行期间注入到 context 中的 `runSession`。这意味着它们**只能在 Agent 的执行上下文内使用**——例如在 Middleware、Handler 或 Tool 回调函数中。当用户通过 Runner 的 `Run` 方法获取到 `AsyncIterator` 并在外部消费 `AgentEvent` 时,此时已经不在 Agent 的执行上下文中,因此无法通过 `adk.GetSessionValues` 获取到 Session Values。如果需要在 Agent 运行过程中实时获取 Session Values(例如在消费流式事件的同时),可以考虑使用 Middleware/Callback Handler 的回调将所需数据通过其他渠道(如 channel)传递出来。 + +# Q: 多个同名 SubAgent 并发执行时,如何区分它们发出的 AgentEvent? + +**场景:** 使用 DeepAgent 时,多个同名 SubAgent(如 `general-purpose`)可能并发执行。在通过 Runner 消费 `AsyncIterator[*AgentEvent]` 时,不同实例发出的事件难以区分。**方案:包装 Agent,通过 CustomizedOutput 注入标识符** `AgentOutput` 提供了 `CustomizedOutput any` 字段,可以用于承载自定义数据。通过包装 Agent 的 `Run` 方法,在每个发出的事件上注入唯一标识: + +```go +type wrappedAgent struct { + adk.Agent + identifier int +} + +func (w *wrappedAgent) Run(ctx context.Context, input *adk.AgentInput, options ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + iter := w.Agent.Run(ctx, input, options...) + newIter, newGen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() + go func() { + defer newGen.Close() + for { + event, ok := iter.Next() + if !ok { + break + } + // 注意:event.Output 可能为 nil(如错误事件、action-only 事件) + if event.Output == nil { + event.Output = &adk.AgentOutput{} + } + event.Output.CustomizedOutput = w.identifier + newGen.Send(event) + } + }() + return newIter +} +``` + +**使用方式:** + +```go +agent1 := &wrappedAgent{Agent: generalAgent, identifier: 1} +agent2 := &wrappedAgent{Agent: generalAgent, identifier: 2} +// 将 agent1、agent2 作为 SubAgent 传入 DeepAgent +``` + +**消费端区分:** + +```go +for { + event, ok := iter.Next() + if !ok { + break + } + if event.Output != nil && event.Output.CustomizedOutput != nil { + id := event.Output.CustomizedOutput.(int) + fmt.Printf("Event from agent instance %d\n", id) + } +} +``` + +> 💡 +> 注意事项: +> +> 1. event.Output 可能为 nil,设置 CustomizedOutput 前必须做 nil 检查。 +> 2. 此包装仅覆盖 Run 方法。如果 Agent 实现了 ResumableAgent 接口(如 DeepAgent 创建的 Agent),Resume 方法通过嵌入的 Agent 直接调用,其事件不会被注入标识符。如需完整覆盖,需要同时包装 Resume 方法。 +> 3. 此方案是 workaround,适合快速解决区分问题。CustomizedOutput 不会被持久化到 Checkpoint。 + +# Q: 如何在某个 Skill 被触发时才加载对应的 ToolInfo?/ 如何用 Skill 强制模型调用指定工具? + +这两个问题的根源在于对 Skill 和 Tool 概念的混淆。**Skill 的本质是 Prompt。** Skill 中间件在触发时,会向对话中插入一条新的 UserMessage,其内容就是该 Skill 的 Prompt 文本。你可以在 Skill Prompt 中写明"请调用 xxx 工具,参数为 yyy",但这仍然只是提示词——模型是否遵循,取决于 Prompt Engineering 的质量和模型本身的随机性。**Tool(ToolInfo)的本质是请求参数。** ToolInfo 列表作为 ChatModel 请求的 `tools` 参数发送给模型,告诉模型"你可以调用哪些工具"。除非使用 ToolSearch 动态加载(Claude、GPT 5.4+ 等支持),否则 ToolInfo 必须在请求时一并传递。**关于"Skill 触发时动态加载 ToolInfo":** 要实现这个效果,意味着当 Skill Prompt 被插入对话时,同时往本次请求的 `[]ToolInfo` 中追加该 Skill 所需的工具定义。这完全是用户侧的自定义行为——你需要:1) 识别当前轮次是否触发了 Skill;2) 确定该 Skill 需要哪些 Tool;3) 在构造 ChatModel 请求前,将对应的 ToolInfo 追加到 `[]ToolInfo`。需要注意,`[]ToolInfo` 位于 Prompt Cache 的前部,动态追加新工具极大概率会破坏 Prompt Cache,导致缓存命中率下降和延迟增加。如果在意缓存效率,应在初始化时就把所有可能用到的工具一次性传入。**关于"用 Skill 强制模型调用指定工具":** Skill 只是向模型发送了一段文字提示,模型是否严格遵循取决于 Prompt 的清晰度、模型自身的 instruction-following 能力以及上下文干扰。这本质上是 Prompt Engineering 问题,存在固有的不确定性。如果业务要求 100% 确定调用某个工具,可以在 LLM 请求中指定 ToolChoice 强制模型选择该工具,或在应用层代码中直接调用该工具而非依赖模型决策。 + +> 💡 +> 推荐做法:Skill 触发时希望模型"大概率"调用某工具 → 在 Skill Prompt 中明确写出工具名称、参数格式和调用指令;需要动态控制可用工具集 → 使用 ToolSearch 或在 ChatModel 中间件中根据上下文动态修改 `[]ToolInfo`;必须 100% 调用某工具 → 在应用层代码中直接调用,不依赖模型决策;担心 Prompt Cache 失效 → 初始化时传入所有可能用到的 ToolInfo,避免动态增删。 + +# Q: Supervisor 子 Agent 转回主 Agent 报错 / transfer_to_agent 转发后子 Agent 收到的用户内容变更 + +这些问题均与 ADK 的 AgentTransfer 机制有关。Supervisor 是基于 AgentTransfer 实现的多 Agent 协作模式。AgentTransfer 机制存在以下已知局限: + +- **上下文全量共享**:Supervisor 与 SubAgent 之间、SubAgent 之间强制共享完整上下文,导致 token 开销大、延迟高。 +- **注意力稀释**:全量共享的上下文对子 Agent 而言往往冗余,稀释了子 Agent 对其真正任务的关注度,降低执行质量。 +- **上下文污染**:转发过程中产生的 "Successfully transferred to xxx" 消息会残留在上下文中,可能误导后续 Agent 的 Tool Call 决策(形成错误的 few-shot 示例)。 +- **强制注入工具**:机制要求注入 Transfer Tool(以及可能的 Exit Tool),增加了 ToolInfo 列表的复杂度。 + +> 💡 +> 基于上述原因,ADK 中的 AgentTransfer / Supervisor 模式目前标记为「不推荐使用」。 + +**推荐替代方案:** 使用 DeepAgent 或 ChatModelAgent + AgentTool 组合。这种模式下: + +- 每个 AgentTool 拥有独立封装的上下文,不会相互污染,速度更快、成本更低,通常效果更好。 +- 不会产生 "Successfully transferred to xxx" 等干扰消息,避免对模型决策形成误导。 + +# Q: DeepSeek V4 模型 tool call 场景下 reason content 回传有问题,如何解决? + +DeepSeek V4 模型在 tool call 场景下,reason content 的回传存在已知问题,多位业务同学反馈遇到此情况。 + +**解决方式:** 升级对应的 eino-ext deepseek 模块到最新版本即可修复。 + +```shell +go get github.com/cloudwego/eino-ext/components/model/deepseek@latest +``` -# Q: Gemini 模型报错 missing a `thought_signature` +升级后重新运行,确认 reason content 回传是否恢复正常。 diff --git a/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md b/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md index 731c634fbdf..1b6055116be 100644 --- a/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md +++ b/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-17" lastmod: "" tags: [] title: 编排的设计理念 diff --git a/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/stream_programming_essentials.md b/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/stream_programming_essentials.md index 39ef62359a2..9c7025c0804 100644 --- a/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/stream_programming_essentials.md +++ b/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/stream_programming_essentials.md @@ -101,7 +101,7 @@ Collect 和 Transform 两种流式范式,目前只在编排场景有用到。 上面的 Concat message stream 是 Eino 框架自动提供的能力,即使不是 message,是任意的 T,只要满足特定的条件,Eino 框架都会自动去做这个 StreamReader[T] 到 T 的转化,这个条件是:**在编排中,当一个组件的上游输出是 StreamReader[T],但是组件只提供了 T 作为输入的业务接口时,框架会自动将 StreamReader[T] concat 成 T,再输入给这个组件。** > 💡 -> 框架自动将 StreamReader[T] concat 成 T 的过程,可能需要用户提供一个 Concat function。详见 [Eino: 编排的设计理念](/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles) 中关于“合并帧”的章节。 +> 框架自动将 StreamReader[T] concat 成 T 的过程,可能需要用户提供一个 Concat function。详见 [Eino: 编排的设计理念](/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles#share-FaVnd9E2foy4fAxtbTqcsgq3n5f) 中关于“合并帧”的章节。 另一方面,考虑一个相反的例子。还是 React Agent,这次是一个更完整的编排示意图: diff --git a/content/zh/docs/eino/core_modules/components/agentic_tools_node_guide.md b/content/zh/docs/eino/core_modules/components/agentic_tools_node_guide.md index 67f1559297a..a9f6438c4c0 100644 --- a/content/zh/docs/eino/core_modules/components/agentic_tools_node_guide.md +++ b/content/zh/docs/eino/core_modules/components/agentic_tools_node_guide.md @@ -371,8 +371,8 @@ result, err := runnable.Invoke(ctx, input, compose.WithCallbacks(helper)) 工具的实现方式有多种,可以参考如下方式: -- 基于 HTTP API 的 tool 实现: [如何使用 openapi 创建 tool/function call ?](/zh/docs/eino/usage_guide/how_to_guide/openapi_tool_creation) -- 基于 gRPC 的 tool 实现: [如何使用 proto3 创建 tool/function call ? ](/zh/docs/eino/usage_guide/how_to_guide/proto3_tool_creation) -- 基于 thrift 的 tool 实现: [如何使用 thrift idl 创建 tool/function call ? ](/zh/docs/eino/usage_guide/how_to_guide/thrift_idl_tool_creation) +- 基于 HTTP API 的 tool 实现: [如何使用 openapi 创建 tool/function call ?](https://bytedance.larkoffice.com/wiki/FjXzwf3exijtKyk2hh7cAmnZn1g) +- 基于 gRPC 的 tool 实现: [如何使用 proto3 创建 tool/function call ? ](https://bytedance.larkoffice.com/wiki/EPkawUVbdiGwxCkWCJTcAMQonbh) +- 基于 thrift 的 tool 实现: [如何使用 thrift idl 创建 tool/function call ? ](https://bytedance.larkoffice.com/wiki/PcHfwo6x0iOrXxkIjJecez8xnNg) - 基于本地函数的工具实现: [如何创建一个 tool ?](/zh/docs/eino/core_modules/components/tools_node_guide/how_to_create_a_tool) - …… diff --git a/content/zh/docs/eino/core_modules/components/document_transformer_guide.md b/content/zh/docs/eino/core_modules/components/document_transformer_guide.md index fa0d2764a98..b9a2a6ae272 100644 --- a/content/zh/docs/eino/core_modules/components/document_transformer_guide.md +++ b/content/zh/docs/eino/core_modules/components/document_transformer_guide.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2025-07-21" +date: "2026-05-17" lastmod: "" tags: [] title: Document Transformer 使用说明 @@ -160,9 +160,11 @@ for idx, doc := range outDocs { ## **已有实现** -1. Markdown Header Splitter: 基于 Markdown 标题进行文档分割 [Splitter - markdown](/zh/docs/eino/ecosystem_integration/document/splitter_markdown) -2. Text Splitter: 基于文本长度或分隔符进行文档分割 [Splitter - semantic](/zh/docs/eino/ecosystem_integration/document/splitter_semantic) -3. Document Filter: 基于规则过滤文档内容 [Splitter - recursive](/zh/docs/eino/ecosystem_integration/document/splitter_recursive) + + + + +
    markdownREADME_zh.mdREADME.md
    recursiveREADME_zh.mdREADME.md
    semanticREADME_zh.mdREADME.md
    ## **自行实现参考** diff --git a/content/zh/docs/eino/core_modules/components/embedding_guide.md b/content/zh/docs/eino/core_modules/components/embedding_guide.md index 329129f3d87..029bbe92266 100644 --- a/content/zh/docs/eino/core_modules/components/embedding_guide.md +++ b/content/zh/docs/eino/core_modules/components/embedding_guide.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2025-07-21" +date: "2026-05-17" lastmod: "" tags: [] title: Embedding 使用说明 diff --git a/content/zh/docs/eino/core_modules/components/tools_node_guide/_index.md b/content/zh/docs/eino/core_modules/components/tools_node_guide/_index.md index 2561cabb5b5..8660b0f0abb 100644 --- a/content/zh/docs/eino/core_modules/components/tools_node_guide/_index.md +++ b/content/zh/docs/eino/core_modules/components/tools_node_guide/_index.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-03" +date: "2026-05-17" lastmod: "" tags: [] title: ToolsNode&Tool 使用说明 @@ -282,6 +282,82 @@ type ToolInfo struct { Tool 组件使用 ToolOption 来定义可选参数, ToolsNode 没有抽象公共的 option。每个具体的实现可以定义自己的特定 Option,通过 WrapToolImplSpecificOptFn 函数包装成统一的 ToolOption 类型。 +## Tool 别名(Alias)🏷️ alpha/09 + +Tool 别名功能允许为工具配置**名称别名**和**参数别名**,使 LLM 使用别名调用工具时能自动解析到真实工具和规范参数。 + +### 配置结构 + +```go +// ToolAliasConfig 配置单个工具的名称和参数别名 +type ToolAliasConfig struct { + // NameAliases 是工具的替代名称列表 + // 如果模型返回这些名称中的任何一个,将解析为规范工具名 + NameAliases []string + + // ArgumentsAliases 将规范参数 key 映射到其别名列表 + // key=规范名, value=[]别名 + // 例: {"query": ["q", "search_term"], "limit": ["max_results", "count"]} + ArgumentsAliases map[string][]string +} +``` + +在 `ToolsNodeConfig` 中通过 `ToolAliases` 字段配置: + +```go +config := &compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{searchTool, weatherTool}, + ToolAliases: map[string]ToolAliasConfig{ + "search": { + NameAliases: []string{"find", "query", "search_v1"}, + ArgumentsAliases: map[string][]string{ + "query": {"q", "search_term"}, + "limit": {"max_results", "count"}, + }, + }, + }, +} +toolsNode, err := compose.NewToolNode(ctx, config) +``` + +### 动态覆盖 + +通过 `WithToolAliases()` 调用选项可在运行时覆盖全局别名配置: + +```go +// 覆盖别名配置(保留原工具列表) +result, err := toolsNode.Invoke(ctx, input, + compose.WithToolAliases(map[string]compose.ToolAliasConfig{ + "search": { + NameAliases: []string{"new_alias"}, + }, + }), +) + +// 同时覆盖工具列表和别名 +result, err := toolsNode.Invoke(ctx, input, + compose.WithToolList(newSearchTool), + compose.WithToolAliases(map[string]compose.ToolAliasConfig{...}), +) +``` + +### 执行流程 + +工具调用时的处理顺序: + +1. **名称解析**:LLM 返回的工具名(可能是别名)通过 indexes 查找解析为规范工具名 +2. **参数重映射**:JSON 参数中的别名 key 自动替换为规范 key +3. **ToolArgumentsHandler**(如已配置):接收规范工具名和已重映射的参数 +4. **工具执行**:使用规范名称和参数调用工具 + +### 注意事项 + +- 名称别名**不能**与其他工具的规范名或已注册的别名冲突 +- 参数别名**不能**与工具 JSON Schema 中已有的属性名冲突 +- 当别名 key 和规范 key **同时存在**于参数 JSON 中时,规范 key 优先,别名 key 保持原样 +- 为不存在的工具名配置别名会被**静默忽略** +- 别名功能同时支持**标准工具**和**增强型工具** + ## **使用方式** ### **标准工具使用** diff --git a/content/zh/docs/eino/core_modules/devops/visual_debug_plugin_guide.md b/content/zh/docs/eino/core_modules/devops/visual_debug_plugin_guide.md index f195e80e1d6..ef9539c7302 100644 --- a/content/zh/docs/eino/core_modules/devops/visual_debug_plugin_guide.md +++ b/content/zh/docs/eino/core_modules/devops/visual_debug_plugin_guide.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2025-11-20" +date: "2026-05-17" lastmod: "" tags: [] title: Eino Dev 可视化调试插件功能指南 @@ -166,6 +166,7 @@ go mod tidy > 1. 确保目标调试的编排产物至少执行过一次 `Compile()`。 > 2. `devops.Init()` 的执行必须要在调用 `Compile()` 之前。 > 3. 用户需要保证 `devops.Init()` 执行后主进程不能退出。 +> 4. v0.1.9 起,调试服务默认监听地址由 `0.0.0.0` 变更为 `127.0.0.1`(仅允许本地连接)。如需远程调试,请通过 `WithDevServerIP` 显式指定监听 IP,例如:`devops.Init(ctx, devops.WithDevServerIP("0.0.0.0"))`。 如在 `main()` 函数中增加调试服务启动代码 @@ -223,7 +224,7 @@ func main() { > 注意事项 > > - 本地电脑调试:系统可能会弹出网络接入警告,允许接入即可。 -> - 远程服务器调试:需要你保证端口可访问。 +> - 远程服务器调试:需要保证端口可访问。此外,v0.1.9 起默认仅监听 `127.0.0.1`,远程调试必须在 `devops.Init()` 时通过 `WithDevServerIP` 指定可被远端访问的 IP(如 `0.0.0.0`)。 IP 和 Port 配置完成后,点击确认,调试插件会自动连接到目标调试服务器。如果成功连接,连接状态指示器会变成绿色。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md index bd3e960647d..6c0fe4d8656 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-17" lastmod: "" tags: [] title: PatchToolCalls @@ -10,19 +10,15 @@ weight: 8 adk/middlewares/patchtoolcalls > 💡 -> PatchToolCalls 中间件用于修复消息历史中「悬空的工具调用」(dangling tool calls)问题。本中间件在 v0.8.0 版本引入。 +> PatchToolCalls 中间件用于修复消息历史中「悬空的工具调用」(dangling tool calls)问题。在 v0.8.0 版本引入。同时支持 `*schema.Message` 和 `*schema.AgenticMessage` 两种消息类型。 ## 概述 -在多轮对话场景中,可能会出现 Assistant 消息包含工具调用(ToolCalls),但对话历史中缺少对应的 Tool 消息响应的情况。这种「悬空的工具调用」会导致某些模型 API 报错或产生异常行为。 - -**常见场景:** +在多轮对话场景中,可能出现 Assistant 消息包含工具调用(ToolCalls),但对话历史中缺少对应 Tool 响应的情况。这种「悬空的工具调用」会导致某些模型 API 报错或产生异常行为。**常见场景:** - 用户在工具执行完成前发送了新消息,导致工具调用被中断 - 会话恢复时,部分工具调用结果丢失 -- Human-in-the-loop 场景下,用户取消了工具执行 - -PatchToolCalls 中间件会在每次模型调用前扫描消息历史,为缺少响应的工具调用自动插入占位符消息。 +- Human-in-the-loop 场景下,用户取消了工具执行 PatchToolCalls 中间件会在每次模型调用前(`BeforeModelRewriteState` 钩子)扫描消息历史,为缺少响应的工具调用自动插入占位符消息。 ## 快速开始 @@ -33,48 +29,64 @@ import ( "github.com/cloudwego/eino/adk/middlewares/patchtoolcalls" ) -// 使用默认配置创建中间件 +// 使用默认配置(cfg 可传 nil) mw, err := patchtoolcalls.New(ctx, nil) if err != nil { // 处理错误 } -// 与 ChatModelAgent 一起使用 agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ Model: yourChatModel, Middlewares: []adk.ChatModelAgentMiddleware{mw}, }) ``` -## 配置项 +## API 参考 + +### Config ```go type Config struct { - // PatchedContentGenerator 自定义生成占位符消息内容的函数 - // 可选,不设置时使用默认消息 PatchedContentGenerator func(ctx context.Context, toolName, toolCallID string) (string, error) } ``` - +
    字段类型必填说明
    PatchedContentGenerator
    func(ctx, toolName, toolCallID string) (string, error)
    自定义生成占位符消息内容的函数。参数包含工具名和调用 ID,返回要填充的内容
    PatchedContentGenerator
    func(ctx context.Context, toolName, toolCallID string) (string, error)
    自定义生成占位符消息内容的函数。未设置时使用内置默认消息模板
    -### 默认占位符消息 +### New + +```go +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) +``` + +创建 PatchToolCalls 中间件。`cfg` 可为 `nil`,此时使用默认配置。内部调用 `NewTyped[*schema.Message]`。 + +### NewTyped + +```go +func NewTyped[M adk.MessageType](_ context.Context, cfg *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +泛型版本构造函数,支持 `*schema.Message` 和 `*schema.AgenticMessage`。`cfg` 可为 `nil`。 + +- 当 `M = *schema.Message` 时,通过 `ToolCallID` 字段匹配 Tool 消息 +- 当 `M = *schema.AgenticMessage` 时,通过 `ContentBlock.FunctionToolResult.CallID` 匹配 -如果不设置 `PatchedContentGenerator`,中间件会使用默认的占位符消息: +### 默认占位符消息 -**英文(默认):** +如果不设置 `PatchedContentGenerator`,中间件使用内置模板(通过 `fmt.Sprintf` 格式化,`%s` 依次对应 toolName 和 toolCallID):**英文(默认):** ``` -Tool call {toolName} with id {toolCallID} was cancelled - another message came in before it could be completed. +Tool call %s with id %s was canceled - another message came in before it could be completed. ``` **中文:** ``` -工具调用 {toolName}(ID 为 {toolCallID})已被取消——在其完成之前收到了另一条消息。 +工具调用 %s(ID 为 %s)已被取消——在其完成之前收到了另一条消息。 ``` 可通过 `adk.SetLanguage()` 切换语言。 @@ -91,10 +103,24 @@ mw, err := patchtoolcalls.New(ctx, &patchtoolcalls.Config{ }) ``` -### 结合其他中间件使用 +### 泛型用法(AgenticMessage) + +```go +mw, err := patchtoolcalls.NewTyped[*schema.AgenticMessage](ctx, nil) +if err != nil { + // 处理错误 +} + +agent, err := adk.NewTypedChatModelAgent[*schema.AgenticMessage](ctx, &adk.TypedChatModelAgentConfig[*schema.AgenticMessage]{ + Model: yourChatModel, + Middlewares: []adk.TypedChatModelAgentMiddleware[*schema.AgenticMessage]{mw}, +}) +``` + +### 结合其他中间件 ```go -// PatchToolCalls 通常应该放在中间件链的前面 +// PatchToolCalls 通常放在中间件链的前面 // 确保在其他中间件处理消息之前修复悬空的工具调用 agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ Model: yourChatModel, @@ -108,35 +134,28 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ ## 工作原理 - - -**处理逻辑:** - -1. 在 `BeforeModelRewriteState` 钩子中执行 -2. 遍历所有消息,查找包含 `ToolCalls` 的 Assistant 消息 -3. 对于每个 ToolCall,检查后续消息中是否存在对应的 Tool 消息(通过 `ToolCallID` 匹配) -4. 如果找不到对应的 Tool 消息,则插入一个占位符消息 -5. 返回修复后的消息列表 +> 💡 +> 对于 `*schema.Message`,通过 `msg.Role == schema.Tool && msg.ToolCallID` 匹配;对于 `*schema.AgenticMessage`,通过 `ContentBlock.FunctionToolResult.CallID` 匹配。 -## 示例场景 +### 示例场景 -### 修复前的消息历史 +**修复前:** ``` -[User] "帮我查询天气" -[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] -[Tool] "call_1: 晴天,25°C" -[User] "不用查位置了,直接告诉我北京的天气" <- 用户中断 +[User] "帮我查询天气" +[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] +[Tool] "call_1: 晴天,25°C" +[User] "不用查位置了,直接告诉我北京的天气" <- 用户中断 ``` -### 修复后的消息历史 +**修复后:** ``` -[User] "帮我查询天气" -[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] -[Tool] "call_1: 晴天,25°C" -[Tool] "call_2: 工具调用 get_location(ID 为 call_2)已被取消..." <- 自动插入 -[User] "不用查位置了,直接告诉我北京的天气" +[User] "帮我查询天气" +[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] +[Tool] "call_1: 晴天,25°C" +[Tool] "call_2: 工具调用 get_location(ID 为 call_2)已被取消..." <- 自动插入 +[User] "不用查位置了,直接告诉我北京的天气" ``` ## 多语言支持 @@ -153,7 +172,8 @@ adk.SetLanguage(adk.LanguageEnglish) // 英文(默认) ## 注意事项 > 💡 -> 此中间件仅在 `BeforeModelRewriteState` 钩子中修改本次运行的历史消息,不会影响实际存储的消息历史。修复只是临时的,仅用于本轮 agent 调用。 +> `BeforeModelRewriteState` 返回的 state 会被框架持久化到 agent 内部状态(参见 `wrappers.go` 中的 `ProcessState` 调用)。因此 PatchToolCalls 插入的占位符消息**会保留在后续迭代中**,不需要每轮重复修补。 - 建议将此中间件放在中间件链的**前面**,确保其他中间件处理的是完整的消息历史 -- 如果你的场景需要持久化修复后的消息,请在 `PatchedContentGenerator` 中实现相应逻辑 +- `cfg` 参数可传 `nil`,等价于 `&Config{}` +- 如果消息列表为空(`len(state.Messages) == 0`),中间件直接返回,不做任何处理 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md index daf5662c286..1622e947a52 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md @@ -1,33 +1,28 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-17" lastmod: "" tags: [] title: PlanTask weight: 6 --- -# PlanTask 中间件 - -adk/middlewares/plantask - > 💡 -> 本中间件在 v0.8.0 版本引入。 +> 本中间件在 v0.8.0 版本引入。包路径:`github.com/cloudwego/eino/adk/middlewares/plantask` ## 概述 -`plantask` 是一个任务管理中间件,让 Agent 可以创建和管理任务列表。中间件通过 `BeforeAgent` 钩子注入四个工具: - -- **TaskCreate**: 创建任务 -- **TaskGet**: 查看任务详情 -- **TaskUpdate**: 更新任务 -- **TaskList**: 列出所有任务 +`plantask` 是一个任务管理中间件,通过 `BeforeAgent` 钩子向 Agent 注入四个工具,使其具备结构化任务规划能力: -主要用途: + + + + + + +
    工具功能
    TaskCreate
    创建任务
    TaskGet
    获取单个任务详情
    TaskUpdate
    更新任务状态/字段、设置依赖、删除任务
    TaskList
    列出所有任务摘要
    -- 跟踪复杂任务的进度 -- 把大任务拆成小步骤 -- 管理任务间的依赖关系 +核心用途:将复杂请求拆解为可跟踪的小任务,管理任务间依赖关系,让用户看到执行进度。 --- @@ -38,7 +33,7 @@ adk/middlewares/plantask │ Agent │ │ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ -│ │ BeforeAgent: 注入任务工具 │ │ +│ │ BeforeAgent: 注入任务工具 (带 sync.Mutex 保证并发安全) │ │ │ │ - TaskCreate │ │ │ │ - TaskGet │ │ │ │ - TaskUpdate │ │ @@ -53,7 +48,7 @@ adk/middlewares/plantask │ │ │ 存储结构: │ │ baseDir/ │ -│ ├── .highwatermark # ID 计数器 │ +│ ├── .highwatermark # 已分配的最大 ID(纯数字文本) │ │ ├── 1.json # 任务 #1 │ │ ├── 2.json # 任务 #2 │ │ └── ... │ @@ -63,126 +58,151 @@ adk/middlewares/plantask --- -## 配置 +## API + +### 构造函数 + +```go +// 泛型版本,支持 *schema.Message 和 *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, config *Config) (adk.TypedChatModelAgentMiddleware[M], error) + +// 非泛型版本,等价于 NewTyped[*schema.Message] +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +``` + +### Config ```go type Config struct { Backend Backend // 存储后端,必填 - BaseDir string // 任务文件目录,必填 + BaseDir string // 任务文件存储目录,必填 } ``` -- 注意这个 Backend 的实现,应该是 session 维度隔离的,不同的 session 对应不同的 Backend(任务列表) +> 💡 +> Backend 应该是 session 维度隔离的——不同会话对应不同的 Backend 实例(即不同的任务列表)。 ---- +### Backend 接口 -## Backend 接口 +`Backend` 定义在 `plantask` 包内,是 `filesystem.Backend` 的精简子集,仅保留任务存储所需的四个方法: ```go type Backend interface { LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) - Read(ctx context.Context, req *ReadRequest) (string, error) + Read(ctx context.Context, req *ReadRequest) (*filesystem.FileContent, error) Write(ctx context.Context, req *WriteRequest) error Delete(ctx context.Context, req *DeleteRequest) error } ``` +其中类型别名关系: + +```go +type FileInfo = filesystem.FileInfo // Path, IsDir, Size, ModifiedAt +type LsInfoRequest = filesystem.LsInfoRequest // Path string +type ReadRequest = filesystem.ReadRequest // FilePath, Offset, Limit +type WriteRequest = filesystem.WriteRequest // FilePath, Content string + +// DeleteRequest 是 plantask 包自定义的(filesystem 包无此类型) +type DeleteRequest struct { + FilePath string +} +``` + +> 💡 +> 注意 `Read` 返回 `*filesystem.FileContent`(含 `Content string` 字段),不是裸 string。导入路径为 `github.com/cloudwego/eino/adk/filesystem`。 + --- ## 任务结构 ```go type task struct { - ID string `json:"id"` // 任务 ID - Subject string `json:"subject"` // 标题 - Description string `json:"description"` // 描述 - Status string `json:"status"` // 状态 - Blocks []string `json:"blocks"` // 阻塞哪些任务 - BlockedBy []string `json:"blockedBy"` // 被哪些任务阻塞 - ActiveForm string `json:"activeForm"` // 进行时文案 - Owner string `json:"owner"` // 负责 agent - Metadata map[string]any `json:"metadata"` // 自定义数据 + ID string `json:"id"` + Subject string `json:"subject"` + Description string `json:"description"` + Status string `json:"status"` + Blocks []string `json:"blocks"` + BlockedBy []string `json:"blockedBy"` + ActiveForm string `json:"activeForm,omitempty"` + Owner string `json:"owner,omitempty"` + Metadata map[string]any `json:"metadata,omitempty"` } ``` ### 状态 - - + + - +
    状态说明
    pending
    待处理(默认)
    状态值说明
    pending
    待处理(创建时默认)
    in_progress
    进行中
    completed
    已完成
    deleted
    删除(会删掉文件)
    deleted
    删除(物理删除 JSON 文件,并从其他任务的依赖列表中移除)
    -状态流转:`pending` → `in_progress` → `completed`,任何状态都可以直接 `deleted`。 +状态流转:`pending` → `in_progress` → `completed`;任何状态均可直接设为 `deleted`。 --- -## 工具 +## 工具参数 ### TaskCreate -创建任务。 +工具名常量:`TaskCreateToolName = "TaskCreate"` - - - - + + + +
    参数类型必填说明
    subject
    string标题
    description
    string描述
    activeForm
    string进行时文案,比如"正在运行测试"
    metadata
    object自定义数据
    subject
    string任务标题(祈使句形式)
    description
    string任务详细描述,包含上下文和验收标准
    activeForm
    string进行时文案(如"正在运行测试"),in_progress 时展示给用户
    metadata
    object自定义键值对
    -什么时候用: - -- 任务比较复杂,有 3 步以上 -- 用户给了一堆事情要做 -- 需要让用户看到进度 - -什么时候不用: - -- 就一个简单任务 -- 三两下就能搞定的事 +创建后任务 ID 自动递增(基于 `.highwatermark` 文件),状态初始为 `pending`。 ### TaskGet -查看任务详情。 +工具名常量:`TaskGetToolName = "TaskGet"` - +
    参数类型必填说明
    taskId
    string任务 ID
    taskId
    string任务 ID(纯数字字符串)
    -返回任务的完整信息:标题、描述、状态、依赖关系等。 +返回任务的完整信息:subject、description、status、blocks、blockedBy、owner。 ### TaskUpdate -更新任务。 +工具名常量:`TaskUpdateToolName = "TaskUpdate"` - - - - - - + + + + + +
    参数类型必填说明
    taskId
    string任务 ID
    subject
    string新标题
    description
    string新描述
    activeForm
    string新的进行时文案
    status
    string新状态
    addBlocks
    []string添加被阻塞的任务
    addBlockedBy
    []string添加阻塞自己的任务
    owner
    string负责 agent
    metadata
    object自定义数据(设 null 删除)
    activeForm
    string新进行时文案
    status
    string新状态,enum:
    pending
    /
    in_progress
    /
    completed
    /
    deleted
    addBlocks
    []string添加被当前任务阻塞的任务 ID(双向写入)
    addBlockedBy
    []string添加阻塞当前任务的任务 ID(双向写入)
    owner
    string负责的 agent 名称
    metadata
    object合并到现有 metadata;设 key 为 null 则删除该 key
    -注意: +关键行为: -- `status: "deleted"` 会直接删掉任务文件 -- 加依赖时会检查循环依赖 -- 所有任务都完成后会自动清理 +- `status: "deleted"` 会物理删除任务文件,并从所有其他任务的 blocks/blockedBy 中移除该 ID +- 添加依赖时会进行**循环依赖检测**,若形成环则报错 +- 当**所有任务均为 completed** 时,自动删除全部任务文件(清理机制) ### TaskList -列出所有任务,不需要参数。 +工具名常量:`TaskListToolName = "TaskList"` -返回每个任务的摘要:ID、状态、标题、负责 agent、依赖关系。 +无参数。返回所有任务的摘要列表(按 ID 排序),每条格式为: + +``` +#ID [status] subject [owner: xxx] [blocked by #x, #y] +``` --- @@ -191,8 +211,7 @@ type task struct { ```go ctx := context.Background() -// plantask middleware 正常情况下应该 session 维度的 -// 不同的 session 对应不同的任务列表 +// Backend 应该是 session 维度隔离的 middleware, err := plantask.New(ctx, &plantask.Config{ Backend: myBackend, BaseDir: "/tasks", @@ -213,39 +232,40 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ 1. 收到复杂任务 │ ▼ -2. TaskCreate 创建任务 +2. TaskCreate 创建多个子任务 - #1: 分析需求 - - #2: 写代码 + - #2: 实现代码 + - #3: 编写测试 │ ▼ 3. TaskUpdate 设置依赖 - - #2 依赖 #1 - - #3 依赖 #2 + - #2 addBlockedBy: ["1"] + - #3 addBlockedBy: ["2"] │ ▼ -4. TaskList 看看有啥任务 +4. TaskList 查看可用任务 │ ▼ -5. TaskUpdate 开始干活 - - #1 改成 in_progress +5. TaskUpdate #1 → in_progress │ ▼ -6. 干完了 TaskUpdate - - #1 改成 completed +6. 完成后 TaskUpdate #1 → completed │ ▼ 7. 循环 4-6 直到全部完成 │ ▼ -8. 自动清理 +8. 全部 completed → 自动清理所有文件 ``` --- ## 依赖管理 -- **blocks**: 我完成了,这些任务才能开始 -- **blockedBy**: 这些任务完成了,我才能开始 +- **blocks**: "我完成了,这些任务才能开始" +- **blockedBy**: "这些任务完成了,我才能开始" + +依赖写入是**双向**的:对 Task A 执行 `addBlocks: ["2"]`,会同时在 Task #2 的 `blockedBy` 中写入 A 的 ID。 ``` Task #1 (blocks: ["2"]) ────► Task #2 (blockedBy: ["1"]) @@ -253,33 +273,32 @@ Task #1 (blocks: ["2"]) ────► Task #2 (blockedBy: ["1"]) #1 完成后 #2 才能开始 ``` -循环依赖会报错: +循环依赖检测通过 DFS 可达性判断实现: ``` #1 blocks #2 -#2 blocks #1 ← 不行,循环了 +#2 blocks #1 ← 报错:would create a cyclic dependency ``` --- -## 自动清理 - -所有任务都 `completed` 后,会自动把任务文件都删掉。 - ---- - -## 注意事项 +## 实现细节 -- 任务文件以 JSON 格式存储在 `BaseDir` 目录下,文件名为 `{id}.json` -- `.highwatermark` 文件用于记录已分配的最大任务 ID,确保 ID 不重复 -- 所有工具操作都有互斥锁保护,并发安全 -- 工具的 description 里已经包含了详细的使用指南,Agent 会根据这些指南来使用工具 + + + + + + + + +
    机制说明
    ID 分配
    .highwatermark
    文件存储当前最大 ID,创建时 +1
    并发安全四个工具共享同一
    sync.Mutex
    ,同一 middleware 实例串行执行
    文件格式每个任务一个
    {id}.json
    文件,JSON 序列化使用
    sonic
    自动清理TaskUpdate 将任务标记为 completed 后检查——若所有任务均 completed 则批量删除
    ID 校验纯数字正则
    ^\d+$
    删除级联删除任务时遍历所有任务文件,移除对该 ID 的引用
    --- ## 多语言支持 -工具的 description 支持中英文切换,通过 `adk.SetLanguage()` 设置: +工具的 description 支持中英文双语,通过全局设置切换: ```go // 使用中文 description @@ -289,4 +308,4 @@ adk.SetLanguage(adk.LanguageChinese) adk.SetLanguage(adk.LanguageEnglish) ``` -这个设置是全局的,会影响所有 ADK 内置的 prompt 和工具 description。 +此设置影响所有 ADK 内置的 prompt 和工具 description。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md index 2753ec9ae44..275cfb3f391 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md @@ -1,17 +1,17 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: Skill weight: 3 --- -Skill Middleware 为 Eino ADK Agent 提供了 Skill 支持,使 Agent 能够动态发现和使用预定义的技能来更准确、高效地完成任务。 +Skill Middleware 为 Eino ADK Agent 提供 Skill 支持,使 Agent 能够动态发现和使用预定义的技能来完成任务。 # 什么是 Skill -Skill 是包含指令、脚本和资源的文件夹,Agent 可以按需发现和使用这些 Skill 来扩展自身能力。 Skill 的核心是一个 `SKILL.md` 文件,包含元数据(至少需要 name 和 description)和指导 Agent 执行特定任务的说明。 +Skill 是包含指令、脚本和资源的文件夹,Agent 可以按需发现和使用这些 Skill 来扩展自身能力。核心是 `SKILL.md` 文件,包含元数据(至少需要 name 和 description)和指导 Agent 执行任务的说明。 ``` my-skill/ @@ -23,9 +23,11 @@ my-skill/ Skill 使用**渐进式展示(Progressive Disclosure)**来高效管理上下文: -1. **发现(Discovery)**:启动时,Agent 仅加载每个可用 Skill 的名称和描述,足以判断何时可能需要使用该 Skill -2. **激活****(Activation)**:当任务匹配某个 Skill 的描述时,Agent 将完整的 `SKILL.md` 内容读入上下文 -3. **执行(Execution)**:Agent 遵循指令执行任务,也可以根据需要加载其他文件或执行捆绑的代码这种方式让 Agent 保持快速响应,同时能够按需访问更多上下文。 + + +1. **发现(Discovery)**:Agent 仅加载每个可用 Skill 的 name 和 description,足以判断何时可能需要使用该 Skill +2. **激活(Activation)**:当任务匹配某个 Skill 时,Agent 将完整的 `SKILL.md` 内容读入上下文 +3. **执行(Execution)**:Agent 遵循指令执行任务,按需加载其他文件或执行捆绑代码 > 💡 > Ref: [https://agentskills.io/home](https://agentskills.io/home) @@ -34,7 +36,7 @@ Skill 使用**渐进式展示(Progressive Disclosure)**来高效管理上下 ## FrontMatter -Skill 的元数据结构,用于在发现阶段快速展示 Skill 信息,避免加载完整内容: +Skill 的元数据结构,从 SKILL.md 的 YAML frontmatter 中解析。用于在发现阶段快速展示 Skill 信息: ```go type FrontMatter struct { @@ -48,14 +50,14 @@ type FrontMatter struct { - - - - - + + + + +
    字段类型说明
    Name
    string
    Skill 的唯一标识符。Agent 通过此名称调用 Skill ,建议使用简短、有意义的名称(如
    pdf-processing
    web-research
    )。对应 SKILL.md 中 frontmatter 的
    name
    字段
    Description
    string
    Skill 的功能描述。这是 Agent 判断是否使用该 Skill 的关键依据,应清晰说明技 Skill 能适用的场景和能力。对应 SKILL.md 中 frontmatter 的
    description
    字段
    Context
    ContextMode
    上下文模式。可选值:
    fork_with_context
    (复制历史消息创建新 Agent 执行)、
    fork
    (隔离上下文创建新 Agent 执行)。留空表示内联模式(直接返回 Skill 内容)
    Agent
    string
    指定使用的 Agent 名称。配合
    Context
    字段使用,通过
    AgentHub
    获取对应的 Agent 工厂函数。留空时使用默认 Agent
    Model
    string
    指定使用的模型名称。通过
    ModelHub
    获取对应的模型实例。在 Context 模式下传递给 Agent 工厂;在内联模式下切换后续 ChatModel 调用使用的模型
    Name
    string
    Skill 的唯一标识符。建议使用简短、有意义的名称(如
    pdf-processing
    web-research
    Description
    string
    Skill 的功能描述。Agent 判断是否使用该 Skill 的关键依据,应清晰说明适用场景和能力
    Context
    ContextMode
    上下文模式。可选值:
    fork
    (隔离上下文)、
    fork_with_context
    (复制历史消息)。留空表示内联模式
    Agent
    string
    指定使用的 Agent 名称,配合
    Context
    使用,通过
    AgentHub
    获取对应 Agent。留空使用默认 Agent
    Model
    string
    指定使用的模型名称,通过
    ModelHub
    获取对应模型实例
    -### ContextMode 上下文模式 +### ContextMode ```go const ( @@ -67,13 +69,13 @@ const ( - - + +
    模式说明
    内联(默认)Skill 内容直接作为工具结果返回,由当前 Agent 继续处理
    ForkWithContext创建新 Agent,复制当前对话历史,独立执行 Skill 任务后返回结果
    Fork创建新 Agent,使用隔离的上下文(仅包含 Skill 内容),独立执行后返回结果
    fork_with_context
    创建新 Agent,复制当前对话历史,独立执行 Skill 任务后返回结果
    fork
    创建新 Agent,使用隔离上下文(仅包含 Skill 内容),独立执行后返回结果
    ## Skill -完整的 Skill 结构,包含元数据和实际指令内容: +完整的 Skill 结构,包含元数据和指令内容: ```go type Skill struct { @@ -85,18 +87,14 @@ type Skill struct { - - - + + +
    字段类型说明
    FrontMatter
    FrontMatter
    嵌入的元数据结构,包含
    Name
    Description
    Context
    Agent
    Model
    Content
    string
    SKILL.md 文件中 frontmatter 之后的正文内容。包含 Skill 的详细指令、工作流程、示例等,Agent 激活 Skill 后会读取此内容
    BaseDirectory
    string
    Skill 目录的绝对路径。Agent 可以使用此路径访问 Skill 目录中的其他资源文件(如脚本、模板、参考文档等)
    FrontMatter
    FrontMatter
    嵌入的元数据结构
    Content
    string
    SKILL.md 中 frontmatter 之后的正文内容,包含详细指令、工作流程、示例等
    BaseDirectory
    string
    Skill 目录的绝对路径,Agent 可用此路径访问目录中的其他资源文件
    ## Backend -Skill 后端接口,定义了技能的检索方式。Backend 接口将技能的存储与使用解耦,提供以下优势: - -- **灵活的存储方式**:技能可以存储在本地文件系统、数据库、远程服务、云存储等任意位置 -- **可扩展性**:团队可以根据需求实现自定义 Backend,如从 Git 仓库动态加载、从配置中心获取等 -- **测试友好**:可以轻松创建 Mock Backend 进行单元测试 +Skill 后端接口,将技能的存储与使用解耦: ```go type Backend interface { @@ -107,13 +105,13 @@ type Backend interface { - - + +
    方法说明
    List
    列出所有可用技能的元数据。在 Agent 启动时调用,用于构建技能工具的描述信息,让 Agent 知道有哪些技能可用
    Get
    根据名称获取完整的技能内容。当 Agent 决定使用某个技能时调用,返回包含详细指令的完整 Skill 结构
    List
    列出所有可用技能的元数据。Agent 启动时调用,用于构建技能工具的描述
    Get
    根据名称获取完整的技能内容。Agent 决定使用某个技能时调用
    -### **NewBackendFromFilesystem** +### NewBackendFromFilesystem -基于 `filesystem.Backend` 接口的后端实现,在指定的目录下读取技能: +基于 `filesystem.Backend` 接口的后端实现,扫描指定目录下的一级子目录读取技能: ```go type BackendFromFilesystemConfig struct { @@ -126,8 +124,8 @@ func NewBackendFromFilesystem(ctx context.Context, config *BackendFromFilesystem - - + +
    字段类型必需说明
    Backend
    filesystem.Backend
    文件系统后端实现,用于文件操作
    BaseDir
    string
    技能根目录的路径。会扫描此目录下的所有一级子目录,查找包含
    SKILL.md
    文件的目录作为技能
    Backend
    filesystem.Backend
    文件系统后端实现,用于文件操作
    BaseDir
    string
    技能根目录路径。扫描此目录下的一级子目录,查找包含
    SKILL.md
    文件的目录
    工作方式: @@ -137,303 +135,143 @@ func NewBackendFromFilesystem(ctx context.Context, config *BackendFromFilesystem - 解析 YAML frontmatter 获取元数据 - 深层嵌套的 `SKILL.md` 文件会被忽略 -### **filesystem.Backend 实现** - -`filesystem.Backend` 接口有以下两种实现可供选择,详见 [Middleware: FileSystem](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem) +`filesystem.Backend` 接口有两种实现可供选择,详见 FileSystem Backend 文档。 ## AgentHub 和 ModelHub -当 Skill 使用 Context 模式(fork/isolate)时,需要配置 AgentHub 和 ModelHub: +当 Skill 使用 Context 模式(fork / fork\_with\_context)时,需要通过 AgentHub 和 ModelHub 提供 Agent 实例和模型实例。 + +> 💡 +> 以下展示非泛型别名类型(即 `*schema.Message` 特化)。泛型版本 `TypedAgentHub[M]`、`TypedModelHub[M]` 可用于 `*schema.AgenticMessage` 场景,接口签名一致,仅消息类型参数不同。 ```go -// AgentHubOptions contains options passed to AgentHub.Get when creating an agent for skill execution. -type AgentHubOptions struct { - // Model is the resolved model instance when a skill specifies a "model" field in frontmatter. - // nil means the skill did not specify a model override; implementations should use their default. - Model model.ToolCallingChatModel +// AgentHubOptions 传递给 AgentHub.Get 的选项 +type AgentHubOptions = TypedAgentHubOptions[*schema.Message] + +type TypedAgentHubOptions[M adk.MessageType] struct { + // Model 为技能 frontmatter 中指定的模型实例(通过 ModelHub 解析)。 + // nil 表示技能未指定模型覆盖,实现方应使用默认模型。 + Model model.BaseModel[M] } -// AgentHub provides agent instances for context mode (fork/fork_with_context) execution. -type AgentHub interface { - // Get returns an Agent by name. When name is empty, implementations should return a default agent. - // The opts parameter carries skill-level overrides (e.g., model) resolved by the framework. - Get(ctx context.Context, name string, opts *AgentHubOptions) (adk.Agent, error) +// AgentHub 为 Context 模式提供 Agent 实例 +type AgentHub = TypedAgentHub[*schema.Message] + +type TypedAgentHub[M adk.MessageType] interface { + // Get 根据名称返回 Agent。name 为空时应返回默认 Agent。 + Get(ctx context.Context, name string, opts *TypedAgentHubOptions[M]) (adk.TypedAgent[M], error) } -// ModelHub 提供模型实例 -type ModelHub interface { - Get(ctx context.Context, name string) (model.ToolCallingChatModel, error) +// ModelHub 根据名称解析模型实例 +type ModelHub = TypedModelHub[*schema.Message] + +type TypedModelHub[M adk.MessageType] interface { + Get(ctx context.Context, name string) (model.BaseModel[M], error) } ``` -### +> 💡 +> 注意:`AgentHubOptions.Model` 和 `ModelHub.Get` 的返回类型为 `model.BaseModel[M]`,而非旧版文档中的 `model.ToolCallingChatModel`。 -## 初始化 +## SubAgentInput 和 SubAgentOutput -创建 Skill Middleware(推荐使用 `NewMiddleware`): +这两个结构体在自定义 fork 模式行为时使用: ```go -func NewMiddleware(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +type SubAgentInput = TypedSubAgentInput[*schema.Message] + +type TypedSubAgentInput[M adk.MessageType] struct { + Skill Skill + Mode ContextMode + RawArguments string // 原始 JSON 参数 + SkillContent string // 构建好的 Skill 内容 + History []M // 对话历史(仅 fork_with_context 模式) + ToolCallID string // 工具调用 ID(仅 fork_with_context 模式) +} + +type SubAgentOutput = TypedSubAgentOutput[*schema.Message] + +type TypedSubAgentOutput[M adk.MessageType] struct { + Skill Skill + Mode ContextMode + RawArguments string + Messages []M // 子 Agent 产生的所有消息 + Results []string // 提取的 assistant 消息文本内容 +} ``` -Config 中配置为: +# 初始化 + +## Config ```go -type Config struct { - // Backend 技能后端实现,必填 - Backend Backend - - // SkillToolName 技能工具名称,默认 "skill" - SkillToolName *string - - // AgentHub 提供 Agent 工厂函数,用于 Context 模式 - // 当 Skill 使用 "context: fork" 或 "context: isolate" 时必填 - AgentHub AgentHub - - // ModelHub 提供模型实例,用于 Skill 指定模型 - ModelHub ModelHub - - // CustomSystemPrompt 自定义系统提示词 - CustomSystemPrompt SystemPromptFunc - - // CustomToolDescription 自定义工具描述 +type Config = TypedConfig[*schema.Message] + +type TypedConfig[M adk.MessageType] struct { + Backend Backend + SkillToolName *string + AgentHub TypedAgentHub[M] + ModelHub TypedModelHub[M] + + CustomSystemPrompt SystemPromptFunc CustomToolDescription ToolDescriptionFunc + CustomToolParams func(ctx context.Context, defaults map[string]*schema.ParameterInfo) (map[string]*schema.ParameterInfo, error) + BuildContent func(ctx context.Context, skill Skill, rawArgs string) (string, error) + BuildForkMessages func(ctx context.Context, in TypedSubAgentInput[M]) ([]M, error) + FormatForkResult func(ctx context.Context, in TypedSubAgentOutput[M]) (string, error) } ``` - - - - - - + + + + + + + + + +
    字段类型必需默认值说明
    Backend
    Backend
  • 技能后端实现。负责技能的存储和检索,可使用内置的
    LocalBackend
    或自定义实现
    SkillToolName
    *string
    "skill"
    技能工具的名称。Agent 通过此名称调用技能工具。如果你的 Agent 已有同名工具,可以通过此字段自定义名称避免冲突
    AgentHub
    AgentHub
  • 提供 Agent 工厂函数。当 Skill 使用
    context: fork
    context: isolate
    时必填
    ModelHub
    ModelHub
  • 提供模型实例。当 Skill 指定
    model
    字段时使用
    CustomSystemPrompt
    SystemPromptFunc
    内置提示词自定义系统提示词函数
    CustomToolDescription
    ToolDescriptionFunc
    内置描述自定义工具描述函数
    Backend
    Backend
    -技能后端实现,负责技能的存储和检索
    SkillToolName
    *string
    "skill"
    技能工具名称。如已有同名工具,可自定义避免冲突
    AgentHub
    TypedAgentHub[M]
    -提供 Agent 实例。使用
    context: fork
    fork_with_context
    时必填
    ModelHub
    TypedModelHub[M]
    -提供模型实例。Context 模式下传给 AgentHub;内联模式下通过 WrapModel 切换后续 ChatModel 调用的模型
    CustomSystemPrompt
    SystemPromptFunc
    内置提示词自定义系统提示词。签名:
    func(ctx, toolName) string
    CustomToolDescription
    ToolDescriptionFunc
    内置描述自定义工具描述。签名:
    func(ctx, skills []FrontMatter) string
    CustomToolParams
    func
    skill
    参数
    自定义工具参数 schema。接收默认参数,返回自定义参数,始终保留
    skill
    为必填
    BuildContent
    func
    默认格式化自定义 Skill 内容生成,可在内容中注入额外上下文
    BuildForkMessages
    func
    见下文自定义 fork 模式下传给子 Agent 的初始消息。默认:
    fork
    [UserMessage(content)]
    fork_with_context
    [history..., ToolMessage(content, callID)]
    FormatForkResult
    func
    拼接内容自定义子 Agent 结果格式化。默认将 assistant message 内容拼接后返回
    -# 快速开始 - -以从本地加载 pdf skill 为例, 完整代码见 [https://github.com/cloudwego/eino-examples/tree/main/adk/middlewares/skill](https://github.com/cloudwego/eino-examples/tree/main/adk/middlewares/skill)。 - -- 在工作目录中创建 skills 目录: +## NewMiddleware ```go -workdir/ -├── skills/ -│ └── pdf/ -│ ├── scripts -│ │ └── analyze.py -│ └── SKILL.md -└── other files +func NewMiddleware(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) ``` -- 创建本地 filesystem backend,基于 backend 创建 Skill middleware: +创建 Skill Middleware,返回 `adk.ChatModelAgentMiddleware`,传入 `ChatModelAgentConfig.Handlers` 使用。 -```go -import ( - "github.com/cloudwego/eino/adk/middlewares/skill" - "github.com/cloudwego/eino-ext/adk/backend/local" -) +> 💡 +> 泛型版本 `NewTyped[M](ctx, config)` 返回 `adk.TypedChatModelAgentMiddleware[M]`,可用于 `*schema.AgenticMessage` 类型的 Agent。 -ctx := context.Background() +## 使用示例 -be, err := local.NewBackend(ctx, &local.Config{}) +```go +// 1. 创建 Backend +backend, err := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesystemConfig{ + Backend: fsBackend, + BaseDir: "/path/to/skills", +}) if err != nil { - log.Fatal(err) + return err } -skillBackend, err := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesystemConfig{ - Backend: be, - BaseDir: skillsDir, +// 2. 创建 Middleware +handler, err := skill.NewMiddleware(ctx, &skill.Config{ + Backend: backend, + AgentHub: myAgentHub, // 可选,仅 fork 模式需要 + ModelHub: myModelHub, // 可选,仅使用 model 字段时需要 }) if err != nil { - log.Fatalf("Failed to create skill backend: %v", err) + return err } -sm, err := skill.NewMiddleware(ctx, &skill.Config{ - Backend: skillBackend, -}) -``` - -- 基于 backend 创建本地 Filesystem Middleware,供 agent 读取 skill 其他文件以及执行脚本: - -```go -import ( - "github.com/cloudwego/eino/adk/middlewares/filesystem" -) - -fsm, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: be, - StreamingShell: be, -}) -``` - -- 创建 Agent 并配置 middlewares - -```go +// 3. 传入 Agent 的 Handlers agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "LogAnalysisAgent", - Description: "An agent that can analyze logs", - Instruction: "You are a helpful assistant.", - Model: cm, - Handlers: []adk.ChatModelAgentMiddleware{fsm, sm}, + // ... 其他配置 + Handlers: []adk.ChatModelAgentMiddleware{handler}, }) ``` - -- 调用 Agent,观察结果 - -```go -runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: agent, -}) - -input := fmt.Sprintf("Analyze the %s file", filepath.Join(workDir, "test.log")) -log.Println("User: ", input) - -iterator := runner.Query(ctx, input) -for { - event, ok := iterator.Next() - if !ok { - break - } - if event.Err != nil { - log.Printf("Error: %v\n", event.Err) - break - } - - prints.Event(event) -} -``` - -agent 输出: - -```yaml -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool name: skill -arguments: {"skill":"log_analyzer"} - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool response: Launching skill: log_analyzer -Base directory for this skill: /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/skills/log_analyzer -# SKILL.md content - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool name: execute -arguments: {"command": "python3 /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/skills/log_analyzer/scripts/analyze.py /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/test.log"} - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool response: Analysis Result for /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/test.log: -Total Errors: 2 -Total Warnings: 1 - -Error Details: -Line 3: [2024-05-20 10:02:15] ERROR: Database connection failed. -Line 5: [2024-05-20 10:03:05] ERROR: Connection timed out. - -Warning Details: -Line 2: [2024-05-20 10:01:23] WARNING: High memory usage detected. - - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -answer: Here's the analysis result of the log file: - -### Summary -- **Total Errors**: 2 -- **Total Warnings**: 1 - -### Detailed Entries -#### Errors: -1. Line 3: [2024-05-20 10:02:15] ERROR: Database connection failed. -2. Line5: [2024-05-2010:03:05] ERROR: Connection timed out. - -#### Warnings: -1. Line2: [2024-05-2010:01:23] WARNING: High memory usage detected. - -The log file contains critical issues related to database connectivity and a warning about memory usage. Let me know if you need further analysis! -``` - -# 原理 - -Skill middleware 向 Agent 增加 system prompt 与 skill tool,system prompt 内容如下,{tool_name} 为 skill 工具的工具名: - -```python -# Skills System - -**How to Use Skills (Progressive Disclosure):** - -Skills follow a **progressive disclosure** pattern - you see their name and description above, but only read full instructions when needed: - -1. **Recognize when a skill applies**: Check if the user's task matches a skill's description -2. **Read the skill's full instructions**: Use the '{tool_name}' tool to load skill -3. **Follow the skill's instructions**: tool result contains step-by-step workflows, best practices, and examples -4. **Access supporting files**: Skills may include helper scripts, configs, or reference docs - use absolute paths - -**When to Use Skills:** -- User's request matches a skill's domain (e.g., "research X" -> web-research skill) -- You need specialized knowledge or structured workflows -- A skill provides proven patterns for complex tasks - -**Executing Skill Scripts:** -Skills may contain Python scripts or other executable files. Always use absolute paths. - -**Example Workflow:** - -User: "Can you research the latest developments in quantum computing?" - -1. Check available skills -> See "web-research" skill -2. Call '{tool_name}' tool to read the full skill instructions -3. Follow the skill's research workflow (search -> organize -> synthesize) -4. Use any helper scripts with absolute paths - -Remember: Skills make you more capable and consistent. When in doubt, check if a skill exists for the task! -``` - -Skill 工具接收需要加载 skill name,返回对应 SKILL.md 中的完整内容,在工具描述中告知 agent 所有可使用的 skill 的 name 和 description: - -```sql -Execute a skill within the main conversation - - -When users ask you to perform tasks, check if any of the available skills below can help complete the task more effectively. Skills provide specialized capabilities and domain knowledge. - -How to invoke: -- Use this tool with the skill name only (no arguments) -- Examples: - - `skill: pdf` - invoke the pdf skill - - `skill: xlsx` - invoke the xlsx skill - - `skill: ms-office-suite:pdf` - invoke using fully qualified name - -Important: -- When a skill is relevant, you must invoke this tool IMMEDIATELY as your first action -- NEVER just announce or mention a skill in your text response without actually calling this tool -- This is a BLOCKING REQUIREMENT: invoke the relevant Skill tool BEFORE generating any other response about the task -- Only use skills listed in below -- Do not invoke a skill that is already running -- Do not use this tool for built-in CLI commands (like /help, /clear, etc.) - - - -{{- range .Matters }} - - -{{ .Name }} - - -{{ .Description }} - - -{{- end }} - -``` - -运行举例: - - - -> 💡 -> Skill Middleware 仅提供了如上图所示的加载 SKILL.md 能力,如果 Skill 需要 agent 具备读取文件、执行脚本等能力,需要用户另外为 agent 配置。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md index f59fe4dbc17..e8b5fce7d83 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md @@ -1,210 +1,343 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-17" lastmod: "" tags: [] title: Summarization weight: 4 --- +> 💡 +> 本中间件在 v0.8.0 版本引入。包路径:`github.com/cloudwego/eino/adk/middlewares/summarization` + ## 概述 -Summarization 中间件会在对话的 token 数量超过配置阈值时,自动压缩对话历史。这有助于在长对话中保持上下文连续性,同时控制在模型的 token 限制范围内。 +Summarization 中间件在对话 token 数超过阈值时自动调用摘要模型压缩对话历史,使长对话在模型上下文窗口内保持连贯。中间件挂载在 `BeforeModelRewriteState` 钩子上,每轮模型调用前检查触发条件,触发后执行:计数 → 摘要生成(含重试/降级)→ 后处理 → 替换 state。 -> 💡 -> 本中间件在 v0.8.0 版本引入。 +## 泛型体系 -## 快速开始 +本包全部核心类型和函数均提供 **Typed 泛型版本**(`M adk.MessageType`)与 **非泛型别名**(固定为 `*schema.Message`)。 -```go -import ( - "context" - "github.com/cloudwego/eino/adk/middlewares/summarization" -) + + + + + + + + + + + + + + + +
    泛型版本非泛型别名(= Typed\[*schema.Message\])
    TypedConfig[M]
    Config
    NewTyped[M](ctx, *TypedConfig[M])
    New(ctx, *Config)
    TypedTokenCounterFunc[M]
    TokenCounterFunc
    TypedGenModelInputFunc[M]
    GenModelInputFunc
    TypedGetFailoverModelFunc[M]
    GetFailoverModelFunc
    TypedFinalizeFunc[M]
    FinalizeFunc
    TypedCallbackFunc[M]
    CallbackFunc
    TypedUserMessageFilterFunc[M]
    UserMessageFilterFunc
    TypedPreserveUserMessages[M]
    PreserveUserMessages
    TypedRetryConfig[M]
    RetryConfig
    TypedFailoverConfig[M]
    FailoverConfig
    TypedFailoverContext[M]
    FailoverContext
    TypedFinalizerBuilder[M]
    FinalizerBuilder
    -// 使用最小配置创建中间件 -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, // 必填:用于生成摘要的模型 -}) -if err != nil { - // 处理错误 -} +以下文档中如无特别说明,类型签名使用泛型形式 `M`。使用非泛型别名时 `M` = `*schema.Message`。 -// 与 ChatModelAgent 一起使用 -agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Model: yourChatModel, - Middlewares: []adk.ChatModelAgentMiddleware{mw}, -}) +### 构造函数 + +```go +// 泛型版本 — 支持 *schema.Message 和 *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error) + +// 非泛型版本 — 等价于 NewTyped[*schema.Message] +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) ``` -## 配置项 +## TypedConfig[M] 配置项 - - - - - - - - - - - + + + + + + + + + + + + +
    字段类型必填默认值说明
    Modelmodel.BaseChatModel
  • 用于生成摘要的聊天模型
    ModelOptions[]model.Option
  • 传递给模型生成摘要时的选项
    TokenCounterTokenCounterFunc约 4 字符/token自定义 token 计数函数
    Trigger*TriggerCondition190,000 tokens触发摘要的条件
    UserInstructionstring内置 prompt自定义摘要指令
    TranscriptFilePathstring
  • 完整对话记录文件路径
    GenModelInputGenModelInputFunc
  • 自定义摘要模型输入的预处理函数
    FinalizeFinalizeFunc
  • 自定义最终消息的后处理函数
    CallbackCallbackFunc
  • 在 Finalize 之后调用,用于观察状态变化(只读)
    EmitInternalEventsboolfalse是否发送内部事件
    PreserveUserMessages*PreserveUserMessagesEnabled: true是否在摘要中保留原始用户消息
    Model
    model.BaseModel[M]
    用于生成摘要的模型
    ModelOptions
    []model.Option
    传递给摘要模型的选项
    TokenCounter
    TypedTokenCounterFunc[M]
    基于最近 assistant 消息的 total\_tokens 作为基线,增量消息按 ~4 字符/token 估算自定义 token 计数函数
    Trigger
    *TriggerCondition
    ContextTokens=160,000触发摘要的条件
    UserInstruction
    string
    内置 prompt自定义用户级摘要指令,覆盖默认指令
    TranscriptFilePath
    string
    完整对话记录文件路径,附加到摘要中提醒模型原始上下文位置。仅在未设置 Finalize 时生效
    GenModelInput
    TypedGenModelInputFunc[M]
    sysInstruction → contextMsgs → userInstruction完全控制摘要模型输入的构建
    Finalize
    TypedFinalizeFunc[M]
    内置后处理自定义摘要后处理。设置后中间件不再执行任何默认后处理
    Callback
    TypedCallbackFunc[M]
    在 Finalize 后调用,参数为
    before, after adk.TypedChatModelAgentState[M]
    (值类型),只读
    EmitInternalEvents
    bool
    false是否在关键节点发送内部事件
    PreserveUserMessages
    *TypedPreserveUserMessages[M]
    Enabled: true在摘要中保留原始用户消息。仅在未设置 Finalize 时生效
    Retry
    *TypedRetryConfig[M]
    nil(不重试)主模型摘要生成的重试策略
    Failover
    *TypedFailoverConfig[M]
    nil主模型失败后的降级策略
    -### TriggerCondition 结构 +> 💡 +> **Finalize 覆盖语义**:一旦设置了自定义 `Finalize`,中间件将**跳过所有默认后处理**——`PreserveUserMessages` 和 `TranscriptFilePath` 均不再生效。如需在自定义 Finalize 中复用默认后处理逻辑,请使用 `DefaultFinalizer` 函数。 + +## 子配置结构体 + +### TriggerCondition + +满足**任一**条件即触发摘要。 ```go type TriggerCondition struct { - // ContextTokens 当总 token 数量超过此阈值时触发摘要 - ContextTokens int + ContextTokens int // token 数超过此阈值时触发 + ContextMessages int // 消息数超过此阈值时触发 } ``` -### PreserveUserMessages 结构 +### TypedPreserveUserMessages\[M\] + +启用后,将摘要中 `...` 区段替换为最近的原始用户消息。 ```go -type PreserveUserMessages struct { - // Enabled 是否启用保留用户消息功能 - Enabled bool - - // MaxTokens 保留用户消息的最大 token 数 - // 只保留最近的用户消息,直到达到此限制 - // 默认为 TriggerCondition.ContextTokens 的 1/3 - MaxTokens int +type TypedPreserveUserMessages[M adk.MessageType] struct { + Enabled bool + MaxTokens int // 保留用户消息的最大 token 数;默认为 TriggerCondition.ContextTokens / 3 + Filter TypedUserMessageFilterFunc[M] // 过滤函数,返回 false 则不保留该消息 } ``` -### 配置示例 +### TypedRetryConfig[M] -**自定义 Token 阈值** +```go +type TypedRetryConfig[M adk.MessageType] struct { + MaxRetries *int // 默认 3 + ShouldRetry func(ctx context.Context, resp M, err error) bool // 默认 err != nil 时重试 + BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration // 默认指数退避 + 抖动 +} +``` + +### TypedFailoverConfig[M] ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Trigger: &summarization.TriggerCondition{ - ContextTokens: 100000, // 在 100k tokens 时触发 - }, -}) +type TypedFailoverConfig[M adk.MessageType] struct { + MaxRetries *int // 默认 3 + ShouldFailover func(ctx context.Context, resp M, err error) bool // 默认 err != nil 时降级 + BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration + GetFailoverModel TypedGetFailoverModelFunc[M] // 返回 (failoverModel model.BaseModel[M], failoverModelInputMsgs []M, failoverErr error) +} ``` -**自定义 Token 计数器** +### TypedFailoverContext[M] + +传递给 `GetFailoverModel` 回调的上下文。 ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - TokenCounter: func(ctx context.Context, input *summarization.TokenCounterInput) (int, error) { - // 使用你的 tokenizer - return yourTokenizer.Count(input.Messages) - }, -}) +type TypedFailoverContext[M adk.MessageType] struct { + Attempt int // 当前降级尝试次数,从 1 开始 + SystemInstruction M // 系统指令(中间件内部设置,不可配置) + UserInstruction M // 用户指令 + OriginalMessages []M // 原始完整对话 + LastModelResponse M // 上次尝试的模型响应 + LastErr error +} ``` -**设置对话记录文件路径** +### TypedTokenCounterInput[M] ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, +type TypedTokenCounterInput[M adk.MessageType] struct { + Messages []M + Tools []*schema.ToolInfo +} +``` + +## 函数类型签名速查 + +```go +type TypedTokenCounterFunc[M] func(ctx context.Context, input *TypedTokenCounterInput[M]) (int, error) +type TypedGenModelInputFunc[M] func(ctx context.Context, sysInstruction, userInstruction M, originalMsgs []M) ([]M, error) +type TypedGetFailoverModelFunc[M] func(ctx context.Context, failoverCtx *TypedFailoverContext[M]) (model.BaseModel[M], []M, error) +type TypedFinalizeFunc[M] func(ctx context.Context, originalMessages []M, summary M) ([]M, error) +type TypedCallbackFunc[M] func(ctx context.Context, before, after adk.TypedChatModelAgentState[M]) error +type TypedUserMessageFilterFunc[M] func(ctx context.Context, msg M) (bool, error) +``` + +## DefaultFinalizer + +`DefaultFinalizer` 是一个独立的工厂函数,返回与中间件默认后处理逻辑一致的 `TypedFinalizeFunc[M]`。当你需要在自定义 `Finalize` 中复用默认逻辑(保留用户消息、附加 transcript 路径等)时使用。 + +```go +func DefaultFinalizer[M adk.MessageType](cfg *DefaultFinalizerConfig[M]) (TypedFinalizeFunc[M], error) +``` + +### DefaultFinalizerConfig[M] + +```go +type DefaultFinalizerConfig[M adk.MessageType] struct { + PreserveUserMessages *TypedPreserveUserMessages[M] // 默认 Enabled=true,MaxTokens=30000 + TranscriptFilePath string +} +``` + +**示例**:在自定义 Finalize 中先执行默认后处理,再添加系统消息: + +```go +defaultFinalize, err := summarization.DefaultFinalizer[*schema.Message](&summarization.DefaultFinalizerConfig[*schema.Message]{ TranscriptFilePath: "/path/to/transcript.txt", }) +if err != nil { + // handle error +} + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: func(ctx context.Context, originalMessages []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + msgs, err := defaultFinalize(ctx, originalMessages, summary) + if err != nil { + return nil, err + } + // 在摘要前添加系统消息 + return append([]*schema.Message{schema.SystemMessage("your system prompt")}, msgs...), nil + }, +} ``` -**自定义 Finalize 函数** +## FinalizerBuilder + +`TypedFinalizerBuilder[M]` 提供链式 API 构建 `TypedFinalizeFunc[M]`,支持链接多个处理器(Handler)和一个可选的自定义终结器(Custom)。 ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Finalize: func(ctx context.Context, originalMessages []adk.Message, summary adk.Message) ([]adk.Message, error) { - // 自定义逻辑构建最终消息 - return []adk.Message{ - schema.SystemMessage("你的系统提示词"), - summary, - }, nil - }, -}) +func NewTypedFinalizer[M adk.MessageType]() *TypedFinalizerBuilder[M] +func NewFinalizer() *FinalizerBuilder // = NewTypedFinalizer[*schema.Message] + +func (b *TypedFinalizerBuilder[M]) PreserveSkills(config *PreserveSkillsConfig) *TypedFinalizerBuilder[M] +func (b *TypedFinalizerBuilder[M]) Custom(fn TypedFinalizeFunc[M]) *TypedFinalizerBuilder[M] +func (b *TypedFinalizerBuilder[M]) Build() (TypedFinalizeFunc[M], error) ``` -**使用 Callback 观察状态变化****/存储** +执行顺序:Handler 按注册顺序依次对 summary 进行变换 → Custom 确定最终输出消息列表。若未设置 Custom,则返回 `[]M{summary}`。 + +### PreserveSkills + +在摘要压缩后保留 Skill 中间件加载过的技能内容,确保 agent 在上下文窗口压缩后仍保留技能知识。 ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Callback: func(ctx context.Context, before, after adk.ChatModelAgentState) error { - log.Printf("Summarization completed: %d messages -> %d messages", - len(before.Messages), len(after.Messages)) - return nil - }, -}) +type PreserveSkillsConfig struct { + SkillToolName string // 技能工具名,需与 Skill 中间件一致。默认 "skill" + MaxSkills *int // 最多保留技能数。默认 5;0 表示禁用 + MaxTokensPerSkill *int // 单个技能最大 token 数,超出截断。默认 5000 + SkillsTokenBudget *int // 所有技能总 token 预算。默认 25000 +} ``` -**控制用户消息保留** +**示例**: ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - PreserveUserMessages: &summarization.PreserveUserMessages{ - Enabled: true, - MaxTokens: 50000, // 保留最多 50k tokens 的用户消息 - }, -}) +finalizer, err := summarization.NewFinalizer(). + PreserveSkills(&summarization.PreserveSkillsConfig{}). + Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + return []*schema.Message{schema.SystemMessage("system prompt"), summary}, nil + }). + Build() + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: finalizer, +} ``` -## 工作原理 +## Summarize 方法 -```mermaid -flowchart TD - A[BeforeModelRewriteState] --> B{Token 数量超过阈值?} - B -->|否| C[返回原始状态] - B -->|是| D[发送 BeforeSummarize 事件] - D --> E{有自定义 GenModelInput?} - E -->|是| F[调用 GenModelInput] - E -->|否| G[调用模型生成摘要] - F --> G - G --> H{有自定义 Finalize?} - H -->|是| I[调用 Finalize] - H -->|否| L{有自定义 Callback?} - I --> L - L -->|是| M[调用 Callback] - L -->|否| J[发送 AfterSummarize 事件] - M --> J - J --> K[返回新状态] - - style A fill:#e3f2fd - style G fill:#fff3e0 - style D fill:#e8f5e9 - style J fill:#e8f5e9 - style K fill:#c8e6c9 - style C fill:#f5f5f5 - style M fill:#fce4ec - style F fill:#fff3e0 - style I fill:#fff3e0 +`TypedMiddleware[M]` 暴露 `Summarize` 方法,可在中间件自动触发之外手动执行一次摘要: + +```go +func (m *TypedMiddleware[M]) Summarize(ctx context.Context, state *adk.TypedChatModelAgentState[M]) ([]M, error) ``` +该方法执行完整的摘要流程(生成 → 后处理 → Callback → 事件),但**不检查触发条件**。返回替换后的消息列表。 + +## 工作原理 + + + +**触发条件检查**:先检查 `ContextMessages`(消息数),再通过 `TokenCounter` 计算 token 数与 `ContextTokens` 对比。满足任一即触发。 + +**默认后处理**(未设置 Finalize 时): + +1. 将摘要中 `...` 替换为最近的原始用户消息(受 `PreserveUserMessages` 控制) +2. 附加 `TranscriptFilePath` 提示 +3. 添加摘要前言和继续指令 + ## 内部事件 -当 EmitInternalEvents 设置为 true 时,中间件会在关键节点发送事件: +当 `EmitInternalEvents = true` 时,中间件通过 `adk.TypedSendEvent` 发送事件: - - + + +
    事件类型触发时机携带数据
    ActionTypeBeforeSummarize生成摘要之前原始消息列表
    ActionTypeAfterSummarize完成总结之后最终消息列表
    ActionTypeBeforeSummarize
    触发条件满足后,调用模型前
    TypedBeforeSummarizeAction[M]{Messages}
    :原始消息列表
    ActionTypeGenerateSummary
    每次模型生成尝试后(含重试/降级)
    TypedGenerateSummaryAction[M]{Attempt, Phase, ModelResponse, GetError()}
    ActionTypeAfterSummarize
    摘要完成、Finalize 之后
    TypedAfterSummarizeAction[M]{Messages}
    :最终消息列表
    -**使用示例** +事件通过 `TypedCustomizedAction[M]` 包装,放在 `adk.AgentAction.CustomizedAction` 字段中。`GenerateSummaryPhase` 有两个值:`GenerateSummaryPhasePrimary`(主模型/重试)和 `GenerateSummaryPhaseFailover`(降级)。 + +## 使用示例 + +### 最小配置 ```go mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - EmitInternalEvents: true, + Model: yourChatModel, }) -// 在你的事件处理器中监听事件 +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: yourChatModel, + Middlewares: []adk.ChatModelAgentMiddleware{mw}, +}) +``` + +### 自定义触发条件 + 重试 + 降级 + +```go +mw, err := summarization.New(ctx, &summarization.Config{ + Model: yourChatModel, + Trigger: &summarization.TriggerCondition{ + ContextTokens: 100000, + ContextMessages: 80, + }, + TranscriptFilePath: "/path/to/transcript.txt", + Retry: &summarization.RetryConfig{ + MaxRetries: ptrOf(2), + }, + Failover: &summarization.FailoverConfig{ + MaxRetries: ptrOf(3), + GetFailoverModel: func(ctx context.Context, fctx *summarization.FailoverContext) (model.BaseModel[*schema.Message], []*schema.Message, error) { + return backupModel, nil, nil // 返回 nil input 将复用默认输入 + }, + }, +}) +``` + +### FinalizerBuilder + PreserveSkills + DefaultFinalizer + +```go +defaultFinalize, _ := summarization.DefaultFinalizer[*schema.Message]( + &summarization.DefaultFinalizerConfig[*schema.Message]{ + TranscriptFilePath: "/path/to/transcript.txt", + }, +) + +finalizer, err := summarization.NewFinalizer(). + PreserveSkills(&summarization.PreserveSkillsConfig{ + MaxSkills: ptrOf(3), + }). + Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + msgs, err := defaultFinalize(ctx, origMsgs, summary) + if err != nil { + return nil, err + } + return append([]*schema.Message{schema.SystemMessage("system prompt")}, msgs...), nil + }). + Build() + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: finalizer, +} ``` -## 最佳实践 +## 注意事项 -1. **设置 TranscriptFilePath**:建议始终提供对话记录文件路径,以便模型在需要时可以参考原始对话。 -2. **调整 Token 阈值**:根据模型的上下文窗口大小调整 `Trigger.MaxTokens`。一般建议设置为模型限制的 80-90%。 -3. **自定义 Token 计数器**:在生产环境中,建议实现与模型 tokenizer 匹配的自定义 `TokenCounter`,以获得准确的计数。 +1. **设置 TranscriptFilePath**:强烈建议提供对话记录文件路径,摘要后模型可从原始记录中回溯细节。 +2. **调整触发阈值**:`Trigger.ContextTokens` 建议设为模型上下文窗口的 80-90%。默认值 160,000 适用于 200k 窗口的模型。 +3. **自定义 TokenCounter**:生产环境建议实现与模型 tokenizer 精确匹配的计数器。默认估算器以最近 assistant 消息的 `ResponseMeta.Usage.TotalTokens` 为基线,增量消息按 ~4 字符/token 估算。 +4. **Finalize 覆盖**:设置 `Finalize` 后,`PreserveUserMessages` 和 `TranscriptFilePath` 不再自动生效。如需复用,使用 `DefaultFinalizer` 或 `FinalizerBuilder`。 +5. **GetFailoverModel 约束**:回调必须返回非 nil 的 model 和非空的 input 消息列表。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md index db4d192d2e6..96a47b99fc3 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md @@ -1,25 +1,23 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-17" lastmod: "" tags: [] title: Reduction weight: 5 --- -# Reduction 中间件 - -adk/middlewares/reduction +`adk/middlewares/reduction` > 💡 > 本中间件在 v0.8.0 版本引入。 ## 概述 -`reduction` 中间件用来控制工具结果占用的 token 数量,提供两种策略: +`reduction` 中间件管理 Agent 对话中工具输出占用的 token 数量,分为两个阶段: -1. **截断 (Truncation)**:工具返回时立即截断过长的输出,将完整内容保存到 Backend -2. **清理 (Clear)**:总 token 超过阈值时,把旧的工具结果存到文件系统 +1. **截断(Truncation)**:工具调用返回时立即触发。单次输出超过 `MaxLengthForTrunc` 时,完整内容存入 Backend,消息替换为截断摘要。 +2. **清理(Clear)**:模型调用前触发(`BeforeModelRewriteState`)。总 token 超过 `MaxTokensForClear` 时,遍历历史消息,将旧的工具参数和结果卸载到 Backend。 --- @@ -30,9 +28,10 @@ Tool 调用返回结果 │ ▼ ┌─────────────────────────────────────────────────────────────┐ -│ WrapInvokableToolCall / WrapStreamableToolCall │ +│ WrapInvokableToolCall / WrapStreamableToolCall │ +│ WrapEnhancedInvokableToolCall / WrapEnhancedStreamable │ │ │ -│ Truncation 策略(可跳过) │ +│ Truncation(可通过 SkipTruncation 跳过) │ │ 结果长度 > MaxLengthForTrunc? │ │ 是 → 截断内容,完整内容存到 Backend │ │ 否 → 原样返回 │ @@ -45,9 +44,12 @@ Tool 调用返回结果 ┌─────────────────────────────────────────────────────────────┐ │ BeforeModelRewriteState │ │ │ -│ Clear 策略(可跳过) │ +│ Clear(可通过 SkipClear 跳过) │ │ 总 token > MaxTokensForClear? │ -│ 是 → 把旧的工具结果存到 Backend,替换成文件路径 │ +│ 是 → ClearMessageRewriter 预处理 │ +│ → 旧工具结果存到 Backend,替换为文件路径 │ +│ → ClearAtLeastTokens 最小释放量检查 │ +│ → ClearPostProcess 回调 │ │ 否 → 不处理 │ └─────────────────────────────────────────────────────────────┘ │ @@ -57,95 +59,75 @@ Tool 调用返回结果 --- -## 配置 +## 泛型体系 -### Config 主配置 +本中间件采用 ADK 标准泛型模式,同时支持 `*schema.Message` 和 `*schema.AgenticMessage`: ```go -type Config struct { - // Backend 存储后端,用于保存截断/清理的内容 - // 当 SkipTruncation 为 false 时必填 - Backend Backend - - // SkipTruncation 跳过截断阶段 - SkipTruncation bool - - // SkipClear 跳过清理阶段 - SkipClear bool - - // ReadFileToolName 读取文件的工具名 - // 内容卸载到文件后,agent 需要使用此工具读取 - // 默认 "read_file" - ReadFileToolName string +// 泛型配置,M 约束为 adk.MessageType +type TypedConfig[M adk.MessageType] struct { ... } - // RootDir 保存内容的根目录 - // 默认 "/tmp" - // 截断内容保存到 {RootDir}/trunc/{tool_call_id} - // 清理内容保存到 {RootDir}/clear/{tool_call_id} - RootDir string - - // MaxLengthForTrunc 触发截断的最大长度 - // 默认 50000 - MaxLengthForTrunc int +// 向后兼容别名 +type Config = TypedConfig[*schema.Message] +``` - // TokenCounter token 计数器 - // 用于判断是否需要触发清理 - // 默认使用 字符数/4 估算 - TokenCounter func(ctx context.Context, msg []adk.Message, tools []*schema.ToolInfo) (int64, error) +构造函数同样提供泛型和非泛型两种: - // MaxTokensForClear 触发清理的 token 阈值 - // 默认 30000 - MaxTokensForClear int64 +```go +func NewTyped[M adk.MessageType](ctx context.Context, config *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error) +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +``` - // ClearRetentionSuffixLimit 保留最近多少轮对话不清理 - // 默认 1 - ClearRetentionSuffixLimit int +--- - // ClearPostProcess 清理完成后的回调 - // 可用于保存或通知当前状态 - ClearPostProcess func(ctx context.Context, state *adk.ChatModelAgentState) context.Context +## 配置 - // ToolConfig 针对特定工具的配置 - // 优先级高于全局配置 - ToolConfig map[string]*ToolReductionConfig -} -``` +### TypedConfig[M] 主配置 + + + + + + + + + + + + + + + + + + + + +
    字段类型说明
    Backend
    Backend
    存储后端。
    SkipTruncation
    为 false 时必填;仅做 Clear 且不需要 offload 时可为 nil。
    SkipTruncation
    bool
    跳过截断阶段。
    SkipClear
    bool
    跳过清理阶段。
    ReadFileToolName
    string
    用于读取卸载内容的工具名。默认
    "read_file"
    RootDir
    string
    保存内容的根目录。默认
    "/tmp"
    。截断内容存到
    {RootDir}/trunc/{tool_call_id}
    ,清理内容存到
    {RootDir}/clear/{tool_call_id}
    GenTruncOffloadFilePath
    func(ctx, *ToolDetail) (string, error)
    自定义截断文件路径生成。设置后 RootDir 对截断不生效。适用于 tool_call_id 不唯一的场景。
    GenClearOffloadFilePath
    func(ctx, *ToolDetail) (string, error)
    自定义清理文件路径生成。设置后 RootDir 对清理不生效。
    MaxLengthForTrunc
    int
    触发截断的最大字符长度。默认
    50000
    TruncExcludeTools
    []string
    不截断的工具名列表。
    TokenCounter
    func(ctx, []M, []*schema.ToolInfo) (int64, error)
    token 计数函数。默认使用字符数/4 估算。建议用 tiktoken-go/tokenizer 替换
    MaxTokensForClear
    int64
    触发清理的 token 阈值。默认
    160000
    ClearRetentionSuffixLimit
    int
    保留最近 N 轮 assistant 消息不清理。默认
    1
    ClearAtLeastTokens
    int64
    清理至少释放的 token 量。未达标则不执行清理(避免无谓破坏 prompt cache)。默认
    0
    ClearExcludeTools
    []string
    不清理的工具名列表。
    ClearMessageRewriter
    func(ctx, M, []M) ([]M, error)
    清理前的消息重写回调。参数为 toolCallMsg 和对应的 toolResponseMsgs。可用于将 write_file/edit_file 调用重写为 system-reminder。返回 nil 表示移除该组消息。
    ClearPostProcess
    func(ctx, *adk.TypedChatModelAgentState[M]) context.Context
    清理完成后的回调,可保存状态或发送通知。返回可能更新后的 context。
    ToolConfig
    map[string]*ToolReductionConfig
    按工具名配置,优先级高于全局。
    ### ToolReductionConfig 工具级配置 ```go type ToolReductionConfig struct { - // Backend 此工具使用的存储后端 - Backend Backend - - // SkipTruncation 跳过此工具的截断 + Backend Backend SkipTruncation bool - - // TruncHandler 自定义截断处理器 - // 不设置时使用默认处理器 - TruncHandler func(ctx context.Context, detail *ToolDetail) (*TruncResult, error) - - // SkipClear 跳过此工具的清理 - SkipClear bool - - // ClearHandler 自定义清理处理器 - // 不设置时使用默认处理器 - ClearHandler func(ctx context.Context, detail *ToolDetail) (*ClearResult, error) + TruncHandler func(ctx context.Context, detail *ToolDetail) (*TruncResult, error) + SkipClear bool + ClearHandler func(ctx context.Context, detail *ToolDetail) (*ClearResult, error) } ``` +- `TruncHandler` / `ClearHandler` 为 nil 且未跳过时,使用全局默认 handler。 +- `Backend` 为该工具独立的存储后端,可覆盖全局 Backend。 + ### ToolDetail 工具详情 ```go type ToolDetail struct { - // ToolContext 工具元信息(工具名、调用 ID) - ToolContext *adk.ToolContext - - // ToolArgument 输入参数 - ToolArgument *schema.ToolArgument - - // ToolResult 输出结果 - ToolResult *schema.ToolResult + ToolContext *adk.ToolContext + ToolArgument *schema.ToolArgument + ToolResult *schema.ToolResult // 非流式 + StreamToolResult *schema.StreamReader[*schema.ToolResult] // 流式 } ``` @@ -153,23 +135,12 @@ type ToolDetail struct { ```go type TruncResult struct { - // NeedTrunc 是否需要截断 - NeedTrunc bool - - // ToolResult 截断后的工具结果 - // NeedTrunc 为 true 时必填 - ToolResult *schema.ToolResult - - // NeedOffload 是否需要卸载到存储 - NeedOffload bool - - // OffloadFilePath 卸载文件路径 - // NeedOffload 为 true 时必填 - OffloadFilePath string - - // OffloadContent 卸载内容 - // NeedOffload 为 true 时必填 - OffloadContent string + NeedTrunc bool + ToolResult *schema.ToolResult // NeedTrunc && 非流式时必填 + StreamToolResult *schema.StreamReader[*schema.ToolResult] // NeedTrunc && 流式时必填 + NeedOffload bool + OffloadFilePath string // NeedOffload 时必填 + OffloadContent string // NeedOffload 时必填 } ``` @@ -177,30 +148,26 @@ type TruncResult struct { ```go type ClearResult struct { - // NeedClear 是否需要清理 - NeedClear bool - - // ToolArgument 清理后的工具参数 - // NeedClear 为 true 时必填 - ToolArgument *schema.ToolArgument - - // ToolResult 清理后的工具结果 - // NeedClear 为 true 时必填 - ToolResult *schema.ToolResult - - // NeedOffload 是否需要卸载到存储 - NeedOffload bool + NeedClear bool + ToolArgument *schema.ToolArgument // NeedClear 时必填 + ToolResult *schema.ToolResult // NeedClear 时必填 + NeedOffload bool + OffloadFilePath string // NeedOffload 时必填 + OffloadContent string // NeedOffload 时必填 +} +``` - // OffloadFilePath 卸载文件路径 - // NeedOffload 为 true 时必填 - OffloadFilePath string +### Backend 接口 - // OffloadContent 卸载内容 - // NeedOffload 为 true 时必填 - OffloadContent string +```go +// 定义于 reduction/internal,通过类型别名导出 +type Backend interface { + Write(context.Context, *filesystem.WriteRequest) error } ``` +`filesystem.WriteRequest` 包含 `FilePath string` 和 `Content string` 两个字段。 + --- ## 创建中间件 @@ -208,67 +175,75 @@ type ClearResult struct { ### 基本用法 ```go -import ( - "context" - "github.com/cloudwego/eino/adk/middlewares/reduction" -) +import "github.com/cloudwego/eino/adk/middlewares/reduction" -// 使用默认配置 middleware, err := reduction.New(ctx, &reduction.Config{ - Backend: myBackend, // 必填:存储后端 + Backend: myBackend, }) -// 与 ChatModelAgent 一起使用 agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Model: yourChatModel, + Model: chatModel, Middlewares: []adk.ChatModelAgentMiddleware{middleware}, }) ``` +### 泛型用法(AgenticMessage) + +```go +middleware, err := reduction.NewTyped[*schema.AgenticMessage](ctx, &reduction.TypedConfig[*schema.AgenticMessage]{ + Backend: myBackend, + TokenCounter: myAgenticTokenCounter, +}) + +agent, err := adk.NewTypedChatModelAgent(ctx, &adk.TypedChatModelAgentConfig[*schema.AgenticMessage]{ + Model: chatModel, + Middlewares: []adk.TypedChatModelAgentMiddleware[*schema.AgenticMessage]{middleware}, +}) +``` + ### 自定义配置 ```go -config := &reduction.Config{ +middleware, err := reduction.New(ctx, &reduction.Config{ Backend: myBackend, RootDir: "/data/agent", MaxLengthForTrunc: 30000, MaxTokensForClear: 100000, ClearRetentionSuffixLimit: 2, - TokenCounter: myTokenCounter, + ClearAtLeastTokens: 10000, + TruncExcludeTools: []string{"search_tool"}, + ClearExcludeTools: []string{"read_file"}, + ClearMessageRewriter: func(ctx context.Context, toolCallMsg *schema.Message, toolResponseMsgs []*schema.Message) ([]*schema.Message, error) { + // 将 write_file 调用重写为 system-reminder + return []*schema.Message{schema.UserMessage("file written")}, nil + }, ClearPostProcess: func(ctx context.Context, state *adk.ChatModelAgentState) context.Context { log.Printf("Clear completed, messages: %d", len(state.Messages)) return ctx }, ToolConfig: map[string]*reduction.ToolReductionConfig{ - "grep": { - Backend: grepBackend, - SkipTruncation: false, - }, - "read_file": { - Backend: readFileBackend, - SkipClear: true, // 读文件工具不需要清理 - }, + "grep": {Backend: grepBackend}, + "read_file": {SkipClear: true}, }, -} - -middleware, err := reduction.New(ctx, config) +}) ``` -### 仅使用截断策略 +### 仅截断 ```go middleware, err := reduction.New(ctx, &reduction.Config{ Backend: myBackend, - SkipClear: true, // 跳过清理阶段 + SkipClear: true, }) ``` -### 仅使用清理策略 +### 仅清理 ```go middleware, err := reduction.New(ctx, &reduction.Config{ - Backend: myBackend, - SkipTruncation: true, // 跳过截断阶段 + SkipTruncation: true, + MaxTokensForClear: 100000, + // Backend 为 nil 时,清理仍会替换内容为占位符,但不执行 offload }) ``` @@ -278,29 +253,37 @@ middleware, err := reduction.New(ctx, &reduction.Config{ ### Truncation(截断) -在 `WrapInvokableToolCall` / `WrapStreamableToolCall` 中处理: +在 `WrapInvokableToolCall` / `WrapStreamableToolCall` / `WrapEnhancedInvokableToolCall` / `WrapEnhancedStreamableToolCall` 中处理: 1. 工具返回结果 -2. 调用 TruncHandler 判断是否需要截断 -3. 如需截断,将完整内容存到 Backend -4. 返回截断后的内容,包含提示文字告知 agent 完整内容的位置 +2. 检查 `TruncExcludeTools`,命中则跳过 +3. 查找 ToolConfig → 全局 defaultConfig,获取 TruncHandler +4. TruncHandler 判定:读取完整输出,检查所有 text 部分总长度是否超过 `MaxLengthForTrunc` +5. 超过则:保留首尾各 `MaxLengthForTrunc/(textParts*2)` 字符作为预览,完整内容存到 Backend +6. 返回截断通知,告知 agent 完整内容的文件路径 + +> 💡 +> 对于流式工具,默认 TruncHandler 会等待完整流读取完毕后再决定是否截断。若需严格增量流式行为,请为该工具提供自定义 TruncHandler。 ### Clear(清理) 在 `BeforeModelRewriteState` 中处理: -1. 用 TokenCounter 计算总 token -2. 超过 MaxTokensForClear 才处理 -3. 从旧消息开始遍历,跳过已处理的和最近 ClearRetentionSuffixLimit 轮 -4. 对范围内的每个工具调用,调用 ClearHandler -5. 需要清理的,写入 Backend,把消息里的结果替换成文件路径 -6. 调用 ClearPostProcess 回调 +1. 用 `TokenCounter` 计算总 token +2. 未超过 `MaxTokensForClear` 则跳过 +3. 确定清理范围:从第一条未处理的 assistant 消息开始,到 `len(messages) - ClearRetentionSuffixLimit` 轮结束 +4. 若配置了 `ClearMessageRewriter`,先对范围内消息执行重写预处理 +5. 遍历范围内的 tool call 消息,跳过 `ClearExcludeTools` +6. 对每个 tool call 调用 ClearHandler,替换参数和结果 +7. 如设置了 `ClearAtLeastTokens`:先在副本上操作,对比清理前后 token 差值,不达标则放弃本次清理 +8. 达标后执行实际 offload 写入,更新 state.Messages +9. 调用 `ClearPostProcess` --- ## 多语言支持 -截断和清理的提示文字支持中英文,通过 `adk.SetLanguage()` 切换: +截断和清理的提示文字支持中英文自动切换: ```go adk.SetLanguage(adk.LanguageChinese) // 中文 @@ -311,7 +294,11 @@ adk.SetLanguage(adk.LanguageEnglish) // 英文(默认) ## 注意事项 -- 当 `SkipTruncation` 为 false 时,`Backend` 必须设置 -- 默认 TokenCounter 用 `字符数 / 4` 估算,对于中文不精准,建议使用 `github.com/tiktoken-go/tokenizer` 替换 -- 已处理过的消息会打标记,不会重复处理 -- `ToolConfig` 中的配置优先级高于全局配置 +- `SkipTruncation` 为 false 时,`Backend` **必须**设置 +- 默认 TokenCounter 用字符数/4 估算,建议使用 `github.com/tiktoken-go/tokenizer` 替换 +- 已处理过的消息通过 Extra 字段打标记 `_reduction_mw_processed`,不会重复处理 +- `ToolConfig` 中配置优先级高于全局;若 ToolConfig 中仅设置了 `SkipTruncation: false` 但未提供 `TruncHandler`,则回退到默认 handler +- `GenTruncOffloadFilePath` / `GenClearOffloadFilePath` 适用于 tool_call_id 不唯一的场景(如 retry),防止文件覆盖 +- `ClearMessageRewriter` 在清理范围确定后、逐工具清理前执行,适合将 write/edit 类调用压缩为简短提示 +- `ClearAtLeastTokens` 设为 0 表示只要超阈值就执行清理;大于 0 时可避免微量清理破坏 prompt cache +- Legacy API(`NewClearToolResult`、`NewToolResultMiddleware`)已废弃,建议迁移到 `New` / `NewTyped` diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md index de9bf1ac22a..7fac9f60cb2 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md @@ -1,26 +1,26 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-17" lastmod: "" tags: [] title: ToolSearch weight: 7 --- -# ToolSearch 中间件 - -adk/middlewares/dynamictool/toolsearch - -> 💡 -> 本中间件在 v0.8.0 版本引入。 - ## 概述 `toolsearch` 中间件实现动态工具选择。当工具库很大时,把所有工具都传给模型会撑爆上下文。这个中间件的做法是: -1. 添加一个 `tool_search` 元工具,接受正则表达式搜索工具名 +1. 添加一个 `tool_search` 元工具,接受关键字查询或直接选择来搜索工具 2. 初始时隐藏所有动态工具 -3. 模型调用 `tool_search` 后,匹配的工具才会出现在后续调用中 +3. 模型调用 `tool_search` 后,匹配的工具才会出现在后续调用中支持三种运行模式(配置层面为两个值,但 `UseModelToolSearch=true` 存在两种端到端行为): + +- **默认模式**(`UseModelToolSearch=false`):中间件自行管理工具可见性。在每次 Model 调用前通过 `BeforeModelRewriteState` 根据 `tool_search` 的调用结果过滤 `state.ToolInfos`,逐步将选中的动态工具加回模型可见列表 +- **模型原生模式 — 纯服务端检索**(`UseModelToolSearch=true`,模型自行检索 DeferredTools):中间件把动态工具移入 `state.DeferredToolInfos`,通过 `model.WithDeferredTools` 传递给模型。如果模型原生支持 server-side 工具检索(如 Claude 的 tool search),模型直接从 DeferredTools 中搜索和选择,**无需调用 tool_search tool** +- **模型原生模式 — 客户端代理检索**(`UseModelToolSearch=true`,模型通过调用 `tool_search` 发现工具):与上一模式相同的中间件配置,但模型不具备自主检索 DeferredTools 的能力,而是通过调用 `tool_search` 工具(由 `model.WithToolSearchTool` 注册),客户端的 `modelToolSearchTool` 执行搜索并返回结构化的 `ToolSearchResult`(含匹配工具的完整 ToolInfo),模型据此选择工具 + +> 💡 +> 包路径:github.com/cloudwego/eino/adk/middlewares/dynamictool/toolsearch --- @@ -31,17 +31,33 @@ Agent 初始化 │ ▼ ┌───────────────────────────────────────────┐ -│ BeforeAgent │ -│ - 注入 tool_search 工具 │ -│ - 把 DynamicTools 加到 Tools 列表 │ +│ BeforeAgent │ +│ - 注入 tool_search 工具 │ +│ - 把 DynamicTools 加到 Tools 列表 │ +│ - 模型原生模式下设置 │ +│ runCtx.ToolSearchTool │ └───────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────┐ -│ WrapModel │ -│ 每次 Model 调用前: │ -│ 1. 扫描消息历史,找到历史中所有 tool_search 的返回结果。 │ -│ 2. 全量 Tools 减去未被选中的 DynamicTools,作为本次 Model 调用的工具列表。 │ +│ BeforeModelRewriteState │ +│ (每次 Model 调用前执行) │ +│ │ +│ 1. 插入 │ +│ User 消息,列出所有可搜索的工具名 │ +│ │ +│ 首次调用时(初始化): │ +│ 默认模式: │ +│ 从 ToolInfos 中移除 DynamicTools │ +│ 模型原生模式: │ +│ DynamicTools → DeferredToolInfos │ +│ ToolInfos 中移除 DynamicTools │ +│ 和 tool_search │ +│ │ +│ 后续调用(默认模式-前向选择): │ +│ 扫描消息历史,收集 tool_search 返回的 │ +│ matches,把匹配的 DynamicTools 加回 │ +│ ToolInfos │ └────────────────────────────────────────────┘ │ ▼ @@ -56,34 +72,81 @@ Agent 初始化 type Config struct { // 可动态搜索和加载的工具列表 DynamicTools []tool.BaseTool + + // 是否使用模型原生的工具搜索能力 + // + // 为 true 时,中间件将工具搜索委托给模型的原生能力。 + // + // 为 false 时(默认),中间件通过在每次 Model 调用前 + // 根据 tool_search 结果过滤工具列表来管理工具可见性。 + // 注意:这种方式可能会使模型的 KV-cache 失效 + // (因为工具列表在调用之间会变化)。 + UseModelToolSearch bool } ``` --- -## tool_search 工具 +## 构造函数 -中间件注入的工具。 +```go +// 标准构造函数,使用 *schema.Message +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) -**参数:** +// 泛型构造函数,支持 *schema.Message 和 *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, config *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +## `New` 内部调用 `NewTyped[*schema.Message]`。如果你使用 `TypedChatModelAgent`(如 Agentic 模式),请直接使用 `NewTyped`。 + +## tool_search 工具 + +中间件注入的元工具。**参数:** - + +
    参数类型必填说明
    regex_pattern
    string匹配工具名的正则表达式
    query
    string查找工具的查询字符串。支持三种模式:关键字搜索、
    select:
    直接选择、
    +keyword
    必须匹配
    max_results
    integer返回的最大结果数(默认:5)。仅对关键字搜索模式生效,直接选择模式不受此限制
    -**返回:** +**查询模式:** + + + + + + +
    模式语法说明
    关键字搜索
    "weather forecast"
    按关键字在工具名和描述中匹配,按相关性评分排序。支持 camelCase 和
    _
    /
    __
    (MCP)分隔符拆分
    直接选择
    "select:tool_a,tool_b"
    按精确名称选择一个或多个工具,逗号分隔。不受
    max_results
    限制
    必须匹配
    "+slack send message"
    +
    前缀的关键字为必须匹配项,不含该关键字的工具会被过滤掉。其余关键字用于排序
    + +**返回值(默认模式):** ```json -{ - "selectedTools": ["tool_a", "tool_b"] -} +{"matches": ["tool_a", "tool_b"]} ``` ---- +**返回值(模型原生模式):** 返回结构化的 `schema.ToolResult`,包含匹配工具的完整 `ToolInfo`,供模型原生处理。 + +## 关键字搜索评分机制 + +关键字搜索使用多层评分系统,对每个关键字分别计算最高得分后累加: + + + + + + + +
    匹配规则得分
    工具名拆分后的部分完全匹配关键字10
    工具名拆分后的部分包含关键字(子串)5
    工具全名包含关键字3
    工具描述包含关键字2
    + +> 💡 +> 每个关键字对每个规则取最高分(intMax),不会叠加同一工具内多个 part 的匹配分数。多个关键字的得分相加为总分。得分相同时按工具名字典序排列。 + +工具名会按 `_`(下划线)、`__`(MCP 服务器与工具分隔符)和 camelCase 边界拆分为多个部分进行匹配。例如 `mcp__slack__send_message` 会拆分为 `["mcp", "slack", "send", "message"]`,`NotebookEdit` 会拆分为 `["Notebook", "Edit"]`。匹配不区分大小写。 ## 使用示例 +### 默认模式(中间件管理工具可见性) + ```go middleware, err := toolsearch.New(ctx, &toolsearch.Config{ DynamicTools: []tool.BaseTool{ @@ -103,35 +166,70 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ }) ``` +### 模型原生模式 + +```go +middleware, err := toolsearch.New(ctx, &toolsearch.Config{ + DynamicTools: []tool.BaseTool{ + weatherTool, + stockTool, + currencyTool, + }, + UseModelToolSearch: true, +}) +if err != nil { + return err +} + +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: myModel, // 需要模型支持原生 tool search + Handlers: []adk.ChatModelAgentMiddleware{middleware}, +}) +``` + +配置完全相同,但端到端行为取决于模型适配器的实现: + +- 如果模型原生支持 server-side 检索(如 Claude):模型直接从 `DeferredToolInfos` 中搜索和选择工具,`tool_search` 工具不会被调用 +- 如果模型通过客户端代理检索:模型发起 `tool_search` 调用 → 客户端 `modelToolSearchTool` 执行搜索 → 返回结构化 `ToolSearchResult`(含完整 ToolInfo)→ 模型据此选择工具 + --- ## 工作原理 ### BeforeAgent -1. 获取所有 DynamicTool -2. 使用 DynamicTools 创建 `tool_search` 工具 -3. 把 `tool_search` 和所有 DynamicTools 加到 `runCtx.Tools`,此时 Agent 中的 Tools 为全量 +1. 获取所有 DynamicTool 的 ToolInfo,校验无重复工具名 +2. 根据 `UseModelToolSearch` 创建对应类型的 `tool_search` 工具 +3. 把 `tool_search` 和所有 DynamicTools 加到 `runCtx.Tools`(此时 Agent 中为全量工具) +4. 模型原生模式下,设置 `runCtx.ToolSearchTool`,框架会通过 `model.WithToolSearchTool` 传递给模型 + +### BeforeModelRewriteState(每次 Model 调用前) + +**通用逻辑:** + +- 确保消息列表中存在 `` 提醒(以 User 消息插入,列出所有可搜索的工具名)**首次调用 — 初始化(两种模式):** -### WrapModel + +
    +默认模式
    state.ToolInfos
    中移除所有 DynamicTools,使模型初始只能看到静态工具和
    tool_search
    +模型原生模式1. 将 DynamicTools 从
    state.ToolInfos
    提取到
    state.DeferredToolInfos
    2. 从
    state.ToolInfos
    中移除
    tool_search
    (由模型原生处理)
    -每次 Model 调用前: +**后续调用 — 前向选择(仅默认模式):** -1. 遍历消息历史,找所有 `tool_search` 的返回结果 +1. 遍历消息历史,找所有 `tool_search` 返回结果中 JSON `matches` 字段 2. 收集已选中的工具名 -3. 从全量工具中过滤掉未选中的 DynamicTools -4. 用过滤后的工具列表调用 Model +3. 把匹配的 DynamicTools 加回 `state.ToolInfos`(累加,不会移除已添加的工具) -### 工具选择流程 +### 工具选择流程(默认模式) ``` 第一轮: - Model 只能看到 tool_search - Model 调用 tool_search(regex_pattern="weather.*") - 返回 {"selectedTools": ["weather_forecast", "weather_history"]} + Model 只能看到 tool_search + 静态工具 + Model 调用 tool_search(query="weather forecast") + 返回 {"matches": ["weather_forecast", "weather_history"]} 第二轮: - Model 能看到 tool_search + weather_forecast + weather_history + Model 能看到 tool_search + 静态工具 + weather_forecast + weather_history Model 调用 weather_forecast(...) ``` @@ -139,7 +237,10 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ ## 注意事项 -- DynamicTools 不能为空 -- 正则匹配的是工具名,不是描述 -- 选中的工具会一直保持可用,除非 tool_search 调用结果被删除或修改 -- 可以多次调用 tool_search,结果会累加 +- `DynamicTools` 不能为空,且工具名不能重复 +- 关键字搜索匹配工具名和描述,不区分大小写 +- 在默认模式下,选中的工具会一直保持可用(基于消息历史中 `tool_search` 结果累加) +- 可以多次调用 `tool_search`,结果会累加 +- 默认模式下,每次 Model 调用前工具列表可能变化,这可能导致模型 KV-cache 失效 +- 模型原生模式需要 ChatModel 支持 `model.WithToolSearchTool` 和/或 `model.WithDeferredTools` 选项。具体走哪条路径(纯服务端检索 vs 客户端代理检索)取决于模型适配器的实现 +- `` 提醒以 **User 消息**(而非 System 消息)插入到消息列表中,位于第一条非 System 消息之前 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md index 878423730f0..857e98b0c00 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md @@ -1,298 +1,257 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-19" lastmod: "" tags: [] title: ChatModelAgentMiddleware weight: 8 --- -## 概述 +`ChatModelAgentMiddleware` 是自定义 `ChatModelAgent`(及基于它的 `DeepAgent`)行为的核心接口。自 v0.8.0 引入,在后续版本持续演进。 -## ChatModelAgentMiddleware 接口 +## 类型约定 -`ChatModelAgentMiddleware` 定义了自定义 `ChatModelAgent` 行为的接口。 +本文使用默认 `M = *schema.Message` 的别名。泛型原始类型以 `Typed` 前缀命名: -**重要说明:** 此接口专为 `ChatModelAgent` 及基于它构建的 Agent(如 `DeepAgent`)设计。 - -> 💡 -> ChatModelAgentMiddleware 接口在 v0.8.0 版本引入 +```go +type ChatModelAgentMiddleware = TypedChatModelAgentMiddleware[*schema.Message] +type BaseChatModelAgentMiddleware = TypedBaseChatModelAgentMiddleware[*schema.Message] +type ChatModelAgentState = TypedChatModelAgentState[*schema.Message] +type ModelContext = TypedModelContext[*schema.Message] +``` -### 为什么使用 ChatModelAgentMiddleware 而非 AgentMiddleware? +当需使用 `*schema.AgenticMessage` 时,直接使用 `Typed` 泛型版本即可。 - - - - - -
    特性AgentMiddleware (结构体)ChatModelAgentMiddleware (接口)
    扩展性封闭,用户无法添加新方法开放,用户可实现自定义 handler
    Context 传播回调只返回 error所有方法返回
    (context.Context, ..., error)
    配置管理分散在闭包中集中在结构体字段中
    +--- -### 接口定义 +## 接口定义 ```go type ChatModelAgentMiddleware interface { - // BeforeAgent 在每次 agent 运行前调用,允许修改 instruction 和 tools 配置 + // ── 生命周期 Hook ── + + // BeforeAgent:agent 运行前调用一次,可修改 instruction、tools 配置 BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) - // BeforeModelRewriteState 在每次模型调用前调用 - // 返回的 state 会被持久化到 agent 内部状态并传递给模型 - // 返回的 context 会传播到模型调用和后续 handler + // AfterAgent:agent 成功终止后调用(最终回答或 return-directly 工具结果) + // 错误终止(超迭代、context 取消、model 错误)时不调用 + AfterAgent(ctx context.Context, state *ChatModelAgentState) (context.Context, error) + + // BeforeModelRewriteState:每次模型调用前调用 + // 返回的 state 被持久化,可修改 Messages、ToolInfos、DeferredToolInfos BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - // AfterModelRewriteState 在每次模型调用后调用 - // 输入的 state 包含模型响应作为最后一条消息 + // AfterModelRewriteState:每次模型调用后调用 + // 输入 state 包含模型响应作为最后一条消息 AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - // WrapInvokableToolCall 用自定义行为包装工具的同步执行 - // 如果不需要包装,返回原始 endpoint 和 nil error - // 仅对实现了 InvokableTool 的工具调用此方法 - WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) + // ── Wrapper ── - // WrapStreamableToolCall 用自定义行为包装工具的流式执行 - // 如果不需要包装,返回原始 endpoint 和 nil error - // 仅对实现了 StreamableTool 的工具调用此方法 + WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) - - // WrapEnhancedInvokableToolCall 用自定义行为包装增强型工具的同步执行 WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) - - // WrapEnhancedStreamableToolCall 用自定义行为包装增强型工具的流式执行 WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) - // WrapModel 用自定义行为包装聊天模型 - // 如果不需要包装,返回原始 model 和 nil error - // 在请求时调用,每次模型调用前都会执行 - WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) -} -``` - -### 使用 BaseChatModelAgentMiddleware - -嵌入 `*BaseChatModelAgentMiddleware` 以获得默认的空操作实现: - -```go -type MyHandler struct { - *adk.BaseChatModelAgentMiddleware -} - -func (h *MyHandler) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - return ctx, state, nil + // WrapModel:包装 ChatModel,参数类型为 model.BaseModel[M](非 ToolCallingChatModel) + // 框架单独处理 WithTools 绑定,不经过用户 wrapper + WrapModel(ctx context.Context, m model.BaseModel[M], mc *ModelContext) (model.BaseModel[M], error) } ``` ---- - -## 工具调用端点类型 - -工具包装使用函数类型而非接口,更清晰地表达了包装的意图: +> 💡 +> 嵌入 `*BaseChatModelAgentMiddleware` 可获得所有方法的空操作默认实现,只需覆盖关心的方法。 -```go -// InvokableToolCallEndpoint 是同步工具调用的函数签名 -type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) +### AgentMiddleware 已废弃 -// StreamableToolCallEndpoint 是流式工具调用的函数签名 -type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) +> 💡 +> `AgentMiddleware` 结构体及 `ChatModelAgentConfig.Middlewares` 字段已标记为 Deprecated,将在未来版本中移除。所有新代码应使用 `ChatModelAgentMiddleware`(interface-based Handlers)。 -// EnhancedInvokableToolCallEndpoint 是增强型同步工具调用的函数签名 -type EnhancedInvokableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.ToolResult, error) +`AgentMiddleware` 是结构体,有固有局限——用户无法扩展方法,回调仅返回 error 无法传播 context。`ChatModelAgentMiddleware` 是接口: -// EnhancedStreamableToolCallEndpoint 是增强型流式工具调用的函数签名 -type EnhancedStreamableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.StreamReader[*schema.ToolResult], error) -``` +- Hook 方法返回 `(context.Context, ..., error)`,支持 context 传播 +- Wrapper 方法通过 endpoint 链传播修改后的 context +- 自定义 handler 可携带任意内部状态 -### 为什么使用分离的端点类型? +迁移映射: -之前的 `ToolCall` 接口同时包含 `InvokableRun` 和 `StreamableRun`,但大多数工具只实现其中一个。 -分离的端点类型使得: + + + + + + + +
    AgentMiddleware 字段ChatModelAgentMiddleware 替代
    AdditionalInstruction
    BeforeAgent
    中修改
    runCtx.Instruction
    AdditionalTools
    BeforeAgent
    中修改
    runCtx.Tools
    BeforeChatModel
    BeforeModelRewriteState
    AfterChatModel
    AfterModelRewriteState
    WrapToolCall
    WrapInvokableToolCall
    /
    WrapStreamableToolCall
    -- 只有当工具实现相应接口时才调用对应的包装方法 -- wrapper 作者更清晰的契约 -- 关于实现哪个方法没有歧义 +当前版本两者可共存(Handlers 在 Middlewares 之后执行),但应尽早迁移。 --- -## ChatModelAgentContext +## 上下文类型 + +### ChatModelAgentContext -`ChatModelAgentContext` 包含在每次 `ChatModelAgent` 运行前传递给 handler 的运行时信息。 +`BeforeAgent` 的输入,每次 Run 前调用一次: ```go type ChatModelAgentContext struct { - // Instruction 是当前 Agent 执行的指令 - // 包括 agent 配置的指令、框架和 AgentMiddleware 追加的额外指令, - // 以及之前 BeforeAgent handler 应用的修改 + // 当前 instruction(含 agent 配置 + 框架追加 + 前序 handler 修改) Instruction string - // Tools 是当前为 Agent 执行配置的原始工具(无任何 wrapper 或 tool middleware) - // 包括 AgentConfig 中传入的工具、框架隐式添加的工具(如 transfer/exit 工具), - // 以及 middleware 已添加的其他工具 + // 原始工具列表(含框架隐式工具如 transfer/exit) Tools []tool.BaseTool - // ReturnDirectly 是当前配置为使 Agent 直接返回的工具名称集合 + // 配置为"直接返回"的工具名集合 ReturnDirectly map[string]bool + + // 模型原生工具搜索能力的 ToolInfo + // 由 handler 设置后,框架通过 model.WithToolSearchTool 传递给模型 + ToolSearchTool *schema.ToolInfo } ``` ---- - -## ChatModelAgentState +### ChatModelAgentState -`ChatModelAgentState` 表示对话过程中聊天模型 agent 的状态。这是 `ChatModelAgentMiddleware` 和 `AgentMiddleware` 回调的主要状态类型。 +每次模型调用前后传递的**持久化状态**(跨 iteration 保持): ```go type ChatModelAgentState struct { - // Messages 包含当前对话会话中的所有消息 - Messages []Message + // 当前会话的所有消息 + Messages []*schema.Message + + // 传递给模型的工具定义(via model.WithTools),可在 BeforeModelRewriteState 中修改 + ToolInfos []*schema.ToolInfo + + // 延迟检索工具定义(via model.WithDeferredTools),用于模型原生搜索能力 + // 未使用时为 nil + DeferredToolInfos []*schema.ToolInfo } ``` ---- +> 💡 +> 修改 `ToolInfos` / `DeferredToolInfos` 的推荐位置是 `BeforeModelRewriteState`——这是工具配置的 source of truth。不要在 `WrapModel` 中修改工具列表。 -## ToolContext +### ModelContext -`ToolContext` 提供被包装工具的元数据。在请求时创建,包含当前工具调用的信息。 +`WrapModel` 和 `Before/AfterModelRewriteState` 的上下文: ```go -type ToolContext struct { - // Name 是工具名称 - Name string +type ModelContext struct { + // Deprecated: 使用 ChatModelAgentState.ToolInfos 替代 + Tools []*schema.ToolInfo + + // 模型重试配置 + ModelRetryConfig *ModelRetryConfig - // CallID 是此特定工具调用的唯一标识符 - CallID string + // 模型容灾切换配置 + ModelFailoverConfig *ModelFailoverConfig[*schema.Message] } ``` -### 使用示例:工具调用包装 +### ToolContext + +工具包装的元数据: ```go -func (h *MyHandler) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - log.Printf("Tool %s (call %s) starting with args: %s", tCtx.Name, tCtx.CallID, argumentsInJSON) - - result, err := endpoint(ctx, argumentsInJSON, opts...) - - if err != nil { - log.Printf("Tool %s failed: %v", tCtx.Name, err) - return "", err - } - - log.Printf("Tool %s completed with result: %s", tCtx.Name, result) - return result, nil - }, nil +type ToolContext struct { + Name string // 工具名称 + CallID string // 本次调用唯一标识 } ``` --- -## ModelContext +## 工具调用端点类型 -`ModelContext` 包含传递给 `WrapModel` 的上下文信息。在请求时创建,包含当前模型调用的工具配置。 +工具包装使用函数类型而非接口。根据工具实现的接口,框架调用对应的 Wrap 方法: ```go -type ModelContext struct { - // Tools 是当前配置给 agent 的工具列表 - // 在请求时填充,包含将发送给模型的工具 - Tools []*schema.ToolInfo +// 标准工具 +type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) +type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) - // ModelRetryConfig 包含模型的重试配置 - // 在请求时从 agent 的 ModelRetryConfig 填充 - // 用于 EventSenderModelWrapper 适当地包装流错误 - ModelRetryConfig *ModelRetryConfig -} +// 增强型工具(使用 ToolArgument/ToolResult) +type EnhancedInvokableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.ToolResult, error) +type EnhancedStreamableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.StreamReader[*schema.ToolResult], error) ``` -### 使用示例:模型包装 - -```go -func (h *MyHandler) WrapModel(ctx context.Context, m model.BaseChatModel, mc *adk.ModelContext) (model.BaseChatModel, error) { - return &myModelWrapper{ - inner: m, - tools: mc.Tools, - }, nil -} - -type myModelWrapper struct { - inner model.BaseChatModel - tools []*schema.ToolInfo -} +> 💡 +> 每个 Wrap 方法**仅在工具实现了对应接口时才被调用**。例如,工具只实现了 `InvokableTool`,则只会调用 `WrapInvokableToolCall`,不会调用 `WrapStreamableToolCall`。 -func (w *myModelWrapper) Generate(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.Message, error) { - log.Printf("Model called with %d tools", len(w.tools)) - return w.inner.Generate(ctx, msgs, opts...) -} +--- -func (w *myModelWrapper) Stream(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.StreamReader[*schema.Message], error) { - return w.inner.Stream(ctx, msgs, opts...) -} -``` +## 执行顺序 ---- +### Model 调用生命周期(由外到内) + +1. ~~AgentMiddleware.BeforeChatModel~~(**Deprecated**,将移除) +2. **ChatModelAgentMiddleware.BeforeModelRewriteState** +3. `failoverModelWrapper`(内部 — 模型容灾切换,如配置) +4. `retryModelWrapper`(内部 — 失败重试) +5. `eventSenderModelWrapper` 预处理(内部 — 准备事件发送) +6. **ChatModelAgentMiddleware.WrapModel** 预处理(先注册 → 先执行) +7. `callbackInjectionModelWrapper`(内部) +8. **Model.Generate / Stream** +9. `callbackInjectionModelWrapper` 后处理 +10. **ChatModelAgentMiddleware.WrapModel** 后处理(先注册 → 后执行) +11. `eventSenderModelWrapper` 后处理 +12. `retryModelWrapper` 后处理 +13. **ChatModelAgentMiddleware.AfterModelRewriteState** +14. ~~AgentMiddleware.AfterChatModel~~(**Deprecated**,将移除) + +### Tool 调用生命周期(由外到内) + +1. `eventSenderToolHandler`(内部 — 发送工具结果事件) +2. `ToolsConfig.ToolCallMiddlewares` +3. ~~AgentMiddleware.WrapToolCall~~(**Deprecated**,将移除) +4. **ChatModelAgentMiddleware.WrapXxxToolCall**(先注册 → 最外层) +5. `cancelMonitoredToolHandler`(内部 — 取消监控) +6. **Tool.InvokableRun / StreamableRun** ## 运行时本地存储 API -`SetRunLocalValue`、`GetRunLocalValue` 和 `DeleteRunLocalValue` 提供在当前 agent Run() 调用期间存储、获取和删除值的能力。 +在当前 agent `Run()` 期间存取键值对。值与中断/恢复兼容——序列化后随 checkpoint 持久化。 ```go -// SetRunLocalValue 设置一个在当前 agent Run() 调用期间持久化的键值对 -// 值的作用域限于此特定执行,不会在不同的 Run() 调用或 agent 实例之间共享 -// -// 存储在这里的值与中断/恢复周期兼容 - 它们会被序列化并在 agent 恢复时还原 -// 对于自定义类型,必须在 init() 函数中使用 schema.RegisterName[T]() 注册以确保正确序列化 -// -// 此函数只能在 agent 执行期间从 ChatModelAgentMiddleware 内部调用 -// 如果在 agent 执行上下文之外调用,返回错误 func SetRunLocalValue(ctx context.Context, key string, value any) error - -// GetRunLocalValue 获取在当前 agent Run() 调用期间设置的值 -// 值的作用域限于此特定执行,不会在不同的 Run() 调用或 agent 实例之间共享 -// -// 通过 SetRunLocalValue 存储的值与中断/恢复周期兼容 - 它们会被序列化并在 agent 恢复时还原 -// 对于自定义类型,必须在 init() 函数中使用 schema.RegisterName[T]() 注册以确保正确序列化 -// -// 此函数只能在 agent 执行期间从 ChatModelAgentMiddleware 内部调用 -// 如果找到值返回 (value, true, nil),如果未找到返回 (nil, false, nil), -// 如果在 agent 执行上下文之外调用返回错误 func GetRunLocalValue(ctx context.Context, key string) (any, bool, error) - -// DeleteRunLocalValue 删除在当前 agent Run() 调用期间设置的值 -// -// 此函数只能在 agent 执行期间从 ChatModelAgentMiddleware 内部调用 -// 如果在 agent 执行上下文之外调用,返回错误 func DeleteRunLocalValue(ctx context.Context, key string) error ``` -### 使用示例:跨 handler 点共享数据 +> 💡 +> 自定义类型必须在 `init()` 中通过 `schema.RegisterName[T]()` 注册,以确保 gob 序列化正确。这些函数只能在 `ChatModelAgentMiddleware` 回调内调用。 + +### 示例:跨回调共享状态 ```go func init() { - schema.RegisterName[*MyCustomData]("my_package.MyCustomData") + schema.RegisterName[*ToolStats]("mypackage.ToolStats") } -type MyCustomData struct { +type ToolStats struct { Count int Name string } -type MyHandler struct { +type MyMiddleware struct { *adk.BaseChatModelAgentMiddleware } -func (h *MyHandler) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - result, err := endpoint(ctx, argumentsInJSON, opts...) - - data := &MyCustomData{Count: 1, Name: tCtx.Name} - if err := adk.SetRunLocalValue(ctx, "my_handler.last_tool", data); err != nil { - log.Printf("Failed to set run local value: %v", err) - } - +// 在工具调用后记录统计 +func (m *MyMiddleware) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + + _ = adk.SetRunLocalValue(ctx, "last_tool", &ToolStats{Count: 1, Name: tCtx.Name}) return result, err }, nil } -func (h *MyHandler) AfterModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - if val, found, err := adk.GetRunLocalValue(ctx, "my_handler.last_tool"); err == nil && found { - if data, ok := val.(*MyCustomData); ok { - log.Printf("Last tool was: %s (count: %d)", data.Name, data.Count) +// 在模型调用后读取统计 +func (m *MyMiddleware) AfterModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { + if val, found, _ := adk.GetRunLocalValue(ctx, "last_tool"); found { + if stats, ok := val.(*ToolStats); ok { + log.Printf("上一次工具: %s (count=%d)", stats.Name, stats.Count) } } return ctx, state, nil @@ -303,219 +262,79 @@ func (h *MyHandler) AfterModelRewriteState(ctx context.Context, state *adk.ChatM ## SendEvent API -`SendEvent` 允许在 agent 执行期间向事件流发送自定义 `AgentEvent`。 +在 agent 执行期间向事件流发送自定义 `AgentEvent`,调用方遍历事件流时可收到: ```go -// SendEvent 在 agent 执行期间向事件流发送自定义 AgentEvent -// 允许 ChatModelAgentMiddleware 实现发出自定义事件, -// 这些事件将被遍历 agent 事件流的调用者接收 -// -// 此函数只能在 agent 执行期间从 ChatModelAgentMiddleware 内部调用 -// 如果在 agent 执行上下文之外调用,返回错误 func SendEvent(ctx context.Context, event *AgentEvent) error ``` ---- - -## State 类型(即将弃用) - -`State` 保存 agent 运行时状态,包括消息和用户可扩展存储。 - -**⚠️ 弃用警告:** 此类型将在 v1.0.0 中设为未导出。请在 `ChatModelAgentMiddleware` 和 `AgentMiddleware` 回调中使用 `ChatModelAgentState`。不建议直接使用 `compose.ProcessState[*State]`,该用法将在 v1.0.0 中停止工作;请使用 handler API。 - -```go -type State struct { - Messages []Message - extra map[string]any // 未导出,通过 SetRunLocalValue/GetRunLocalValue 访问 - - // 以下为内部字段 - 请勿直接访问 - // 为与现有 checkpoint 向后兼容而保持导出 - ReturnDirectlyToolCallID string - ToolGenActions map[string]*AgentAction - AgentName string - RemainingIterations int - - internals map[string]any -} -``` - ---- - -## 架构图 - -下图展示了 `ChatModelAgentMiddleware` 在 `ChatModelAgent` 执行过程中的工作原理: - -``` -Agent.Run(input) - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ BeforeAgent(ctx, *ChatModelAgentContext) │ -│ 输入: 当前 Instruction、Tools 等 Agent 运行环境 │ -│ 输出: 修改后的 Agent 运行环境 │ -│ 作用: Run 开始时调用一次,修改整个 Run 生命周期的配置 │ -└─────────────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ ReAct Loop │ -│ ┌───────────────────────────────────────────────────────────────────┐ │ -│ │ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ BeforeModelRewriteState(ctx, *ChatModelAgentState, *MC) │ │ │ -│ │ │ 输入: 消息历史等持久化状态,以及 Model 运行环境 │ │ │ -│ │ │ 输出: 修改后的持久化状态,返回新 ctx │ │ │ -│ │ │ 作用: 修改跨 iteration 的持久化状态(主要是消息列表) │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ WrapModel(ctx, BaseChatModel, *ModelContext) │ │ │ -│ │ │ 输入: 被 wrap 的 ChatModel,以及 Model 运行环境 │ │ │ -│ │ │ 输出: 包装后的 Model (洋葱模型) │ │ │ -│ │ │ 作用: 修改单次 Model 请求的输入、输出和配置 │ │ │ -│ │ │ │ │ │ │ -│ │ │ ▼ │ │ │ -│ │ │ ┌───────────────┐ │ │ │ -│ │ │ │ Model │ │ │ │ -│ │ │ │ Generate/Stream│ │ │ │ -│ │ │ └───────────────┘ │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ AfterModelRewriteState(ctx, *ChatModelAgentState, *MC) │ │ │ -│ │ │ 输入: 消息历史等持久化状态(含 Model 响应), │ │ │ -│ │ │ 以及 Model 运行环境 │ │ │ -│ │ │ 输出: 修改后的持久化状态 │ │ │ -│ │ │ 作用: 修改跨 iteration 的持久化状态(主要是消息列表) │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌──────────────────┐ │ │ -│ │ │ Model 返回内容? │ │ │ -│ │ └──────────────────┘ │ │ -│ │ │ │ │ │ -│ │ 最终响应 │ │ ToolCalls │ │ -│ │ │ ▼ │ │ -│ │ │ ┌─────────────────────────────────────┐ │ │ -│ │ │ │ WrapInvokableToolCall / WrapStream │ │ │ -│ │ │ │ ableToolCall(ctx, endpoint, *TC) │ │ │ -│ │ │ │ 输入: 被 wrap 的 Tool 以及 │ │ │ -│ │ │ │ Tool 运行环境 │ │ │ -│ │ │ │ 输出: 包装后的 endpoint (洋葱模型)│ │ │ -│ │ │ │ 作用: 修改单次 Tool 请求的 │ │ │ -│ │ │ │ 输入、输出和配置 │ │ │ -│ │ │ │ │ │ │ │ -│ │ │ │ ▼ │ │ │ -│ │ │ │ ┌─────────────┐ │ │ │ -│ │ │ │ │ Tool.Run() │ │ │ │ -│ │ │ │ └─────────────┘ │ │ │ -│ │ │ └─────────────────────────────────────┘ │ │ -│ │ │ │ │ │ -│ │ │ │ (结果加入 Messages) │ │ -│ │ │ │ │ │ -│ │ │ ┌─────────┘ │ │ -│ │ │ │ │ │ -│ │ │ └──────────► 继续循环 │ │ -│ │ │ │ │ -│ └─────────────────────┼─────────────────────────────────────────────┘ │ -│ │ │ -│ ▼ │ -│ 循环直到完成或达到 maxIterations │ -└─────────────────────────────────────────────────────────────────────────┘ - │ - ▼ - Agent.Run() 结束 -``` - -### Handler 方法说明 - - - - - - - - - -
    方法输入输出作用范围
    BeforeAgent
    Agent 运行环境 (
    *ChatModelAgentContext
    )
    修改后的 Agent 运行环境整个 Run 生命周期,仅调用一次
    BeforeModelRewriteState
    持久化状态 + Model 运行环境修改后的持久化状态跨 iteration 的持久化状态(消息列表)
    WrapModel
    被 wrap 的 ChatModel + Model 运行环境包装后的 Model单次 Model 请求的输入、输出和配置
    AfterModelRewriteState
    持久化状态(含响应)+ Model 运行环境修改后的持久化状态跨 iteration 的持久化状态(消息列表)
    WrapInvokableToolCall
    被 wrap 的 Tool + Tool 运行环境包装后的 endpoint单次 Tool 请求的输入、输出和配置
    WrapStreamableToolCall
    被 wrap 的 Tool + Tool 运行环境包装后的 endpoint单次 Tool 请求的输入、输出和配置
    +仅能在 `ChatModelAgentMiddleware` 回调内调用。 --- -## 执行顺序 +## State 类型 -### Model 调用生命周期(从外到内的 wrapper 链) - -1. `AgentMiddleware.BeforeChatModel`(hook,在模型调用前运行) -2. `ChatModelAgentMiddleware.BeforeModelRewriteState`(hook,可在模型调用前修改状态) -3. `retryModelWrapper`(内部 - 失败时重试,如已配置) -4. `eventSenderModelWrapper` 预处理(内部 - 准备事件发送) -5. `ChatModelAgentMiddleware.WrapModel` 预处理(wrapper,在请求时包装,先注册的先运行) -6. `callbackInjectionModelWrapper`(内部 - 如未启用则注入回调) -7. `Model.Generate/Stream` -8. `callbackInjectionModelWrapper` 后处理 -9. `ChatModelAgentMiddleware.WrapModel` 后处理(wrapper,先注册的后运行) -10. `eventSenderModelWrapper` 后处理(内部 - 发送模型响应事件) -11. `retryModelWrapper` 后处理(内部 - 处理重试逻辑) -12. `ChatModelAgentMiddleware.AfterModelRewriteState`(hook,可在模型调用后修改状态) -13. `AgentMiddleware.AfterChatModel`(hook,在模型调用后运行) - -### Tool 调用生命周期(从外到内) - -1. `eventSenderToolHandler`(内部 ToolMiddleware - 在所有处理后发送工具结果事件) -2. `ToolsConfig.ToolCallMiddlewares`(ToolMiddleware) -3. `AgentMiddleware.WrapToolCall`(ToolMiddleware) -4. `ChatModelAgentMiddleware.WrapInvokableToolCall/WrapStreamableToolCall`(在请求时包装,先注册的在最外层) -5. `Tool.InvokableRun/StreamableRun` +> 💡 +> `State` 仅为 checkpoint 向后兼容而保持导出。**不要直接使用**——请在 `ChatModelAgentMiddleware` 回调中使用 `ChatModelAgentState`,用 `SetRunLocalValue/GetRunLocalValue` 替代原 `State.Extra`。`compose.ProcessState[*State]` 用法将在 v1.0.0 中停止工作。 --- ## 迁移指南 -### 从 AgentMiddleware 迁移到 ChatModelAgentMiddleware +### 从 compose.ProcessState[*State] 迁移 -**之前(AgentMiddleware):** +**之前:** ```go -middleware := adk.AgentMiddleware{ - BeforeChatModel: func(ctx context.Context, state *adk.ChatModelAgentState) error { - return nil - }, -} +compose.ProcessState(ctx, func(_ context.Context, st *adk.State) error { + st.Extra["myKey"] = myValue + return nil +}) ``` -**之后(ChatModelAgentMiddleware):** +**之后:** ```go -type MyHandler struct { - *adk.BaseChatModelAgentMiddleware +// 写入 +if err := adk.SetRunLocalValue(ctx, "myKey", myValue); err != nil { + return ctx, state, err } -func (h *MyHandler) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - newCtx := context.WithValue(ctx, myKey, myValue) - return newCtx, state, nil +// 读取 +if val, found, err := adk.GetRunLocalValue(ctx, "myKey"); err == nil && found { + // use val } ``` -### 从 compose.ProcessState[*State] 迁移 +### 适配 AfterAgent(v0.9 新增) -**之前:** +`AfterAgent` 在 agent **成功终止**后调用(最终回答或 return-directly 工具结果),可用于后处理: ```go -compose.ProcessState(ctx, func(_ context.Context, st *adk.State) error { - st.Extra["myKey"] = myValue - return nil -}) +func (m *MyMiddleware) AfterAgent(ctx context.Context, state *adk.ChatModelAgentState) (context.Context, error) { + log.Printf("Agent 完成,共 %d 条消息", len(state.Messages)) + // 可在此做审计、统计、清理等 + return ctx, nil +} ``` -**之后(使用 SetRunLocalValue/GetRunLocalValue):** +> 💡 +> `AfterAgent` 按注册顺序调用(与 `BeforeAgent` 一致)。任一 handler 返回 error 后,后续 handler 不再调用(fail-fast),错误发送到事件流。 -```go -if err := adk.SetRunLocalValue(ctx, "myKey", myValue); err != nil { - return ctx, state, err -} +### 适配 ToolInfos / DeferredToolInfos(v0.9 新增) -if val, found, err := adk.GetRunLocalValue(ctx, "myKey"); err == nil && found { +`ChatModelAgentState` 新增了 `ToolInfos` 和 `DeferredToolInfos` 字段,取代 `ModelContext.Tools` 成为工具配置的 source of truth: + +```go +func (m *MyMiddleware) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { + // 动态过滤工具 + filtered := make([]*schema.ToolInfo, 0, len(state.ToolInfos)) + for _, t := range state.ToolInfos { + if shouldInclude(t.Name) { + filtered = append(filtered, t) + } + } + state.ToolInfos = filtered + return ctx, state, nil } ``` diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md index 5291c0c17df..936376cfe4a 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md @@ -1,125 +1,162 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: FileSystem Backend weight: 1 --- -> 💡 -> Package: [github.com/cloudwego/eino/adk/filesystem](https://github.com/cloudwego/eino/tree/main/adk/filesystem) +> 💡Package: [github.com/cloudwego/eino/adk/filesystem](https://github.com/cloudwego/eino/tree/main/adk/filesystem) ## 背景与目的 -在 AI Agent 场景中,Agent 往往需要与文件系统交互——读取文件内容、搜索代码、编辑配置、执行命令等。然而,不同的运行环境对文件系统的访问方式差异很大: +AI Agent 需要与文件系统交互(读取、搜索、编辑、执行命令),但不同运行环境的访问方式差异很大:本地磁盘、远程沙箱、内存模拟、对象存储等。若每种环境单独实现文件操作逻辑,会导致 Middleware/Agent 代码与底层存储耦合。 -- **本地开发环境**:直接操作本机文件系统,零配置即可使用 -- **云端沙箱环境**:通过远程 API 操作隔离的沙箱文件系统,需要认证和网络通信 -- **测试环境**:需要内存级别的模拟文件系统,无需真实磁盘 I/O -- **自定义存储**:可能需要对接 OSS、数据库等非传统文件系统 +`filesystem.Backend` 接口解决这一问题——作为**统一文件系统操作协议**: -如果每种环境都各自实现一套文件操作逻辑,会导致 Middleware 和 Agent 代码与底层存储实现耦合,难以复用和测试。 - -为了解决这一问题,Eino ADK 抽象出 `filesystem.Backend` 接口,作为**统一的文件系统操作协议**。它的设计目标是: - -1. **解耦存储与业务**:Middleware 只依赖 Backend 接口,不关心底层是本地磁盘、远程沙箱还是内存模拟 -2. **可插拔替换**:通过切换 Backend 实现,同一个 Agent 可以在不同环境中运行,无需修改任何业务代码 -3. **易于测试**:内置 `InMemoryBackend` 实现,方便在单元测试中模拟文件系统行为 -4. **可扩展性**:所有方法使用结构体参数,未来新增字段不会破坏已有实现的兼容性 +1. **解耦存储与业务** — Middleware 只依赖接口,不关心底层实现 +2. **可插拔替换** — 切换 Backend 即可在不同环境运行,无需修改业务代码 +3. **易于测试** — 内置 `InMemoryBackend`,无需真实磁盘 I/O +4. **向前兼容** — 所有方法使用结构体参数,新增字段不破坏已有实现 ## Backend 接口 ```go type Backend interface { - // 列出指定路径下的文件和目录信息 LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) - // 读取文件内容,支持按行分页(offset + limit) Read(ctx context.Context, req *ReadRequest) (*FileContent, error) - // 在指定路径中搜索匹配 pattern 的内容,返回匹配列表 GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error) - // 根据 glob pattern 和路径查找匹配的文件 GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error) - // 写入或创建文件 Write(ctx context.Context, req *WriteRequest) error - // 替换文件中的字符串内容 Edit(ctx context.Context, req *EditRequest) error } ``` -### 扩展接口 + + + + + + + + +
    方法功能返回
    LsInfo
    列出指定路径下的文件和目录信息
    []FileInfo
    Read
    读取文件内容,支持按行分页(offset + limit)
    *FileContent
    GrepRaw
    在文件中搜索匹配 pattern 的内容
    []GrepMatch
    GlobInfo
    根据 glob pattern 查找匹配文件
    []FileInfo
    Write
    写入或创建文件
    error
    Edit
    替换文件中的字符串内容
    error
    -除核心文件操作外,Backend 还可以选择性地实现 Shell 命令执行能力: +## 扩展接口 + +### Shell / StreamingShell + +Backend 可选择性实现命令执行能力。当 Backend 同时实现 `Shell` 或 `StreamingShell` 时,Filesystem Middleware 会额外注册 `execute` 工具。两者**互斥**,不可同时配置。 ```go -// Shell 提供同步命令执行能力 type Shell interface { Execute(ctx context.Context, input *ExecuteRequest) (result *ExecuteResponse, err error) } -// StreamingShell 提供流式命令执行能力,适用于长时间运行的命令 type StreamingShell interface { ExecuteStreaming(ctx context.Context, input *ExecuteRequest) (result *schema.StreamReader[*ExecuteResponse], err error) } ``` -当 Backend 同时实现了 `Shell` 或 `StreamingShell` 接口时,Filesystem Middleware 会额外注册 `execute` 工具,允许 Agent 执行 shell 命令。 +### MultiModalReader + +可选扩展接口,支持多模态文件读取(图片、PDF 等),返回结构化的 `MultiFileContent`。 + +```go +type MultiModalReader interface { + MultiModalRead(ctx context.Context, req *MultiModalReadRequest) (*MultiFileContent, error) +} +``` + +当 Backend 实现此接口且 Middleware 配置 `UseMultiModalRead = true` 时,`read_file` 工具将使用多模态读取。 + +## 核心数据类型 -### 核心数据类型 +### 请求类型 - - - - - - - - - - + + + + + + + + +
    类型描述
    FileInfo
    文件/目录信息:路径、是否目录、大小、修改时间
    FileContent
    文件内容 + 行号信息
    GrepMatch
    搜索匹配结果:内容、路径、行号
    ReadRequest
    读取请求:路径、offset(从第几行开始,1-based)、limit(读取行数)
    GrepRequest
    搜索请求:pattern(支持正则)、路径、glob 过滤、文件类型过滤等
    WriteRequest
    写入请求:路径、内容
    EditRequest
    编辑请求:路径、旧字符串、新字符串、是否全部替换
    ExecuteRequest
    命令执行请求:命令字符串、是否后台运行
    ExecuteResponse
    命令执行结果:输出内容、退出码、是否被截断
    类型字段说明
    LsInfoRequest
    Path string
    要列出的目录路径
    ReadRequest
    FilePath string
    Offset int
    Limit int
    文件路径;起始行号(1-based,<1 视为 1);最大读取行数(0=全部)
    MultiModalReadRequest
    嵌入
    ReadRequest
    Pages string
    继承 ReadRequest 所有字段;Pages 指定 PDF 页码范围(如 "1-5"、"3")
    GrepRequest
    Pattern string
    Path string
    Glob string
    FileType string
    CaseInsensitive bool
    EnableMultiline bool
    AfterLines int
    BeforeLines int
    正则搜索模式(ripgrep 语法);搜索目录;glob 文件过滤;文件类型过滤(如 "go"、"py");忽略大小写;启用多行匹配;匹配后显示 N 行;匹配前显示 N 行
    GlobInfoRequest
    Pattern string
    Path string
    glob 表达式(支持
    *
    **
    ?
    [abc]
    );搜索起始目录
    WriteRequest
    FilePath string
    Content string
    目标文件路径;写入内容
    EditRequest
    FilePath string
    OldString string
    NewString string
    ReplaceAll bool
    文件路径;被替换的精确字符串(非空);替换后的字符串;false 时要求 OldString 在文件中仅出现一次
    ExecuteRequest
    Command string
    RunInBackendGround bool
    要执行的命令字符串;是否后台运行
    +### 响应类型 + + + + + + + + + +
    类型字段说明
    FileInfo
    Path string
    IsDir bool
    Size int64
    ModifiedAt string
    文件/目录路径;是否为目录;文件大小(字节);最后修改时间(ISO 8601 格式)
    FileContent
    Content string
    文件的纯文本内容
    MultiFileContent
    *FileContent
    Parts []FileContentPart
    嵌入 FileContent;多模态输出部分。Parts 与 FileContent 互斥:Parts 非空时 FileContent 被忽略
    FileContentPart
    Type FileContentPartType
    MIMEType string
    Data []byte
    内容类型(
    "image"
    "pdf"
    );MIME 类型(如 "image/png");原始二进制数据
    GrepMatch
    Content string
    Path string
    Line int
    匹配的行内容;文件路径;1-based 行号
    ExecuteResponse
    Output string
    ExitCode *int
    Truncated bool
    命令输出内容;退出码(指针,可能为 nil);输出是否被截断
    + +### 常量 + +```go +type FileContentPartType string + +const ( + FileContentPartTypeImage FileContentPartType = "image" + FileContentPartTypePDF FileContentPartType = "pdf" +) +``` + ## 内置实现:InMemoryBackend -`InMemoryBackend` 是框架内置的 Backend 实现,将文件存储在内存 map 中,主要用于: +`InMemoryBackend` 将文件存储在内存 map 中,主要用于: -- **单元测试**:无需真实文件系统即可测试 Agent 和 Middleware 的文件操作逻辑 -- **轻量场景**:不需要持久化的临时文件操作 -- **工具结果卸载**:Filesystem Middleware 的大型工具结果卸载功能默认使用 InMemoryBackend 存储 +- **单元测试** — 无需真实文件系统即可测试 Agent/Middleware 的文件操作逻辑 +- **轻量场景** — 不需要持久化的临时文件操作 +- **工具结果卸载** — Filesystem Middleware 的大型工具结果卸载功能默认使用 InMemoryBackend + +### 构造函数 ```go -import "github.com/cloudwego/eino/adk/filesystem" +func NewInMemoryBackend() *InMemoryBackend +``` -ctx := context.Background() +零参数构造,返回空的内存文件系统。 + +### 使用示例 + +```go backend := filesystem.NewInMemoryBackend() +ctx := context.Background() -// 写入文件 -err := backend.Write(ctx, &filesystem.WriteRequest{ +// 写入 +_ = backend.Write(ctx, &filesystem.WriteRequest{ FilePath: "/example/test.txt", Content: "Hello, World!\nLine 2\nLine 3", }) -// 读取文件(支持分页) -content, err := backend.Read(ctx, &filesystem.ReadRequest{ +// 读取(分页) +content, _ := backend.Read(ctx, &filesystem.ReadRequest{ FilePath: "/example/test.txt", Offset: 1, Limit: 10, }) -// 列出目录 -files, err := backend.LsInfo(ctx, &filesystem.LsInfoRequest{ - Path: "/example", -}) +// 列目录 +files, _ := backend.LsInfo(ctx, &filesystem.LsInfoRequest{Path: "/example"}) -// 搜索内容(支持正则) -matches, err := backend.GrepRaw(ctx, &filesystem.GrepRequest{ - Pattern: "Hello", - Path: "/example", +// 搜索(正则) +matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ + Pattern: "Hello", + Path: "/example", + CaseInsensitive: true, }) -// 编辑文件 -err = backend.Edit(ctx, &filesystem.EditRequest{ +// 编辑 +_ = backend.Edit(ctx, &filesystem.EditRequest{ FilePath: "/example/test.txt", OldString: "Hello", NewString: "Hi", @@ -127,18 +164,22 @@ err = backend.Edit(ctx, &filesystem.EditRequest{ }) ``` -特性: +### 实现特性 -- 线程安全(基于 `sync.RWMutex`) -- GrepRaw 支持正则匹配、大小写不敏感、上下文行数等高级选项 -- GrepRaw 内部采用并行处理(最多 10 个 worker) +- **线程安全** — 基于 `sync.RWMutex`,读操作使用读锁,写操作使用写锁 +- **GrepRaw 并行处理** — 多文件搜索时最多启动 10 个 worker 并行匹配 +- **正则支持** — 支持完整正则、大小写不敏感 (`(?i)` 前缀)、多行模式 +- **上下文行** — GrepRaw 支持 BeforeLines/AfterLines 显示匹配行前后的上下文 +- **Glob 匹配** — 使用 `doublestar` 库支持 `**` 递归匹配 +- **FileType 映射** — 内置 70+ 种文件类型到扩展名的映射表(go、py、ts、rust 等) +- **不实现 Shell** — InMemoryBackend 不实现 Shell/StreamingShell 接口 ## 外部实现 以下 Backend 实现位于 [eino-ext](https://github.com/cloudwego/eino-ext) 仓库: -- **Local Backend** — 本地文件系统实现,直接操作本机磁盘,零配置开箱即用 -- **Ark Agentkit Sandbox Backend** — 火山引擎 Agentkit 远程沙箱实现,在隔离的云端环境中执行文件操作 +- **Local Backend** (`github.com/cloudwego/eino-ext/adk/backend/local`) — 本地文件系统实现,直接操作本机磁盘 +- **Ark Agentkit Sandbox** (`github.com/cloudwego/eino-ext/adk/backend/agentkit`) — 火山引擎 Agentkit 远程沙箱实现 ### 实现对比 @@ -148,18 +189,17 @@ err = backend.Edit(ctx, &filesystem.EditRequest{ 网络依赖无无需要 配置复杂度零配置零配置需要凭证 持久化否是是 -Shell 支持否支持(含流式)支持 -适用场景测试/临时开发/本地环境多租户/生产环境 +Shell 支持否Shell + StreamingShellShell +MultiModalReader否视实现而定视实现而定 +适用场景测试 / 临时存储开发 / 本地环境多租户 / 生产环境 ## 自定义实现 -如需对接自定义存储(如 OSS、数据库等),只需实现 `Backend` 接口即可: +实现 `Backend` 接口即可对接自定义存储。如需命令执行,额外实现 `Shell` 或 `StreamingShell`;如需多模态读取,实现 `MultiModalReader`。 ```go -type MyBackend struct { - // ... -} +type MyBackend struct { /* ... */ } func (b *MyBackend) LsInfo(ctx context.Context, req *filesystem.LsInfoRequest) ([]filesystem.FileInfo, error) { // 自定义实现 @@ -169,7 +209,29 @@ func (b *MyBackend) Read(ctx context.Context, req *filesystem.ReadRequest) (*fil // 自定义实现 } -// ... 实现其余方法 -``` +func (b *MyBackend) GrepRaw(ctx context.Context, req *filesystem.GrepRequest) ([]filesystem.GrepMatch, error) { + // 自定义实现 +} + +func (b *MyBackend) GlobInfo(ctx context.Context, req *filesystem.GlobInfoRequest) ([]filesystem.FileInfo, error) { + // 自定义实现 +} -如果需要支持命令执行,还可以额外实现 `Shell` 或 `StreamingShell` 接口。 +func (b *MyBackend) Write(ctx context.Context, req *filesystem.WriteRequest) error { + // 自定义实现 +} + +func (b *MyBackend) Edit(ctx context.Context, req *filesystem.EditRequest) error { + // 自定义实现 +} + +// 可选:实现 Shell +func (b *MyBackend) Execute(ctx context.Context, input *filesystem.ExecuteRequest) (*filesystem.ExecuteResponse, error) { + // 自定义实现 +} + +// 可选:实现 MultiModalReader +func (b *MyBackend) MultiModalRead(ctx context.Context, req *filesystem.MultiModalReadRequest) (*filesystem.MultiFileContent, error) { + // 自定义实现 +} +``` diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md index 47767e5af6c..956755f238f 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: Ark Agentkit Sandbox diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md new file mode 100644 index 00000000000..9bdd0a6bb62 --- /dev/null +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md @@ -0,0 +1,201 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: 本地文件系统 +weight: 2 +--- + +## Local Backend + +**Package**: `github.com/cloudwego/eino-ext/adk/backend/local` + +> 💡 +> eino v0.8.0+ 需使用 local backend v0.2.1 及以上版本。 + +Local Backend 是 Eino ADK FileSystem 的本地实现,直接操作本机文件系统。实现了 `filesystem.Backend`(文件操作)和 `filesystem.StreamingShell`(流式命令执行)两个接口。 + +**核心特性**:零配置、原生性能、强制绝对路径、流式命令执行、可选命令验证。 + +--- + +## 安装 + +```bash +go get github.com/cloudwego/eino-ext/adk/backend/local +``` + +## 配置 + +```go +type Config struct { + // 可选:命令验证函数,用于 ExecuteStreaming 的安全控制。 + // 返回 non-nil error 时拒绝执行。 + ValidateCommand func(string) error +} +``` + +## 快速开始 + +```go +backend, err := local.NewBackend(ctx, &local.Config{}) + +// 写入文件(必须绝对路径;文件已存在则覆盖) +err = backend.Write(ctx, &filesystem.WriteRequest{ + FilePath: "/tmp/hello.txt", + Content: "Hello, Local Backend!", +}) + +// 读取文件(支持行级分页) +fc, err := backend.Read(ctx, &filesystem.ReadRequest{ + FilePath: "/tmp/hello.txt", + Offset: 1, // 起始行号(1-based) + Limit: 50, // 最大行数,0 表示全部 +}) +``` + +### 与 Agent 集成 + +```go +import ( + "github.com/cloudwego/eino/adk" + fsMiddleware "github.com/cloudwego/eino/adk/middlewares/filesystem" + "github.com/cloudwego/eino-ext/adk/backend/local" +) + +backend, _ := local.NewBackend(ctx, &local.Config{}) + +middleware, _ := fsMiddleware.New(ctx, &fsMiddleware.Config{ + Backend: backend, // 必填:注册 ls/read/write/edit/glob/grep 工具 + StreamingShell: backend, // 可选:注册流式 execute 工具 +}) + +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: chatModel, + Handlers: []adk.ChatModelAgentMiddleware{middleware}, +}) +``` + +> 💡 +> 中间件 Config 中 `Shell` 与 `StreamingShell` 互斥。Local Backend 仅实现 `StreamingShell`(流式命令执行),不实现非流式 `Shell`。 + +--- + +## 实现的接口与方法 + +### filesystem.Backend + + + + + + + + + +
    方法签名说明
    LsInfo
    (ctx, *LsInfoRequest) ([]FileInfo, error)
    列出目录内容
    Read
    (ctx, *ReadRequest) (*FileContent, error)
    读取文件,支持行级分页(Offset 1-based,Limit 0=全部)
    Write
    (ctx, *WriteRequest) error
    写入文件;自动创建父目录;文件已存在则覆盖
    Edit
    (ctx, *EditRequest) error
    字符串替换;支持
    ReplaceAll
    OldString
    不唯一时报错(非 ReplaceAll 模式)
    GrepRaw
    (ctx, *GrepRequest) ([]GrepMatch, error)
    基于 ripgrep 搜索,支持完整正则语法;支持大小写不敏感、多行匹配、上下文行
    GlobInfo
    (ctx, *GlobInfoRequest) ([]FileInfo, error)
    Glob 模式匹配文件,支持
    *
    /
    **
    /
    ?
    /
    [abc]
    + +### filesystem.StreamingShell + + + + +
    方法签名说明
    ExecuteStreaming
    (ctx, *ExecuteRequest) (*StreamReader[*ExecuteResponse], error)
    流式执行 shell 命令,实时输出;支持后台运行(
    RunInBackendGround
    + +--- + +## 使用示例 + +### 搜索内容(正则) + +```go +matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ + Path: "/home/user/project", + Pattern: "TODO|FIXME", // ripgrep 正则语法 + Glob: "*.go", + CaseInsensitive: true, +}) +``` + +### 编辑文件 + +```go +backend.Edit(ctx, &filesystem.EditRequest{ + FilePath: "/tmp/file.txt", + OldString: "old text", + NewString: "new text", + ReplaceAll: true, +}) +``` + +### 流式执行命令 + +```go +reader, _ := backend.ExecuteStreaming(ctx, &filesystem.ExecuteRequest{ + Command: "tail -f /var/log/app.log", +}) +for { + resp, err := reader.Recv() + if err == io.EOF { + break + } + fmt.Print(resp.Output) +} +``` + +### 带命令验证 + +```go +backend, _ := local.NewBackend(ctx, &local.Config{ + ValidateCommand: func(cmd string) error { + allowed := map[string]bool{"ls": true, "cat": true, "grep": true} + parts := strings.Fields(cmd) + if len(parts) == 0 || !allowed[parts[0]] { + return fmt.Errorf("command not allowed: %s", parts[0]) + } + return nil + }, +}) +``` + +--- + +## 路径要求 + +所有文件路径必须为绝对路径(以 `/` 开头)。相对路径可通过 `filepath.Abs()` 转换。 + +--- + +## 与 Agentkit Backend 对比 + + + + + + + + + + +
    特性LocalAgentkit
    执行模型本地直接远程沙箱
    网络依赖需要
    配置复杂度零配置需要凭证
    安全模型OS 权限 + ValidateCommand隔离沙箱
    流式输出支持(StreamingShell)不支持
    平台支持Unix/Linux/macOS任意
    适用场景开发/本地环境多租户/生产环境
    + +--- + +## FAQ + +**Q: GrepRaw 支持正则吗?** + +A: 支持。底层使用 ripgrep(`rg`),支持完整正则语法。系统需安装 ripgrep,否则报错 `ripgrep (rg) is not installed or not in PATH`。安装方式见 [https://github.com/BurntSushi/ripgrep#installation](https://github.com/BurntSushi/ripgrep#installation) 。 + +**Q: Write 是创建还是覆盖?** + +A: 覆盖。`Write` 使用 `O_CREATE|O_TRUNC` 标志,文件已存在则覆盖内容,不存在则创建(含自动创建父目录)。 + +**Q: Windows 支持吗?** + +A: 不支持。`ExecuteStreaming` 依赖 `/bin/sh`。文件操作本身可在任意平台运行,但命令执行仅限 Unix 系。 + +**Q: Local Backend 支持非流式 Execute 吗?** + +A: 不支持。Local 仅实现 `StreamingShell`(`ExecuteStreaming`),未实现 `Shell`(`Execute`)。中间件 Config 中 `Shell` 与 `StreamingShell` 互斥,选其一即可。 diff --git "a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" "b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" deleted file mode 100644 index 00a3dff1644..00000000000 --- "a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" +++ /dev/null @@ -1,231 +0,0 @@ ---- -Description: "" -date: "2026-03-24" -lastmod: "" -tags: [] -title: 本地文件系统 -weight: 2 ---- - -## Local Backend - -Package: `github.com/cloudwego/eino-ext/adk/backend/local` - -注意:如果 eino 版本是 v0.8.0 及以上,需要使用 local backend 的 [adk/backend/local/v0.2.1](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Flocal%2Fv0.2.1) 版本。 - -### 概述 - -Local Backend 是 EINO ADK FileSystem 的本地文件系统实现,直接操作本机文件系统,提供原生性能和零配置体验。 - -#### 核心特性 - -- 零配置 - 开箱即用 -- 原生性能 - 直接文件系统访问,无网络开销 -- 路径安全 - 强制使用绝对路径 -- 流式执行 - 支持命令输出实时流 -- 命令验证 - 可选的安全验证钩子 - -### 安装 - -```bash -go get github.com/cloudwego/eino-ext/adk/backend/local -``` - -### 配置 - -```go -type Config struct { - // 可选: 命令验证函数,用于 Execute() 安全控制 - ValidateCommand func(string) error -} -``` - -### 快速开始 - -#### 基本用法 - -```go -import ( - "context" - - "github.com/cloudwego/eino-ext/adk/backend/local" - "github.com/cloudwego/eino/adk/filesystem" -) - -func main() { - ctx := context.Background() - - backend, err := local.NewBackend(ctx, &local.Config{}) - if err != nil { - panic(err) - } - - // 写入文件(必须是绝对路径) - err = backend.Write(ctx, &filesystem.WriteRequest{ - FilePath: "/tmp/hello.txt", - Content: "Hello, Local Backend!", - }) - - // 读取文件 - fcontent, err := backend.Read(ctx, &filesystem.ReadRequest{ - FilePath: "/tmp/hello.txt", - }) - fmt.Println(fcontent.Content) -} -``` - -#### 带命令验证 - -```go -func validateCommand(cmd string) error { - allowed := map[string]bool{"ls": true, "cat": true, "grep": true} - parts := strings.Fields(cmd) - if len(parts) == 0 || !allowed[parts[0]] { - return fmt.Errorf("command not allowed: %s", parts[0]) - } - return nil -} - -backend, _ := local.NewBackend(ctx, &local.Config{ - ValidateCommand: validateCommand, -}) -``` - -#### 与 Agent 集成 - -```go -import ( - "github.com/cloudwego/eino/adk" - fsMiddleware "github.com/cloudwego/eino/adk/middlewares/filesystem" -) - -// 创建 Backend -backend, _ := local.NewBackend(ctx, &local.Config{}) - -// 创建 Middleware -middleware, _ := fsMiddleware.New(ctx, &fsMiddleware.Config{ - Backend: backend, - StreamingShell: backend, -}) - -// 创建 Agent -agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "LocalFileAgent", - Description: "具有本地文件系统访问能力的 AI Agent", - Model: chatModel, - Handlers: []adk.ChatModelAgentMiddleware{middleware}, -}) -``` - -### API 参考 - - - - - - - - - - - -
    方法描述
    LsInfo列出目录内容
    Read读取文件内容(支持分页,默认 200 行)
    Write创建新文件(已存在则报错)
    Edit替换文件内容
    GrepRaw搜索文件内容(字面量匹配)
    GlobInfo按模式查找文件
    Execute执行 shell 命令
    ExecuteStreaming流式执行命令
    - -#### 示例 - -```go -// 列出目录 -files, _ := backend.LsInfo(ctx, &filesystem.LsInfoRequest{ - Path: "/home/user", -}) - -// 读取文件(分页) -fcontent, _ := backend.Read(ctx, &filesystem.ReadRequest{ - FilePath: "/path/to/file.txt", - Offset: 0, - Limit: 50, -}) - -// 搜索内容(字面量匹配,非正则) -matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ - Path: "/home/user/project", - Pattern: "TODO", - Glob: "*.go", -}) - -// 查找文件 -files, _ := backend.GlobInfo(ctx, &filesystem.GlobInfoRequest{ - Path: "/home/user", - Pattern: "**/*.go", -}) - -// 编辑文件 -backend.Edit(ctx, &filesystem.EditRequest{ - FilePath: "/tmp/file.txt", - OldString: "old", - NewString: "new", - ReplaceAll: true, -}) - -// 执行命令 -result, _ := backend.Execute(ctx, &filesystem.ExecuteRequest{ - Command: "ls -la /tmp", -}) - -// 流式执行 -reader, _ := backend.ExecuteStreaming(ctx, &filesystem.ExecuteRequest{ - Command: "tail -f /var/log/app.log", -}) -for { - resp, err := reader.Recv() - if err == io.EOF { - break - } - fmt.Print(resp.Stdout) -} -``` - -### 路径要求 - -所有路径必须是绝对路径(以 `/` 开头): - -```go -// 正确 -backend.Read(ctx, &filesystem.ReadRequest{FilePath: "/home/user/file.txt"}) - -// 错误 -backend.Read(ctx, &filesystem.ReadRequest{FilePath: "./file.txt"}) -``` - -转换相对路径: - -```go -absPath, _ := filepath.Abs("./relative/path") -``` - -### 与 Agentkit Backend 对比 - - - - - - - - - - -
    特性LocalAgentkit
    执行模型本地直接远程沙箱
    网络依赖需要
    配置复杂度零配置需要凭证
    安全模型OS 权限隔离沙箱
    流式输出支持不支持
    平台支持Unix/Linux/macOS任意
    适用场景开发/本地环境多租户/生产环境
    - -### 常见问题 - -**Q: 为什么运行 grep 命令报错 ripgrep (rg) is not installed or not in PATH. Please install it: ****[https://github.com/BurntSushi/ripgrep#installation](https://github.com/BurntSushi/ripgrep#installation)** - -local 的 Grep 命令默认依赖** ripgrep **指令,如系统没有预装 ripgrep 则需要通过文档安装 ripgrep - -**Q: GrepRaw 支持正则吗?** - -支持正则匹配,GrepRaw 底层使用的是 ripgrep 命令做的 Grep 操作 - -**Q: Windows 支持吗?** - -不支持,依赖 `/bin/sh`。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md index c4edcf4e9d1..ce12d13f1c4 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: AgentsMD @@ -9,51 +9,27 @@ weight: 9 ## 概述 -`agentsmd` 是 Eino ADK 提供的一个中间件,用于在每次模型调用时**自动将 Agents.md 文件内容注入到模型输入消息中**。注入是瞬态的——内容在模型调用时动态添加,不会持久化到会话状态中,因此**不会被摘要/压缩中间件处理**。 - -**核心价值**:通过 Agents.md 文件为 Agent 定义系统级的行为指令和上下文信息(类似 Claude Code 的 CLAUDE.md),无需手动管理 system prompt 的拼接。 - -**包路径**:`github.com/cloudwego/eino/adk/middlewares/agentsmd` - ---- +`agentsmd` 是 Eino ADK 的中间件,在每次模型调用时**自动将 Agents.md 文件内容注入到消息序列中**。注入的消息会被框架持久化到 agent 内部状态,但通过**幂等性检查**(`Extra["__agentsmd_content__"]` 标记)确保不会重复注入。由于注入内容在首次出现时即固定,**不会随后续摘要/压缩而变化**。**核心价值**:通过 Agents.md 文件为 Agent 定义系统级行为指令与上下文(类似 Claude Code 的 CLAUDE.md),无需手动管理 system prompt 拼接。**包路径**:`github.com/cloudwego/eino/adk/middlewares/agentsmd` ## 快速开始 -### 最小化示例 - ```go -package main - -import ( - "context" - "fmt" - - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/adk/middlewares/agentsmd" -) - -func main() { - ctx := context.Background() - - // 1. 准备 Backend(文件读取后端) - backend := NewLocalFileBackend("/path/to/project") - - // 2. 创建 agentsmd 中间件 - mw, err := agentsmd.New(ctx, &agentsmd.Config{ - Backend: backend, - AgentsMDFiles: []string{"/home/user/project/agents.md"}, - }) - if err != nil { - panic(err) - } - - // 3. 将中间件配置到 Agent - // agent := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // Middlewares: []adk.ChatModelAgentMiddleware{mw}, - // }) - _ = mw - fmt.Println("agentsmd middleware created successfully") +ctx := context.Background() + +// 1. 创建 agentsmd 中间件 +mw, err := agentsmd.New(ctx, &agentsmd.Config{ + Backend: myBackend, // 实现 agentsmd.Backend 接口 + AgentsMDFiles: []string{"/project/agents.md"}, +}) +if err != nil { + panic(err) } + +// 2. 配置到 Agent +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: chatModel, + Handlers: []adk.ChatModelAgentMiddleware{mw}, +}) ``` --- @@ -64,84 +40,86 @@ func main() { ```go type Config struct { - // Backend 提供文件访问能力,用于加载 Agents.md 文件。 - // 可以使用本地文件系统、远程存储或任何其他后端实现。 - // 必填。 - Backend Backend - - // AgentsMDFiles 指定要加载的 Agents.md 文件路径的有序列表。 - // 文件按照给定顺序加载和注入。 - // 文件内部支持 @import 语法进行递归引入(最大深度 5)。 - AgentsMDFiles []string - - // AllAgentsMDMaxBytes 限制所有加载的 Agents.md 内容的总字节大小。 - // 文件按顺序加载;一旦累计大小超过此限制,剩余文件将被跳过。 - // 每个单独的文件始终完整加载。 - // 0 表示无限制。 + Backend Backend + AgentsMDFiles []string AllAgentsMDMaxBytes int - - // OnLoadWarning 是一个可选的回调函数,在加载过程中发生非致命错误时调用 - // (如文件未找到、循环 @import、深度超限等)。 - // 如果为 nil,警告通过 log.Printf 输出。 - // - // 注意:Backend.Read 的非 os.ErrNotExist 错误(如权限被拒、I/O 错误) - // 不会被视为警告,而是会中止加载过程。 - OnLoadWarning func(filePath string, err error) + OnLoadWarning func(filePath string, err error) } ``` -### 配置参数说明 +### 参数说明 - - - - + + + +
    参数类型必填默认值说明
    Backend
    Backend
    -文件读取后端,负责实际的文件 I/O
    AgentsMDFiles
    []string
    -要加载的 Agents.md 文件路径列表(至少一个)
    AllAgentsMDMaxBytes
    int
    0
    (无限制)
    所有文件的总字节数上限
    OnLoadWarning
    func(string, error)
    log.Printf
    非致命错误的回调函数
    Backend
    Backend
    文件读取后端,负责实际的文件 I/O
    AgentsMDFiles
    []string
    要加载的 Agents.md 文件路径列表(至少一个),按顺序加载和注入
    AllAgentsMDMaxBytes
    int
    0
    (无限制)
    所有文件的总字节数上限;超过后跳过后续文件,但每个文件始终完整加载
    OnLoadWarning
    func(string, error)
    log.Printf
    非致命错误的回调函数(文件缺失、循环 @import、深度超限等)
    +### 校验规则 + +`New` / `NewTyped` 在创建时会校验 Config: + +- `Config` 不能为 nil +- `Backend` 不能为 nil +- `AgentsMDFiles` 至少包含一个路径 +- `AllAgentsMDMaxBytes` 不能为负数 + --- +## 构造函数 + +### New — 标准构造 + +```go +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) +``` + +返回 `ChatModelAgentMiddleware`(即 `TypedChatModelAgentMiddleware[*schema.Message]`),适用于标准 `ChatModelAgent`。 + +### NewTyped — 泛型构造 + +```go +func NewTyped[M adk.MessageType](_ context.Context, cfg *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +泛型版本,支持 `*schema.Message` 和 `*schema.AgenticMessage` 两种消息类型。`New` 内部调用 `NewTyped[*schema.Message]`。 + ## Backend 接口 ### 接口定义 ```go type Backend interface { - // Read 读取文件内容。 - // 如果文件不存在,实现应返回包装了 os.ErrNotExist 的 error - // (以便 errors.Is(err, os.ErrNotExist) 返回 true)。 - // 这样 loader 可以静默跳过缺失文件并通过 OnLoadWarning 通知。 - // 其他错误(如权限被拒、I/O 错误)会中止加载过程。 Read(ctx context.Context, req *ReadRequest) (*FileContent, error) } ``` ### 类型定义 -```go -// ReadRequest 定义读取文件的请求参数 -type ReadRequest struct { - FilePath string // 文件路径 - Offset int // 起始行号(1-based) -} +`ReadRequest` 和 `FileContent` 是 `github.com/cloudwego/eino/adk/filesystem` 包中同名类型的别名: -// FileContent 定义文件内容的返回结构 -type FileContent struct { - Content string // 文件的文本内容 -} +```go +type ReadRequest = filesystem.ReadRequest +type FileContent = filesystem.FileContent ``` +> 💡 +> **Backend 实现要求** +> +> - 文件不存在时**必须**返回包裹 `os.ErrNotExist` 的错误(使 `errors.Is(err, os.ErrNotExist)` 为 `true`),loader 据此区分"文件缺失"和"真正的 I/O 错误" +> - 其他错误(权限被拒、I/O 错误)会**中止整个加载过程**,不视为警告 +> - `Read` 方法应当是并发安全的 + --- ## @import 语法 -Agents.md 文件支持 `@import` 语法,可以递归引入其他文件。 +Agents.md 文件支持 `@路径` 语法递归引入其他文件。 ### 语法格式 -在 Agents.md 文件中,使用 `@路径/文件名` 引用其他文件: - ```markdown # 项目指令 @@ -152,68 +130,66 @@ Agents.md 文件支持 `@import` 语法,可以递归引入其他文件。 @rules/api-conventions.md ``` -### 规则 +### 匹配规则 + +loader 使用正则 `@([a-zA-Z0-9_.~/][a-zA-Z0-9_.~/\-]*)` 扫描文件内容,并结合以下过滤逻辑: + +- **含 / 的路径**:直接视为 @import(如 `@rules/style.md`) +- **不含 / 的路径**:仅当扩展名在允许列表内时视为 @import,否则忽略**允许的扩展名**:`.md`、`.txt`、`.mdx`、`.yaml`、`.yml`、`.json`、`.toml` 这一设计避免将 `@someone`、`@example.com` 等误识为导入目标。 + +### 解析行为 -1. **路径解析**:相对路径基于当前文件所在目录解析,绝对路径直接使用 -2. **最大递归深度**:5 层(超过后跳过并触发 `OnLoadWarning`) -3. **循环引用检测**:自动检测并跳过循环引用(触发 `OnLoadWarning`) -4. **全局去重**:同一文件不会被重复加载 -5. **支持的文件扩展名**(路径中不含 `/` 时):`.md`, `.txt`, `.mdx`, `.yaml`, `.yml`, `.json`, `.toml` -6. **误报过滤**:不含 `/` 且扩展名不在允许列表中的 `@引用` 会被忽略(避免将 `@someone` 或 `@example.com` 识别为导入) + + + + + + + + +
    规则说明
    路径解析相对路径基于当前文件所在目录解析;绝对路径直接使用
    最大递归深度5 层(超过后跳过并触发
    OnLoadWarning
    循环引用检测当前祖先链中已存在的路径会被跳过(触发
    OnLoadWarning
    全局去重整次加载中同一文件路径只会被读取和注入一次
    原文保留@import 引用的文件作为独立段落追加,原文中的
    @path
    文本不被移除
    字节预算累计字节数超过
    AllAgentsMDMaxBytes
    后,跳过后续 import
    -### @import 目录结构示例 +### 目录结构示例 ``` project/ ├── Agents.md # 主入口文件 ├── rules/ -│ ├── code-style.md # 代码风格规范 -│ ├── api-conventions.md # API 规范 -│ └── testing.md # 测试规范 +│ ├── code-style.md # @rules/code-style.md +│ ├── api-conventions.md # @rules/api-conventions.md +│ └── testing.md └── context/ - └── architecture.md # 架构说明 + └── architecture.md ``` --- ## 工作原理 +### 实现钩子 + +中间件实现 `TypedChatModelAgentMiddleware` 接口的 `BeforeModelRewriteState` 方法(**非** WrapModel)。此钩子在每次模型调用前、对 state 进行改写时触发。 + ### 注入流程 +### 注入后的消息序列 + ``` -用户消息 + 历史消息 - │ - ▼ -┌─────────────────────┐ -│ agentsmd 中间件 │ -│ (WrapModel) │ -│ │ -│ 1. 加载 Agents.md │ -│ 2. 缓存到 RunLocal │ -│ 3. 生成注入消息 │ -└─────────────────────┘ - │ - ▼ -┌─────────────────────────────────────┐ -│ 注入后的消息序列 │ -│ │ -│ [System] 系统提示词 │ -│ [User] ← Agents.md 内容注入 │ ← 插入在第一条 User 消息之前 -│ [User] 用户历史消息 1 │ -│ [Assistant] 助手回复 1 │ -│ [User] 用户当前消息 │ -└─────────────────────────────────────┘ - │ - ▼ - 模型调用 (Generate / Stream) +[System] 系统提示词 +[User] ← Agents.md 内容(带 Extra 标记) +[User] 用户历史消息 1 +[Assistant] 助手回复 1 +[User] 用户当前消息 ``` ### 关键机制 -1. **瞬态注入**:Agents.md 内容仅在模型调用时临时插入,不写入 `ChatModelAgentState`,因此不会被摘要/压缩中间件处理 -2. **Run 级别缓存**:同一次 Agent `Run()` 中,Agents.md 内容加载后会缓存在 `RunLocalValue` 中,后续的模型调用(如多轮工具调用)直接复用缓存,避免重复读取 -3. **插入位置**:内容作为 `User` 角色消息插入在第一条 User 消息之前;如果没有 User 消息,则追加到末尾 -4. **国际化**:格式化输出自动适配中英文(根据系统语言环境) +**1. 持久化注入 + 幂等性保证**框架会将 `BeforeModelRewriteState` 返回的 state 持久化到 agent 内部状态(`st.Messages = state.Messages`)。注入的消息通过 `Extra["__agentsmd_content__"]` 标记,每次进入钩子时先扫描——若已存在该标记则直接返回原 state,避免重复注入。因此效果上:内容在首次 model call 时被注入并持久化,后续迭代不再重复插入。**2. Run 级别缓存**同一次 `Run()` 中,首次加载的内容通过 `adk.SetRunLocalValue` 缓存到 RunLocal 存储。后续模型调用(如多轮工具调用)通过 `adk.GetRunLocalValue` 直接复用缓存。每次新的 `Run()` 会重新加载,因此文件修改会在下次 Run 时生效。**4. 插入位置**内容作为 `User` 角色消息插入在**第一条 User 消息之前**。如果消息序列中没有 User 消息,则追加到末尾。**5. 内容格式化**加载的文件内容经过格式化处理: + +- 外层包裹 `` 标签 +- 含 i18n 的 header(提示模型遵循指令)和 footer(提示上下文可能不相关) +- 每个文件以 `文件内容:{路径}(指令):` 为前缀独立展示 +- 语言(中/英文)通过 `adk.SetLanguage` 全局控制 --- @@ -221,13 +197,11 @@ project/ ### 中间件顺序 -**推荐将 ****agentsmd**** 中间件放在 summarization/compression 中间件之后。** 这样可以确保 Agents.md 内容: - -- 不会被摘要中间件压缩掉 -- 每次模型调用都能获得完整的指令内容 +> 💡 +> **推荐将 agentsmd 中间件放在 summarization/compression 中间件之后。** 这样 Agents.md 内容不会被摘要压缩,每次模型调用都能获得完整指令。 ```go -Middlewares: []adk.ChatModelAgentMiddleware{ +Handlers: []adk.ChatModelAgentMiddleware{ summarizationMiddleware, // 先摘要 agentsMDMiddleware, // 后注入 Agents.md } @@ -237,44 +211,51 @@ Middlewares: []adk.ChatModelAgentMiddleware{ - - - + + + - +
    场景行为
    文件不存在 (
    os.ErrNotExist
    )
    跳过该文件,触发
    OnLoadWarning
    循环
    @import
    跳过循环文件,触发
    OnLoadWarning
    @import
    深度超过 5 层
    跳过,触发
    OnLoadWarning
    文件不存在(
    os.ErrNotExist
    跳过该文件,触发
    OnLoadWarning
    循环 @import跳过循环文件,触发
    OnLoadWarning
    @import 深度超过 5 层跳过,触发
    OnLoadWarning
    累计大小超过
    AllAgentsMDMaxBytes
    跳过后续文件,触发
    OnLoadWarning
    (第一个文件始终完整加载)
    权限被拒 / I/O 错误中止加载,返回 error
    所有文件内容为空不注入,原样传递输入消息
    所有文件内容为空不注入,原样传递消息
    -### Backend 实现要求 - -- 文件不存在时**必须**返回 `os.ErrNotExist` 包裹的错误(`fmt.Errorf("... : %w", os.ErrNotExist)`),否则 loader 无法区分"文件缺失"和"真正的 I/O 错误" -- `Read` 方法应当是并发安全的 - ### 性能考虑 -- 合理设置 `AllAgentsMDMaxBytes`,避免注入过多内容占用模型上下文窗口 -- Agents.md 内容在每次 `Run()` 中只加载一次(Run 级别缓存),但**每次新的 ****Run()**** 都会重新加载**,因此文件内容的修改会在下次 Run 时生效 -- 避免在 Agents.md 中 `@import` 过多文件,递归深度上限为 5 层 +- 合理设置 `AllAgentsMDMaxBytes`,避免注入过多内容占用上下文窗口 +- Agents.md 内容在每次 `Run()` 中只加载一次(Run 级别缓存),但**每次新 Run() 都会重新加载** +- 避免 @import 过多文件,递归深度上限为 5 层 ### Agents.md 编写建议 - 保持内容精炼,只包含对模型行为真正有影响的指令 -- 使用 `@import` 拆分关注点(代码规范、API 规范、架构说明等) -- 避免在 Agents.md 中包含大量代码示例或数据,以免浪费上下文窗口 -- 文件内容会被包裹在 `` 标签中传递给模型,模型会将其视为系统级指令 +- 使用 @import 按关注点拆分(代码规范、API 规范、架构说明等) +- 避免包含大量代码示例或数据,以免浪费上下文窗口 +- 文件内容会被包裹在 `` 标签中传递给模型 --- ## FAQ **Q: Agents.md 的内容会被保存到对话历史中吗?** -A: 不会。内容是在模型调用时动态注入的,不会写入 `ChatModelAgentState`,因此对话历史中不会出现 Agents.md 的内容。 + +A: 会。`BeforeModelRewriteState` 返回的 state 会被框架持久化。但由于幂等性检查(`Extra["__agentsmd_content__"]` 标记),内容只在首次 model call 时注入一次,后续迭代直接跳过。建议将 agentsmd 放在 summarization 之后,避免注入内容被摘要压缩。 **Q: 如果某个 Agents.md 文件不存在会怎样?** -A: 该文件会被跳过,触发 `OnLoadWarning` 回调(默认 `log.Printf`),不会导致整体加载失败。 + +A: 该文件被跳过,触发 `OnLoadWarning` 回调(默认 `log.Printf`),不影响其他文件的加载。 **Q: @import 的路径是相对于什么目录?** -A: 相对于当前文件所在目录。例如 `/project/Agents.md` 中的 `@rules/style.md` 会解析为 `/project/rules/style.md`。 + +A: 相对于当前文件所在目录。例如 `/project/Agents.md` 中的 `@rules/style.md` 解析为 `/project/rules/style.md`。 **Q: 多个文件中 @import 了同一个文件会重复加载吗?** -A: 不会。loader 维护了全局去重 map,同一个文件路径只会被读取和注入一次。 + +A: 不会。loader 维护全局去重 map(`seen`),同一路径只会被读取和注入一次。 + +**Q: 原文中的 @path 引用会被替换掉吗?** + +A: 不会。@import 的文件作为独立段落追加在原文之后,原文内容保持不变。 + +**Q: New 和 NewTyped 有什么区别?** + +A: `New` 返回 `ChatModelAgentMiddleware`(即 `TypedChatModelAgentMiddleware[*schema.Message]`),适用于标准 Agent。`NewTyped` 是泛型版本,额外支持 `*schema.AgenticMessage` 类型,用于 Agentic Model 场景。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md index e96a2709d8c..906affe379e 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md @@ -1,187 +1,221 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: FileSystem weight: 2 --- -> 💡 Package: [github.com/cloudwego/eino/adk/middlewares/filesystem](https://github.com/cloudwego/eino/tree/main/adk/middlewares/filesystem) +FileSystem 中间件为 Agent 注入一组文件系统操作工具(ls、read\_file、write\_file、edit\_file、glob、grep)以及可选的命令执行工具(execute),使 Agent 具备与本地或远程文件系统交互的能力。 -## 概述 - -FileSystem Middleware 为 Agent 提供文件系统访问能力。它通过 [FileSystem Backend](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/filesystem_backend) 接口操作文件系统,自动向 Agent 注入一组文件操作工具及对应的 system prompt,使 Agent 能够直接进行文件读写、搜索、编辑等操作。 - -核心功能: - -- **文件系统工具注入** — 自动注册 ls、read_file、write_file、edit_file、glob、grep 等工具 -- **Shell 命令执行** — 可选注入 execute 工具,支持同步和流式命令执行 -- **工具级别配置** — 每个工具均可独立配置名称、描述、自定义实现或禁用 -- **多语言提示词** — 工具描述和 system prompt 支持中英文切换 +``` +import "github.com/cloudwego/eino/adk/middlewares/filesystem" +``` -## 创建中间件 +--- -推荐使用 `New` 函数创建中间件(返回 `ChatModelAgentMiddleware`): +## 快速开始 ```go -import "github.com/cloudwego/eino/adk/middlewares/filesystem" +import ( + "context" + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/adk/middlewares/filesystem" +) +// 1. 创建 middleware middleware, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: myBackend, - // 如果需要 shell 命令执行能力,设置 Shell 或 StreamingShell - Shell: myShell, + Backend: myBackend, // 实现 filesystem.Backend 接口 }) -if err != nil { - // handle error -} +// 2. 注入 Agent agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ // ... Middlewares: []adk.ChatModelAgentMiddleware{middleware}, }) ``` +--- + +## 构造函数 + + + + + +
    函数签名说明
    New(ctx, *MiddlewareConfig) (ChatModelAgentMiddleware, error)
    推荐。返回
    ChatModelAgentMiddleware
    ,支持通过
    BeforeAgent
    钩子动态修改 Instruction 和 Tools。
    NewTyped[M MessageType](ctx, *MiddlewareConfig) (TypedChatModelAgentMiddleware[M], error)
    泛型版本,类型参数
    M
    支持
    *schema.Message
    *schema.AgenticMessage
    New
    等价于
    NewTyped[*schema.Message]
    + > 💡 -> `New` 返回 `ChatModelAgentMiddleware`,提供更好的上下文传播能力(通过 `BeforeAgent` hook 在运行时修改 Agent 的 instruction 和 tools)。 +> **Deprecated**: `NewMiddleware(ctx, *Config) (AgentMiddleware, error)` 为旧版构造函数,新代码请使用 `New`。`NewMiddleware` 返回结构体 `AgentMiddleware`,缺少 `BeforeAgent` 钩子的灵活性;此外它默认启用「大结果卸载」功能(见下文),在 `New` 路径中该功能已被移除。 -## MiddlewareConfig 配置项 +--- -```go -type MiddlewareConfig struct { - // Backend 提供文件系统操作 - // 必填 - Backend filesystem.Backend - - // Shell 提供 shell 命令执行能力 - // 如果设置,会注册 execute 工具 - // 可选,与 StreamingShell 互斥 - Shell filesystem.Shell - - // StreamingShell 提供流式 shell 命令执行能力 - // 如果设置,会注册流式 execute 工具(支持实时输出) - // 可选,与 Shell 互斥 - StreamingShell filesystem.StreamingShell - - // 以下为各工具的独立配置,均为可选 - LsToolConfig *ToolConfig // ls 工具配置 - ReadFileToolConfig *ToolConfig // read_file 工具配置 - WriteFileToolConfig *ToolConfig // write_file 工具配置 - EditFileToolConfig *ToolConfig // edit_file 工具配置 - GlobToolConfig *ToolConfig // glob 工具配置 - GrepToolConfig *ToolConfig // grep 工具配置 - - // CustomSystemPrompt 覆盖默认的系统提示词 - // 可选,默认 ToolsSystemPrompt - CustomSystemPrompt *string - - // 以下字段已 Deprecated,请使用对应的 *ToolConfig.Desc 替代 - // CustomLsToolDesc, CustomReadFileToolDesc, CustomGrepToolDesc, - // CustomGlobToolDesc, CustomWriteFileToolDesc, CustomEditToolDesc -} -``` +## MiddlewareConfig + +`MiddlewareConfig` 是 `New` / `NewTyped` 使用的配置结构体。 -### ToolConfig +### 核心字段 -每个工具均可通过 `ToolConfig` 独立配置: + + + + + + + +
    字段类型说明
    Backend
    filesystem.Backend
    必填。提供文件系统操作能力,驱动 ls、read\_file、write\_file、edit\_file、glob、grep 共 6 个工具。接口定义在
    github.com/cloudwego/eino/adk/filesystem
    包。
    Shell
    filesystem.Shell
    可选。提供命令执行能力,设置后注册
    execute
    工具。与
    StreamingShell
    互斥
    StreamingShell
    filesystem.StreamingShell
    可选。提供流式命令执行能力,设置后注册流式
    execute
    工具。与
    Shell
    互斥
    UseMultiModalRead
    bool
    可选,默认
    false
    。开启后
    read_file
    工具变为
    EnhancedInvokableTool
    ,支持返回图片/PDF 等多模态内容。要求 Backend 同时实现 filesystem.MultiModalReader 接口
    CustomSystemPrompt
    *string
    可选。覆盖追加到 Agent Instruction 的系统提示词。若为
    nil
    不追加任何系统提示词
    + +### 工具配置字段 + +每个工具均有对应的 `*ToolConfig` 字段,用于自定义工具名称、描述、替换实现或禁用: + + + + + + + + + +
    字段对应工具
    LsToolConfig
    ls
    ReadFileToolConfig
    read\_file
    WriteFileToolConfig
    write\_file
    EditFileToolConfig
    edit\_file
    GlobToolConfig
    glob
    GrepToolConfig
    grep
    + +> `execute` 工具当前不支持通过 `ToolConfig` 自定义,其注册仅由 `Shell` / `StreamingShell` 是否设置来控制。 + +--- + +## ToolConfig ```go type ToolConfig struct { - // Name 覆盖工具名称 - // 可选,不设置则使用默认名称(如 "ls"、"read_file" 等) - Name string - - // Desc 覆盖工具描述 - // 可选,不设置则使用默认描述 - Desc *string - - // CustomTool 提供自定义工具实现 - // 如果设置,将使用此自定义实现替代基于 Backend 的默认实现 - // 可选 - CustomTool tool.BaseTool - - // Disable 禁用此工具 - // 如果为 true,该工具将不会被注册 - // 可选,默认 false - Disable bool + Name string // 覆盖工具名称,空串使用默认值 + Desc *string // 覆盖工具描述,nil 使用默认值 + CustomTool tool.BaseTool // 自定义工具实现,设置后替代 Backend 默认实现 + Disable bool // 设为 true 则不注册该工具 } ``` -示例 — 自定义工具名称并禁用写入: +**优先级**:`Disable=true` > `CustomTool` > Backend 默认实现。 + +--- + +## 工具名称常量 ```go -middleware, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: myBackend, - ReadFileToolConfig: &filesystem.ToolConfig{ - Name: "cat_file", // 自定义名称 - }, - WriteFileToolConfig: &filesystem.ToolConfig{ - Disable: true, // 禁用写入工具 - }, -}) +const ( + ToolNameLs = "ls" + ToolNameReadFile = "read_file" + ToolNameWriteFile = "write_file" + ToolNameEditFile = "edit_file" + ToolNameGlob = "glob" + ToolNameGrep = "grep" + ToolNameExecute = "execute" +) ``` +--- + ## 注入的工具 - - - - - - - - + + + + + + + +
    工具默认名称描述条件
    列出目录
    ls
    列出指定路径下的文件和目录Backend 不为 nil 时注入
    读取文件
    read_file
    读取文件内容,支持按行分页(offset + limit)Backend 不为 nil 时注入
    写入文件
    write_file
    创建或覆盖文件Backend 不为 nil 时注入
    编辑文件
    edit_file
    替换文件中的字符串Backend 不为 nil 时注入
    Glob 查找
    glob
    按 glob pattern 查找文件Backend 不为 nil 时注入
    内容搜索
    grep
    按 pattern 搜索文件内容,支持多种输出模式Backend 不为 nil 时注入
    命令执行
    execute
    执行 shell 命令需配置 Shell 或 StreamingShell
    工具默认名称注册条件功能说明
    ls
    ls
    Backend ≠ nil列出目录下的文件和子目录
    read\_file
    read_file
    Backend ≠ nil读取文件内容,支持 offset/limit 分页。开启
    UseMultiModalRead
    后可读取图片和 PDF
    write\_file
    write_file
    Backend ≠ nil创建或覆盖写入文件
    edit\_file
    edit_file
    Backend ≠ nil精确字符串替换编辑,支持
    replace_all
    glob
    glob
    Backend ≠ nil按 glob 模式匹配文件路径
    grep
    grep
    Backend ≠ nil正则搜索文件内容,支持多种输出模式和分页
    execute
    execute
    Shell ≠ nil 或 StreamingShell ≠ nil执行 Shell 命令
    -每个工具均可通过对应的 `*ToolConfig` 禁用(`Disable: true`)或提供自定义实现(`CustomTool`)。 +--- -## 多语言支持 +## Backend 接口 -工具描述和内置提示词默认为英文。如需切换为中文,可通过 `adk.SetLanguage()` 设置: +`Backend` 定义在 `github.com/cloudwego/eino/adk/filesystem` 包中。middleware 包通过类型别名重导出了请求/响应类型(如 `ReadRequest`、`FileContent` 等),但 **Backend 接口本身需要从 adk/filesystem 包引用**。 ```go -import "github.com/cloudwego/eino/adk" - -adk.SetLanguage(adk.LanguageChinese) // 切换为中文 -adk.SetLanguage(adk.LanguageEnglish) // 切换为英文(默认) +type Backend interface { + LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) + Read(ctx context.Context, req *ReadRequest) (*FileContent, error) + GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error) + GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error) + Write(ctx context.Context, req *WriteRequest) error + Edit(ctx context.Context, req *EditRequest) error +} ``` -也可以通过 `ToolConfig.Desc` 或 `CustomSystemPrompt` 自定义各工具的说明文本。 +### Shell 与 StreamingShell -## [deprecated] 工具结果卸载 +```go +type Shell interface { + Execute(ctx context.Context, input *ExecuteRequest) (*ExecuteResponse, error) +} -> 💡 -> 该功能即将在 0.8.0 中 deprecate。请迁移到 Middleware: ToolReduction +type StreamingShell interface { + ExecuteStreaming(ctx context.Context, input *ExecuteRequest) (*schema.StreamReader[*ExecuteResponse], error) +} +``` -> 注意:工具结果卸载仅在旧的 `Config` + `NewMiddleware` 函数中可用。推荐的 `MiddlewareConfig` + `New` 不包含此功能,如需要请配合 ToolReduction middleware 使用。 +二者互斥,只能设置其中一个。`StreamingShell` 支持流式输出,适合长时间运行的命令。 -当工具调用结果过大(例如读取大文件、grep 命中大量内容),如果继续将完整结果放入对话上下文,会导致: +--- -- token 急剧增加 -- Agent 历史上下文污染 -- 推理效率变差 +## MultiModalReader 扩展接口 -为此,旧版 Middleware(`NewMiddleware`)提供了自动卸载机制: +当 `UseMultiModalRead = true` 时,Backend 需要额外实现 `MultiModalReader` 接口: -- 当结果大小超过阈值(默认 20,000 tokens)时,不直接返回全部内容给 LLM -- 实际结果会保存到文件系统(Backend) -- 上下文中仅包含摘要和文件路径(Agent 可再次调用 `read_file` 工具按需读取) +```go +type MultiModalReader interface { + MultiModalRead(ctx context.Context, req *MultiModalReadRequest) (*MultiFileContent, error) +} +``` -该功能默认启用,可通过 `Config`(非 `MiddlewareConfig`)配置: +**行为说明**: -```go -type Config struct { - // ... Backend, Shell, StreamingShell, ToolConfig 等字段同 MiddlewareConfig +- `read_file` 工具将从 `InvokableTool` 升级为 `EnhancedInvokableTool`,通过 `schema.ToolResult.Parts` 返回多模态结果 +- 默认实现支持读取图片文件(PNG、JPG 等)和 PDF 文件(支持 `pages` 参数指定页面范围,每次最多 20 页) +- 工具描述会自动追加多模态能力后缀;若通过 `ReadFileToolConfig.Desc` 自定义了描述,则不会追加 - // 关闭自动卸载 - WithoutLargeToolResultOffloading bool +> 💡 +> 使用 `ChatModelAgentMiddleware` 时,需要实现 `WrapEnhancedInvokableToolCall` 方法,多模态 read\_file 工具才能生效。 + +```go +// MultiModalReadRequest 扩展了 ReadRequest +type MultiModalReadRequest struct { + ReadRequest + Pages string // PDF 页面范围,如 "1-5"、"3"、"10-20" +} - // 自定义触发阈值(默认 20000 tokens) - LargeToolResultOffloadingTokenLimit int +// MultiFileContent 返回结果 +type MultiFileContent struct { + *FileContent // 纯文本结果 + Parts []FileContentPart // 多模态结果(与 FileContent 互斥,Parts 非空时忽略 FileContent) +} - // 自定义卸载文件生成路径 - // 默认路径格式: /large_tool_result/{ToolCallID} - LargeToolResultOffloadingPathGen func(ctx context.Context, input *compose.ToolInput) (string, error) +type FileContentPart struct { + Type FileContentPartType // "image" 或 "pdf" + MIMEType string // 如 "image/png"、"application/pdf" + Data []byte // 原始二进制数据 } ``` + +--- + +## Deprecated: 旧版 Config 与大结果卸载 + +> 💡 +> 以下内容仅适用于 `NewMiddleware` + `Config` 旧版路径。`New` / `NewTyped` 路径**不包含**大结果卸载功能。 + +旧版 `Config` 在 `MiddlewareConfig` 的基础上额外提供了「大工具结果卸载」(Large Tool Result Offloading) 机制: + + + + + + +
    字段说明
    WithoutLargeToolResultOffloading bool
    设为
    true
    禁用卸载,默认
    false
    (启用)
    LargeToolResultOffloadingTokenLimit int
    Token 阈值,默认
    20000
    LargeToolResultOffloadingPathGen func(ctx, *compose.ToolInput) (string, error)
    卸载路径生成函数,默认
    /large_tool_result/{ToolCallID}
    + +**触发条件**:当工具返回结果的字符数 > `tokenLimit × 4` 时触发卸载。 + +**卸载行为**:将完整结果通过 `Backend.Write` 写入文件,并用摘要(前 10 行 + 文件路径提示)替换原始返回。Agent 可通过 `read_file` 分页读取完整结果。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_collaboration.md b/content/zh/docs/eino/core_modules/eino_adk/agent_collaboration.md index 05419d80660..2bc6c0a9782 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_collaboration.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_collaboration.md @@ -1,521 +1,116 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: Agent 协作 weight: 4 --- -# Agent 协作 +# 多 Agent 协作 -概述文档已经对 Agent 协作提供了基础的说明,下面将结合代码,对协作与组合原语的设计与实现进行介绍: +Eino ADK 提供两种主要的 Agent 协作方式: -## 协作原语 +## AgentAsTool(推荐) -### Agent 间协作方式 +将子 Agent 包装为 Tool,父 Agent 通过 ToolCall 自主决定何时调用。子 Agent 独立执行,结果返回父 Agent 的上下文。 - - - - -
    协作方式描述
    Transfer直接将任务转让给另外一个 Agent,本 Agent 则执行结束后退出,不关心转让 Agent 的任务执行状态
    ToolCall(AgentAsTool)将 Agent 当成 ToolCall 调用,等待 Agent 的响应,并可获取被调用Agent 的输出结果,进行下一轮处理
    - -### AgentInput 的上下文策略 - - - - - -
    上下文策略描述
    上游 Agent 全对话获取本 Agent 的上游 Agent 的完整对话记录
    全新任务描述忽略掉上游 Agent 的完整对话记录,给出一个全新的任务总结,作为子 Agent 的 AgentInput 输入
    - -### 决策自主性 - - - - - -
    决策自主性描述
    自主决策在 Agent 内部,基于其可选的下游 Agent, 如需协助时,自主选择下游 Agent 进行协助。 一般来说,Agent 内部是基于 LLM 进行决策,不过即使是基于预设逻辑进行选择,从 Agent 外部看依然视为自主决策
    预设决策事先预设好一个Agent 执行任务后的下一个 Agent。 Agent 的执行顺序是事先确定、可预测的
    - -### 组合原语 - - - - - - - - -
    类型描述运行模式协作方式上下文策略决策自主性
    SubAgents将用户提供的 agent 作为 父Agent,用户提供的 subAgents 列表作为 子Agents,组合而成可自主决策的 Agent,其中的 Name 和 Description 作为该 Agent 的名称标识和描述。
  • 当前限定一个 Agent 只能有一个 父 Agent
  • 可采用 SetSubAgents 函数,构建 「多叉树」 形式的 Multi-Agent
  • 在这个「多叉树」中,AgentName 需要保持唯一
  • Transfer上游 Agent 全对话自主决策
    Sequential将用户提供的 SubAgents 列表,组合成按照顺序依次执行的 Sequential Agent,其中的 Name 和 Description 作为 Sequential Agent 的名称标识和描述。Sequential Agent 执行时,将 SubAgents 列表,按照顺序依次执行,直至将所有 Agent 执行一遍后结束。Transfer上游 Agent 全对话预设决策
    Parallel将用户提供的 SubAgents 列表,组合成基于相同上下文,并发执行的 Parallel Agent,其中的 Name 和 Description 作为 Parallel Agent 的名称标识和描述。Parallel Agent 执行时,将 SubAgents 列表,并发执行,待所有 Agent 执行完成后结束。Transfer上游 Agent 全对话预设决策
    Loop将用户提供的 SubAgents 列表,按照数组顺序依次执行,循环往复,组合成 Loop Agent,其中的 Name 和 Description 作为 Loop Agent 的名称标识和描述。Loop Agent 执行时,将 SubAgents 列表,顺序执行,待所有 Agent 执行完成后结束。Transfer上游 Agent 全对话预设决策
    AgentAsTool将一个 Agent 转换成 Tool,被其他的 Agent 当成普通的 Tool 使用。一个 Agent 能否将其他 Agent 当成 Tool 进行调用,取决于自身的实现。adk 中提供的 ChatModelAgent 支持 AgentAsTool 的功能ToolCall全新任务描述自主决策
    - -## 上下文传递 - -在构建多 Agent 系统时,让不同 Agent 之间高效、准确地共享信息至关重要。Eino ADK 提供了两种核心的上下文传递机制,以满足不同的协作需求: History 和 SessionValues。 - -### History - -#### 概念 - -History 对应【上游 Agent 全对话上下文策略】,多 Agent 系统中每一个 Agent 产生的 AgentEvent 都会被保存到 History 中,调用一个新 Agent 时 (Workflow/ Transfer) History 中的 AgentEvent 会被转换并拼接到 AgentInput 中。 - -默认情况下,其他 Agent 的 Assistant 或 Tool Message,被转换为 User Message。这相当于在告诉当前的 LLM:“刚才, Agent_A 调用了 some_tool ,返回了 some_result 。现在,轮到你来决策了。” - -通过这种方式,其他 Agent 的行为被当作了提供给当前 Agent 的“外部信息”或“事实陈述”,而不是它自己的行为,从而避免了 LLM 的上下文混乱。 - - - -在 Eino ADK 中,当为一个 Agent 构建 AgentInput 时,它能看到的 History 是“所有在我之前产生的 AgentEvent”。 - -值得一提的是 ParallelWorkflowAgent:并行的两个子 Agent(A,B),在并行执行过程中,相互不可见对方产生的 AgentEvent,因为并行的 A、B 没有谁是在另一个之前。 - -#### RunPath - -History 中每个 AgentEvent 都是由“特定 Agent 在特定的执行序列中产生的”,也就是 AgentEvent 有自身的 RunPath。RunPath 的作用是传递出这个信息,在 eino 框架中不乘载其他功能。 - -下面表格中给出各种编排模式下,Agent 执行时的具体 RunPath: - - - - - - - -
    ExampleRunPath
  • Agent: [Agent]
  • SubAgent: [Agent, SubAgent]
  • Agent: [Agent]
  • Agent(after function call): [Agent]
  • Agent1: [SequentialAgent, LoopAgent, Agent1]
  • Agent2: [SequentialAgent, LoopAgent, Agent1, Agent2]
  • Agent1: [SequentialAgent, LoopAgent, Agent1, Agent2, Agent1]
  • Agent2: [SequentialAgent, LoopAgent, Agent1, Agent2, Agent1, Agent2]
  • Agent3: [SequentialAgent, LoopAgent, Agent3]
  • Agent4: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent4]
  • Agent5: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent5]
  • Agent6: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent6]
  • Agent: [Agent]
  • SubAgent: [Agent, SubAgent]
  • Agent: [Agent, SubAgent, Agent]
  • - -#### 自定义 - -有些情况下在 Agent 运行前需要对 History 的内容进行调整,此时通过 AgentWithOptions 可以自定义 Agent 从 History 中生成 AgentInput 的方式: - -```go -// github.com/cloudwego/eino/adk/flow.go - -type HistoryRewriter func(ctx context.Context, entries []*HistoryEntry) ([]Message, error) - -func WithHistoryRewriter(h HistoryRewriter) AgentOption -``` - -### SessionValues - -#### 概念 +这是最灵活、最可组合的协作模式: -SessionValues 是在一次运行中持续存在的全局临时 KV 存储,用于支持跨 Agent 的状态管理和数据共享,一次运行中的任何 Agent 可以在任何时间读写 SessionValues。 - -Eino ADK 提供了多种方法供 Agent 运行时内部并发安全的读写 Session Values: - -```go -// github.com/cloudwego/eino/adk/runctx.go - -// 获取全部 SessionValues -func GetSessionValues(ctx context.Context) map[string]any -// 批量设置 SessionValues -func AddSessionValues(ctx context.Context, kvs map[string]any) -// 指定 key 获取 SessionValues 中的一个值,key 不存在时第二个返回值为 false,否则为 true -func GetSessionValue(ctx context.Context, key string) (any, bool) -// 设置单个 SessionValues -func AddSessionValue(ctx context.Context, key string, value any) -``` - -需要注意的是,由于 SessionValues 机制基于 Context 来实现,而 Runner 运行会对 Context 重新初始化,因此在 Run 方法外通过 `AddSessionValues` 或 `AddSessionValue` 注入 SessionValues 是不生效的。 - -如果您需要在 Agent 运行前就注入数据到 SessionValues 中,需要使用专用的 Option 来协助实现,用法如下: - -```go -// github.com/cloudwego/eino/adk/call_option.go -// WithSessionValues 在 Agent 运行前注入 SessionValues -func WithSessionValues(v map[string]any) AgentRunOption - -// 用法: -runner := adk.NewRunner(ctx, adk.RunnerConfig{Agent: agent}) -iterator := runner.Run(ctx, []adk.Message{schema.UserMessage("xxx")}, - adk.WithSessionValues(map[string]any{ - PlanSessionKey: 123, - UserInputSessionKey: []adk.Message{schema.UserMessage("yyy")}, - }), -) -``` - -## Transfer SubAgents - -### 概念 - -Transfer 对应【Transfer 协作方式】,Agent 运行时产生带有包含 TransferAction 的 AgentEvent 后,Eino ADK 会调用 Action 指定的 Agent,被调用的 Agent 被称为子 Agent(SubAgent)。 - -TransferAction 可以使用 `NewTransferToAgentAction` 快速创建: - -```go -import "github.com/cloudwego/eino/adk" - -event := adk.NewTransferToAgentAction("dest agent name") -``` - -为了让 Eino ADK 在接受到 TransferAction 可以找到子 Agent 实例并运行,在运行前需要先调用 `SetSubAgents` 将可能的子 Agent 注册到 Eino ADK 中: - -```go -// github.com/cloudwego/eino/adk/flow.go -func SetSubAgents(ctx context.Context, agent Agent, subAgents []Agent) (Agent, error) -``` - -> 💡 -> Transfer 的含义是将任务**移交**给子 Agent,而不是委托或者分配,因此: -> -> 1. 区别于 ToolCall,通过 Transfer 调用子 Agent,子 Agent 运行结束后,不会再调用父 Agent 总结内容或进行下一步操作。 -> 2. 调用子 Agent 时,子 Agent 的输入仍然是原始输入,父 Agent 的输出会作为上下文供子 Agent 参考。 - -在触发 SetSubAgents 时,父子 Agent 双方都需要进行处理来完成初始化操作,Eino ADK 定义了 `OnSubAgents` 接口用于支持此功能: - -```go -// github.com/cloudwego/eino/adk/interface.go -type OnSubAgents interface { - OnSetSubAgents(ctx context.Context, subAgents []Agent) error - OnSetAsSubAgent(ctx context.Context, parent Agent) error - OnDisallowTransferToParent(ctx context.Context) error -} -``` - -如果 Agent 实现了 `OnSubAgents` 接口,`SetSubAgents` 中会调用相应的方法向 Agent 注册,例如 `ChatModelAgent` 的实现 - -### 示例 - -接下来以一个多功能对话 Agent 演示 Transfer 能力,目标是搭建一个可以查询天气或者与用户对话的 Agent,Agent 结构如下: - - - -三个 Agent 均使用 ChatModelAgent 实现: +- 父 Agent 保持控制权,可基于子 Agent 结果继续推理 +- 子 Agent 接收独立的任务描述,不继承父 Agent 的完整对话历史 +- 多个子 Agent 可并行调用 ```go import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" "github.com/cloudwego/eino/compose" + "github.com/cloudwego/eino/components/tool" ) -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -type GetWeatherInput struct { - City string `json:"city"` -} - -func NewWeatherAgent() adk.Agent { - weatherTool, err := utils.InferTool( - "get_weather", - "Gets the current weather for a specific city.", // English description - func(ctx context.Context, input *GetWeatherInput) (string, error) { - return fmt.Sprintf(`the temperature in %s is 25°C`, input.City), nil - }, - ) - if err != nil { - log.Fatal(err) - } - - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "WeatherAgent", - Description: "This agent can get the current weather for a given city.", - Instruction: "Your sole purpose is to get the current weather for a given city by using the 'get_weather' tool. After calling the tool, report the result directly to the user.", - Model: newChatModel(), - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{weatherTool}, - }, - }, - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func NewChatAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ChatAgent", - Description: "A general-purpose agent for handling conversational chat.", // English description - Instruction: "You are a friendly conversational assistant. Your role is to handle general chit-chat and answer questions that are not related to any specific tool-based tasks.", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func NewRouterAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "RouterAgent", - Description: "A manual router that transfers tasks to other expert agents.", - Instruction: `You are an intelligent task router. Your responsibility is to analyze the user's request and delegate it to the most appropriate expert agent.If no Agent can handle the task, simply inform the user it cannot be processed.`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} -``` - -之后使用 Eino ADK 的 Transfer 能力搭建 Multi-Agent 并运行,ChatModelAgent 实现了 OnSubAgent 接口,在 adk.SetSubAgents 方法中会使用此接口向 ChatModelAgent 注册父/子 Agent,不需要用户处理 TransferAction 生成问题: - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino/adk" -) - -func main() { - weatherAgent := NewWeatherAgent() - chatAgent := NewChatAgent() - routerAgent := NewRouterAgent() - - ctx := context.Background() - a, err := adk.SetSubAgents(ctx, routerAgent, []adk.Agent{chatAgent, weatherAgent}) - if err != nil { - log.Fatal(err) - } - - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: a, - }) - - // query weather - println("\n\n>>>>>>>>>query weather<<<<<<<<<") - iter := runner.Query(ctx, "What's the weather in Beijing?") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil { - fmt.Printf("\nAgent[%s]: transfer to %+v\n\n======\n", event.AgentName, event.Action.TransferToAgent.DestAgentName) - } else { - fmt.Printf("\nAgent[%s]:\n%+v\n\n======\n", event.AgentName, event.Output.MessageOutput.Message) - } - } - - // failed to route - println("\n\n>>>>>>>>>failed to route<<<<<<<<<") - iter = runner.Query(ctx, "Book me a flight from New York to London tomorrow.") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil { - fmt.Printf("\nAgent[%s]: transfer to %+v\n\n======\n", event.AgentName, event.Action.TransferToAgent.DestAgentName) - } else { - fmt.Printf("\nAgent[%s]:\n%+v\n\n======\n", event.AgentName, event.Output.MessageOutput.Message) - } - } -} -``` - -运行结果: - -```yaml ->>>>>>>>>query weather<<<<<<<<< -Agent[RouterAgent]: -assistant: -tool_calls: -{Index: ID:call_SKNsPwKCTdp1oHxSlAFt8sO6 Type:function Function:{Name:transfer_to_agent Arguments:{"agent_name":"WeatherAgent"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{201 17 218} -====== -Agent[RouterAgent]: transfer to WeatherAgent -====== -Agent[WeatherAgent]: -assistant: -tool_calls: -{Index: ID:call_QMBdUwKj84hKDAwMMX1gOiES Type:function Function:{Name:get_weather Arguments:{"city":"Beijing"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{255 15 270} -====== -Agent[WeatherAgent]: -tool: the temperature in Beijing is 25°C -tool_call_id: call_QMBdUwKj84hKDAwMMX1gOiES -tool_call_name: get_weather -====== -Agent[WeatherAgent]: -assistant: The current temperature in Beijing is 25°C. -finish_reason: stop -usage: &{286 11 297} -====== - ->>>>>>>>>failed to route<<<<<<<<< -Agent[RouterAgent]: -assistant: I'm unable to assist with booking flights. Please use a relevant travel service or booking platform to make your reservation. -finish_reason: stop -usage: &{206 23 229} -====== -``` - -OnSubAgents 的另外两个方法在 Agent 作为 SetSubAgents 中的子 Agent 时被调用: - -- OnSetAsSubAgent 用来注册向 Agent 注册其父 Agent 信息 -- OnDisallowTransferToParent 在 Agent 设置 WithDisallowTransferToParent option 时会被调用,用来告知 Agent 不要产生向父 Agent 的 TransferAction。 - -```go -adk.SetSubAgents( - ctx, - Agent1, - []adk.Agent{ - adk.AgentWithOptions(ctx, Agent2, adk.WithDisallowTransferToParent()), +// 创建子 Agent +subAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "researcher", + Description: "搜索并总结相关信息", + Instruction: "你是一个研究助手...", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{searchTool}, + }, }, -) +}) + +// 包装为 Tool +agentTool := adk.NewAgentTool(ctx, subAgent) + +// 父 Agent 注册子 Agent Tool +parentAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "coordinator", + Description: "协调任务的主 Agent", + Instruction: "你是一个任务协调者...", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{agentTool}, + }, + }, +}) ``` -### 静态配置 Transfer - -AgentWithDeterministicTransferTo 是一个 Agent Wrapper,在原 Agent 执行完后生成预设的 TransferAction,从而实现静态配置 Agent 跳转的能力: - -```go -// github.com/cloudwego/eino/adk/flow.go +### AgentTool 选项 -type DeterministicTransferConfig struct { - Agent Agent - ToAgentNames []string -} - -func AgentWithDeterministicTransferTo(_ context.Context, config *DeterministicTransferConfig) Agent -``` - -在 Supervisor 模式中,子 Agent 执行完毕后固定回到 Supervisor,由 Supervisor 生成下一步任务目标。此时可以使用 AgentWithDeterministicTransferTo: + + + + +
    选项说明
    WithFullChatHistoryAsInput()
    将父 Agent 的完整对话历史作为子 Agent 输入(默认只传模型生成的 request 参数)
    WithAgentInputSchema(schema)
    自定义子 Agent 的输入 schema
    - +### 事件流透传 -```go -// github.com/cloudwego/eino/adk/prebuilt/supervisor.go +当 `ToolsConfig.EmitInternalEvents = true` 时,子 Agent 的事件会实时透传到父 Agent 的事件流,允许终端用户看到子 Agent 的中间过程。 -type SupervisorConfig struct { - Supervisor adk.Agent - SubAgents []adk.Agent -} +> 💡 +> 透传的事件不影响父 Agent 的状态或 checkpoint,仅用于用户展示。唯一例外是 Interrupted action,会通过 CompositeInterrupt 跨边界传播以支持中断恢复。 -func NewSupervisor(ctx context.Context, conf *SupervisorConfig) (adk.Agent, error) { - subAgents := make([]adk.Agent, 0, len(conf.SubAgents)) - supervisorName := conf.Supervisor.Name(ctx) - for _, subAgent := range conf.SubAgents { - subAgents = append(subAgents, adk.AgentWithDeterministicTransferTo(ctx, &adk.DeterministicTransferConfig{ - Agent: subAgent, - ToAgentNames: []string{supervisorName}, - })) - } +### 预构建示例:DeepAgents - return adk.SetSubAgents(ctx, conf.Supervisor, subAgents) -} -``` +[DeepAgents](/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents) 是 AgentAsTool 模式的最佳实践:主 Agent 通过 **TaskTool** 将子任务委派给子 Agent 执行,配合 **WriteTodos** 进行任务规划和进度追踪。 ## Workflow Agents -WorkflowAgent 支持以代码中预设好的流程运行 Agents。Eino ADK 提供了三种基础 Workflow Agent:Sequential、Parallel、Loop,它们之间可以互相嵌套以完成更复杂的任务。 - -默认情况下,Workflow 中每个 Agent 的输入由 History 章节中介绍的方式生成,可以通过 WithHistoryRewriter 自定 AgentInput 生成方式。 - -当 Agent 产生 ExitAction Event 后,Workflow Agent 会立刻退出,无论之后有没有其他需要运行的 Agent。 - -详解与用例参考请见:[Eino ADK: Workflow Agents](/zh/docs/eino/core_modules/eino_adk/agent_implementation/workflow) - -### SequentialAgent - -SequentialAgent 会按照你提供的顺序,依次执行一系列 Agent: - - - -```go -type SequentialAgentConfig struct { - Name string - Description string - SubAgents []Agent -} - -func NewSequentialAgent(ctx context.Context, config *SequentialAgentConfig) (Agent, error) -``` - -### LoopAgent - -LoopAgent 基于 SequentialAgent 实现,在 SequentialAgent 运行完成后,再次从头运行: - - - -```go -type LoopAgentConfig struct { - Name string - Description string - SubAgents []Agent - - MaxIterations int // 最大循环次数 -} - -func NewLoopAgent(ctx context.Context, config *LoopAgentConfig) (Agent, error) -``` - -### ParallelAgent - -ParallelAgent 会并发运行若干 Agent: +确定性编排,用于流程固定的多步任务: - + + + + + +
    类型说明构造函数
    Sequential按数组顺序依次执行子 Agent
    adk.NewSequentialAgent
    Parallel并发执行所有子 Agent,全部完成后结束
    adk.NewParallelAgent
    Loop循环执行子 Agent 序列,直到 BreakLoop 或超过 MaxIterations
    adk.NewLoopAgent
    -```go -type ParallelAgentConfig struct { - Name string - Description string - SubAgents []Agent -} +Workflow Agent 之间通过 Transfer 传递上下文:上游 Agent 的输出自动拼接到下游 Agent 的输入 Messages 中。 -func NewParallelAgent(ctx context.Context, config *ParallelAgentConfig) (Agent, error) -``` +# 上下文传递 -## AgentAsTool +## SessionValues -当 Agent 运行仅需要明确清晰的指令,而非完整运行上下文(History)时,该 Agent 可以转换为 Tool 进行调用: +跨 Agent 的全局 KV 存储,一次运行内任何 Agent 可并发安全地读写: ```go -func NewAgentTool(_ context.Context, agent Agent, options ...AgentToolOption) tool.BaseTool +// 读写 API +adk.AddSessionValue(ctx, "key", value) +val, ok := adk.GetSessionValue(ctx, "key") +adk.AddSessionValues(ctx, map[string]any{"k1": v1, "k2": v2}) +all := adk.GetSessionValues(ctx) ``` -转换为 Tool 后,Agent 可以被支持 function calling 的 ChatModel 调用,也可以被所有基于 LLM 驱动的 Agent 调用,调用方式取决于 Agent 实现。 - -消息历史隔离:作为 Tool 的 Agent,不会继承上级 Agent 的消息历史(History)。 - -SessionValues 共享:但是,会共享上级 Agent 的 SessionValues,即读写同一个 KV map。 - -内部事件透出:作为 Tool 的 Agent 也是 Agent,会产生 AgentEvent。这些内部的 AgentEvent,默认情况下,不会通过 `Runner` 返回的 `AsyncIterator` 透出。在部分业务场景中,如果需要像用户透出内部 AgentTool 的 AgentEvent,需要在 AgentTool 的上级 `ChatModelAgent` 的 `ToolsConfig` 中增加配置,开启内部事件透出: +> 💡 +> SessionValues 基于 Context 实现,Runner 运行时会重新初始化 Context。如需在运行前注入数据,使用 `WithSessionValues` Option: ```go -// from adk/chatmodel.go - -**type **ToolsConfig **struct **{ - // other configurations... - - _// EmitInternalEvents indicates whether internal events from agentTool should be emitted_ -_ // to the parent generator via a tool option injection at run-time._ -_ _EmitInternalEvents bool -} +iter := runner.Run(ctx, messages, + adk.WithSessionValues(map[string]any{ + "user_id": "123", + }), +) ``` - -这些内部事件,不会进入上级 agent 的上下文(除了本来就会进入的最后一条 message),各种 AgentAction 也不会生效(InterruptAction 除外)。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_extension.md b/content/zh/docs/eino/core_modules/eino_adk/agent_extension.md index 1a87c5db017..cf7b28dad77 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_extension.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_extension.md @@ -1,118 +1,133 @@ --- Description: "" -date: "2025-11-20" +date: "2026-05-17" lastmod: "" tags: [] title: Agent Runner 与扩展 weight: 6 --- -# Agent Runner +# Runner -## 定义 +Runner 是 Agent 的执行入口,负责管理 Agent 生命周期、上下文初始化、Checkpoint 持久化和中断恢复。**任何 Agent 都应通过 Runner 运行。** -Runner 是 Eino ADK 中负责执行 Agent 的核心引擎。它的主要作用是管理和控制 Agent 的整个生命周期,如处理多 Agent 协作,保存传递上下文等,interrupt、callback 等切面能力也均依赖 Runner 实现。任何 Agent 都应通过 Runner 来运行。 +## 基本用法 -## Interrupt & Resume - -Agent Runner 提供运行时中断与恢复的功能,该功能允许一个正在运行的 Agent 主动中断其执行并保存当前状态,支持从中断点恢复执行。该功能常用于 Agent 处理流程中需要外部输入、长时间等待或可暂停等场景。 - -下面将对一次中断到恢复过程中的三个关键点进行介绍: +```go +import "github.com/cloudwego/eino/adk" + +// 创建 Runner +runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + CheckPointStore: store, // 可选,启用中断恢复需要 +}) + +// 方式一:Query — 直接发送用户问题 +iter := runner.Query(ctx, "帮我搜索今天的新闻") + +// 方式二:Run — 传入完整 Messages +iter := runner.Run(ctx, []*schema.Message{ + schema.UserMessage("你好"), +}, adk.WithSessionValues(map[string]any{"user": "alice"})) + +// 消费事件流 +for { + event, ok := iter.Next() + if !ok { + break + } + // 处理 event +} +``` -1. Interrupted Action:由 Agent 抛出中断事件,Agent Runner 拦截 -2. Checkpoint:Agent Runner 拦截事件后保存当前运行状态 -3. Resume:运行条件重新 ready 后,由 Agent Runner 从断点恢复运行 +## 泛型支持 -### Interrupted Action +```go +type TypedRunner[M MessageType] struct { ... } +type Runner = TypedRunner[*schema.Message] -在 Agent 的执行过程中,可以通过产生包含 Interrupted Action 的 AgentEvent 来主动中断 Runner 的运行。 +func NewTypedRunner[M MessageType](conf TypedRunnerConfig[M]) *TypedRunner[M] +``` -当 Event 中的 Interrupted 不为空时,Agent Runner 便会认为发生中断: +`*schema.AgenticMessage` 路径使用 `NewTypedRunner` 构造。 -```go -// github.com/cloudwego/eino/adk/interface.go -type AgentAction struct { - // other actions - Interrupted *InterruptInfo - // other actions -} +## Interrupt & Resume -// github.com/cloudwego/eino/adk/interrupt.go -type InterruptInfo struct { - Data any -} -``` +Agent 可在运行中主动中断,Runner 自动保存状态(需配置 `CheckPointStore`),后续可从断点恢复。 -当中断发生时,可以通过 InterruptInfo 结构体附带自定义的中断信息。此信息: +### 中断 -1. 会被传递给调用者,可以通过该信息向调用者说明中断原因等 -2. 如果后续需要恢复 Agent 运行,InterruptInfo 会在恢复时重新传递给中断的 Agent,Agent 可以依据该信息恢复运行 +Agent 产出包含 `Interrupted` 的事件即可触发中断: ```go -// 例如 ChatModelAgent 中断时,会发送如下的 AgentEvent: -h.Send(&AgentEvent{AgentName: h.agentName, Action: &AgentAction{ - Interrupted: &InterruptInfo{ - Data: &ChatModelAgentInterruptInfo{Data: data, Info: info}, +gen.Send(&adk.AgentEvent{ + Action: &adk.AgentAction{ + Interrupted: &adk.InterruptInfo{Data: myData}, }, -}}) +}) ``` -### 状态持久化 (Checkpoint) - -当 Runner 捕获到这个带有 Interrupted Action 的 Event 时,会立即终止当前的执行流程。 如果: +### 状态持久化 -1. Runner 中设置了 CheckPointStore +Runner 捕获中断后,将运行状态(输入、对话历史、InterruptInfo)以 CheckPointID 为 key 存入 `CheckPointStore`: ```go -// github.com/cloudwego/eino/adk/runner.go -type RunnerConfig struct { - // other fields - CheckPointStore CheckPointStore -} - -// github.com/cloudwego/eino/adk/interrupt.go type CheckPointStore interface { Set(ctx context.Context, key string, value []byte) error Get(ctx context.Context, key string) ([]byte, bool, error) } ``` -1. 调用 Runner 时通过 AgentRunOption WithCheckPointID 传入 CheckPointID +调用时通过 Option 传入 CheckPointID: ```go -// github.com/cloudwego/eino/adk/interrupt.go -func WithCheckPointID(id string) _AgentRunOption_ +iter := runner.Run(ctx, messages, adk.WithCheckPointID("cp-123")) ``` -Runner 在终止运行后会将当前运行状态(原始输入、对话历史等)以及 Agent 抛出的 InterruptInfo 以 CheckPointID 为 key 持久化到 CheckPointStore 中。 - > 💡 -> 为了保存 interface 中数据的原本类型,Eino ADK 使用 gob([https://pkg.go.dev/encoding/gob](https://pkg.go.dev/encoding/gob))序列化运行状态。因此在使用自定义类型时需要提前使用 gob.Register 或 gob.RegisterName 注册类型(更推荐后者,前者使用路径加类型名作为默认名字,因此类型的位置和名字均不能发生变更)。Eino 会自动注册框架内置的类型。 +> ADK 使用 gob 序列化运行状态。自定义类型需提前 gob.RegisterName 注册。框架内置类型已自动注册。 -### Resume - -运行中断,调用 Runner 的 Resume 接口传入中断时的 CheckPointID 可以恢复运行: +### 恢复 ```go -// github.com/cloudwego/eino/adk/runner.go -func (r *Runner) Resume(ctx context.Context, checkPointID string, opts ...AgentRunOption) (*AsyncIterator[*AgentEvent], error) +// 简单恢复:隐式恢复所有中断点 +iter, err := runner.Resume(ctx, "cp-123") + +// 精确恢复:指定目标和数据 +iter, err := runner.ResumeWithParams(ctx, "cp-123", &adk.ResumeParams{ + Targets: map[string]any{ + "agent-address": resumeData, + }, +}) ``` -恢复 Agent 运行需要发生中断的 Agent 实现了 ResumableAgent 接口, Runner 从 CheckPointerStore 读取运行状态并恢复运行,其中 InterruptInfo 和上次运行配置的 EnableStreaming 会作为输入提供给 Agent: +恢复需要中断的 Agent 实现 `ResumableAgent` 接口: ```go -// github.com/cloudwego/eino/adk/interface.go -type ResumableAgent interface { - Agent - - Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] -} - -// github.com/cloudwego/eino/adk/interrupt.go -type ResumeInfo struct { - EnableStreaming bool - *_InterruptInfo_ +type TypedResumableAgent[M MessageType] interface { + TypedAgent[M] + Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*TypedAgentEvent[M]] } ``` -Resume 如果向 Agent 传入新信息,可以定义 AgentRunOption,在调用 Runner.Resume 时传入。 +# 多轮运行时:TurnLoop + +对于需要多轮交互的场景(聊天应用、持续对话),ADK 提供 `TurnLoop` 运行时: + +- **Push-based 事件循环**:Push 新消息触发 Agent 运行 +- **抢占(Preempt)**:用户在 Agent 运行中发送新消息时,可取消当前运行 +- **Stop**:停止事件循环 +- **声明式 Checkpoint/Resume**:TurnLoop 自动管理输入 bookkeeping,应用层只需声明恢复策略 + +详见:[Agent Cancel 与 TurnLoop 快速入门](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) + +# Agent Cancel + +v0.9 新增的运行时取消能力,支持: + +- **CancelMode 位掩码组合**:`CancelModelStream | CancelToolCalls` +- **CancelHandle.Wait()**:等待取消完成 +- **与 TurnLoop 集成**:Preempt 时自动触发 Cancel + +详见:[Agent Cancel 与 TurnLoop 快速入门](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md deleted file mode 100644 index 6c4a682268c..00000000000 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md +++ /dev/null @@ -1,897 +0,0 @@ ---- -Description: "" -date: "2026-03-24" -lastmod: "" -tags: [] -title: ChatModelAgent -weight: 1 ---- - -# ChatModelAgent 概述 - -## Import Path - -`import ``github.com/cloudwego/eino/adk` - -## 什么是 ChatModelAgent - -`ChatModelAgent` 是 Eino ADK 中的一个核心预构建 的 Agent,它封装了与大语言模型(LLM)进行交互、并支持使用工具来完成任务的复杂逻辑。 - -## ChatModelAgent ReAct 模式 - -`ChatModelAgent` 内使用了 [ReAct](https://react-lm.github.io/) 模式,该模式旨在通过让 ChatModel 进行显式的、一步一步的“思考”来解决复杂问题。为 `ChatModelAgent` 配置了工具后,它在内部的执行流程就遵循了 ReAct 模式: - -- 调用 ChatModel(Reason) -- LLM 返回工具调用请求(Action) -- ChatModelAgent 执行工具(Act) -- 它将工具结果返回给 ChatModel(Observation),然后开始新的循环,直到 ChatModel 判断不需要调用 Tool 结束。 - -当没有配置工具时,`ChatModelAgent` 退化为一次 ChatModel 调用。 - - - -可以通过 ToolsConfig 为 ChatModelAgent 配置 Tool: - -```go -// github.com/cloudwego/eino/adk/chatmodel.go - -type ToolsConfig struct { - compose.ToolsNodeConfig - - // Names of the tools that will make agent return directly when the tool is called. - // When multiple tools are called and more than one tool is in the return directly list, only the first one will be returned. - ReturnDirectly map[string]bool - - // EmitInternalEvents indicates whether internal events from agentTool should be emitted - // to the parent generator via a tool option injection at run-time. - EmitInternalEvents bool -} -``` - -ToolsConfig 复用了 Eino Graph ToolsNodeConfig,详细参考:[Eino: ToolsNode&Tool 使用说明](/zh/docs/eino/core_modules/components/tools_node_guide)。额外提供了 ReturnDirectly 配置,ChatModelAgent 调用配置在 ReturnDirectly 中的 Tool 后会直接退出。 - -## ChatModelAgent 配置字段 - -> 💡 -> 注意:GenModelInput 默认情况下,会通过 adk.GetSessionValues() 并以 F-String 的格式渲染 Instruction,如需关闭此行为,可定制 GenModelInput 方法。 - -```go -type ChatModelAgentConfig struct { - // Name of the agent. Better be unique across all agents. - Name string - // Description of the agent's capabilities. - // Helps other agents determine whether to transfer tasks to this agent. - Description string - // Instruction used as the system prompt for this agent. - // Optional. If empty, no system prompt will be used. - // Supports f-string placeholders for session values in default GenModelInput, for example: - // "You are a helpful assistant. The current time is {Time}. The current user is {User}." - // These placeholders will be replaced with session values for "Time" and "User". - Instruction string - - Model model.ToolCallingChatModel - - ToolsConfig ToolsConfig - - // GenModelInput transforms instructions and input messages into the model's input format. - // Optional. Defaults to defaultGenModelInput which combines instruction and messages. - GenModelInput GenModelInput - - // Exit defines the tool used to terminate the agent process. - // Optional. If nil, no Exit Action will be generated. - // You can use the provided 'ExitTool' implementation directly. - Exit tool.BaseTool - - // OutputKey stores the agent's response in the session. - // Optional. When set, stores output via AddSessionValue(ctx, outputKey, msg.Content). - OutputKey string - - // MaxIterations defines the upper limit of ChatModel generation cycles. - // The agent will terminate with an error if this limit is exceeded. - // Optional. Defaults to 20. - MaxIterations int - - // ModelRetryConfig configures retry behavior for the ChatModel. - // When set, the agent will automatically retry failed ChatModel calls - // based on the configured policy. - // Optional. If nil, no retry will be performed. - ModelRetryConfig *ModelRetryConfig -} - -type ToolsConfig struct { - compose.ToolsNodeConfig - - // Names of the tools that will make agent return directly when the tool is called. - // When multiple tools are called and more than one tool is in the return directly list, only the first one will be returned. - ReturnDirectly map[string]bool - - // EmitInternalEvents indicates whether internal events from agentTool should be emitted - // to the parent generator via a tool option injection at run-time. - EmitInternalEvents bool -} - -type GenModelInput func(ctx context.Context, instruction string, input *AgentInput) ([]Message, error) -``` - -- `Name`:Agent 名称 -- `Description`:Agent 描述 -- `Instruction`:调用 ChatModel 时的 System Prompt,支持 f-string 渲染 -- `Model`:运行所使用的 ChatModel,要求支持工具调用 -- `ToolsConfig`:工具配置 - - ToolsConfig 复用了 Eino Graph ToolsNodeConfig,详细参考:[Eino: ToolsNode&Tool 使用说明](/zh/docs/eino/core_modules/components/tools_node_guide)。 - - ReturnDirectly:当 ChatModelAgent 调用配置在 ReturnDirectly 中的 Tool 后,将携带结果立刻退出,不会按照 react 模式返回 ChatModel。如果命中了多个 Tool,只有首个 Tool 会返回。Map key 为 Tool 名称。 - - EmitInternalEvents:当通过 adk.AgentTool() 将一个 Agent 通过 ToolCall 的形式当成 SubAgent 时,默认情况下,这个 SubAgent 不会发送 AgentEvent,只将最终结果作为 ToolResult 返回。 -- `GenModelInput`:Agent 被调用时会使用该方法将 `Instruction` 和 `AgentInput` 转换为调用 ChatModel 的 Messages。Agent 提供了默认的 GenModelInput 方法: - 1. 将 `Instruction` 作为 `System Message` 加到 `AgentInput.Messages` 前 - 2. 将 `SessionValues` 为 variables 渲染到步骤 1 的 message list 中 - -> 💡 -> 默认的 `GenModelInput` 使用 pyfmt 渲染,message list 中的文本会被作为 pyfmt 模板,这意味着文本中的 '{' 与 '}' 都会被视为关键字,如果希望直接输入这两个字符,需要进行转义 '{{'、'}}' - -- `OutputKey`:配置后,ChatModelAgent 运行产生的最后一条 Message 将会以 `OutputKey` 为 key 设置到 `SessionValues` 中 -- `MaxIterations`:react 模式下 ChatModel 最大生成次数,超过时 Agent 会报错退出,默认值为 20 -- `Exit`:Exit 是一个特殊的 Tool,当模型调用这个工具并执行后,ChatModelAgent 将直接退出,效果与 `ToolsConfig.ReturnDirectly` 类似。ADK 提供了一个默认 ExitTool 实现供用户使用: - -```go -type ExitTool struct{} - -func (et ExitTool) Info(_ context.Context) (*schema.ToolInfo, error) { - return ToolInfoExit, nil -} - -func (et ExitTool) InvokableRun(ctx context.Context, argumentsInJSON string, _ ...tool.Option) (string, error) { - type exitParams struct { - FinalResult string `json:"final_result"` - } - - params := &exitParams{} - err := sonic.UnmarshalString(argumentsInJSON, params) - if err != nil { - return "", err - } - - err = SendToolGenAction(ctx, "exit", NewExitAction()) - if err != nil { - return "", err - } - - return params.FinalResult, nil -} -``` - -- `ModelRetryConfig`: 配置后,ChatModel 请求过程中发生的各种错误(包括直接返回错误、流式响应过程中发生错误等),都会按照配置的策略选择是否以及何时进行重试。如果是流式响应过程中发生错误,则这一次流式响应依然会第一时间通过 AgentEvent 的形式返回出去。如果这次流式响应过程中的错误,按照配置的策略,会进行重试,则消费 AgentEvent 中的 message stream,会得到 `WillRetryError`。用户可以处理这个 error,做对应的上屏展示等处理,示例如下: - -```go -iterator := agent.Run(ctx, input) -for { - event, ok := iterator.Next() - if !ok { - break - } - - if event.Err != nil { - handleFinalError(event.Err) - break - } - - // Process streaming output - if event.Output != nil && event.Output.MessageOutput.IsStreaming { - stream := event.Output.MessageOutput.MessageStream - for { - msg, err := stream.Recv() - if err == io.EOF { - break // Stream completed successfully - } - if err != nil { - // Check if this error will be retried (more streams coming) - var willRetry *adk.WillRetryError - if errors.As(err, &willRetry) { - log.Printf("Attempt %d failed, retrying...", willRetry.RetryAttempt) - break // Wait for next event with new stream - } - // Original error - won't retry, agent will stop and the next AgentEvent probably will be an error - log.Printf("Final error (no retry): %v", err) - break - } - // Display chunk to user - displayChunk(msg) - } - } -} -``` - -## ChatModelAgent Transfer - -`ChatModelAgent` 支持将其他 Agent 的元信息转为自身的 Tool ,经由 ChatModel 判断实现动态 Transfer: - -- `ChatModelAgent` 实现了 `OnSubAgents` 接口,使用 `SetSubAgents` 为 `ChatModelAgent` 设置子 Agents 后,`ChatModelAgent` 会增加一个 `Transfer Tool`,并且在 prompt 中指示 ChatModel 在需要 transfer 时调用这个 Tool 并以 transfer 目标 AgentName 作为 Tool 输入。 - -```go -const ( - TransferToAgentInstruction = `Available other agents: %s - -Decision rule: -- If you're best suited for the question according to your description: ANSWER -- If another agent is better according its description: CALL '%s' function with their agent name - -When transferring: OUTPUT ONLY THE FUNCTION CALL` -) - -func genTransferToAgentInstruction(ctx context.Context, agents []Agent) string { - var sb strings.Builder - for _, agent := range agents { - sb.WriteString(fmt.Sprintf("\n- Agent name: %s\n Agent description: %s", - agent.Name(ctx), agent.Description(ctx))) - } - - return fmt.Sprintf(TransferToAgentInstruction, sb.String(), TransferToAgentToolName) -} -``` - -- `Transfer Tool` 运行会设置 Transfer Event,指定跳转到目标 Agent 上,完成后 ChatModelAgent 退出。 -- Agent Runner 接收到 Transfer Event 后,跳转到目标 Agent 上执行,完成 Transfer 操作 - -## ChatModelAgent AgentAsTool - -当需要被调用的 Agent 不需要完整的运行上下文,仅需要明确清晰的入参即可正确运行时,该 Agent 可以转换为 Tool 交由 `ChatModelAgent` 判断调用: - -- ADK 中提供了工具方法,可以方便地将 Eino ADK Agent 转化为 Tool 供 ChatModelAgent 调用: - -```go -// github.com/cloudwego/eino/adk/agent_tool.go - -func NewAgentTool(_ context.Context, agent Agent, options ...AgentToolOption) tool.BaseTool -``` - -- 被转换为 Tool 后的 Agent 可以通过 `ToolsConfig` 直接注册在 ChatModelAgent 中 - -```go -bookRecommender := NewBookRecommendAgent() -bookRecommendeTool := NewAgentTool(ctx, bookRecommender) - -a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // ... - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{bookRecommendeTool}, - }, - }, -}) -``` - -## ChatModelAgent Middleware - -`ChatModelAgentMiddleware` 是 `ChatModelAgent` 的扩展机制,允许开发者在 Agent 执行的各个阶段注入自定义逻辑: - - - -`ChatModelAgentMiddleware` 定义为 interface,开发者可以实现此 interface 并通过配置到 `ChatModelAgentConfig` 使其在 `ChatModelAgent` 中生效: - -```go -type ChatModelAgentMiddleware interface { - // ... -} - -a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // ... - Handlers: []adk.ChatModelAgentMiddleware{ - &MyMiddleware{}, - }, -}) -``` - -**使用 BaseChatModelAgentMiddleware** - -`BaseChatModelAgentMiddleware` 提供所有方法的默认空实现。通过嵌入它,可以只覆盖需要的方法: - -```go -type MyMiddleware struct { - *adk.BaseChatModelAgentMiddleware - // 自定义字段 - logger *log.Logger -} - -// 只需覆盖需要的方法 -func (m *MyMiddleware) BeforeModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - m.logger.Printf("Messages count: %d", len(state.Messages)) - return ctx, state, nil -} -``` - -### BeforeAgent - -在每次 Agent 运行前调用,可用于修改指令和工具配置。ChatModelAgentContext 定义了 BeforeAgent 中可读写的内容: - -```go -type ChatModelAgentContext struct { - // InstructionAgent 是当前 Agent 的指令 - Instruction string - // Tools 是当前配置的原始工具列表 - Tools []tool.BaseTool - // ReturnDirectly 配置调用后直接返回的工具名称集合 - ReturnDirectly map[string]bool -} - -type ChatModelAgentMiddleware interface { - // ... - BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) - // ... -} -``` - -例子: - -```go -func (m *MyMiddleware) BeforeAgent( - ctx context.Context, - runCtx *adk.ChatModelAgentContext, -) (context.Context, *adk.ChatModelAgentContext, error) { - // 拷贝 runCtx,避免修改输入 - nRunCtx := *runCtx - - // 修改指令 - nRunCtx.Instruction += "\n\n请始终使用中文回复。" - - // 添加工具 - nRunCtx.Tools = append(runCtx.Tools, myCustomTool) - - // 设置工具直接返回 - nRunCtx.ReturnDirectly["my_tool"] = true - - return ctx, &nRunCtx, nil -} -``` - -### BeforeModelRewriteState / AfterModelRewriteState - -在每次模型调用前/后调用,可用于检查和修改消息历史。ModelContext 定义了只读内容,ChatModelAgentState 定义了可读写内容: - -```go -type ModelContext struct { - // Tools 包含当前配置给 Agent 的工具列表 - // 在请求时填充,包含将要发送给模型的工具信息 - Tools []*schema.ToolInfo - - // ModelRetryConfig 包含模型的重试配置 - // 从 Agent 的 ModelRetryConfig 填充 - ModelRetryConfig *ModelRetryConfig -} - -type ChatModelAgentState struct { - // Messages 包含当前会话中的所有消息 - Messages []Message -} - -type ChatModelAgentMiddleware interface { - BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) -} -``` - -例子: - -```go -func (m *MyMiddleware) BeforeModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - // 拷贝 state,避免修改入参 - nState := *state - - // 检查消息历史 - if len(state.Messages) > 50 { - // 截断过旧的消息 - nState.Messages = state.Messages[len(state.Messages)-50:] - } - return ctx, &nState, nil -} - -func (m *MyMiddleware) AfterModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - // 模型响应是最后一条消息 - lastMsg := state.Messages[len(state.Messages)-1] - m.logger.Printf("Model response: %s", lastMsg.Content) - return ctx, state, nil -} -``` - -### WrapModel - -包装模型调用,可用于拦截和修改模型的输入输出: - -```go -type ChatModelAgentMiddleware interface { - WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) -} -``` - -例子: - -```go -func (m *MyMiddleware) WrapModel( - ctx context.Context, - chatModel model.BaseChatModel, - mc *adk.ModelContext, -) (model.BaseChatModel, error) { - return &loggingModel{ - inner: chatModel, - logger: m.logger, - }, nil -} - -type loggingModel struct { - inner model.BaseChatModel - logger *log.Logger -} - -func (m *loggingModel) Generate(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.Message, error) { - m.logger.Printf("Input messages: %d", len(msgs)) - resp, err := m.inner.Generate(ctx, msgs, opts...) - m.logger.Printf("Output: %v, error: %v", resp != nil, err) - return resp, err -} - -func (m *loggingModel) Stream(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.StreamReader[*schema.Message], error) { - return m.inner.Stream(ctx, msgs, opts...) -} -``` - -### WrapInvokableToolCall / WrapStreamableToolCall - -包装工具调用,可用于拦截和修改工具的输入输出: - -```go -// InvokableToolCallEndpoint 是工具调用的函数签名。 -// Middleware 开发者围绕这个 Endpoint 添加自定义逻辑。 -type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) - -// StreamableToolCallEndpoint 是流式工具调用的函数签名。 -// Middleware 开发者围绕这个 Endpoint 添加自定义逻辑。 -type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) - -type ToolContext struct { - // Name 说明了本次调用工具的名称 - Name string - // CallID 说明了本次调用工具的 ToolCallID - CallID string -} - -type ChatModelAgentMiddleware interface { - WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) - WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) -} -``` - -例子: - -```go -func (m *MyMiddleware) WrapInvokableToolCall( - ctx context.Context, - endpoint adk.InvokableToolCallEndpoint, - tCtx *adk.ToolContext, -) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - m.logger.Printf("Calling tool: %s (ID: %s)", tCtx.Name, tCtx.CallID) - start := time.Now() - - result, err := endpoint(ctx, argumentsInJSON, opts...) - - m.logger.Printf("Tool %s completed in %v", tCtx.Name, time.Since(start)) - return result, err - }, nil -} -``` - -# ChatModelAgent 使用示例 - -## 场景说明 - -创建一个图书推荐 Agent,Agent 将能够根据用户的输入推荐相关图书。 - -## 代码实现 - -### 步骤 1: 定义工具 - -图书推荐 Agent 需要一个根据能够根据用户要求(题材、评分等)检索图书的工具 `book_search` 。 - -利用 Eino 提供的工具方法可以方便地创建(可参考[如何创建一个 tool ?](/zh/docs/eino/core_modules/components/tools_node_guide/how_to_create_a_tool)): - -```go -import ( - "context" - "log" - - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" -) - -type BookSearchInput struct { - Genre string `json:"genre" jsonschema:"description=Preferred book genre,enum=fiction,enum=sci-fi,enum=mystery,enum=biography,enum=business"` - MaxPages int `json:"max_pages" jsonschema:"description=Maximum page length (0 for no limit)"` - MinRating int `json:"min_rating" jsonschema:"description=Minimum user rating (0-5 scale)"` -} - -type BookSearchOutput struct { - Books []string -} - -func NewBookRecommender() tool.InvokableTool { - bookSearchTool, err := utils.InferTool("search_book", "Search books based on user preferences", func(ctx context.Context, input *BookSearchInput) (output *BookSearchOutput, err error) { - // search code - // ... - return &BookSearchOutput{Books: []string{"God's blessing on this wonderful world!"}}, nil - }) - if err != nil { - log.Fatalf("failed to create search book tool: %v", err) - } - return bookSearchTool -} -``` - -### 步骤 2: 创建 ChatModel - -Eino 提供了多种 ChatModel 封装(如 openai、gemini、doubao 等,详见 [Eino: ChatModel 使用说明](/zh/docs/eino/core_modules/components/chat_model_guide)),这里以 openai ChatModel 为例: - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/components/model" -) - -func NewChatModel() model.ToolCallingChatModel { - ctx := context.Background() - apiKey := os.Getenv("OPENAI_API_KEY") - openaiModel := os.Getenv("OPENAI_MODEL") - - cm, err := openai.NewChatModel(ctx, &openai.ChatModelConfig{ - APIKey: apiKey, - Model: openaiModel, - }) - if err != nil { - log.Fatal(fmt.Errorf("failed to create chatmodel: %w", err)) - } - return cm -} -``` - -### 步骤 3: 创建 ChatModelAgent - -除了配置 ChatModel 和工具外,还需要配置描述 Agent 功能用途的 Name 和 Description,以及指示 ChatModel 的 Instruction,Instruction 最终会作为 system message 被传递给 ChatModel。 - -```go -import ( - "context" - "fmt" - "log" - - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/compose" -) - -func NewBookRecommendAgent() adk.Agent { - ctx := context.Background() - - a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "BookRecommender", - Description: "An agent that can recommend books", - Instruction: `You are an expert book recommender. Based on the user's request, use the "search_book" tool to find relevant books. Finally, present the results to the user.`, - Model: NewChatModel(), - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{NewBookRecommender()}, - }, - }, - }) - if err != nil { - log.Fatal(fmt.Errorf("failed to create chatmodel: %w", err)) - } - - return a -} -``` - -### - -### 步骤 4: 通过 Runner 运行 - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino/adk" - - "github.com/cloudwego/eino-examples/adk/intro/chatmodel/subagents" -) - -func main() { - ctx := context.Background() - a := subagents.NewBookRecommendAgent() - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: a, - }) - iter := runner.Query(ctx, "recommend a fiction book to me") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - msg, err := event.Output.MessageOutput.GetMessage() - if err != nil { - log.Fatal(err) - } - fmt.Printf("\nmessage:\n%v\n======", msg) - } -} -``` - -## 运行结果 - -```yaml -message: -assistant: -tool_calls: -{Index: ID:call_o2It087hoqj8L7atzr70EnfG Type:function Function:{Name:search_book Arguments:{"genre":"fiction","max_pages":0,"min_rating":0}} Extra:map[]} - -finish_reason: tool_calls -usage: &{140 24 164} -====== - - -message: -tool: {"Books":["God's blessing on this wonderful world!"]} -tool_call_id: call_o2It087hoqj8L7atzr70EnfG -tool_call_name: search_book -====== - - -message: -assistant: I recommend the fiction book "God's blessing on this wonderful world!". It's a great choice for readers looking for an exciting story. Enjoy your reading! -finish_reason: stop -usage: &{185 31 216} -====== -``` - -# ChatModelAgent 中断与恢复 - -## 介绍 - -`ChatModelAgent` 使用了 Eino Graph 实现,因此在 agent 中可以复用 Eino Graph 的 Interrupt&Resume 能力。 - -- Interrupt 时,通过在工具中返回特殊错误使 Graph 触发中断并向外抛出自定义信息,在恢复时 Graph 会重新运行此工具: - -```go -// github.com/cloudwego/eino/adk/interrupt.go - -func NewInterruptAndRerunErr(extra any) error -``` - -- Resume 时,支持自定义 ToolOption,用于在恢复时传递额外信息到 Tool 中: - -```go -import ( - "github.com/cloudwego/eino/components/tool" -) - -type askForClarificationOptions struct { - NewInput *string -} - -func WithNewInput(input string) tool.Option { - return tool.WrapImplSpecificOptFn(func(t *askForClarificationOptions) { - t.NewInput = &input - }) -} -``` - -## 示例 - -下面我们将基于上面【ChatModelAgent 使用示例】小节中的代码,为 `BookRecommendAgent` 增加一个工具 `ask_for_clarification`,当用户提供的信息不足以支持推荐时,Agent 将调用这个工具向用户询问更多信息,`ask_for_clarification` 使用了 Interrupt&Resume 能力来实现向用户“询问”。 - -### 步骤 1 : 新增 Tool 支持中断 - -```go -import ( - "context" - "log" - - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" - "github.com/cloudwego/eino/compose" -) - -type askForClarificationOptions struct { - NewInput *string -} - -func WithNewInput(input string) tool.Option { - return tool.WrapImplSpecificOptFn(func(t *askForClarificationOptions) { - t.NewInput = &input - }) -} - -type AskForClarificationInput struct { - Question string `json:"question" jsonschema:"description=The specific question you want to ask the user to get the missing information"` -} - -func NewAskForClarificationTool() tool.InvokableTool { - t, err := utils.InferOptionableTool( - "ask_for_clarification", - "Call this tool when the user's request is ambiguous or lacks the necessary information to proceed. Use it to ask a follow-up question to get the details you need, such as the book's genre, before you can use other tools effectively.", - func(ctx context.Context, input *AskForClarificationInput, opts ...tool.Option) (output string, err error) { - o := tool.GetImplSpecificOptions[askForClarificationOptions](nil, opts...) - if o.NewInput == nil { - return "", compose.NewInterruptAndRerunErr(input.Question) - } - return *o.NewInput, nil - }) - if err != nil { - log.Fatal(err) - } - return t -} -``` - -### 步骤 2: 添加 Tool 到 Agent 中 - -```go -func NewBookRecommendAgent() adk.Agent { - // xxx - a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // xxx - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{NewBookRecommender(), NewAskForClarificationTool()}, - }, - // Tool 内部通过 AgentTool() 调用 SubAgent 时,是否将这个 SubAgent 的 AgentEvent 输出 - EmitInternalEvents: true, - }, - }) - // xxx -} -``` - -### 步骤 3: Agent Runner 配置 CheckPointStore - -在 Runner 中配置 `CheckPointStore`(例子中使用最简单的 InMemoryStore),并在调用 Agent 时传入 `CheckPointID`,用于在恢复时使用。另外,在中断时,Graph 会将 `InterruptInfo` 放入 `Interrupted.Data` 中: - -```go -func newInMemoryStore() compose.CheckPointStore { - return &inMemoryStore{ - mem: map[string][]byte{}, - } -} - -func main() { - ctx := context.Background() - a := subagents.NewBookRecommendAgent() - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - EnableStreaming: true, // you can disable streaming here - Agent: a, - CheckPointStore: newInMemoryStore(), - }) - iter := runner.Query(ctx, "recommend a book to me", adk.WithCheckPointID("1")) - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil && event.Action.Interrupted != nil { - fmt.Printf("\ninterrupt happened, info: %+v\n", event.Action.Interrupted.Data.(*adk.ChatModelAgentInterruptInfo).RerunNodesExtra["ToolNode"]) - continue - } - msg, err := event.Output.MessageOutput.GetMessage() - if err != nil { - log.Fatal(err) - } - fmt.Printf("\nmessage:\n%v\n======\n\n", msg) - } - - scanner := bufio.NewScanner(os.Stdin) - fmt.Print("\nyour input here: ") - scanner.Scan() - fmt.Println() - nInput := scanner.Text() - - iter, err := runner.Resume(ctx, "1", adk.WithToolOptions([]tool.Option{subagents.WithNewInput(nInput)})) - if err != nil { - log.Fatal(err) - } - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - prints.Event(event) - } -} -``` - -### 运行结果 - -运行后会发生中断 - -``` -message: -assistant: -tool_calls: -{Index: ID:call_3HAobzkJvW3JsTmSHSBRftaG Type:function Function:{Name:ask_for_clarification Arguments:{"question":"Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{219 37 256} -====== - - -interrupt happened, info: &{ToolCalls:[{Index: ID:call_3HAobzkJvW3JsTmSHSBRftaG Type:function Function:{Name:ask_for_clarification Arguments:{"question":"Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?"}} Extra:map[]}] ExecutedTools:map[] RerunTools:[call_3HAobzkJvW3JsTmSHSBRftaG] RerunExtraMap:map[call_3HAobzkJvW3JsTmSHSBRftaG:Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?]} -your input here: -``` - -stdin 输入后,从 CheckPointStore 取出之前中断状态,结合补全的输入,继续运行 - -``` -new input is: -recommend me a fiction book - -message: -tool: recommend me a fiction book -tool_call_id: call_3HAobzkJvW3JsTmSHSBRftaG -tool_call_name: ask_for_clarification -====== - - -message: -assistant: -tool_calls: -{Index: ID:call_3fC5OqPZLls11epXMv7sZGAF Type:function Function:{Name:search_book Arguments:{"genre":"fiction","max_pages":0,"min_rating":0}} Extra:map[]} - -finish_reason: tool_calls -usage: &{272 24 296} -====== - - -message: -tool: {"Books":["God's blessing on this wonderful world!"]} -tool_call_id: call_3fC5OqPZLls11epXMv7sZGAF -tool_call_name: search_book -====== - - -message: -assistant: I recommend the fiction book "God's Blessing on This Wonderful World!" Enjoy your reading! -finish_reason: stop -usage: &{317 20 337} -====== -``` - -# 总结 - -`ChatModelAgent` 是 ADK 核心 Agent 实现,充当应用程序 "思考" 的部分,利用 LLM 强大的功能进行推理、理解自然语言、作出决策、生成相应、进行工具交互。 - -`ChatModelAgent` 的行为是非确定性的,通过 LLM 来动态的决定使用哪些工具,或转交控制权到其他 Agent 上。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md new file mode 100644 index 00000000000..b8fa2373179 --- /dev/null +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md @@ -0,0 +1,306 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: ChatModelAgent +weight: 1 +--- + +# ChatModelAgent 概述 + +`import "github.com/cloudwego/eino/adk"` + +## 什么是 ChatModelAgent + +`ChatModelAgent` 是 Eino ADK 的核心 Agent 实现——以 ChatModel 为决策器、以 Tools 为行动空间、通过 ReAct Loop 自主推进问题求解。 + +关于 ChatModelAgent 的概念、ReAct Loop、Middleware 体系的完整介绍,见:[ChatModelAgent 介绍](/zh/docs/eino/overview/eino_adk_quickstart) + +## ReAct Loop + +当配置了 Tools 时,ChatModelAgent 按 ReAct 模式循环执行: + +1. **Reason**:调用 ChatModel,模型决定下一步行动 +2. **Action**:模型返回 ToolCall 请求 +3. **Act**:执行对应 Tool +4. **Observation**:将 Tool 结果注入上下文,开始新一轮循环 + +循环持续直到模型判断无需再调用 Tool。未配置 Tools 时退化为单次 ChatModel 调用。 + +# 配置 + +## TypedChatModelAgentConfig + +```go +type TypedChatModelAgentConfig[M MessageType] struct { + Name string + Description string + Instruction string + + Model model.BaseModel[M] // 必填。使用 Tools 时须支持 model.WithTools + + ToolsConfig ToolsConfig + GenModelInput TypedGenModelInput[M] + + Exit tool.BaseTool // NOT RECOMMENDED + OutputKey string // NOT RECOMMENDED + MaxIterations int // 默认 20 + + Handlers []TypedChatModelAgentMiddleware[M] + Middlewares []AgentMiddleware // 旧版兼容 + + ModelRetryConfig *TypedModelRetryConfig[M] + ModelFailoverConfig *ModelFailoverConfig[M] +} + +// 默认别名 +type ChatModelAgentConfig = TypedChatModelAgentConfig[*schema.Message] +``` + +### 字段说明 + + + + + + + + + + + + + + +
    字段说明
    Name
    Agent 名称。用作 AgentTool 时必填
    Description
    Agent 能力描述。用作 AgentTool 时必填
    Instruction
    System Prompt。支持
    {Key}
    占位符,默认
    GenModelInput
    会用 SessionValues 渲染
    Model
    必填
    model.BaseModel[M]
    类型,使用 Tools 时须支持
    model.WithTools
    ToolsConfig
    工具配置,详见下文
    GenModelInput
    自定义输入转换。默认将 Instruction 作为 System Message + f-string 渲染
    MaxIterations
    ReAct 最大循环次数,超过报错退出。默认 20
    Handlers
    接口式 Middleware(
    TypedChatModelAgentMiddleware[M]
    ),推荐使用
    Middlewares
    结构体式 Middleware(
    AgentMiddleware
    ),旧版兼容
    ModelRetryConfig
    模型调用失败时的重试策略
    ModelFailoverConfig
    模型调用失败时切换备用模型。需配置
    GetFailoverModel
    ShouldFailover
    + +> 💡 +> 默认 GenModelInput 使用 pyfmt 渲染,Messages 中的 `{` 和 `}` 会被视为占位符。如需直接输出这两个字符,用 `{{` 和 `}}` 转义。 + +### ToolsConfig + +```go +type ToolsConfig struct { + compose.ToolsNodeConfig + + ReturnDirectly map[string]bool // 调用后直接返回的 Tool 名称 + EmitInternalEvents bool // 透传 AgentTool 内部事件 +} +``` + +- **ReturnDirectly**:命中的 Tool 执行后 Agent 立即退出,不再回调模型。多个命中时取首个 +- **EmitInternalEvents**:当子 Agent 通过 AgentTool 调用时,将子 Agent 事件实时透传到父 Agent 事件流 + +### 构造函数 + +```go +func NewChatModelAgent(ctx context.Context, config *ChatModelAgentConfig) (*ChatModelAgent, error) +func NewTypedChatModelAgent[M MessageType](ctx context.Context, config *TypedChatModelAgentConfig[M]) (*TypedChatModelAgent[M], error) +``` + +# Middleware(ChatModelAgentMiddleware) + +## 接口定义 + +```go +type TypedChatModelAgentMiddleware[M MessageType] interface { + BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) + AfterAgent(ctx context.Context, state *TypedChatModelAgentState[M]) (context.Context, error) + + BeforeModelRewriteState(ctx context.Context, state *TypedChatModelAgentState[M], mc *TypedModelContext[M]) (context.Context, *TypedChatModelAgentState[M], error) + AfterModelRewriteState(ctx context.Context, state *TypedChatModelAgentState[M], mc *TypedModelContext[M]) (context.Context, *TypedChatModelAgentState[M], error) + + WrapModel(ctx context.Context, m model.BaseModel[M], mc *TypedModelContext[M]) (model.BaseModel[M], error) + + WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) + WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) + WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) + WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) +} + +type ChatModelAgentMiddleware = TypedChatModelAgentMiddleware[*schema.Message] +``` + +使用 `*BaseChatModelAgentMiddleware` 嵌入可只覆盖需要的方法: + +```go +type MyMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *MyMiddleware) BeforeModelRewriteState( + ctx context.Context, + state *adk.ChatModelAgentState, + mc *adk.ModelContext, +) (context.Context, *adk.ChatModelAgentState, error) { + // 自定义逻辑 + return ctx, state, nil +} +``` + +## 钩子点位 + + + + + + + + + +
    钩子时机可修改内容
    BeforeAgent
    Agent 运行前(仅一次)Instruction、Tools、ReturnDirectly、ToolSearchTool
    AfterAgent
    Agent 成功结束后读取最终 state(不修改)
    BeforeModelRewriteState
    每次模型调用前Messages、ToolInfos、DeferredToolInfos(持久化到 state
    AfterModelRewriteState
    每次模型调用后Messages(含模型响应)、ToolInfos(持久化到 state
    WrapModel
    包装模型调用重试、failover、事件发送(不要修改 Messages
    WrapToolCall
    包装工具调用权限检查、日志、输出改写
    + +> 💡 +> `BeforeModelRewriteState` 返回的 state 会被框架持久化到 agent 内部状态。因此该钩子中的修改(如压缩 Messages、过滤 ToolInfos)会影响后续所有迭代。 + +## 核心类型 + +### ChatModelAgentContext(BeforeAgent 参数) + +```go +type ChatModelAgentContext struct { + Instruction string + Tools []tool.BaseTool + ReturnDirectly map[string]bool + ToolSearchTool *schema.ToolInfo // 模型原生 ToolSearch 能力 +} +``` + +### ChatModelAgentState(BeforeModel/AfterModel 参数) + +```go +type TypedChatModelAgentState[M MessageType] struct { + Messages []M + ToolInfos []*schema.ToolInfo // 传给模型的工具列表 + DeferredToolInfos []*schema.ToolInfo // 服务端延迟检索的工具列表 +} + +type ChatModelAgentState = TypedChatModelAgentState[*schema.Message] +``` + +### ModelContext(WrapModel 参数) + +```go +type TypedModelContext[M MessageType] struct { + Tools []*schema.ToolInfo // Deprecated: 用 state.ToolInfos + ModelRetryConfig *TypedModelRetryConfig[M] + ModelFailoverConfig *ModelFailoverConfig[M] +} + +type ModelContext = TypedModelContext[*schema.Message] +``` + +## 执行顺序 + +**模型调用链**(外到内): + +1. `AgentMiddleware.BeforeChatModel` +2. **BeforeModelRewriteState** +3. failover wrapper(内置) +4. retry wrapper(内置) +5. event sender wrapper(内置) +6. **WrapModel**(先注册 = 最外层) +7. callback injection(内置) +8. 实际模型调用 +9. **AfterModelRewriteState** +10. `AgentMiddleware.AfterChatModel` + +**工具调用链**(外到内): + +1. event sender(内置) +2. `ToolsConfig.ToolCallMiddlewares` +3. `AgentMiddleware.WrapToolCall` +4. **WrapToolCall**(先注册 = 最外层) +5. callback injection(内置) +6. 实际工具调用 + +# AgentAsTool + +将子 Agent 包装为 Tool,父 Agent 通过 ToolCall 自主调用: + +```go +subAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "researcher", + Description: "搜索并总结信息", + Model: chatModel, + // ... +}) + +agentTool := adk.NewAgentTool(ctx, subAgent) + +parentAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + // ... + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{agentTool}, + }, + }, +}) +``` + +泛型版本:`adk.NewTypedAgentTool[M](ctx, agent, options...)` + +选项:`WithFullChatHistoryAsInput()`(传递完整对话历史)、`WithAgentInputSchema(schema)`(自定义输入 schema) + +# ModelRetry + +配置后,ChatModel 调用失败时自动重试。流式响应中发生错误时,当前流仍会通过 AgentEvent 返回,消费 MessageStream 得到 `WillRetryError`: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + // ... + ModelRetryConfig: &adk.ModelRetryConfig{ + // 重试策略配置 + }, +}) + +// 消费事件流时处理 WillRetryError +stream := event.Output.MessageOutput.MessageStream +for { + msg, err := stream.Recv() + if err == io.EOF { + break + } + if err != nil { + var willRetry *adk.WillRetryError + if errors.As(err, &willRetry) { + log.Printf("Attempt %d failed, retrying...", willRetry.RetryAttempt) + break // 等待下一个事件 + } + break + } + displayChunk(msg) +} +``` + +# ModelFailover + +配置后,模型调用失败时切换备用模型: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: primaryModel, + ModelFailoverConfig: &adk.ModelFailoverConfig{ + GetFailoverModel: func(ctx context.Context, err error) (model.BaseModel[*schema.Message], error) { + return backupModel, nil + }, + ShouldFailover: func(err error) bool { + return true // 根据错误类型决定是否 failover + }, + }, +}) +``` + +# Cancel + +v0.9 新增的运行时取消能力。详见 [Agent Cancel 与 TurnLoop](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart)。 + +```go +cancelOpt, cancelFn := adk.WithCancel() +iter := runner.Run(ctx, messages, cancelOpt) + +// 稍后取消(CancelMode 支持位掩码组合) +handle := cancelFn(adk.CancelAfterChatModel | adk.CancelAfterToolCalls) +handle.Wait() // 等待取消完成 +``` diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md new file mode 100644 index 00000000000..021d583d4f0 --- /dev/null +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md @@ -0,0 +1,176 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: ChatModel Failover 功能文档 +weight: 1 +--- + +> 💡 +> 本功能目前在 alpha/09 灰度中 + +## 概述 + +`ChatModelAgent` 内置模型故障转移(Failover)能力:主模型调用失败时自动切换备用模型,支持 Generate(同步)和 Stream(流式)。通过 `ModelFailoverConfig[M]` 配置,与 `TypedModelRetryConfig[M]`(同模型重试)正交组合。 + +> 本文以默认 `*schema.Message` 类型为例。泛型用法请将 API 替换为对应的 `Typed` 前缀版本,消息类型参数化为 `M MessageType`。 + +## 核心数据结构 + +### ModelFailoverConfig[M] + +```go +type ModelFailoverConfig[M MessageType] struct { + // 最大故障转移次数。0 表示不 failover; + // 1 表示 GetFailoverModel 最多被调用 1 次。 + // 含 lastSuccessModel 时先尝试它,再调用 GetFailoverModel。 + MaxRetries uint + + // 判断是否触发 failover。ctx.Err() != nil 时不论返回值均停止。 + // 与 ModelRetryConfig 组合时,outputErr 为 *RetryExhaustedError; + // 原始错误通过 RetryExhaustedError.LastErr 获取。 + // 流式场景下 outputMessage 可能携带已接收的部分消息。 + // 配置 ModelFailoverConfig 时此字段必填。 + ShouldFailover func(ctx context.Context, outputMessage M, outputErr error) bool + + // 选择下一个模型并可选地转换输入消息。 + // failoverCtx.FailoverAttempt 从 1 开始。 + // 返回 nil failoverModelInputMessages 表示沿用原始输入。 + // 返回非 nil failoverErr 立即终止 failover。 + // 配置 ModelFailoverConfig 时此字段必填。 + GetFailoverModel func(ctx context.Context, failoverCtx *FailoverContext[M]) ( + failoverModel model.BaseModel[M], + failoverModelInputMessages []M, + failoverErr error, + ) +} +``` + +### FailoverContext[M] + +```go +type FailoverContext[M MessageType] struct { + FailoverAttempt uint // 当前尝试编号,从 1 开始 + InputMessages []M // 转换前的原始输入 + LastOutputMessage M // 上次失败的输出(流式下为部分消息) + // 与 ModelRetryConfig 组合时为 *RetryExhaustedError + LastErr error // 上次失败的错误 +} +``` + +## 快速接入 + +### 基础用法:双模型故障转移 + +```go +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "my-agent", + Instruction: "You are a helpful assistant.", + Model: primaryModel, // model.BaseModel[*schema.Message],必填 + + ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, // 最多 1 次 failover(共 2 次调用) + + ShouldFailover: func(ctx context.Context, msg *schema.Message, err error) bool { + return !errors.Is(err, context.Canceled) && + !errors.Is(err, context.DeadlineExceeded) + }, + + GetFailoverModel: func(ctx context.Context, fc *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + return fallbackModel, nil, nil // nil 消息 → 沿用原始输入 + }, + }, +}) +``` + +> 💡 +> `model.BaseChatModel` 是 `model.BaseModel[*schema.Message]` 的类型别名,两者可互换使用。 + +### 故障转移时转换输入 + +当备用模型不支持某些功能(如图片输入)时: + +```go +ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, + ShouldFailover: func(_ context.Context, _ *schema.Message, _ error) bool { + return true + }, + GetFailoverModel: func(_ context.Context, fc *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + // 过滤掉图片内容,降级到纯文本模型 + return textModel, filterTextOnly(fc.InputMessages), nil + }, +}, +``` + +### 结合 Retry + +Failover 与 Retry 正交组合。语义:**每个模型先按 Retry 策略重试,重试耗尽后触发 Failover 切换**。 + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: primaryModel, + // ... + + ModelRetryConfig: &adk.ModelRetryConfig{ + MaxRetries: 2, + IsRetryAble: func(_ context.Context, err error) bool { + return isTransientError(err) + }, + }, + + ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, + ShouldFailover: func(_ context.Context, _ *schema.Message, err error) bool { + // err 此时为 *RetryExhaustedError + return true + }, + GetFailoverModel: func(_ context.Context, _ *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + return fallbackModel, nil, nil + }, + }, +}) +``` + +## 流式 Failover 行为 + + + + + + +
    场景行为
    Stream()
    初始化失败
    与 Generate 一致,直接触发 failover 判定
    流中途出错已接收 chunk 拼接为
    LastOutputMessage
    传入
    ShouldFailover
    ;决定 failover 后关闭当前流,用新模型重启
    客户端影响失败尝试中已发送的事件不会被撤回。客户端应在收到新一轮流时重置部分结果或按元数据去重
    + +> 💡 +> `ErrStreamCanceled`(调用方主动放弃流)不触发 failover,直接返回。 + +## Model 调用链执行顺序 + +Failover 在包装链中的位置(从外到内): + +``` +1. AgentMiddleware.BeforeChatModel + 2. ChatModelAgentMiddleware.BeforeModelRewriteState + 3. failoverModelWrapper ← failover 在此层 + 4. retryModelWrapper ← 每个 failover 模型内部重试 + 5. eventSenderModelWrapper + 6. ChatModelAgentMiddleware.WrapModel(先注册的在最外层) + 7. callbackInjectionModelWrapper(failover 启用时由 failoverProxyModel 内部处理) + 8. failoverProxyModel / Model.Generate|Stream + 9. ChatModelAgentMiddleware.AfterModelRewriteState +10. AgentMiddleware.AfterChatModel +``` + +## 注意事项 + +- **必填校验**:`ShouldFailover` 和 `GetFailoverModel` 在配置 `ModelFailoverConfig` 时均为必填,缺少任一在 `NewChatModelAgent` 时返回错误。`Model` 字段始终必填。 +- **Attempt 编号**:`FailoverAttempt` 从 1 开始。单次 Model 调用最多执行 `1 + MaxRetries` 次(初始 1 次 + failover 最多 MaxRetries 次)。 +- **输入消息**:`GetFailoverModel` 返回 `nil` 消息时沿用原始输入;返回非 `nil` 时替代原始输入。 +- **与 Retry 组合时的错误类型**:`ShouldFailover` 和 `FailoverContext.LastErr` 收到的是 `*RetryExhaustedError`,原始错误通过 `RetryExhaustedError.LastErr` 获取。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md index 3fcfb76163f..c412653db50 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md @@ -1,196 +1,208 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: DeepAgents -weight: 5 +weight: 3 --- -## DeepAgents 概述 +> 💡 +> 本功能要求 eino >= v0.5.14。 -DeepAgents 是在 ChatModelAgent (详见:[Eino ADK: ChatModelAgent](/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model))的基础上实现的一种开箱即用的 agent 方案。你无需自己去拼装提示词、工具或上下文管理,就可以立即获得一个可运行的 agent,并仍可使用 ChatModelAgent 的扩展能力来为 agent 增加业务功能,如添加自定义 tools 和 middleware 等。 +## 概述 -**包含内容:** +DeepAgents 是基于 ChatModelAgent 的开箱即用方案。无需手动拼装提示词、工具或上下文管理,即可获得具备规划、文件系统、Shell 执行和子 Agent 委派能力的 Agent,同时保留 ChatModelAgent 的全部扩展能力(自定义 tools、middleware、handlers)。 -- **规划能力** —— 通过 `write_todos` 进行任务拆解与进度跟踪 -- **文件系统** —— 提供 `read_file`、`write_file`、`edit_file`、`ls`、`glob`、`grep`,用于读取和写入上下文 -- **Shell 访问** —— 使用 `execute` 运行命令 -- **子 Agent** —— 通过 `task` 将工作委派给拥有独立上下文窗口的子智能体 -- **智能默认配置** —— 内置 Prompt,教模型如何高效使用这些工具 -- **上下文管理** —— 长对话历史自动摘要,大体量输出自动保存到文件 - - SummarizationMiddleware、ReductionMiddleware 正在建设中 +**内置能力**: -### ImportPath +- **规划** — `write_todos` 工具进行任务拆解与进度跟踪 +- **文件系统** — `ls`、`read_file`、`write_file`、`edit_file`、`glob`、`grep` +- **Shell** — `execute`(支持流式) +- **子 Agent** — `task` 工具将任务委派到上下文隔离的子智能体 +- **智能默认** — 内置 Prompt 教模型高效使用工具 +- **上下文管理** — 大体量输出自动保存到文件 -Eino 版本需大于等于 v0.5.14 +### Import ```go -import github.com/cloudwego/eino/adk/prebuilt/deep +import "github.com/cloudwego/eino/adk/prebuilt/deep" -agent, err := deep.New(ctx, &deep.Config{}) +agent, err := deep.New(ctx, &deep.Config{ + ChatModel: myModel, +}) ``` -### DeepAgents 结构 - -DeepAgents 核心思想在于通过一个主 agent(MainAgent)来协调、规划、委派或自主执行任务。主 agent 利用其内置的 ChatModel 和一系列工具来与外部世界交互或将复杂任务分解给专门的子 agents(SubAgents)。 - - +--- -上图展示了 DeepAgents 的核心组件与它们之间的调用关系: +## Config 完整定义 -- 主 Agent: 系统的入口和总指挥,接收初始任务,以 ReAct 方式调用工具完成任务并负责最终结果的呈现。 -- ChatModel (ToolCallingChatModel): 通常是一个具备工具调用能力的大语言模型,负责理解任务、推理、选择并调用工具。 -- Tools: MainAgent 可用的一系列能力的集合,包括: - - WriteTodos: 内置的规划工具,用于将复杂任务拆解为结构化的待办事项列表。 - - TaskTool: 一个特殊的工具,作为调用子 Agent 的统一入口。 - - BuiltinTools、CustomTools: DeepAgents 内置的通用工具以及用户根据业务需求自定义的各类工具。 -- SubAgents: 负责执行具体、独立的子任务,与 MainAgent 上下文独立。 - - GeneralPurpose: 通用子 Agent,具有与 MainAgent 相同的 Tools(除了 TaskTool),用于在“干净”的上下文中执行子任务。 - - CustomSubAgents: 用户根据业务需求自定义的各种子 Agent。 +```go +type Config = TypedConfig[*schema.Message] -### 内置能力 +type TypedConfig[M adk.MessageType] struct { + Name string // Agent 标识名 + Description string // 用途描述 + ChatModel model.BaseModel[M] // 必填;需支持 model.WithTools + Instruction string // 系统提示词;为空时使用内置默认 Prompt -#### Filesystem + // 子 Agent(绑定到 TaskTool) + SubAgents []adk.TypedAgent[M] -> 💡 -> 目前处于 alpha 状态 + // 自定义工具 + ToolsConfig adk.ToolsConfig + MaxIteration int // 最大推理迭代次数 -创建 DeepAgents 时配置相关 Backend,DeepAgents 会自动加载相应工具: + // 文件系统(三选一或组合) + Backend filesystem.Backend // 注册 ls/read_file/write_file/edit_file/glob/grep + Shell filesystem.Shell // 注册 execute(与 StreamingShell 互斥) + StreamingShell filesystem.StreamingShell // 注册 execute(流式,与 Shell 互斥) -``` -type Config struct { - // ... - Backend filesystem.Backend - Shell filesystem.Shell - StreamingShell filesystem.StreamingShell - // ... -} -``` + // 内置功能开关 + WithoutWriteTodos bool // true 时关闭 write_todos 工具 + WithoutGeneralSubAgent bool // true 时关闭默认 general-purpose 子 Agent - - - - - -
    配置功能添加工具
    Backend提供文件系统访问能力,可选read_file, write_file, edit_file, glob, grep
    Shell提供 Shell 能力,可选,与 StreamShell 互斥 execute
    StreamingShell提供可以流式返回结果的 Shell 能力,可选,与 Shell 互斥execute(streaming)
    + // TaskTool 描述生成器(自定义 task 工具的 description) + TaskToolDescriptionGenerator func(ctx context.Context, agents []adk.TypedAgent[M]) (string, error) -DeepAgents 内引用 filesystem middleware 来实现内置 filesystem,此 middleware 更详细的能力说明见:[Middleware: FileSystem](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem) + // 扩展 + Middlewares []adk.AgentMiddleware // struct-based 中间件 + Handlers []adk.TypedChatModelAgentMiddleware[M] // interface-based handlers -### 任务拆解与规划 + // 模型容错 + ModelRetryConfig *adk.TypedModelRetryConfig[M] + ModelFailoverConfig *adk.ModelFailoverConfig[M] -WriteTodos 的 Description 描述了任务拆解、规划的原则,主 Agent 通过调用 WriteTodos 工具,在上下文中添加子任务列表来启发后续推理、执行过程: + // 输出存储(通过 AddSessionValue 写入会话) + OutputKey string +} +``` - +### 构造函数 -1. 模型接收用户输入。 -2. 模型调用 WriteTodos 工具,参数为依照 WriteTodos Description 产生的任务列表。这次工具调用被添加到上下文中,供后续参考。 -3. 模型依照上下文中的 todos,调用 TaskTool 完成第一个 todo。 -4. 再次调用 WriteTodos ,更新 Todos 执行进度。 +```go +// 标准版(M = *schema.Message) +func New(ctx context.Context, cfg *Config) (adk.ResumableAgent, error) -> 💡 -> 对简单任务来说,每次都调用 WriteTodos 可能会起到反效果。WriteTodos Description 中添加了一些比较通用的正反例子来避免不调用或过度调用 WriteTodos。使用 DeepAgents 时,可以根据实际业务场景添加更多 prompt 来让 WriteTodos 在合适的时候被调用。 +// 泛型版(支持 *schema.AgenticMessage) +func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedResumableAgent[M], error) +``` > 💡 -> WriteTodos 会被默认添加到 Agent 中,配置 `WithoutWriteTodos=true` 可以关闭 WriteTodos。 +> 返回 ResumableAgent(包含 Resume 方法),可与 Runner 的 checkpoint/resume 机制配合使用。 -### 任务委派与 SubAgents 调用 - -**TaskTool** +--- -所有子 Agent 会被绑定到 TaskTool 上,当主 Agent 分配子任务给子 Agent 处理时,它会调用 TaskTool,并指明需要哪个子代理及执行的任务。TaskTool 随后将任务路由到指定的子代理,并在其执行完毕后,将结果返回给主 Agent。TaskTool 的默认 Description 会说明调用子 Agent 的通用规则并拼接每个子 Agent 的 Description,开发者可以通过配置 `TaskToolDescriptionGenerator` 来自定义 TaskTool 的 Description。 +## 架构 -> 当用户配置了 Config.SubAgents 时,这些 Agent 会基于 ChatModelAgent AgentAsTool 的能力绑定到 TaskTool 上 + -**上下文隔离** +- **主 Agent**:系统入口,以 ReAct 方式调用工具完成任务 +- **ChatModel**(`model.BaseModel[M]`):负责推理与工具选择 +- **Tools**: + - `write_todos`:内置规划工具,将任务拆解为结构化 TODO 列表 + - `task`:子 Agent 调用入口(路由参数:`subagent_type`、`description`) + - 内置工具(文件系统/Shell)+ 用户自定义工具(`ToolsConfig`) +- **SubAgents**:上下文隔离,独立执行子任务 + - `general-purpose`:默认子 Agent,拥有与主 Agent 相同的工具(除 task)和配置 + - 自定义子 Agent(`Config.SubAgents`) -Agent 之间的上下文隔离: +--- -- 信息传递: 主 Agent 与子 Agent 之间不共享上下文。子 Agent 仅接收主 Agent 分配的子任务目标,不会接收整个任务的处理过程;主 Agent 仅接收子 Agent 的处理结果,不会接受子 Agent 的处理过程。 -- 避免污染: 这种隔离确保了子 Agent 的执行过程(如大量的工具调用和中间步骤)不会“污染”主代理的上下文,主代理只接收简洁、明确的最终答案。 +## 内置文件系统 -**general-purpose** + + + + + +
    配置字段注册工具说明
    Backend
    ls, read_file, write_file, edit_file, glob, grep文件系统操作
    Shell
    execute非流式命令执行,与 StreamingShell 互斥
    StreamingShell
    execute (streaming)流式命令执行,与 Shell 互斥
    -DeepAgents 会默认增加一个子 Agent:general-purpose。general-purpose 具有和主 Agent 相同的 system prompt 和工具(除了 TaskTool),当任务没有专门的子 Agent 来解决时,主 Agent 可以调用 general-purpose 来隔离上下文。开发者可以通过配置 `WithoutGeneralSubAgent=true` 去掉此 Agent。 +内部使用 FileSystem Middleware 实现。 -### 与其他 Agent 对比 +--- -- 对比 ReAct Agent +## 任务规划:write_todos - - 优势:DeepAgents 通过内置 WriteTodos 强化任务拆解与规划;同时隔离多 Agents 上下文,在大规模、多步骤任务中通常效果更优。 - - 劣势:制定计划与调用子 Agent 会带来额外的模型请求,增加耗时与 token 成本;若任务拆分不合理,可能对效果产生反作用。 -- 对比 Plan-and-Execute + - - 优势:DeepAgents 将 Plan/RePlan 作为工具供主 Agent 自由调用,可以在任务中跳过不必要的规划,整体上减少模型调用次数、降低耗时与成本。 - - 劣势:任务规划与委派由一次模型调用完成,对模型能力要求更高,提示词调优也相对更困难。 +`write_todos` 工具将结构化 TODO 列表写入会话(key: `deep_agent_session_key_todos`),供后续推理参考。 -## DeepAgents 使用示例 +**TODO 结构**: -### 场景说明 +```go +type TODO struct { + Content string `json:"content"` + ActiveForm string `json:"activeForm"` + Status string `json:"status"` // "pending" | "in_progress" | "completed" +} +``` -Excel Agent 是一个“看得懂 Excel 的智能助手”,它先把问题拆解成步骤,再一步步执行并校验结果。它能理解用户问题与上传的文件内容,提出可行的解决方案,并选择合适的工具(系统命令、生成并运行 Python 代码、网络查询等等)完成任务。 +**工作流程**: -在真实业务里,你可以把 Excel Agent 当成一位“Excel 专家 + 自动化工程师”。当你交付一个原始表格和目标描述,它会给出方案并完成执行: +1. 模型接收用户输入 +2. 调用 `write_todos` 拆解任务,写入上下文 +3. 按 TODO 逐项执行(调用 task 或直接工具) +4. 再次调用 `write_todos` 更新进度 -- **数据清理与格式化**:从一个包含大量数据的 Excel 文件中完成去重、空值处理、日期格式标准化操作。 -- **数据分析与报告生成**:从销售数据中提取每月的销售总额,聚合统计、透视,最终生成并导出图表报告。 -- **自动化预算计算**:根据不同部门的预算申请,自动计算总预算并生成部门预算分配表。 -- **数据匹配与合并**:将多个不同来源的客户信息表进行匹配合并,生成完整的客户信息数据库。 +> 💡 +> 对简单任务,每次都调用 write_todos 可能适得其反。内置 Prompt 已包含正反例指导何时使用。可通过自定义 Instruction 进一步调优。配置 WithoutWriteTodos=true 可完全关闭。 -用 DeepAgents 搭建的 Excel Agent 结构如下: +--- - +## 子 Agent 委派:task 工具 -1. 在主 Agent 添加 ReadFile 工具,允许主 Agent 查看文件内容,辅助子任务制定 -2. 添加 Code 和 WebSearch 两个子 Agent:Code 可以编写 python 代码来操作 excel 表格;WebSearch 可以搜索信息并总结。 +**TaskTool** 是所有子 Agent 的统一调用入口: -### 代码实现 +- 参数:`subagent_type`(目标子 Agent 名称)、`description`(任务描述) +- 内部通过 `adk.NewTypedAgentTool` 将每个子 Agent 包装为工具 +- 默认 Description 包含所有可用子 Agent 的名称与说明;可通过 `TaskToolDescriptionGenerator` 自定义 -[https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) +**上下文隔离**: -### 运行结果 +- 子 Agent 仅接收主 Agent 分配的任务描述,不共享对话历史 +- 主 Agent 仅接收子 Agent 的最终结果,中间步骤不回传 +- 避免大量工具调用和中间推理"污染"主 Agent 上下文 -```yaml -name: ExcelAgent -path: [{ExcelAgent}] -tool name: task -arguments: {"subagent_type":"CodeAgent","description":"请帮我处理一个CSV文件操作任务。具体要求如下:1. 读取当前目录下的questions.csv文件;2. 提取该CSV文件中的第一列数据;3. 将提取出的第一列数据保存到一个新的CSV文件中,新文件命名为first_column.csv。请使用pandas库来完成这些操作,确保代码能够正确执行并处理可能出现的文件路径问题。"} +**general-purpose 子 Agent**: -name: ExcelAgent -path: [{ExcelAgent}] -tool name: python_runner -arguments: {"code":"```python\nimport pandas as pd\nimport os\n\n# 获取当前工作目录并构建文件路径\nbase_dir = os.getcwd()\ninput_path = os.path.join(base_dir, 'questions.csv')\noutput_path = os.path.join(base_dir, 'first_column.csv')\n\ntry:\n # 读取CSV文件\n df = pd.read_csv(input_path)\n # 提取第一列数据\n first_column = df.iloc[:, 0]\n # 保存为新CSV文件(不包含索引)\n first_column.to_csv(output_path, index=False)\n print(f\"成功将第一列数据保存至: {output_path}\")\nexcept FileNotFoundError:\n print(f\"错误: 未找到文件 {input_path}\")\nexcept Exception as e:\n print(f\"处理过程中发生错误: {str(e)}\")\n```"} +- 默认创建,拥有与主 Agent 相同的工具(除 task)、Instruction 和 ModelFailoverConfig +- 用于在隔离上下文中执行无专门子 Agent 的通用任务 +- 配置 `WithoutGeneralSubAgent=true` 可关闭 -name: ExcelAgent -path: [{ExcelAgent}] -tool response: 成功将第一列数据保存至: /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv +--- +## 与其他方案对比 -name: ExcelAgent -path: [{ExcelAgent}] -answer: 任务已完成。已成功读取当前目录下的 `questions.csv` 文件,提取第一列数据,并将结果保存至 `first_column.csv`。具体输出路径如下: + + + + +
    维度DeepAgents vs ReActDeepAgents vs Plan-and-Execute
    优势内置规划 + 子 Agent 上下文隔离,多步任务效果更优Plan/RePlan 作为工具按需调用,减少不必要的规划开销
    劣势规划 + 子 Agent 调用增加模型请求、耗时与 token 成本规划与委派在单次调用中完成,对模型能力要求更高
    -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv` +--- -代码已处理路径拼接和异常捕获(如文件不存在或格式错误),确保执行稳定性。 +## 使用示例 -name: ExcelAgent -path: [{ExcelAgent}] -tool response: 任务已完成。已成功读取当前目录下的 `questions.csv` 文件,提取第一列数据,并将结果保存至 `first_column.csv`。具体输出路径如下: +### Excel Agent 场景 -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv` + -代码已处理路径拼接和异常捕获(如文件不存在或格式错误),确保执行稳定性。 +- 主 Agent 配置 ReadFile 工具辅助任务制定 +- 添加 Code(Python 操作 Excel)和 WebSearch 两个子 Agent -name: ExcelAgent -path: [{ExcelAgent}] -answer: 已成功将 `questions.csv` 表格中的第一列数据提取至新文件 `first_column.csv`,文件保存路径为 -: +### 代码 -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c4 -4b131973/first_column.csv` +完整示例:[https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) -操作过程中已处理路径拼接和异常捕获(如文件不存在、格式错误等问题),确保数据 -提取完整性和文件生成稳定性。若需要调整文件路径或对数据格式有进一步要求,请随时告知 -。 +```go +agent, err := deep.New(ctx, &deep.Config{ + Name: "ExcelAgent", + ChatModel: myModel, + Backend: localBackend, + SubAgents: []adk.Agent{codeAgent, webSearchAgent}, + ToolsConfig: adk.ToolsConfig{ + InvokableTools: []tool.InvokableTool{readFileTool}, + }, +}) ``` diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md index 1f3ba69429e..e90c0266c67 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md @@ -1,10 +1,10 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: Plan-Execute Agent -weight: 4 +weight: 2 --- ## Plan-Execute Agent 概述 @@ -275,7 +275,7 @@ func newPlanExecuteAgent(ctx context.Context) adk.Agent { replanner := newReplanner(ctx, model) // 组合为 PlanExecuteAgent(固定 execute - replan 最大迭代 10 次) - planExecuteAgent, err := planexecute.NewPlanExecuteAgent(ctx, &planexecute.PlanExecuteConfig{ + planExecuteAgent, err := planexecute.New(ctx, &planexecute.PlanExecuteConfig{ Planner: planner, Executor: executor, Replanner: replanner, diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md deleted file mode 100644 index f583790aa52..00000000000 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md +++ /dev/null @@ -1,499 +0,0 @@ ---- -Description: "" -date: "2026-03-02" -lastmod: "" -tags: [] -title: Supervisor Agent -weight: 3 ---- - -## Supervisor Agent 概述 - -### Import Path - -`import ``github.com/cloudwego/eino/adk/prebuilt/supervisor` - -### 什么是 Supervisor Agent? - -Supervisor Agent 是一种中心化多 Agent 协作模式,由一个监督者(Supervisor Agent) 和多个子 Agent(SubAgents)组成。Supervisor 负责任务的分配、子 Agent 执行过程的监控,以及子 Agent 完成后的结果汇总与下一步决策;子 Agent 则专注于执行具体任务,并在完成后通过 WithDeterministicTransferTo 自动将任务控制权交回 Supervisor。 - - - -该模式适用于需要动态协调多个专业 Agent 完成复杂任务的场景,例如: - -- 科研项目管理(Supervisor 分配调研、实验、报告撰写任务给不同子 Agent)。 -- 客户服务流程(Supervisor 根据用户问题类型,分配给技术支持、售后、销售等子 Agent)。 - -### Supervisor Agent 结构 - -Supervisor 模式的核心结构如下: - -- **Supervisor Agent**:作为协作核心,具备任务分配逻辑(如基于规则或 LLM 决策),可通过 `SetSubAgents` 将子 Agent 纳入管理。 -- **SubAgents**:每个子 Agent 被 WithDeterministicTransferTo 增强,预设 `ToAgentNames` 为 Supervisor 名称,确保任务完成后自动转让回 Supervisor。 - -### Supervisor Agent 特点 - -1. **确定性回调**:子 Agent 执行完毕(未中断)后,通过 WithDeterministicTransferTo 自动触发 Transfer 事件,将任务控制权交回 Supervisor,避免协作流程中断。 -2. **中心化控制**:Supervisor 统一管理子 Agent,可根据子 Agent 的执行结果动态调整任务分配(如分配给其他子 Agent 或直接生成最终结果)。 -3. **松耦合扩展**:子 Agent 可独立开发、测试和替换,只需确保实现 Agent 接口并绑定到 Supervisor,即可接入协作流程。 -4. **支持中断与恢复**:若子 Agent 或 Supervisor 支持 `ResumableAgent` 接口,协作流程可在中断后恢复,保持任务上下文连续性。 - -### Supervisor Agent 运行流程 - -Supervisor 模式的典型协作流程如下: - -1. **任务启动**:Runner 触发 Supervisor 运行,输入初始任务(如“完成一份 LLM 发展历史报告”)。 -2. **任务分配**:Supervisor 根据任务需求,通过 Transfer 事件将任务转让给指定子 Agent(如“调研 Agent”)。 -3. **子 Agent 执行**:子 Agent 执行具体任务(如调研 LLM 关键里程碑),并生成执行结果事件。 -4. **自动回调**:子 Agent 完成后,WithDeterministicTransferTo 触发 Transfer 事件,将任务转让回 Supervisor。 -5. **结果处理**:Supervisor 接收子 Agent 的结果,决定下一步(如分配给“报告撰写 Agent”继续处理,或直接输出最终结果)。 - -## Supervisor Agent 使用示例 - -### 场景说明 - -创建一个科研报告生成系统: - -- **Supervisor**:基于用户输入的研究主题,分配任务给“调研 Agent”和“撰写 Agent”,并汇总最终报告。 -- **调研 Agent**:负责生成研究计划(如 LLM 发展的关键阶段)。 -- **撰写 Agent**:负责根据调研计划撰写完整报告。 - -### 代码实现 - -#### 步骤 1:实现子 Agent - -首先创建两个子 Agent,分别负责调研和撰写任务: - -```go -// 调研 Agent:生成研究计划 -func NewResearchAgent(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ResearchAgent", - Description: "Generates a detailed research plan for a given topic.", - Instruction: ` -You are a research planner. Given a topic, output a step-by-step research plan with key stages and milestones. -Output ONLY the plan, no extra text.`, - Model: model, - }) - return agent -} - -// 撰写 Agent:根据研究计划撰写报告 -func NewWriterAgent(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "WriterAgent", - Description: "Writes a report based on a research plan.", - Instruction: ` -You are an academic writer. Given a research plan, expand it into a structured report with details and analysis. -Output ONLY the report, no extra text.`, - Model: model, - }) - return agent -} -``` - -#### 步骤 2:实现 Supervisor Agent - -创建 Supervisor Agent,定义任务分配逻辑(此处简化为基于规则:先分配给调研 Agent,再分配给撰写 Agent): - -```go -// Supervisor Agent:协调调研和撰写任务 -func NewReportSupervisor(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ReportSupervisor", - Description: "Coordinates research and writing to generate a report.", - Instruction: ` -You are a project supervisor. Your task is to coordinate two sub-agents: -- ResearchAgent: generates a research plan. -- WriterAgent: writes a report based on the plan. - -Workflow: -1. When receiving a topic, first transfer the task to ResearchAgent. -2. After ResearchAgent finishes, transfer the task to WriterAgent with the plan as input. -3. After WriterAgent finishes, output the final report.`, - Model: model, - }) - return agent -} -``` - -#### 步骤 3:组合 Supervisor 与子 Agent - -使用 `NewSupervisor` 将 Supervisor 和子 Agent 组合: - -```go -import ( - "context" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/adk/prebuilt/supervisor" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -func main() { - ctx := context.Background() - - // 1. 创建 LLM 模型(如 GPT-4o) - model, _ := openai.NewChatModel(ctx, &openai.ChatModelConfig{ - APIKey: "YOUR_API_KEY", - Model: "gpt-4o", - }) - - // 2. 创建子 Agent 和 Supervisor - researchAgent := NewResearchAgent(model) - writerAgent := NewWriterAgent(model) - reportSupervisor := NewReportSupervisor(model) - - // 3. 组合 Supervisor 与子 Agent - supervisorAgent, _ := supervisor.New(ctx, &supervisor.Config{ - Supervisor: reportSupervisor, - SubAgents: []adk.Agent{researchAgent, writerAgent}, - }) - - // 4. 运行 Supervisor 模式 - iter := supervisorAgent.Run(ctx, &adk.AgentInput{ - Messages: []adk.Message{ - schema.UserMessage("Write a report on the history of Large Language Models."), - }, - EnableStreaming: true, - }) - - // 5. 消费事件流(打印结果) - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Output != nil && event.Output.MessageOutput != nil { - msg, _ := event.Output.MessageOutput.GetMessage() - println("Agent[" + event.AgentName + "]:\n" + msg.Content + "\n===========") - } - } -} -``` - -### 运行结果 - -```markdown -Agent[ReportSupervisor]: - -=========== -Agent[ReportSupervisor]: -successfully transferred to agent [ResearchAgent] -=========== -Agent[ResearchAgent]: -1. **Scope Definition & Background Research** - - Task: Define "Large Language Model" (LLM) for the report (e.g., size thresholds, key characteristics: transformer-based, large-scale pretraining, general-purpose). - - Task: Identify foundational NLP/AI concepts pre-LLMs (statistical models, early neural networks, word embeddings) to contextualize origins. - - Milestone: 3-day literature review of academic definitions, industry reports, and AI historiographies to finalize scope. - -2. **Chronological Periodization** - - Task: Divide LLM history into distinct eras (e.g., Pre-2017: Pre-transformer foundations; 2017-2020: Transformer revolution & early LLMs; 2020-Present: Scaling & mainstream adoption). - - Task: Map key events, models, and breakthroughs per era (e.g., 2017: "Attention Is All You Need"; 2018: GPT-1/BERT; 2020: GPT-3; 2022: ChatGPT; 2023: Llama 2). - - Milestone: 10-day timeline draft with annotated model releases, research papers, and technological shifts. - -3. **Key Technical Milestones** - - Task: Deep-dive into critical innovations (transformer architecture, pretraining-fine-tuning paradigm, scaling laws, in-context learning). - - Task: Extract details from seminal papers (authors, institutions, methodologies, performance benchmarks). - - Milestone: 1-week analysis of 5-7 foundational papers (e.g., Vaswani et al. 2017; Radford et al. 2018; Devlin et al. 2018) with technical summaries. - -4. **Stakeholder Mapping** - - Task: Identify key organizations (OpenAI, Google DeepMind, Meta AI, Microsoft Research) and academic labs (Stanford, Berkeley) driving LLM development. - - Task: Document institutional contributions (e.g., OpenAI’s GPT series, Google’s BERT/PaLM, Meta’s Llama) and research priorities (open vs. closed models). - - Milestone: 5-day stakeholder profile draft with org-specific timelines and model lineages. - -5. **Technical Evolution & Innovation Trajectory** - - Task: Analyze shifts in architecture (from RNNs/LSTMs to transformers), training paradigms (pretraining + fine-tuning → instruction tuning → RLHF), and compute scaling (parameters, data size, GPU usage over time). - - Task: Link technical changes to performance improvements (e.g., GPT-1 (124M params) vs. GPT-4 (100B+ params): task generalization, emergent abilities). - - Milestone: 1-week technical trajectory report with data visualizations (param scaling, benchmark scores over time). - -6. **Impact & Societal Context** - - Task: Research LLM impact on NLP tasks (translation, summarization, QA) and beyond (education, content creation, policy). - - Task: Document cultural/industry shifts (rise of prompt engineering, "AI-native" products, public perception post-ChatGPT). - - Milestone: 5-day impact analysis integrating case studies (e.g., GitHub Copilot, healthcare LLMs) and media/scholarly discourse. - -7. **Challenges & Critiques (Historical Perspective)** - - Task: Track historical limitations (pre-2020: data sparsity, task specificity; post-2020: bias, misinformation, energy use) and responses (e.g., 2019: BERT bias audits; 2023: EU AI Act). - - Task: Cite key critiques (e.g., "On the Dangers of Stochastic Parrots," 2021) and industry/academic reactions. - - Milestone: 5-day challenge timeline linking issues to their emergence and mitigation efforts. - -8. **Synthesis & Narrative Drafting** - - Task: Integrate chronological, technical, and societal data into a coherent narrative (origins → revolution → scaling → mainstream impact). - - Task: Outline report structure (Abstract, Introduction, Era-by-Era Analysis, Key Innovations, Stakeholders, Impact, Challenges, Conclusion). - - Milestone: 1-week first draft of full report (8,000–10,000 words). - -9. **Validation & Fact-Checking** - - Task: Verify model release dates, paper citations, parameter counts, and stakeholder claims via primary sources (original papers, official press releases, archived GitHub repos). - - Task: Cross-check with secondary sources (AI history books, expert interviews, peer-reviewed historiographies). - - Milestone: 3-day validation report flagging/correcting inaccuracies. - -10. **Finalization & Revision** - - Task: Edit for clarity, narrative flow, and consistency; refine visuals (timelines, param scaling charts). - - Task: Format references (APA/MLA) and appendices (model comparison table, key paper list). - - Milestone: 2-day final report submission. -=========== -Agent[ResearchAgent]: - -=========== -Agent[ResearchAgent]: -successfully transferred to agent [ReportSupervisor] -=========== -Agent[ReportSupervisor]: - -=========== -Agent[ReportSupervisor]: -successfully transferred to agent [WriterAgent] -=========== -Agent[WriterAgent]: -# The History of Large Language Models: From Foundations to Mainstream Revolution - - -## Abstract -Large Language Models (LLMs) represent one of the most transformative technological innovations of the 21st century, enabling machines to understand, generate, and manipulate human language with unprecedented fluency. This report traces the historical trajectory of LLMs, from their conceptual roots in early natural language processing (NLP) to their current status as mainstream tools. It examines key technical milestones—including the invention of the transformer architecture, the rise of pretraining-fine-tuning paradigms, and the scaling of model parameters—and contextualizes these within the contributions of academic labs and tech giants. The report also analyzes societal impacts, from revolutionizing NLP tasks to sparking debates over bias, misinformation, and AI regulation. By synthesizing chronological, technical, and cultural data, this history reveals how LLMs evolved from niche research experiments to agents of global change. - - -## 1. Introduction: Defining Large Language Models -A **Large Language Model (LLM)** is a type of machine learning model designed to process and generate human language by learning patterns from massive text datasets. Key characteristics include: (1) a transformer-based architecture, enabling parallel processing of text sequences; (2) large-scale pretraining on diverse corpora (e.g., books, websites, articles); (3) general-purpose functionality, allowing adaptation to tasks like translation, summarization, or dialogue without task-specific engineering; and (4) scale, typically defined by billions (or tens of billions) of parameters (adjustable weights that capture linguistic patterns). - -LLMs emerged from decades of NLP research, building on foundational concepts like statistical models (e.g., n-grams), early neural networks (e.g., recurrent neural networks [RNNs]), and word embeddings (e.g., Word2Vec, GloVe). By the 2010s, these predecessors had laid groundwork for "language understanding," but were limited by task specificity (e.g., a model trained for translation could not summarize text) and data sparsity. LLMs addressed these gaps by prioritizing scale, generality, and architectural innovation—ultimately redefining the boundaries of machine language capability. - - -## 2. Era-by-Era Analysis: The Evolution of LLMs - -### 2.1 Pre-2017: Pre-Transformer Foundations (1950s–2016) -The roots of LLMs lie in mid-20th-century NLP, when researchers first sought to automate language tasks. Early efforts relied on rule-based systems (e.g., 1950s machine translation using syntax rules) and statistical methods (e.g., 1990s n-gram models for speech recognition). By the 2010s, neural networks gained traction: RNNs and long short-term memory (LSTM) models (Hochreiter & Schmidhuber, 1997) enabled sequence modeling, while word embeddings (Mikolov et al., 2013) represented words as dense vectors, capturing semantic relationships. - -Despite progress, pre-2017 models faced critical limitations: RNNs/LSTMs processed text sequentially, making them slow to train and unable to handle long-range dependencies (e.g., linking "it" in a sentence to a noun paragraphs earlier). Data was also constrained: models like Word2Vec trained on millions, not billions, of tokens. These bottlenecks set the stage for a paradigm shift. - - -### 2.2 2017–2020: The Transformer Revolution and Early LLMs -The year 2017 marked the dawn of the LLM era with the publication of *"Attention Is All You Need"* (Vaswani et al.), which introduced the **transformer architecture**. Unlike RNNs, transformers use "self-attention" mechanisms to weigh the importance of different words in a sequence simultaneously, enabling parallel computation and capturing long-range dependencies. This breakthrough reduced training time and improved performance on language tasks. - -#### Key Models and Breakthroughs: -- **2018**: OpenAI released **GPT-1** (Radford et al.), the first transformer-based LLM. With 124 million parameters, it introduced the "pretraining-fine-tuning" paradigm: pretraining on a large unlabeled corpus (BooksCorpus) to learn general language patterns, then fine-tuning on task-specific labeled data (e.g., sentiment analysis). -- **2018**: Google published **BERT** (Devlin et al.), a bidirectional transformer that processed text from left-to-right *and* right-to-left, outperforming GPT-1 on context-dependent tasks like question answering. BERT’s success popularized "contextual embeddings," where word meaning depends on surrounding text (e.g., "bank" as a financial institution vs. a riverbank). -- **2019**: OpenAI scaled up with **GPT-2** (1.5 billion parameters), demonstrating improved text generation but sparking early concerns about misuse (OpenAI initially delayed full release over fears of disinformation). -- **2020**: Google’s **T5** (Text-to-Text Transfer Transformer) unified NLP tasks under a single "text-to-text" framework (e.g., translating "translate English to French: Hello" to "Bonjour"), simplifying model adaptation. - - -### 2.3 2020–Present: Scaling, Emergence, and Mainstream Adoption -The 2020s saw LLMs transition from research curiosities to global phenomena, driven by exponential scaling of parameters, data, and compute. - -#### Key Developments: -- **2020**: OpenAI’s **GPT-3** (175 billion parameters) marked a turning point. Trained on 45 terabytes of text, it exhibited "few-shot" and "zero-shot" learning—adapting to tasks with minimal examples (e.g., "Write a poem about AI" with no prior poetry training). GPT-3’s release via API (OpenAI Playground) introduced LLMs to developers, enabling early applications like chatbots and code generation. -- **2022**: **ChatGPT** (based on GPT-3.5) brought LLMs to the public. Launched in November, its user-friendly interface and conversational ability sparked a viral explosion (100 million users by January 2023). ChatGPT refined training with **Reinforcement Learning from Human Feedback (RLHF)**, aligning outputs with human preferences (e.g., helpfulness, safety). -- **2023**: Meta released **Llama 2** (7B–70B parameters), an open-source LLM that lowered barriers to entry, allowing researchers and startups to fine-tune models without proprietary access. Meanwhile, OpenAI’s **GPT-4** (100B+ parameters) expanded multimodality (text + images) and improved reasoning (e.g., solving math problems, coding). -- **2023–2024**: The "race to scale" continued with models like Google’s **PaLM 2** (540B parameters), Anthropic’s **Claude 2** (200B+ parameters), and open-source alternatives (e.g., Mistral, Falcon). Compute usage skyrocketed: training GPT-3 required ~3.14e23 floating-point operations (FLOPs), equivalent to 355 years of a single GPU’s work. - - -## 3. Key Technical Milestones -### 3.1 The Transformer Architecture (2017) -Vaswani et al.’s *"Attention Is All You Need"* (Google, University of Toronto) replaced RNNs with self-attention, a mechanism that computes "attention scores" between every pair of words in a sequence. For example, in "The cat sat on the mat; it purred," self-attention links "it" to "cat." This parallel processing reduced training time from weeks (for RNNs) to days, enabling larger models. - -### 3.2 Pretraining-Fine-Tuning Paradigm (2018) -GPT-1 and BERT established the now-standard workflow: (1) Pretrain on a large, unlabeled corpus (e.g., Common Crawl, a web scrape of 1.1 trillion tokens) to learn syntax, semantics, and world knowledge; (2) Fine-tune on task-specific data (e.g., GLUE, a benchmark of 10 NLP tasks). This decoupled language learning from task engineering, enabling generalization. - -### 3.3 Scaling Laws and Emergent Abilities (2020s) -In 2020, OpenAI researchers articulated **scaling laws**: model performance improves predictably with increased parameters, data, and compute. By 2022, this led to "emergent abilities"—skills not present in smaller models, such as GPT-3’s in-context learning or GPT-4’s multi-step reasoning. - -### 3.4 Instruction Tuning and RLHF (2022) -Post-2020, training shifted from task-specific fine-tuning to **instruction tuning** (training on natural language instructions like "Summarize this article") and **RLHF** (rewarding models for human-preferred outputs). These methods made LLMs more usable: ChatGPT, for instance, follows prompts like "Explain quantum physics like I’m 5" without explicit fine-tuning. - - -## 4. Stakeholders: The Ecosystem of LLM Development -LLM evolution has been driven by a mix of tech giants, academic labs, and startups, each with distinct priorities: - -### 4.1 Tech Giants: Closed vs. Open Models -- **OpenAI** (founded 2015, backed by Microsoft): Pioneered the GPT series, prioritizing commercialization via closed APIs (e.g., ChatGPT Plus, GPT-4 API). Focus: user-friendliness and safety (via RLHF). -- **Google DeepMind**: Developed BERT, T5, and PaLM, integrating LLMs into products like Google Search (via BERT) and Bard. Balances closed (PaLM) and open (T5) models. -- **Meta AI**: Advocated for open science with Llama 1/2 (2023), releasing weights for research and commercial use. Meta’s "open" approach aims to democratize LLM access and accelerate safety research. -- **Microsoft**: Partnered with OpenAI (2019–present), providing Azure compute and integrating GPT into Bing (search), Office (Copilot), and GitHub (Copilot X for coding). - -### 4.2 Academic Labs -- **Stanford NLP**: Contributed to BERT and T5 research; developed HELM (Holistic Evaluation of Language Models), a benchmark for LLM safety and fairness. -- **UC Berkeley**: Studied LLM bias (e.g., 2021 paper "On the Dangers of Stochastic Parrots," critiquing LLMs as "statistical mimics" lacking true understanding). - - -## 5. Impact & Societal Context -### 5.1 Transforming NLP and Beyond -LLMs have redefined NLP performance: By 2023, GPT-4 outperformed humans on the MMLU benchmark (a test of 57 subjects, including math, law, and biology), scoring 86.4% vs. 86.5% for humans. Beyond NLP, they have revolutionized: -- **Content Creation**: Tools like Jasper and Copy.ai automate marketing copy; artists use DALL-E (paired with LLMs) for text-to-image generation. -- **Education**: Khan Academy’s Khanmigo tutors students; Coursera uses LLMs for personalized feedback. -- **Coding**: GitHub Copilot (2021) generates code from comments, boosting developer productivity by 55% (Microsoft, 2023). - -### 5.2 Cultural Shifts -- **Prompt Engineering**: The rise of "prompt engineers"—professionals skilled in crafting text inputs to elicit desired LLM outputs—became a new career path. -- **AI-Native Products**: Startups like Character.AI (chatbots with distinct personalities) and Perplexity (AI-powered search) emerged as "LLM-first" services. -- **Public Perception**: Post-ChatGPT, LLMs shifted from "AI hype" to tangible utility, though skepticism persists (e.g., 62% of U.S. adults worry about job displacement, Pew Research, 2023). - - -## 6. Challenges & Critiques: A Historical Perspective -### 6.1 Technical Limitations -- **Pre-2020**: Data sparsity (small corpora limited generalization); task specificity (models like BERT required retraining for new tasks). -- **Post-2020**: **Hallucinations** (fabricating facts, e.g., GPT-3 citing fake research papers); **energy use** (training GPT-3 emitted ~500 tons of CO₂, equivalent to 125 round-trip flights from NYC to London); **computational inequality** (only tech giants can afford 100B+ parameter models). - -### 6.2 Societal Risks -- **Bias**: Early LLMs mirrored training data biases (e.g., BERT associated "doctor" with "male" in 2019 audits). Responses included bias mitigation datasets (e.g., WinoBias) and audits (e.g., Stanford’s Gender Shades). -- **Misinformation**: GPT-2’s realistic text generation prompted calls for regulation; by 2023, deepfakes (e.g., AI-generated political speeches) became a policy focus. -- **Regulation**: The EU AI Act (2024) classified LLMs as "high-risk," requiring transparency (e.g., disclosing AI-generated content) and safety testing. - - -## 7. Conclusion: A Revolution in Five Years -The history of LLMs is a story of exponential progress: from the transformer’s 2017 invention to ChatGPT’s 2022 viral explosion, a mere five years. What began as an academic breakthrough—parallelizing text processing with self-attention—evolved into a technology that writes code, tutors students, and shapes global policy. - -Yet challenges persist: scaling has outpaced our understanding of how LLMs "think," and debates over bias, energy use, and access (closed vs. open models) intensify. As we look to the future, this history reminds us that LLMs are not just technical achievements, but mirrors of society—reflecting both our ingenuity and our flaws. Their next chapter will depend on balancing innovation with responsibility, ensuring these models serve as tools for collective progress. - - -## References -- Devlin, J., et al. (2018). *BERT: Pre-training of deep bidirectional transformers for language understanding*. NAACL. -- Hochreiter, S., & Schmidhuber, J. (1997). *Long short-term memory*. Neural Computation. -- Mikolov, T., et al. (2013). *Efficient estimation of word representations in vector space*. ICLR. -- Radford, A., et al. (2018). *Improving language understanding by generative pre-training*. OpenAI. -- Vaswani, A., et al. (2017). *Attention is all you need*. NeurIPS. -- Weidinger, L., et al. (2021). *On the dangers of stochastic parrots: Can language models be too big?*. ACM FAccT. -=========== -Agent[WriterAgent]: - -=========== -Agent[WriterAgent]: -successfully transferred to agent [ReportSupervisor] -=========== -``` - -## WithDeterministicTransferTo - -### 什么是 WithDeterministicTransferTo? - -`WithDeterministicTransferTo` 是 Eino ADK 提供的 Agent 增强工具,用于为 Agent 注入任务转让(Transfer)能力 。它允许开发者为目标 Agent 预设固定的任务转让路径,当该 Agent 完成任务(未被中断)时,会自动生成 Transfer 事件,将任务流转到预设的目标 Agent。 - -这一能力是构建 Supervisor Agent 协作模式的基础,确保子 Agent 在执行完毕后能可靠地将任务控制权交回监督者(Supervisor),形成“分配-执行-反馈”的闭环协作流程。 - -### WithDeterministicTransferTo 核心实现 - -#### 配置结构 - -通过 `DeterministicTransferConfig` 定义任务转让的核心参数: - -```go -// 包装方法 -func AgentWithDeterministicTransferTo(_ context.Context, config *DeterministicTransferConfig) Agent - -// 配置详情 -type DeterministicTransferConfig struct { - Agent Agent // 被增强的目标 Agent - ToAgentNames []string // 任务完成后转让的目标 Agent 名称列表 -} -``` - -- `Agent`:需要添加转让能力的原始 Agent。 -- `ToAgentNames`:当 `Agent` 完成任务且未中断时,自动转让任务的目标 Agent 名称列表(按顺序转让)。 - -#### Agent 包装 - -WithDeterministicTransferTo 会对原始 Agent 进行包装,根据其是否实现 `ResumableAgent` 接口(支持中断与恢复),分别返回 `agentWithDeterministicTransferTo` 或 `resumableAgentWithDeterministicTransferTo` 实例,确保增强能力与 Agent 原有功能(如 `Resume` 方法)兼容。 - -包装后的 Agent 会覆盖 `Run` 方法(对 `ResumableAgent` 还会覆盖 `Resume` 方法),在原始 Agent 的事件流基础上追加 Transfer 事件: - -```go -// 对普通 Agent 的包装 -type agentWithDeterministicTransferTo struct { - agent Agent // 原始 Agent - toAgentNames []string // 目标 Agent 名称列表 -} - -// Run 方法:执行原始 Agent 任务,并在任务完成后追加 Transfer 事件 -func (a *agentWithDeterministicTransferTo) Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent] { - aIter := a.agent.Run(ctx, input, options...) - - iterator, generator := NewAsyncIteratorPair[*AgentEvent]() - - // 异步处理原始事件流,并追加 Transfer 事件 - go appendTransferAction(ctx, aIter, generator, a.toAgentNames) - - return iterator -} -``` - -对于 `ResumableAgent`,额外实现 `Resume` 方法,确保恢复执行后仍能触发确定性转让: - -```go -type resumableAgentWithDeterministicTransferTo struct { - agent ResumableAgent // 支持恢复的原始 Agent - toAgentNames []string // 目标 Agent 名称列表 -} - -// Resume 方法:恢复执行原始 Agent 任务,并在完成后追加 Transfer 事件 -func (a *resumableAgentWithDeterministicTransferTo) Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] { - aIter := a.agent.Resume(ctx, info, opts...) - iterator, generator := NewAsyncIteratorPair[*AgentEvent]() - go appendTransferAction(ctx, aIter, generator, a.toAgentNames) - return iterator -} -``` - -#### 事件流追加 Transfer 事件 - -`appendTransferAction` 是实现确定性转让的核心逻辑,它会消费原始 Agent 的事件流,在 Agent 任务正常结束(未中断)后,自动生成并发送 Transfer 事件到目标 Agent: - -```go -func appendTransferAction(ctx context.Context, aIter *AsyncIterator[*AgentEvent], generator *AsyncGenerator[*AgentEvent], toAgentNames []string) { - defer func() { - // 异常处理:捕获 panic 并通过事件传递错误 - if panicErr := recover(); panicErr != nil { - generator.Send(&AgentEvent{Err: safe.NewPanicErr(panicErr, debug.Stack())}) - } - generator.Close() // 事件流结束,关闭生成器 - }() - - interrupted := false - - // 1. 转发原始 Agent 的所有事件 - for { - event, ok := aIter.Next() - if !ok { // 原始事件流结束 - break - } - generator.Send(event) // 转发事件给调用方 - - // 检查是否发生中断(如 InterruptAction) - if event.Action != nil && event.Action.Interrupted != nil { - interrupted = true - } else { - interrupted = false - } - } - - // 2. 若未中断且存在目标 Agent,生成 Transfer 事件 - if !interrupted && len(toAgentNames) > 0 { - for _, toAgentName := range toAgentNames { - // 生成转让消息(系统提示 + Transfer 动作) - aMsg, tMsg := GenTransferMessages(ctx, toAgentName) - // 发送系统提示事件(告知用户任务转让) - aEvent := EventFromMessage(aMsg, nil, schema.Assistant, "") - generator.Send(aEvent) - // 发送 Transfer 动作事件(触发任务转让) - tEvent := EventFromMessage(tMsg, nil, schema.Tool, tMsg.ToolName) - tEvent.Action = &AgentAction{ - TransferToAgent: &TransferToAgentAction{ - DestAgentName: toAgentName, // 目标 Agent 名称 - }, - } - generator.Send(tEvent) - } - } -} -``` - -**关键逻辑**: - -- **事件转发**:原始 Agent 产生的所有事件(如思考、工具调用、输出结果)会被完整转发,确保业务逻辑不受影响。 -- **中断检查**:若 Agent 执行过程中被中断(如 `InterruptAction`),则不触发 Transfer(中断视为任务未正常完成)。 -- **Transfer 事件生成**:任务正常结束后,为每个 `ToAgentNames` 生成两条事件: - 1. 系统提示事件(`schema.Assistant` 角色):告知用户任务将转让给目标 Agent。 - 2. Transfer 动作事件(`schema.Tool` 角色):携带 `TransferToAgentAction`,触发 ADK 运行时将任务转让给 `DestAgentName` 对应的 Agent。 - -## 总结 - -WithDeterministicTransferTo 为 Agent 提供了可靠的任务转让能力,是构建 Supervisor 模式的核心基石;而 Supervisor 模式通过中心化协调与确定性回调,实现了多 Agent 之间的高效协作,显著降低了复杂任务的开发与维护成本。结合两者,开发者可快速搭建灵活、可扩展的多 Agent 系统。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md deleted file mode 100644 index f03c7c390c9..00000000000 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md +++ /dev/null @@ -1,1265 +0,0 @@ ---- -Description: "" -date: "2026-03-09" -lastmod: "" -tags: [] -title: Workflow Agents -weight: 2 ---- - -# Workflow Agents 概述 - -## 导入路径 - -`import ``github.com/cloudwego/eino/adk` - -## 什么是 Workflow Agents - -Workflow Agents 是 eino ADK 中的一种特殊 Agent 类型,它允许开发者以预设的流程来组织和执行多个子 Agent。 - -与基于 LLM 自主决策的 Transfer 模式不同,Workflow Agents 采用**预设决策**的方式,按照代码中定义好的执行流程来运行子 Agent,提供了更可预测和可控的多 Agent 协作方式。 - -Eino ADK 提供了三种基础的 Workflow Agent 类型: - -- **SequentialAgent**:按顺序依次执行子 Agent -- **LoopAgent**:循环执行子 Agent 序列 -- **ParallelAgent**:并发执行多个子 Agent - -这些 Workflow Agent 可以相互嵌套,构建更复杂的执行流程,满足各种业务场景需求。 - -# SequentialAgent - -## 功能 - -SequentialAgent 是最基础的 Workflow Agent,它按照配置中提供的顺序,依次执行一系列子 Agent。每个子 Agent 执行完成后,其输出会通过 History 机制传递给下一个子 Agent,形成一个线性的执行链。 - - - -```go -type SequentialAgentConfig struct { - Name string // Agent 名称 - Description string // Agent 描述 - SubAgents []Agent // 子 Agent 列表,按执行顺序排列 -} - -func NewSequentialAgent(ctx context.Context, config *SequentialAgentConfig) (Agent, error) -``` - -SequentialAgent 的执行遵循以下设定: - -1. **线性执行**:严格按照 SubAgents 数组的顺序执行 -2. **History 传递**:每个 Agent 的执行结果都会被添加到 History 中,后续 Agent 可以访问前面 Agent 的执行历史 -3. **提前退出**:如果任何一个子 Agent 产生 ExitAction / Interrupt,整个 Sequential 流程会立即终止 - -SequentialAgent 适用于以下场景: - -- **多步骤处理流程**:如数据预处理 -> 分析 -> 生成报告 -- **管道式处理**:每个步骤的输出作为下个步骤的输入 -- **有依赖关系的任务序列**:后续任务依赖前面任务的结果 - -## 示例 - -示例展示了如何使用 SequentialAgent 创建一个三步骤的文档处理流水线: - -1. **DocumentAnalyzer**:分析文档内容 -2. **ContentSummarizer**:总结分析结果 -3. **ReportGenerator**:生成最终报告 - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -// 创建 ChatModel 实例 -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// 文档分析 Agent -func NewDocumentAnalyzerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "DocumentAnalyzer", - Description: "分析文档内容并提取关键信息", - Instruction: "你是一个文档分析专家。请仔细分析用户提供的文档内容,提取其中的关键信息、主要观点和重要数据。", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 内容总结 Agent -func NewContentSummarizerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ContentSummarizer", - Description: "对分析结果进行总结", - Instruction: "基于前面的文档分析结果,生成一个简洁明了的总结,突出最重要的发现和结论。", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 报告生成 Agent -func NewReportGeneratorAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ReportGenerator", - Description: "生成最终的分析报告", - Instruction: "基于前面的分析和总结,生成一份结构化的分析报告,包含执行摘要、详细分析和建议。", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // 创建三个处理步骤的 Agent - analyzer := NewDocumentAnalyzerAgent() - summarizer := NewContentSummarizerAgent() - generator := NewReportGeneratorAgent() - - // 创建 SequentialAgent - sequentialAgent, err := adk.NewSequentialAgent(ctx, &adk.SequentialAgentConfig{ - Name: "DocumentProcessingPipeline", - Description: "文档处理流水线:分析 → 总结 → 报告生成", - SubAgents: []adk.Agent{analyzer, summarizer, generator}, - }) - if err != nil { - log.Fatal(err) - } - - // 创建 Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: sequentialAgent, - }) - - // 执行文档处理流程 - input := "请分析以下市场报告:2024年第三季度,公司营收增长15%,主要得益于新产品线的成功推出。但运营成本也上升了8%,需要优化效率。" - - fmt.Println("开始执行文档处理流水线...") - iter := runner.Query(ctx, input) - - stepCount := 1 - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - if event.Output != nil && event.Output.MessageOutput != nil { - fmt.Printf("\n=== 步骤 %d: %s ===\n", stepCount, event.AgentName) - fmt.Printf("%s\n", event.Output.MessageOutput.Message.Content) - stepCount++ - } - } - - fmt.Println("\n文档处理流水线执行完成!") -} -``` - -运行结果为: - -```markdown -开始执行文档处理流水线... - -=== 步骤 1: DocumentAnalyzer === -市场报告关键信息分析: - -1. 营收增长情况: - - 2024年第三季度,公司营收同比增长15%。 - - 营收增长的主要驱动力是新产品线的成功推出。 - -2. 成本情况: - - 运营成本上涨了8%。 - - 成本上升提醒公司需要进行效率优化。 - -主要观点总结: -- 新产品线推出显著推动了营收增长,显示公司在产品创新方面取得良好成果。 -- 虽然营收提升,但运营成本的增加在一定程度上影响了盈利能力,指出了提升运营效率的重要性。 - -重要数据: -- 营收增长率:15% -- 运营成本增长率:8% - -=== 步骤 2: ContentSummarizer === -总结:2024年第三季度,公司实现了15%的营收增长,主要归功于新产品线的成功推出,体现了公司产品创新能力的显著提升。然而,运营成本同时上涨了8%,对盈利能力构成一定压力,强调了优化运营效率的迫切需求。整体来看,公司在增长与成本控制之间需寻求更好的平衡以保障持续健康发展。 - -=== 步骤 3: ReportGenerator === -分析报告 - -一、执行摘要 -2024年第三季度,公司实现营收同比增长15%,主要得益于新产品线的成功推出,展现了强劲的产品创新能力。然而,运营成本也同比提升了8%,对利润空间形成一定压力。为确保持续的盈利增长,需重点关注运营效率的优化,推动成本控制与收入增长的平衡发展。 - -二、详细分析 -1. 营收增长分析 -- 公司营收增长15%,反映出新产品线市场接受度良好,有效拓展了收入来源。 -- 新产品线的推出体现了公司研发及市场响应能力的提升,为未来持续增长奠定基础。 - -2. 运营成本情况 -- 运营成本上升8%,可能来自原材料价格上涨、生产效率下降或销售推广费用增加等多个方面。 -- 该成本提升在一定程度上抵消了收入增长带来的利润增益,影响整体盈利能力。 - -3. 盈利能力及效率考量 -- 营收与成本增长的不匹配显示出当前运营效率存在改进空间。 -- 优化供应链管理、提升生产自动化及加强成本控制将成为关键措施。 - -三、建议 -1. 加强新产品线后续支持,包括市场推广和客户反馈机制,持续推动营收增长。 -2. 深入分析运营成本构成,识别主要成本驱动因素,制定针对性降低成本的策略。 -3. 推动内部流程优化与技术升级,提升生产及运营效率,缓解成本压力。 -4. 建立动态的财务监控体系,实现对营收与成本的实时跟踪与调整,确保公司财务健康。 - -四、结论 -公司在2024年第三季度展现出了良好的增长动力,但同时面临成本上升带来的挑战。通过持续的产品创新结合有效的成本管理,未来有望实现盈利能力和市场竞争力的双重提升,推动公司稳健发展。 - -文档处理流水线执行完成! -``` - -# LoopAgent - -## 功能 - -LoopAgent 基于 SequentialAgent 实现,它会重复执行配置的子 Agent 序列,直到达到最大迭代次数或某个子 Agent 产生 ExitAction。LoopAgent 特别适用于需要迭代优化、反复处理或持续监控的场景。 - - - -```go -type LoopAgentConfig struct { - Name string // Agent 名称 - Description string // Agent 描述 - SubAgents []Agent // 子 Agent 列表 - MaxIterations int // 最大迭代次数,0 表示无限循环 -} - -func NewLoopAgent(ctx context.Context, config *LoopAgentConfig) (Agent, error) -``` - -LoopAgent 的执行遵循以下设定: - -1. **循环执行**:重复执行 SubAgents 序列,每次循环都是一个完整的 Sequential 执行过程 -2. **History 累积**:每次迭代的结果都会累积到 History 中,后续迭代可以访问所有历史信息 -3. **条件退出**:支持通过 ExitAction 或达到最大迭代次数来终止循环,配置 `MaxIterations=0` 时表示无限循环 - -LoopAgent 适用于以下场景: - -- **迭代优化**:如代码优化、参数调优等需要反复改进的任务 -- **持续监控**:定期检查状态并执行相应操作 -- **反复处理**:需要多轮处理才能达到满意结果的任务 -- **自我改进**:Agent 根据前面的执行结果不断改进自己的输出 - -## 示例 - -示例展示了如何使用 LoopAgent 创建一个代码优化循环: - -1. **CodeAnalyzer**:分析代码问题 -2. **CodeOptimizer**:根据分析结果优化代码 -3. **ExitController**:判断是否需要退出循环 - -循环会持续执行直到代码质量达到标准或达到最大迭代次数。 - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// 代码分析 Agent -func NewCodeAnalyzerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "CodeAnalyzer", - Description: "分析代码质量和性能问题", - Instruction: `你是一个代码分析专家。请分析提供的代码,识别以下问题: -1. 性能瓶颈 -2. 代码重复 -3. 可读性问题 -4. 潜在的 bug -5. 不符合最佳实践的地方 - -如果代码已经足够优秀,请输出 "EXIT: 代码质量已达到标准" 来结束优化流程。`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 代码优化 Agent -func NewCodeOptimizerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "CodeOptimizer", - Description: "根据分析结果优化代码", - Instruction: `基于前面的代码分析结果,对代码进行优化改进: -1. 修复识别出的性能问题 -2. 消除代码重复 -3. 提高代码可读性 -4. 修复潜在 bug -5. 应用最佳实践 - -请提供优化后的完整代码。`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 创建一个特殊的 Agent 来处理退出逻辑 -func NewExitControllerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ExitController", - Description: "控制优化循环的退出", - Instruction: `检查前面的分析结果,如果代码分析师认为代码质量已达到标准(包含"EXIT"关键词), -则输出 "TERMINATE" 并生成退出动作来结束循环。否则继续下一轮优化。`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // 创建优化流程的 Agent - analyzer := NewCodeAnalyzerAgent() - optimizer := NewCodeOptimizerAgent() - controller := NewExitControllerAgent() - - // 创建 LoopAgent,最多执行 5 轮优化 - loopAgent, err := adk.NewLoopAgent(ctx, &adk.LoopAgentConfig{ - Name: "CodeOptimizationLoop", - Description: "代码优化循环:分析 → 优化 → 检查退出条件", - SubAgents: []adk.Agent{analyzer, optimizer, controller}, - MaxIterations: 5, // 最多 5 轮优化 - }) - if err != nil { - log.Fatal(err) - } - - // 创建 Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: loopAgent, - }) - - // 待优化的代码示例 - codeToOptimize := ` -func processData(data []int) []int { - result := []int{} - for i := 0; i < len(data); i++ { - for j := 0; j < len(data); j++ { - if data[i] > data[j] { - result = append(result, data[i]) - break - } - } - } - return result -} -` - - fmt.Println("开始代码优化循环...") - iter := runner.Query(ctx, "请优化以下 Go 代码:\n"+codeToOptimize) - - iteration := 1 - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - if event.Output != nil && event.Output.MessageOutput != nil { - fmt.Printf("\n=== 第 %d 轮 - %s ===\n", iteration, event.AgentName) - fmt.Printf("%s\n", event.Output.MessageOutput.Message.Content) - - // 检查是否需要退出 - if event.AgentName == "ExitController" { - if event.Action != nil && event.Action.Exit { - fmt.Println("\n优化循环提前结束!") - break - } - iteration++ - } - } - } - - fmt.Println("\n代码优化循环执行完成!") -} -``` - -运行结果为: - -```java -开始代码优化循环... - -=== 第 1 轮 - CodeAnalyzer === -分析提供的代码: - -```go -func processData(data []int) []int { - result := []int{} - for i := 0; i < len(data); i++ { - for j := 0; j < len(data); j++ { - if data[i] > data[j] { - result = append(result, data[i]) - break - } - } - } - return result -} -``` - -### 1. 性能瓶颈 - -- 双层循环,时间复杂度为 O(n²),对于较大的数据量,性能不佳。 -- 内层循环当条件满足时立即 break,减少了部分不必要的比较,但整体仍然是二次复杂度。 - -### 2. 代码重复 - -- 代码没有明显重复,但逻辑可以简化。 - -### 3. 可读性问题 - -- 代码的意图不太明确。根据代码逻辑,函数筛选出那些在数组里至少有一个元素比它小的元素。换句话说,剔除了数组里所有等于或最小的数字。 -- 可以通过注释或重命名来增加可读性。 - -### 4. 潜在的 bug - -- 目前没有明显的 bug,但输入为空数组时,返回空数组符合预期。 - -### 5. 不符合最佳实践的地方 - -- 过程较为低效,未使用 Go 语言的特性优化性能。 -- 缺少注释或文档说明。 - ---- - -## 优化建议 - -如果目的确实是要返回所有比数组中某些元素大的元素(排除数组中的最小元素),可以改为: - -- 找出数组中的最小元素 minVal。 -- 遍历数组,将大于 minVal 的元素加入结果。 -- 这样时间复杂度降为 O(n)。 - -优化代码: - -```go -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -### 优化点说明 - -- 使用一次遍历确定最小值,O(n)。 -- 再次遍历构造结果,O(n)。 -- 适当使用切片容量初始化,减少内存分配。 -- 代码意图明确且性能提升明显。 - ---- - -如果你期望的是其他逻辑,请告知我进一步调整。 - -=== 第 1 轮 - CodeOptimizer === - -```go -// processData 返回输入切片中所有大于最小元素的值。 -// 如果输入为空,则返回 nil。 -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - // 找出切片中的最小值 - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - // 收集所有大于最小值的元素 - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== 第 1 轮 - ExitController === -TERMINATE - -=== 第 2 轮 - CodeAnalyzer === -EXIT: 代码质量已达到标准 - -=== 第 2 轮 - CodeOptimizer === - -```go -// processData 返回输入切片中所有大于最小元素的值。 -// 如果输入为空,则返回 nil。 -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - // 找出切片中的最小值 - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - // 收集所有大于最小值的元素 - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== 第 2 轮 - ExitController === -TERMINATE - -=== 第 3 轮 - CodeAnalyzer === -EXIT: 代码质量已达到标准 - -=== 第 3 轮 - CodeOptimizer === - -```go -// processData 返回输入切片中所有大于最小元素的值。 -// 如果输入为空,则返回 nil。 -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== 第 3 轮 - ExitController === -TERMINATE - -=== 第 4 轮 - CodeAnalyzer === -EXIT: 代码质量已达到标准 - -=== 第 4 轮 - CodeOptimizer === - -```go -// processData 返回切片中所有大于最小元素的元素。 -// 如果输入为空,则返回 nil。 -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== 第 4 轮 - ExitController === -TERMINATE - -=== 第 5 轮 - CodeAnalyzer === -EXIT: 代码质量已达到标准 - -=== 第 5 轮 - CodeOptimizer === - -```go -// processData 返回输入切片中所有大于最小元素的值。 -// 如果输入为空,则返回 nil。 -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== 第 5 轮 - ExitController === -TERMINATE - -代码优化循环执行完成! - -``` - - - - -## BreakLoop - - -在 Loop Agent 中,当某个 Agent 需要中断循环运行时,您可以使用 ADK 提供的对应 Break Action。 - -```go -// BreakLoopAction is a programmatic-only agent action used to prematurely -// terminate the execution of a loop workflow agent. -// When a loop workflow agent receives this action from a sub-agent, it will stop its -// current iteration and will not proceed to the next one. -// It will mark the BreakLoopAction as Done, signalling to any 'upper level' loop agent -// that this action has been processed and should be ignored further up. -// This action is not intended to be used by LLMs. -type BreakLoopAction struct { - // From records the name of the agent that initiated the break loop action. - From string - // Done is a state flag that can be used by the framework to mark when the - // action has been handled. - Done bool - // CurrentIterations is populated by the framework to record at which - // iteration the loop was broken. - CurrentIterations int -} - -// NewBreakLoopAction creates a new BreakLoopAction, signaling a request -// to terminate the current loop. -func NewBreakLoopAction(agentName string) *AgentAction { - return &AgentAction{BreakLoop: &BreakLoopAction{ - From: agentName, - }} -} -``` - -Break Action 在达到中断目的的同时不影响 Loop Agent 外的其他 Agent 运行,而 Exit Action 会立刻中断所有后续的 Agent 运行。 - -以下图为例: - - - -- 当 Agent1 发出 BreakAction 时,Loop Agent 将中断,Sequential 继续运行 Agent3 -- 当 Agent1 发出 ExitAction 时,Sequential 运行流程整体终止,Agent2 / Agent3 均不会运行 - -# ParallelAgent - -## 功能 - -ParallelAgent 允许多个子 Agent 基于相同的输入上下文并发执行,所有子 Agent 同时开始执行,并等待全部完成后结束。这种模式特别适用于可以独立并行处理的任务,能够显著提高执行效率。 - - - -```go -type ParallelAgentConfig struct { - Name string // Agent 名称 - Description string // Agent 描述 - SubAgents []Agent // 并发执行的子 Agent 列表 -} - -func NewParallelAgent(ctx context.Context, config *ParallelAgentConfig) (Agent, error) -``` - -ParallelAgent 的执行遵循以下设定: - -1. **并发执行**:所有子 Agent 同时启动,在独立的 goroutine 中并行执行 -2. **共享输入**:所有子 Agent 接收相同的初始输入和上下文 -3. **等待与结果聚合**:内部使用 sync.WaitGroup 等待所有子 Agent 执行完成,收集所有子 Agent 的执行结果并按接收顺序输出 - -另外 Parallel 内部默认包含异常处理机制: - -- **Panic 恢复**:每个 goroutine 都有独立的 panic 恢复机制 -- **错误隔离**:单个子 Agent 的错误不会影响其他子 Agent 的执行 -- **中断处理**:支持子 Agent 的中断和恢复机制 - -ParallelAgent 适用于以下场景: - -- **独立任务并行处理**:多个不相关的任务可以同时执行 -- **多角度分析**:从不同角度同时分析同一个问题 -- **性能优化**:通过并行执行减少总体执行时间 -- **多专家咨询**:同时咨询多个专业领域的 Agent - -## 示例 - -示例展示了如何使用 ParallelAgent 同时从四个不同角度分析产品方案: - -1. **TechnicalAnalyst**:技术可行性分析 -2. **BusinessAnalyst**:商业价值分析 -3. **UXAnalyst**:用户体验分析 -4. **SecurityAnalyst**:安全风险分析 - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - "sync" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" -) - -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// 技术分析 Agent -func NewTechnicalAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "TechnicalAnalyst", - Description: "从技术角度分析内容", - Instruction: `你是一个技术专家。请从技术实现、架构设计、性能优化等技术角度分析提供的内容。 -重点关注: -1. 技术可行性 -2. 架构合理性 -3. 性能考量 -4. 技术风险 -5. 实现复杂度`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 商业分析 Agent -func NewBusinessAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "BusinessAnalyst", - Description: "从商业角度分析内容", - Instruction: `你是一个商业分析专家。请从商业价值、市场前景、成本效益等商业角度分析提供的内容。 -重点关注: -1. 商业价值 -2. 市场需求 -3. 竞争优势 -4. 成本分析 -5. 盈利模式`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 用户体验分析 Agent -func NewUXAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "UXAnalyst", - Description: "从用户体验角度分析内容", - Instruction: `你是一个用户体验专家。请从用户体验、易用性、用户满意度等角度分析提供的内容。 -重点关注: -1. 用户友好性 -2. 操作便利性 -3. 学习成本 -4. 用户满意度 -5. 可访问性`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 安全分析 Agent -func NewSecurityAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "SecurityAnalyst", - Description: "从安全角度分析内容", - Instruction: `你是一个安全专家。请从信息安全、数据保护、隐私合规等安全角度分析提供的内容。 -重点关注: -1. 数据安全 -2. 隐私保护 -3. 访问控制 -4. 安全漏洞 -5. 合规要求`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // 创建四个不同角度的分析 Agent - techAnalyst := NewTechnicalAnalystAgent() - bizAnalyst := NewBusinessAnalystAgent() - uxAnalyst := NewUXAnalystAgent() - secAnalyst := NewSecurityAnalystAgent() - - // 创建 ParallelAgent,同时进行多角度分析 - parallelAgent, err := adk.NewParallelAgent(ctx, &adk.ParallelAgentConfig{ - Name: "MultiPerspectiveAnalyzer", - Description: "多角度并行分析:技术 + 商业 + 用户体验 + 安全", - SubAgents: []adk.Agent{techAnalyst, bizAnalyst, uxAnalyst, secAnalyst}, - }) - if err != nil { - log.Fatal(err) - } - - // 创建 Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: parallelAgent, - }) - - // 要分析的产品方案 - productProposal := ` -产品方案:智能客服系统 - -概述:开发一个基于大语言模型的智能客服系统,能够自动回答用户问题,处理常见业务咨询,并在必要时转接人工客服。 - -主要功能: -1. 自然语言理解和回复 -2. 多轮对话管理 -3. 知识库集成 -4. 情感分析 -5. 人工客服转接 -6. 对话历史记录 -7. 多渠道接入(网页、微信、APP) - -技术架构: -- 前端:React + TypeScript -- 后端:Go + Gin 框架 -- 数据库:PostgreSQL + Redis -- AI模型:GPT-4 API -- 部署:Docker + Kubernetes -` - - fmt.Println("开始多角度并行分析...") - iter := runner.Query(ctx, "请分析以下产品方案:\n"+productProposal) - - // 使用 map 来收集不同分析师的结果 - results := make(map[string]string) - var mu sync.Mutex - - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Printf("分析过程中出现错误: %v", event.Err) - continue - } - - if event.Output != nil && event.Output.MessageOutput != nil { - mu.Lock() - results[event.AgentName] = event.Output.MessageOutput.Message.Content - mu.Unlock() - - fmt.Printf("\n=== %s 分析完成 ===\n", event.AgentName) - } - } - - // 输出所有分析结果 - fmt.Println("\n" + "============================================================") - fmt.Println("多角度分析结果汇总") - fmt.Println("============================================================") - - analysisOrder := []string{"TechnicalAnalyst", "BusinessAnalyst", "UXAnalyst", "SecurityAnalyst"} - analysisNames := map[string]string{ - "TechnicalAnalyst": "技术分析", - "BusinessAnalyst": "商业分析", - "UXAnalyst": "用户体验分析", - "SecurityAnalyst": "安全分析", - } - - for _, agentName := range analysisOrder { - if result, exists := results[agentName]; exists { - fmt.Printf("\n【%s】\n", analysisNames[agentName]) - fmt.Printf("%s\n", result) - fmt.Println("----------------------------------------") - } - } - - fmt.Println("\n多角度并行分析完成!") - fmt.Printf("共收到 %d 个分析结果\n", len(results)) -} -``` - -运行结果为: - -```markdown -开始多角度并行分析... - -=== BusinessAnalyst 分析完成 === - -=== UXAnalyst 分析完成 === - -=== SecurityAnalyst 分析完成 === - -=== TechnicalAnalyst 分析完成 === - -============================================================ -多角度分析结果汇总 -============================================================ - -【技术分析】 -针对该智能客服系统方案,下面从技术实现、架构设计及性能优化等角度进行详细分析: - ---- - -### 一、技术可行性 - -1. **自然语言理解和回复** - - 利用 GPT-4 API 实现自然语言理解和自动回复是当前成熟且可行的方案。GPT-4具备强大的语言理解和生成能力,适合处理复杂、多样的问题。 - -2. **多轮对话管理** - - 依赖后端维护上下文状态,结合GPT-4模型能够较好处理多轮交互。需要设计合理的上下文管理机制(例如对话历史维护、关键槽位抽取等),确保上下文信息完整性。 - -3. **知识库集成** - - 可通过向GPT-4 API添加特定的知识库检索结果(检索增强生成),或者通过本地检索接口集成知识库。技术上可行,但对于实时性和准确性有较高要求。 - -4. **情感分析** - - 情感分析功能可以用独立的轻量模型实现(例如基于BERT微调),也可尝试利用GPT-4输出,但成本较高。情感分析能力帮助智能客服更好地理解用户情绪,提升用户体验。 - -5. **人工客服转接** - - 技术上通过建立事件触发规则(如轮次数、情绪阈值、关键词检测)实现自动转人工。系统需支持工单或会话传递机制,并保障会话无缝切换。 - -6. **多渠道接入** - - 网页、微信、App等多渠道接入均可通过统一API网关实现,技术成熟,同时需要处理渠道差异性(消息格式、认证、推送机制等)。 - ---- - -### 二、架构合理性 - -- **前端 React + TypeScript** - 非常适合搭建响应式客服界面,生态成熟,方便多渠道共享组件。 - -- **后端 Go + Gin** - Go语言性能优异,Gin框架轻量且性能高,适合高并发场景。后端承担对接 GPT-4 API、管理状态、多渠道消息转发等职责,选择合理。 - -- **数据库 PostgreSQL + Redis** - - PostgreSQL 负责存储结构化数据,如用户信息、对话历史、知识库元数据。 - - Redis 负责缓存会话状态、热点知识库、限流等,提升访问性能。 - 架构设计符合常见大型互联网产品模式,组件分工明确。 - -- **AI模型 GPT-4 API** - 使用成熟API降低开发难度和模型维护成本;缺点是对网络和API调用依赖度高。 - -- **部署 Docker + Kubernetes** - 容器化和K8s编排能保证系统弹性伸缩、高可用和灰度发布,适合生产环境,符合现代微服务架构趋势。 - ---- - -### 三、性能考量 - -1. **响应时间** - - GPT-4 API调用本身有一定延迟(通常几百毫秒到1秒不等),对响应时间影响较大。需要做好接口异步处理与前端体验设计(如加载动画、部分渐进响应)。 - -2. **并发处理能力** - - 后端Go具有高并发处理优势,配合Redis缓存热点数据,能大幅提升整体吞吐能力。 - - 但GPT-4 API调用受限于OpenAI服务的QPS限制与调用成本,需合理设计调用频率与降级策略。 - -3. **缓存策略** - - 对用户对话上下文和常见问题答案进行缓存,减少重复API调用。 - - 如关键问题先做本地匹配,失败后才调用GPT-4,提升效率。 - -4. **多渠道负载均衡** - - 需要设计统一消息总线和可靠的异步队列,防止某渠道流量突增影响整体系统稳定。 - ---- - -### 四、技术风险 - -1. **GPT-4 API依赖** - - 高度依赖第三方API,风险包括服务中断、接口变更及成本波动。 - - 建议设计本地缓存和有限的替代回答逻辑以应对API异常。 - -2. **多轮对话上下文管理难度** - - 上下文过长或复杂会导致回答质量降低,需要设计限制上下文长度、选择性保留重要信息机制。 - -3. **知识库集成复杂度** - - 如何做到知识库与 ----------------------------------------- - -【商业分析】 -以下是对智能客服系统产品方案的商业角度分析: - -1. 商业价值 -- 提升客户服务效率:自动解答用户问题和常见咨询,减少人工客服压力,降低用人成本。 -- 提升用户体验:多轮对话和情感分析使交互更自然,增强客户满意度和粘性。 -- 数据驱动决策支持:对话历史与知识库集成为企业提供宝贵的用户反馈和行为数据,优化产品和服务。 -- 支持业务扩展:多渠道接入(网页、微信、APP)满足不同客户接入习惯,提升覆盖率。 - -2. 市场需求 -- 市场对智能客服的需求持续增长,特别是在电商、金融、医疗、教育等行业,客户服务自动化是企业数字化转型的重要方向。 -- 随着AI技术的成熟,企业期望借助大语言模型提升客服智能化水平。 -- 用户对即时响应、全天候服务的需求增加,推动智能客服系统的广泛采用。 - -3. 竞争优势 -- 采用先进的GPT-4大语言模型,拥有较强的自然语言理解与生成能力,提升问答准确率和对话自然度。 -- 情感分析功能有助于精准识别用户情绪,动态调整回复策略,提高客户满意度。 -- 多渠道接入设计满足企业多元化客户触达需求,增强产品适用性。 -- 技术架构采用微服务、容器化部署,便于弹性扩展和维护,提升系统稳定性和扩展能力。 - -4. 成本分析 -- AI模型调用成本较高,依赖GPT-4 API,需根据调用量和响应速度调整预算。 -- 技术研发投入较大,涉及前后端、多渠道融合、AI和知识库管理。 -- 运维和服务器成本需考虑多渠道并发访问。 -- 长期来看,人工客服人数可显著减少,节省人力成本。 -- 可通过云服务降低硬件初期投入,但云资源使用需精细管理以控制费用。 - -5. 盈利模式 -- SaaS订阅服务:按月/年向企业客户收取服务费,基于接入渠道数、并发量和功能级别分层定价。 -- 按调用次数或对话数收费,适合业务波动较大的客户。 -- 增值服务:数据分析报告定制、行业知识库集成、人工客服协同工具等收费。 -- 中大型客户可提供定制开发和技术支持,收取项目费用。 -- 通过持续优化模型和服务,增加客户留存和续费率。 - -综上,该智能客服系统基于成熟技术与AI优势,具备良好的商业价值和市场潜力。其多渠道接入和情感分析等功能增强竞争力,但需合理控制AI调用成本和运营费用。建议重点推进SaaS订阅和增值服务,结合市场推广,快速占领客户资源,提升盈利能力。 ----------------------------------------- - -【用户体验分析】 -针对该智能客服系统方案,我将从用户体验、易用性、用户满意度及可访问性等角度进行分析: - -1. 用户友好性 -- 自然语言理解和回复能力提升了用户与系统的沟通体验,使用户能够用自然话语表达需求,降低交流障碍。 -- 多轮对话管理允许系统理解上下文,减少重复解释,增强对话连贯性,进一步提升用户体验。 -- 情感分析功能有助于系统识别用户情绪,做出更贴心的回应,提高互动的个性化和人性化。 -- 多渠道接入覆盖用户常用的访问途径,方便用户随时随地获取服务,提升友好度。 - -2. 操作便利性 -- 自动回答常见业务咨询能够减轻用户等待时间和操作负担,提高响应速度。 -- 人工客服转接机制确保复杂问题可被及时处理,保障服务连续性和操作的无缝衔接。 -- 对话历史记录方便用户回顾咨询内容,避免重复查询,提升操作便利。 -- 使用现代技术栈(React、TypeScript)为前端交互提供良好性能和响应速度,间接增强操作流畅性。 - -3. 学习成本 -- 基于自然语言处理,用户无需学习特殊指令,降低使用门槛。 -- 多轮对话自然衔接,让用户更易理解系统响应逻辑,减少迷惑和挫败感。 -- 不同渠道的一致性界面(如在网页和微信中保持类似体验)有助于用户迅速上手。 -- 通过情感分析提供的更精准反馈,减少用户因误解而频繁尝试的时间成本。 - -4. 用户满意度 -- 快速准确的自动回复和多轮对话减少用户等待和重复输入,提升满意度。 -- 情感分析让系统更懂用户情绪,带来更温暖的交互体验,增加用户粘性。 -- 人工客服介入保障复杂问题得到妥善处理,提高服务质量感知。 -- 多渠道覆盖满足不同用户的使用场景,增强整体满意度。 - -5. 可访问性 -- 多渠道接入覆盖网页、微信、APP,适应不同用户的设备和环境,提升可访问性。 -- 方案未明确提及无障碍设计(如屏幕阅读器兼容、高对比度模式等),这可能是未来需要补充的部分。 -- 前端采用React和TypeScript,有利于实现响应式设计和无障碍功能,但需确保开发规范落地。 -- 后端架构和部署方案保证系统的稳定性和扩展性,间接提升用户持续可访问性。 - -总结: -该智能客服系统方案在用户体验和易用性方面考虑较为充分,利用大语言模型实现自然多轮对话、情感分析和知识库集成,满足用户多样化需求。同时,多渠道接入增强了系统的覆盖能力。建议在具体落地时,强化无障碍设计,实现更全面的可访问性保障,同时继续优化对话策略以提升用户满意度。 ----------------------------------------- - -【安全分析】 -针对该智能客服系统方案,结合信息安全、数据保护及隐私合规等方面,展开如下分析: - -一、数据安全 - -1. 数据传输安全 -- 建议系统所有客户端与服务器间通信均采用TLS/SSL加密,保障数据在传输过程中的机密性与完整性。 -- 由于支持多渠道接入(网页、微信、APP),需确保每个入口均严格实施加密传输。 - -2. 数据存储安全 -- PostgreSQL存储对话历史、用户资料等敏感信息,需启用数据库加密(如透明数据加密TDE或字段级加密),防止数据泄露。 -- Redis作为缓存,可能存储临时会话数据,也需开启访问认证与加密传输。 -- 对用户敏感数据实行最小存储原则,避免无关数据超范围保存。 -- 数据备份过程中需加密保存,且备份访问同样受控。 - -3. API调用安全 -- GPT-4 API调用产生大量用户数据交互,应评估其数据处理及存储政策,确保符合数据安全要求。 -- 增加调用权限管理,限制API密钥访问范围和权限,避免被滥用。 - -4. 日志安全 -- 系统日志中避免存储明文敏感信息,尤其是个人身份信息、对话内容。日志访问需严格控制。 - -二、隐私保护 - -1. 个人数据处理 -- 采集和存储用户个人数据(姓名、联系方式、账务信息等)必须明确告知用户,并征得用户同意。 -- 实施数据匿名化/去标识化技术,尤其是对话历史中的身份信息处理。 - -2. 用户隐私权利 -- 满足相关法律法规(例如《个人信息保护法》、《GDPR》)中用户的访问、更正、删除数据的权利。 -- 提供隐私政策明确披露数据收集、使用和共享情况。 - -3. 交互隐私 -- 多轮对话和情感分析等功能应考虑避免过度侵犯用户隐私,例如敏感情绪数据的使用透明告知和限制。 - -4. 第三方合规 -- GPT-4 API由第三方提供,需确保其服务符合相关隐私合规要求及数据保护标准。 - -三、访问控制 - -1. 用户身份验证 -- 系统中涉及用户身份信息查询和管理时,需建立可靠的身份认证机制。 -- 支持多因素认证增强安全性。 - -2. 权限管理 -- 后端管理接口及人工客服转接模块需采用基于角色的访问控制(RBAC),确保操作权限最小化。 -- 对访问敏感数据的操作需有详细审计和监控。 - -3. 会话管理 -- 对多渠道的会话要有有效的会话管理机制,防止会话劫持。 -- 对话历史访问权限应限制仅允许相关用户或授权人员访问。 - -四、安全漏洞 - -1. 应用安全 -- 前端React+TypeScript应防止XSS、CSRF攻击,合理使用Content Security Policy(CSP)。 -- 后端Go应用需防止SQL注入、请求伪造和权限缺失。Gin框架提供中间件支持,建议充分利用安全模块。 - -2. AI模型风险 -- GPT-4 API本身输入输出可能存在敏感信息泄露或模型误用风险,需限制输入内容、过滤敏感信息。 -- 防止生成恶意回答或信息泄露,建立内容审核机制。 - -3. 容器和部署安全 -- Docker容器须采用安全镜像,及时打补丁。Kubernetes集群网络策略和访问控制需完善。 -- 容器运行权限最小化,避免容器逃逸风险。 - -五、合规要求 - -1. 数据保护法规 -- 根据运营地域,需符合《个人信息保护法》(PIPL)、《欧盟通用数据保护条例》(GDPR)或其他相关法律要求。 -- 明确用户数据的采集、处理、传输和存储流程符合法规。 - -2. 用户隐私告知及同意 -- 应提供清晰的隐私政策和使用条款,说明数据用途及处理方式。 -- 实现用户同意管理(Consent Management)机制。 - -3. 数据跨境传输合规 -- 若系统涉及跨境数据流,需评估合规风险和采取相应技术 ----------------------------------------- - -多角度并行分析完成! -共收到 4 个分析结果 -``` - -# 总结 - -Workflow Agents 为 Eino ADK 提供了强大的多 Agent 协作能力,通过合理选择和组合这些 Workflow Agent,开发者可以构建出高效、可靠的多 Agent 协作系统,满足各种复杂的业务需求。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_interface.md b/content/zh/docs/eino/core_modules/eino_adk/agent_interface.md index a3629db244d..67f85bcf201 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_interface.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_interface.md @@ -1,390 +1,198 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: Agent 抽象 weight: 3 --- -# Agent 定义 +# Agent 接口 -Eino 定义了 Agent 的基础接口,实现此接口的 Struct 可被视为一个 Agent: +ADK 的所有功能围绕 `Agent` 接口展开: ```go -// github.com/cloudwego/eino/adk/interface.go +// github.com/cloudwego/eino/adk -type Agent interface { +type TypedAgent[M MessageType] interface { Name(ctx context.Context) string Description(ctx context.Context) string - Run(ctx context.Context, input *AgentInput, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] + Run(ctx context.Context, input *TypedAgentInput[M], options ...AgentRunOption) *AsyncIterator[*TypedAgentEvent[M]] } + +// 默认类型别名(使用 *schema.Message) +type Agent = TypedAgent[*schema.Message] ``` - - - - + + + +
    Method 说明
    NameAgent 的名称,作为 Agent 的标识
    DescriptionAgent 的职能描述信息,主要用于让其他的 Agent 了解和判断该 Agent 的职责或功能
    RunAgent 的核心执行方法,返回一个迭代器,调用者可以通过这个迭代器持续接收 Agent 产生的事件
    方法说明
    Name
    Agent 名称标识
    Description
    职能描述,供其他 Agent 或框架了解能力
    Run
    核心执行方法,异步返回事件流(Future 模式)
    -## AgentInput - -Run 方法接收 AgentInput 作为 Agent 的输入: - -```go -type AgentInput struct { - Messages []Message - EnableStreaming bool -} - -type Message = *schema.Message -``` - -Agent 通常以 ChatModel 为核心,因此规定 Agent 的输入为 `Messages`, 与调用 Eino ChatModel 的类型相同。`Messages` 中可以包括用户指令、对话历史、背景知识、样例数据等任何你希望传递给 Agent 的数据。例如: +## MessageType 约束 ```go -import ( - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/schema" -) - -input := &adk.AgentInput{ - Messages: []adk.Message{ - schema.UserMessage("What's the capital of France?"), - schema.AssistantMessage("The capital of France is Paris.", nil), - schema.UserMessage("How far is it from London? "), - }, +type MessageType interface { + *schema.Message | *schema.AgenticMessage } ``` -`EnableStreaming` 用于向 Agent **建议**其输出模式,但它并非一个强制性约束。它的核心思想是控制那些同时支持流式和非流式输出的组件的行为,例如 ChatModel,而仅支持一种输出方式的组件,`EnableStreaming` 不会影响他们的行为。另外在 `AgentOutput.IsStreaming` 字段会标明实际输出类型。运行表现为: +所有 ADK 泛型类型使用 `[M MessageType]` 参数化。`*schema.Message` 支持完整 ADK 特性;`*schema.AgenticMessage` 用于 v0.9 新增的结构化内容块模式。 -- 当 `EnableStreaming=false` 时,对于那些既能流式也能非流式输出的组件,此时会使用一次性返回完整结果的非流式模式。 -- 当 `EnableStreaming=true` 时,对于 Agent 内部能够流式输出的组件(如 ChatModel 调用),应以流的形式逐步返回结果。如果某个组件天然不支持流式,它仍然可以按其原有的非流式方式工作。 +## 类型别名速查 -如下图所示,ChatModel 既可以输出非流也可以输出流,Tool 只能输出非流,即: - -- 当 `EnableStream=false` 时,二者均输出非流 -- 当 `EnableStream=true` 时,ChatModel 输出流,Tool 因为不具备输出流的能力,仍然输出非流。 - - - -## AgentRunOption - -`AgentRunOption` 由 Agent 实现定义,可以在请求维度修改 Agent 配置或者控制 Agent 行为。 - -Eino ADK 提供了一些通用定义的 Option,供用户使用: - -- `WithSessionValues`:设置跨 Agent 读写数据 -- `WithSkipTransferMessages`:配置后,当 Event 为 Transfer SubAgent 时,Event 中的消息不会追加到 History 中 - -Eino ADK 提供了 `WrapImplSpecificOptFn` 和 `GetImplSpecificOptions` 两个方法,供 Agent 包装与读取自定义的 `AgentRunOption`。 - -当使用 `GetImplSpecificOptions` 方法读取 `AgentRunOptions` 时,与所需类型(如例子中的 options)不符的 AgentRunOption 会被忽略。 + + + + + + + +
    泛型类型默认别名
    TypedAgent[*schema.Message]
    Agent
    TypedAgentInput[*schema.Message]
    AgentInput
    TypedAgentEvent[*schema.Message]
    AgentEvent
    TypedAgentOutput[*schema.Message]
    AgentOutput
    TypedMessageVariant[*schema.Message]
    MessageVariant
    -例如可以定义 `WithModelName`,在请求维度要求 Agent 修改调用的模型: +# AgentInput ```go -// github.com/cloudwego/eino/adk/call_option.go -// func WrapImplSpecificOptFn[T any](optFn func(*T)) AgentRunOption -// func GetImplSpecificOptions[T any](base *T, opts ...AgentRunOption) *T - -import "github.com/cloudwego/eino/adk" - -type options struct { - modelName string -} - -func WithModelName(name string) adk.AgentRunOption { - return adk.WrapImplSpecificOptFn(func(t *options) { - t.modelName = name - }) -} - -func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { - o := &options{} - o = adk.GetImplSpecificOptions(o, opts...) - // run code... +type TypedAgentInput[M MessageType] struct { + Messages []M + EnableStreaming bool } ``` -除此之外,AgentRunOption 具有一个 `DesignateAgent` 方法,调用该方法可以在调用多 Agent 系统时指定 Option 生效的 Agent: - -```go -func genOpt() { - // 指定 option 仅对 agent_1 和 agent_2 生效 - opt := adk.WithSessionValues(map[string]any{}).DesignateAgent("agent_1", "agent_2") -} -``` +- **Messages**:用户指令、对话历史、背景知识等,与 ChatModel 输入格式一致 +- **EnableStreaming**:建议 Agent 使用流式输出。支持流式的组件(如 ChatModel)会逐步返回;不支持的组件不受影响 -## AsyncIterator +# AgentEvent -`Agent.Run` 返回了一个迭代器 `AsyncIterator[*AgentEvent]`: +Agent 运行过程中产出的事件: ```go -// github.com/cloudwego/eino/adk/utils.go - -type AsyncIterator[T any] struct { - ... -} - -func (ai *AsyncIterator[T]) Next() (T, bool) { - ... +type TypedAgentEvent[M MessageType] struct { + AgentName string + RunPath []RunStep + Output *TypedAgentOutput[M] + Action *AgentAction + Err error } ``` -它代表一个异步迭代器(异步指生产与消费之间没有同步控制),允许调用者以一种有序、阻塞的方式消费 Agent 在运行过程中产生的一系列事件。 - -- `AsyncIterator` 是一个泛型结构体,可以用于迭代任何类型的数据。当前在 Agent 接口中, Run 方法返回的迭代器类型被固定为 `AsyncIterator[*AgentEvent]` 。这意味着,你从这个迭代器中获取的每一个元素,都将是一个指向 `AgentEvent` 对象的指针。`AgentEvent` 会在后续章节中详细说明。 -- 迭代器的主要交互方式是通过调用其 `Next()` 方法。这个方法的行为是 阻塞式 的,每次调用 `Next()` ,程序会暂停执行,直到以下两种情况之一发生: - - Agent 产生了一个新的 `AgentEvent` : `Next()` 方法会返回这个事件,调用者可以立即对其进行处理。 - - Agent 主动关闭了迭代器 : 当 Agent 不会再产生任何新的事件时(通常是 Agent 运行结束),它会关闭这个迭代器。此时 `Next()` 调用会结束阻塞并在第二个返回值返回 false,告知调用者迭代已经结束。 - -通常情况下,你需要使用 for 循环处理 `AsyncIterator`: +## AgentOutput ```go -iter := myAgent.Run(xxx) // get AsyncIterator from Agent.Run - -for { - event, ok := iter.Next() - if !ok { - break - } - // handle event +type TypedAgentOutput[M MessageType] struct { + MessageOutput *TypedMessageVariant[M] + CustomizedOutput any } ``` -`AsyncIterator` 可以由 `NewAsyncIteratorPair` 创建,该函数返回的另一个参数 `AsyncGenerator` 用来生产数据: +`MessageVariant` 统一处理流式与非流式消息: ```go -// github.com/cloudwego/eino/adk/utils.go - -func NewAsyncIteratorPair[T any]() (*AsyncIterator[T], *AsyncGenerator[T]) -``` - -Agent.Run 返回 AsyncIterator 旨在让调用者实时地接收到 Agent 产生的一系列 AgentEvent,因此 Agent.Run 通常会在 Goroutine 中运行 Agent 从而立刻返回 AsyncIterator 供调用者监听: - -```go -import "github.com/cloudwego/eino/adk" - -func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { - // handle input - iter, gen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() - go func() { - defer func() { - // recover code - gen.Close() - }() - // agent run code - // gen.Send(event) - }() - return iter +type TypedMessageVariant[M MessageType] struct { + IsStreaming bool + Message M + MessageStream *schema.StreamReader[M] + Role schema.RoleType // *schema.Message 路径 + AgenticRole schema.AgenticRoleType // *schema.AgenticMessage 路径 + ToolName string } ``` -## AgentWithOptions - -使用 `AgentWithOptions` 方法可以在 Eino ADK Agent 中进行一些通用配置。 - -与 `AgentRunOption` 不同的是,`AgentWithOptions` 在运行前生效,并且不支持自定义 option。 +- `IsStreaming=true` → 从 `MessageStream` 逐帧读取 +- `IsStreaming=false` → 从 `Message` 一次性获取 +- `Role`/`ToolName`:仅 `*schema.Message` 路径有效(Assistant 或 Tool) +- `AgenticRole`:仅 `*schema.AgenticMessage` 路径有效 -```go -// github.com/cloudwego/eino/adk/flow.go -func AgentWithOptions(ctx context.Context, agent Agent, opts ...AgentOption) Agent -``` - -Eino ADK 当前内置支持的配置有: - -- `WithDisallowTransferToParent`:配置该 SubAgent 不允许 Transfer 到 ParentAgent,会触发该 SubAgent 的 `OnDisallowTransferToParent` 回调方法 -- `WithHistoryRewriter`:配置后该 Agent 在执行前会通过该方法重写接收到的上下文信息 - -# AgentEvent +## AgentAction -AgentEvent 是 Agent 在其运行过程中产生的核心事件数据结构。其中包含了 Agent 的元信息、输出、行为和报错: +控制多 Agent 协作的行为信号: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentEvent struct { - AgentName string - - RunPath []RunStep - - Output *AgentOutput - - Action *AgentAction - - Err error +type AgentAction struct { + Exit bool + Interrupted *InterruptInfo + TransferToAgent *TransferToAgentAction // NOT RECOMMENDED + BreakLoop *BreakLoopAction + CustomizedAction any } - -// EventFromMessage 构建普通 event -func EventFromMessage(msg Message, msgStream MessageStream, role schema.RoleType, toolName string) *AgentEvent ``` -## AgentName & RunPath - -`AgentName` 和 `RunPath` 字段是由框架自动进行填充,它们提供了关于事件来源的重要上下文信息,在复杂的、由多个 Agent 构成的系统中至关重要。 +- **Interrupted**:中断 Runner 运行,携带自定义数据,支持后续 Resume +- **BreakLoop**:中止 LoopAgent 的循环 +- **Exit**:立即退出多 Agent 系统 +- **TransferToAgent**:(不推荐)任务转让,建议使用 AgentAsTool 替代 -```go -type RunStep struct { - agentName string -} -``` +# AgentRunOption -- `AgentName` 标明了是哪一个 Agent 实例产生了当前的 AgentEvent 。 -- `RunPath` 记录了到达当前 Agent 的完整调用链路。`RunPath` 是一个 `RunStep` 切片,它按顺序记录了从最初的入口 Agent 到当前产生事件的 Agent 的所有 `AgentName`。 - -## AgentOutput +请求维度的 Agent 配置。ADK 内置: -`AgentOutput` 封装了 Agent 产生的输出。 +- `WithSessionValues(map[string]any)`:注入跨 Agent 共享的 KV 数据 +- `WithCallbacks(...callbacks.Handler)`:添加回调处理器 +- `WithCancel()`:启用 Agent Cancel 能力(详见 [Cancel 与 TurnLoop](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart)) -Message 输出设置在 MessageOutput 字段中,其他类型的自定义输出设置在 CustomizedOutput 字段中: +自定义 Option: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentOutput struct { - MessageOutput *MessageVariant - - CustomizedOutput any +type myOptions struct { + modelName string } -type MessageVariant struct { - IsStreaming bool +func WithModelName(name string) adk.AgentRunOption { + return adk.WrapImplSpecificOptFn(func(t *myOptions) { + t.modelName = name + }) +} - Message Message - MessageStream MessageStream - // message role: Assistant or Tool - Role schema.RoleType - // only used when Role is Tool - ToolName string +// 在 Run 中读取 +func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + o := adk.GetImplSpecificOptions(&myOptions{}, opts...) + // 使用 o.modelName ... } ``` -`MessageOutput` 字段的类型 `MessageVariant` 是一个核心数据结构,主要功能为: - -1. 统一处理流式与非流式消息:`IsStreaming` 是一个标志位。值为 true 表示当前 `MessageVariant` 包含的是一个流式消息(从 MessageStream 读取),为 false 则表示包含的是一个非流式消息(从 Message 读取): - - - 流式 : 随着时间的推移,逐步返回一系列消息片段,最终构成一个完整的消息(MessageStream)。 - - 非流式 : 一次性返回一个完整的消息(Message)。 -2. 提供便捷的元数据访问:Message 结构体内部包含了一些重要的元信息,如消息的 Role(Assistant 或 Tool),为了方便快速地识别消息类型和来源, MessageVariant 将这些常用的元数据提升到了顶层: - - - `Role`:消息的角色,Assistant / Tool - - `ToolName`:如果消息角色是 Tool ,这个字段会直接提供工具的名称。 - -这样做的好处是,代码在需要根据消息类型进行路由或决策时, 无需深入解析 Message 对象的具体内容 ,可以直接从 MessageVariant 的顶层字段获取所需信息,从而简化了逻辑,提高了代码的可读性和效率。 - -## AgentAction - -Agent 产生包含 AgentAction 的 Event 可以控制多 Agent 协作,比如立刻退出、中断、跳转等: +`DesignateAgent` 可将 Option 限定到指定 Agent: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentAction struct { - Exit bool - - Interrupted *InterruptInfo - - TransferToAgent *TransferToAgentAction - - BreakLoop *BreakLoopAction - - CustomizedAction any -} - -type InterruptInfo struct { - Data any -} - -type TransferToAgentAction struct { - DestAgentName string -} +opt := adk.WithSessionValues(map[string]any{"key": "val"}).DesignateAgent("agent_1") ``` -Eino ADK 当前预设 Action 有四种: +# AsyncIterator -1. 退出:当 Agent 产生 Exit Action 时,Multi-Agent 会立刻退出 +`Run` 返回的异步事件迭代器: ```go -func NewExitAction() *AgentAction { - return &AgentAction{Exit: true} +iter := agent.Run(ctx, input) +for { + event, ok := iter.Next() + if !ok { + break + } + // 处理 event } ``` -1. 跳转:当 Agent 产生 Transfer Action 时,会跳转到目标 Agent 运行 +`Next()` 阻塞直到有新事件或迭代结束。Agent 实现通常在 goroutine 中写入 Generator,立即返回 Iterator: ```go -func NewTransferToAgentAction(destAgentName string) *AgentAction { - return &AgentAction{TransferToAgent: &TransferToAgentAction{DestAgentName: destAgentName}} +func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + iter, gen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() + go func() { + defer gen.Close() + // 执行逻辑,通过 gen.Send(event) 产出事件 + }() + return iter } ``` -1. 中断:当 Agent 产生 Interrupt Action 时,会中断 Runner 的运行。由于中断可能发生在任何位置,同时中断时需要向外传递独特的信息,Action 中提供了 `Interrupted` 字段供 Agent 设置自定义数据,Runner 接收到 Interrupted 不为空的 Action 时则认为产生了中断。Interrupt & Resume 内部机制较为复杂,在 【Eino ADK: Agent Runner】-【Eino ADK: Interrupt & Resume】章节会展开详述。 - -```go -// 例如 ChatModelAgent 中断时,会发送如下的 AgentEvent: -h.Send(&AgentEvent{AgentName: h.agentName, Action: &AgentAction{ - Interrupted: &InterruptInfo{ - Data: &ChatModelAgentInterruptInfo{Data: data, Info: info}, - }, -}}) -``` - -4. 中止循环:当 LoopAgent 的一个子 Agent 发出 BreakLoopAction 时,对应的 LoopAgent 会停止循环并正常退出。 - # 语言设置 -ADK 提供了 `SetLanguage` 函数用于设置内置提示词(prompt)的语言。这影响所有 ADK 内置组件和中间件生成的提示词语言。本能力在 [alpha/08](https://github.com/cloudwego/eino/releases/tag/v0.8.0-alpha.13) 版本引入。 - -## API - ```go -// Language 表示 ADK 内置提示词的语言设置 -type Language uint8 - -const ( - // LanguageEnglish 表示英文(默认) - LanguageEnglish Language = iota - // LanguageChinese 表示中文 - LanguageChinese -) - -// SetLanguage 设置 ADK 内置提示词的语言 -// 默认语言是英文(如果未显式设置) -func SetLanguage(lang Language) error +adk.SetLanguage(adk.LanguageChinese) // 或 adk.LanguageEnglish(默认) ``` -## 使用示例 - -```go -import "github.com/cloudwego/eino/adk" - -// 设置为中文 -err := adk.SetLanguage(adk.LanguageChinese) -if err != nil { - // 处理错误 -} - -// 设置为英文(默认) -err = adk.SetLanguage(adk.LanguageEnglish) -``` - -## 影响范围 - -语言设置会影响以下组件的内置提示词: - - - - - - - -
    组件/中间件影响的提示词
    FileSystem Middleware文件系统工具描述、系统提示词、执行工具提示词
    Reduction Middleware工具结果截断/清理的提示文字
    Skill Middleware技能系统提示词、技能工具描述
    ChatModelAgent内置系统提示词
    - -> 💡 -> 建议在程序初始化时设置语言,因为语言设置是全局生效的。在运行时更改语言可能导致同一会话中出现混合语言的提示词。 +影响 ADK 内置提示词(FileSystem、Reduction、Skill、ChatModelAgent 等组件)。建议在程序初始化时设置。 > 💡 -> 语言设置仅影响 ADK 内置的提示词。你自定义的提示词(如 Agent 的 Instruction)需要自行处理国际化。 +> 语言设置仅影响 ADK 内置提示词。自定义 Instruction 需自行处理国际化。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_preview.md b/content/zh/docs/eino/core_modules/eino_adk/agent_preview.md index d32061c34a9..1afa28ea851 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_preview.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_preview.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-01-20" +date: "2026-05-17" lastmod: "" tags: [] title: 概述 @@ -9,154 +9,72 @@ weight: 2 # 什么是 Eino ADK? -Eino ADK 参考 [Google-ADK](https://google.github.io/adk-docs/agents/) 的设计,提供了 Go 语言 的 Agents 开发的灵活组合框架,即 Agent、Multi-Agent 开发框架。Eino ADK 为多 Agent 交互时,沉淀了通用的 上下文传递、事件流分发和转换、任务控制权转让、中断与恢复、通用切面等能力。 适用场景广泛、模型无关、部署无关,让 Agent、Multi-Agent 开发更加简单、便利,并提供完善的生产级应用的治理能力。 +Eino ADK 是 Go 语言的 Agent 开发框架,提供: -Eino ADK 旨在帮助开发者开发、管理 Agent 应用。提供灵活且鲁棒的开发环境,助力开发者搭建 对话智能体、非对话智能体、复杂任务、工作流等多种多样的 Agent 应用。 +- **ChatModelAgent**:以 LLM 为决策器的 ReAct Agent,支持工具调用、自主推理、运行时增强(Middleware) +- **Workflow Agents**:确定性编排原语(Sequential / Loop / Parallel) +- **Runner / TurnLoop**:Agent 执行入口,支持事件流、checkpoint/resume、多轮抢占 +- **多 Agent 协作**:AgentAsTool(推荐)、Workflow 组合 -# ADK 框架 +适用场景广泛、模型无关、部署无关。 -Eino ADK 的整体模块构成,如下图所示: - - +# ADK 架构 ## Agent Interface -Eino ADK 的核心是 Agent 抽象(Agent Interface),ADK 的所有功能设计均围绕 Agent 抽象展开。详解请见 [Eino ADK: Agent 抽象 [New]](/zh/docs/eino/core_modules/eino_adk/agent_interface) +ADK 的所有功能围绕 `Agent` 接口展开: ```go type Agent interface { Name(ctx context.Context) string Description(ctx context.Context) string - - // Run runs the agent. - // The returned AgentEvent within the AsyncIterator must be safe to modify. - // If the returned AgentEvent within the AsyncIterator contains MessageStream, - // the MessageStream MUST be exclusive and safe to be received directly. - // NOTE: it's recommended to use SetAutomaticClose() on the MessageStream of AgentEvents emitted by AsyncIterator, - // so that even the events are not processed, the MessageStream can still be closed. Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent] } ``` -`Agent.Run` 的定义为: - -1. 从入参 AgentInput、AgentRunOption 和可选的 Context Session 中获取任务详情及相关数据 -2. 执行任务,并将执行过程、执行结果写入到 AgentEvent Iterator - -`Agent.Run` 要求 Agent 的实现以 Future 模式异步执行,核心分成三步,具体可参考 ChatModelAgent 中 Run 方法的实现: +`Run` 的语义: -1. 创建一对 Iterator、Generator -2. 启动 Agent 的异步任务,并传入 Generator,处理 AgentInput。Agent 在这个异步任务执行核心逻辑(例如 ChatModelAgent 调用 LLM),并在产生新的事件时写入到 Generator 中,供 Agent 调用方在 Iterator 中消费 -3. 启动 2 中的任务后立即返回 Iterator +1. 从 `AgentInput` 和 Context 中获取任务信息 +2. 异步执行任务,产出的事件写入 `AsyncIterator` +3. 启动异步任务后立即返回 Iterator(Future 模式) -## 多 Agent 协作 +## ChatModelAgent -围绕 Agent 抽象,Eino ADK 提供多种简单易用、场景丰富的组合原语,可支撑开发丰富多样的 Multi-Agent 协同策略,比如 Supervisor、Plan-Execute、Group-Chat 等 Multi-Agent 场景。从而实现不同的 Agent 分工合作模式,处理更复杂的任务。详解请见 [Eino ADK: Agent 组合](/zh/docs/eino/core_modules/eino_adk/agent_collaboration) +ADK 的核心实现。以 ChatModel 为决策器,通过 ReAct Loop 自主推进问题求解。 -Eino ADK 定义的 Agent 协作过程中的协作原语如下: +**ChatModelAgent = ChatModel + Tools + ReAct Loop + Middleware** -- Agent 间协作方式 +详细介绍见:[Eino ADK: ChatModelAgent 介绍](/zh/docs/eino/overview/eino_adk_quickstart) - - - - -
    协助方式描述
    Transfer直接将任务转让给另外一个 Agent,本 Agent 则执行结束后退出,不关心转让 Agent 的任务执行状态
    ToolCall(AgentAsTool)将 Agent 当成 ToolCall 调用,等待 Agent 的响应,并可获取被调用Agent 的输出结果,进行下一轮处理
    +## 多 Agent 协作 -- AgentInput 的上下文策略 +> 💡 +> 推荐方式:**AgentAsTool** — 将子 Agent 转为 Tool,父 Agent 通过 ToolCall 调用并获取结果。这是最灵活、最可组合的协作模式。 - - - + + +
    上下文策略描述
    上游 Agent 全对话获取本 Agent 的上游 Agent 的完整对话记录
    全新任务描述忽略掉上游 Agent 的完整对话记录,给出一个全新的任务总结,作为子 Agent 的 AgentInput 输入
    协作方式机制适用场景
    AgentAsTool(推荐)子 Agent 包装为 Tool,父 Agent 自主决定是否调用委派子任务、能力组合
    WorkflowSequential / Loop / Parallel 确定性编排流程固定的多步任务
    -- 决策自主性 +详见:[Agent 协作](/zh/docs/eino/core_modules/eino_adk/agent_collaboration) - - - - -
    决策自主性描述
    自主决策在 Agent 内部,基于其可选的下游 Agent, 如需协助时,自主选择下游 Agent 进行协助。 一般来说,Agent 内部是基于 LLM 进行决策,不过即使是基于预设逻辑进行选择,从 Agent 外部看依然视为自主决策
    预设决策事先预设好一个Agent 执行任务后的下一个 Agent。 Agent 的执行顺序是事先确定、可预测的
    +## Runner -围绕协作原语,Eino ADK 提供了如下的几种 Agent 组合原语: +Runner 是 Agent 的执行入口。只有通过 Runner 执行时才能使用: - - - - - - - -
    类型描述运行模式协作方式上下文策略决策自主性
    SubAgents将用户提供的 agent 作为 父Agent,用户提供的 subAgents 列表作为 子Agents,组合而成可自主决策的 Agent,其中的 Name 和 Description 作为该 Agent 的名称标识和描述。
  • 当前限定一个 Agent 只能有一个 父 Agent
  • 可采用 SetSubAgents 函数,构建 「多叉树」 形式的 Multi-Agent
  • 在这个「多叉树」中,AgentName 需要保持唯一
  • Transfer上游 Agent 全对话自主决策
    Sequential将用户提供的 SubAgents 列表,组合成按照顺序依次执行的 Sequential Agent,其中的 Name 和 Description 作为 Sequential Agent 的名称标识和描述。Sequential Agent 执行时,将 SubAgents 列表,按照顺序依次执行,直至将所有 Agent 执行一遍后结束。Transfer上游 Agent 全对话预设决策
    Parallel将用户提供的 SubAgents 列表,组合成基于相同上下文,并发执行的 Parallel Agent,其中的 Name 和 Description 作为 Parallel Agent 的名称标识和描述。Parallel Agent 执行时,将 SubAgents 列表,并发执行,待所有 Agent 执行完成后结束。Transfer上游 Agent 全对话预设决策
    Loop将用户提供的 SubAgents 列表,按照数组顺序依次执行,循环往复,组合成 Loop Agent,其中的 Name 和 Description 作为 Loop Agent 的名称标识和描述。Loop Agent 执行时,将 SubAgents 列表,顺序执行,待所有 Agent 执行完成后结束。Transfer上游 Agent 全对话预设决策
    AgentAsTool将一个 Agent 转换成 Tool,被其他的 Agent 当成普通的 Tool 使用。一个 Agent 能否将其他 Agent 当成 Tool 进行调用,取决于自身的实现。Eino ADK 中提供的 ChatModelAgent 支持 AgentAsTool 的功能ToolCall全新任务描述自主决策
    - -## ChatModelAgent - -`ChatModelAgent` 是 Eino ADK 对 Agent 的关键实现,它封装了与大语言模型的交互逻辑,实现了 ReAct 范式的 Agent,基于 Eino 中的 Graph 编排出 ReAct Agent 控制流,通过 callbacks.Handler 导出 ReAct Agent 运行过程中产生的事件,转换成 AgentEvent 返回。 - -想要进一步了解 ChatModelAgent,请见:[Eino ADK: ChatModelAgent [New]](/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model) +- **事件流输出**:Query/Run → AsyncIterator[AgentEvent] +- **Checkpoint / Resume**:持久化运行状态,支持中断恢复 +- **TurnLoop**:多轮运行时,Push/Preempt/Stop ```go -type ChatModelAgentConfig struct { - // Name of the agent. Better be unique across all agents. - Name string - // Description of the agent's capabilities. - // Helps other agents determine whether to transfer tasks to this agent. - Description string - // Instruction used as the system prompt for this agent. - // Optional. If empty, no system prompt will be used. - // Supports f-string placeholders for session values in default GenModelInput, for example: - // "You are a helpful assistant. The current time is {Time}. The current user is {User}." - // These placeholders will be replaced with session values for "Time" and "User". - Instruction string - - Model model.ToolCallingChatModel - - ToolsConfig ToolsConfig - - // GenModelInput transforms instructions and input messages into the model's input format. - // Optional. Defaults to defaultGenModelInput which combines instruction and messages. - GenModelInput GenModelInput - - // Exit defines the tool used to terminate the agent process. - // Optional. If nil, no Exit Action will be generated. - // You can use the provided 'ExitTool' implementation directly. - Exit tool.BaseTool - - // OutputKey stores the agent's response in the session. - // Optional. When set, stores output via AddSessionValue(ctx, outputKey, msg.Content). - OutputKey string - - // MaxIterations defines the upper limit of ChatModel generation cycles. - // The agent will terminate with an error if this limit is exceeded. - // Optional. Defaults to 20. - MaxIterations int -} +runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + CheckPointStore: store, // 可选 +}) -func NewChatModelAgent(_ context.Context, config *ChatModelAgentConfig) (*ChatModelAgent, error) { - // omit code -} +iter := runner.Query(ctx, "你的问题") ``` -# AgentRunner - -AgentRunner 是 Agent 的执行器,为 Agent 运行所需要的拓展功能加以支持,详解请见:[Eino ADK: Agent 扩展](/zh/docs/eino/core_modules/eino_adk/agent_extension) - -只有通过 Runner 执行 agent 时,才可以使用 ADK 的如下功能: - -- Interrupt & Resume -- 切面机制 -- Context 环境的预处理 - - ```go - type RunnerConfig struct { - Agent Agent - EnableStreaming bool - - CheckPointStore compose.CheckPointStore - } - - func NewRunner(_ context.Context, conf RunnerConfig) *Runner { - // omit code - } - ``` +详见:[Agent Runner 与扩展](/zh/docs/eino/core_modules/eino_adk/agent_extension) | [Agent Cancel 与 TurnLoop](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_quickstart.md b/content/zh/docs/eino/core_modules/eino_adk/agent_quickstart.md index 1b833a6b6e4..f36e3f913d2 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_quickstart.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_quickstart.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-01-30" +date: "2026-05-19" lastmod: "" tags: [] title: Quickstart @@ -9,85 +9,47 @@ weight: 1 # Installation -Eino 自 0.5.0 版本正式提供 ADK 功能供用户使用,您可以在项目中输入下面命令来升级 Eino: +Eino ADK 自 v0.5.0 起可用,v0.9.0 为当前推荐版本: ```go -// stable >= eino@v0.5.0 go get github.com/cloudwego/eino@latest ``` -# Agent +# 核心概念 -### 什么是 Eino ADK +**Eino ADK** 是 Go 语言的 Agent 开发框架。核心原语是 **ChatModelAgent**——以 ChatModel 为决策器、以 Tools 为行动空间、通过 ReAct Loop 自主推进问题求解的智能体。 -Eino ADK 参考 [Google-ADK](https://google.github.io/adk-docs/agents/) 的设计,提供了 Go 语言 的 Agents 开发的灵活组合框架,即 Agent、Multi-Agent 开发框架,并为多 Agent 交互场景沉淀了通用的上下文传递、事件流分发和转换、任务控制权转让、中断与恢复、通用切面等能力。 +> 💡 +> 如果你只读一篇文档,请读:[Eino ADK: ChatModelAgent 介绍](/zh/docs/eino/overview/eino_adk_quickstart) -### 什么是 Agent - -Agent 是 Eino ADK 的核心,它代表一个独立的、可执行的智能任务单元。你可以把它想象成一个能够理解指令、执行任务并给出回应的“智能体”。每个 Agent 都有明确的名称和描述,使其可以被其他 Agent 发现和调用。 - -任何需要与大语言模型(LLM)交互的场景都可以抽象为一个 Agent。例如: - -- 一个用于查询天气信息的 Agent。 -- 一个用于预定会议的 Agent。 -- 一个能够回答特定领域知识的 Agent。 - -### Eino ADK 中的 Agent - -Eino ADK 中的所有功能设计均围绕 Agent 抽象设计展开: - -```go -type Agent interface { - Name(ctx context.Context) string - Description(ctx context.Context) string - Run(ctx context.Context, input *AgentInput) *AsyncIterator[*AgentEvent] -} -``` - -基于 Agent 抽象,ADK 提供了三大类基础拓展: - -- `ChatModel Agent`: 应用程序的“思考”部分,利用 LLM 作为核心,理解自然语言,进行推理、规划、生成响应,并动态决定如何执行或使用哪些工具。 -- `Workflow Agents`:应用程序的协调管理部分,基于预定义的逻辑,按照自身类型(顺序 / 并发 / 循环)控制子 Agent 执行流程。Workflow Agents 产生确定性的,可预测的执行模式,不同于 ChatModel Agent 生成的动态随机的决策。 - - 顺序 (Sequential Agent):按顺序依次执行子 Agents - - 循环 (Loop Agent):重复执行子 Agents,直至满足特定的终止条件 - - 并行 (Parallel Agent):并行执行多个子 Agents -- `Custom Agent`:通过接口实现自己的 Agent,允许定义高度定制的复杂 Agent - -基于基础扩展,您可以针对自己的需求排列组合这些基础 Agents,构建所需要的 Multi-Agent 系统。另外,Eino 从日常实践经验出发,内置提供了几种开箱即用的 Multi-Agent 最佳范式: - -- Supervisor: 监督者模式,监督者 Agent 控制所有通信流程和任务委托,并根据当前上下文和任务需求决定调用哪个 Agent。 -- Plan-Execute:计划-执行模式,Plan Agent 生成含多个步骤的计划,Execute Agent 根据用户 query 和计划来完成任务。Execute 后会再次调用 Plan,决定完成任务 / 重新进行规划。 - -下方表格和图提供了这些基础拓展与封装的特点,区别,与关系。后续章节中将展开介绍每种类型的原理与细节: +## 组件地图 - - - - + + + + + +
    类别ChatModel AgentWorkflow AgentsCustom LogicEinoBuiltInAgent(supervisor, plan-execute)
    功能思考,生成,工具调用控制 Agent 之间的执行流程运行自定义逻辑开箱即用的 Multi-agent 模式封装
    核心LLM预确定的执行流程(顺序,并发,循环)自定义代码基于 Eino 实践积累的经验,对前三者的高度封装
    用途生成,动态决策结构化处理,编排定制需求特定场景内的开箱即用
    组件职责文档
    ChatModelAgentReAct Loop:推理 → 行动 → 反馈,自主决策ChatModelAgent 介绍
    Middleware在 ReAct Loop 的生命周期点位注入行为(压缩、搜索、重试等)ChatModelAgentMiddleware
    Runner单次 Agent 运行入口:Query / Run → 事件流Agent Runner 与扩展
    TurnLoop多轮运行时:Push / Preempt / Stop + 声明式 checkpoint/resumeAgent Cancel 与 TurnLoop
    DeepAgents预构建 Agent:任务规划(PlanTask)+ 子任务委派(TaskTool)DeepAgents
    - +## 其他 Agent 类型 -# ADK Examples +除 ChatModelAgent 外,ADK 还提供确定性编排原语: -[Eino-examples](https://github.com/cloudwego/eino-examples/tree/main/adk) 项目中提供了多种 ADK 的实施样例,您可以参考样例代码与简介,对 adk 能力构建初步的认知: +- **Workflow Agents**:Sequential / Loop / Parallel Agent,用于预定义流程的结构化编排。 +- **Custom Agent**:实现 `Agent` 接口即可接入框架。 - - - - - - - - - -
    项目路径简介结构图
    顺序工作流案例该示例代码展示了基于 eino adk 的 Workflow 模式构建的一个顺序执行的多智能体工作流。
  • 顺序工作流构建:通过 adk.NewSequentialAgent 创建一个名为 ResearchAgent 的顺序执行智能体,内部包含两个子智能体(SubAgents)PlanAgent 和 WriterAgent,分别负责研究计划制定和报告撰写。
  • 子智能体职责明确:PlanAgent 接收研究主题,生成详细且逻辑清晰的研究计划;WriterAgent 根据该研究计划撰写结构完整的学术报告。
  • 输入输出串联:PlanAgent 输出的研究计划作为 WriterAgent 的输入,形成清晰的上下游数据流,体现业务步骤的顺序依赖。
  • 循环工作流案例该示例代码基于 eino adk 的 Workflow 模式中的 LoopAgent,构建了一个反思迭代型智能体框架。
  • 迭代反思框架:通过 adk.NewLoopAgent 创建 ReflectionAgent,包含两个子智能体 MainAgent 和 CritiqueAgent,支持最多 5 次迭代,形成主任务解决与批判反馈的闭环。
  • 主智能体(MainAgent):负责根据用户任务生成初步解决方案,追求准确完整的答案输出。
  • 批判智能体(CritiqueAgent):对主智能体输出进行质量审查,反馈改进意见,若结果满意则终止循环,提供最终总结。
  • 循环机制:利用 LoopAgent 的迭代能力,实现在多轮反思中不断优化解决方案,提高输出质量和准确性。
  • 并行工作流案例该示例代码基于 eino adk 的 Workflow 模式中的 ParallelAgent,构建了一个并发信息搜集框架:
  • 并发运行框架:通过 adk.NewParallelAgent 创建 DataCollectionAgent,包含多个信息采集子智能体。
  • 子智能体职责分配:每个子智能体负责一个渠道的信息采集与分析,彼此之间无需交互,功能边界清晰。
  • 并发运行:Parallel Agent 能够同时从多个数据源启动信息收集任务,处理效率相较于串行方式显著提升。
  • supervisor该用例采用单层 Supervisor 管理两个功能较为综合的子 Agent:Research Agent 负责检索任务,Math Agent 负责多种数学运算(加、乘、除),但所有数学运算均由同一个 Math Agent 内部统一处理,而非拆分为多个子 Agent。此设计简化了代理层级,适合任务较为集中且不需要过度拆解的场景,便于快速部署和维护。
    layered-supervisor该用例实现了多层级智能体监督体系,顶层 Supervisor 管理 Research Agent 和 Math Agent,Math Agent 又进一步细分为 Subtract、Multiply、Divide 三个子 Agent。顶层 Supervisor 负责将研究任务和数学任务分配给下级 Agent,Math Agent 作为中层监督者再将具体数学运算任务分派给其子 Agent。
  • 多层级智能体结构:实现了一个顶层 Supervisor Agent,管理两个子智能体 ——Research Agent(负责信息检索)和 Math Agent(负责数学运算)。
  • Math Agent 内部再细分三个子智能体:Subtract Agent、Multiply Agent 和 Divide Agent,分别处理减法、乘法和除法运算,体现多级监督和任务委派。
  • 这种分层管理结构体现了复杂任务的细粒度拆解和多级任务委派,适合任务分类清晰且计算复杂的场景。
    plan-execute 案例本示例基于 eino adk 实现 plan-execute-replan 模式的多 Agent 旅行规划系统,核心功能是处理用户复杂旅行请求(如 “3 天北京游,需从纽约出发的航班、酒店推荐、必去景点”),通过 “计划 - 执行 - 重新计划” 循环完成任务:1. 计划(Plan):
    Planner Agent
    基于大模型生成分步执行计划(如 “第一步查北京天气,第二步搜纽约到北京航班”);2. 执行(Execute):
    Executor Agent
    调用 ** 天气(get_weather)、航班(search_flights)、酒店(search_hotels)、景点(search_attractions)** 等 Mock 工具执行每一步,若用户输入信息缺失(如未说明预算),则调用
    ask_for_clarification
    工具追问;3. 重新计划(Replan):
    Replanner Agent
    根据工具执行结果评估是否需要调整计划(如航班无票则重新选日期)。Execute 和 Replan 不断循环运行,直至完成计划中的所有步骤;4. 支持会话轨迹跟踪(CozeLoop 回调)和状态管理,最终输出完整旅行方案。从结构上看,plan-execute-replan 分为两层:
  • 第二层是由 execute + replan agent 构成的 loop agent,即 replan 后可能需要重新 execute(重新规划后需要查询旅行信息 / 请求用户继续澄清问题)
  • 第一层是由 plan agent + 第二层构造的 loop agent 构成的 sequential agent,即 plan 仅执行一次,然后交由 loop agent 执行
  • 书籍推荐 agent(运行中断与恢复)该代码展示了基于 eino adk 框架构建的一个书籍推荐聊天智能体实现,体现了 Agent 运行中断与恢复功能。
  • Agent 构建:通过 adk.NewChatModelAgent 创建一个名为 BookRecommender 的聊天智能体,用于根据用户请求推荐书籍。
  • 工具集成:集成了两个工具 —— 搜索书籍的 BookSearch 工具 和 询问澄清信息的 AskForClarification 工具,支持多轮交互和信息补充。
  • 状态管理:实现了简单的内存 CheckPoint 存储,支持会话的断点续接,保证上下文连续性。
  • 事件驱动:通过迭代 runner.Query 和 runner.Resume 获取事件流,处理执行过程中的各种事件及错误。
  • 自定义输入:支持动态接收用户输入,利用工具选项传入新的查询请求,灵活驱动任务流程。
  • +> 💡 +> Graph(确定性编排)与 Agent(自主决策)是两种不同的 AI 应用形态。当核心问题是"自主决策 + 运行时增强"时,推荐使用 ChatModelAgent。详见 ChatModelAgent 介绍中的"为什么不继续使用 flow/react"。 -# What's Next +# 示例 -经过 Quickstart 概览,您应该对 Eino ADK 与 Agent 有了基础的认知。 +[eino-examples/adk](https://github.com/cloudwego/eino-examples/tree/main/adk) 提供了完整的 ADK 示例代码: -接下来的文章将深入介绍 ADK 的核心概念,助您理解 Eino ADK 的工作原理并更好的使用它: +- **ChatModelAgent 入门**:[chatmodel](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/chatmodel) — 书籍推荐 Agent,含中断与恢复 +- **DeepAgents**:[deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) — 任务规划 + 子任务委派 +- **Workflow**:[sequential](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/sequential) / [loop](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/loop) / [parallel](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/parallel) +- **Multi-Agent**:[supervisor](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/supervisor) / [plan-execute](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/plan-execute-replan) - +# What's Next diff --git a/content/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/_index.md b/content/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/_index.md new file mode 100644 index 00000000000..04a64e15dac --- /dev/null +++ b/content/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/_index.md @@ -0,0 +1,540 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: Agent Cancel 与 TurnLoop 快速入门 +weight: 10 +--- + +Eino ADK 中 **Agent 取消** 和 **TurnLoop** 两项核心特性的快速入门指南。自 [v0.9.0-alpha.9](https://github.com/cloudwego/eino/releases/tag/v0.9.0-alpha.9) 版本引入。 + +## 类型约定 + +本文示例统一使用以下泛型实例化: + +- `T = string`(推送给 TurnLoop 的业务项类型) +- `M = *schema.Message`(Agent 消息类型,即标准 `Message`) + +ADK 中相关类型别名: + +```go +type Agent = TypedAgent[*schema.Message] +type AgentInput = TypedAgentInput[*schema.Message] +type AgentEvent = TypedAgentEvent[*schema.Message] +``` + +当需要使用 `*schema.AgenticMessage` 时,将 `M` 替换为对应类型即可,所有 API 签名完全对称。 + +--- + +## 第一部分:Agent 取消 + +### 场景 + +用户向 agent 发送请求后,因等待过长或需求变更,希望取消当前执行。 + +### 核心 API + +```go +// 创建取消选项和取消函数 +cancelOpt, cancelFunc := adk.WithCancel() + +// 启动 agent,传入取消选项 +iter := runner.Run(ctx, []*schema.Message{schema.UserMessage("你好")}, cancelOpt) + +// 发起取消(可在任意 goroutine 调用) +handle, contributed := cancelFunc(adk.WithAgentCancelMode(adk.CancelImmediate)) +// contributed == true: 本次调用影响了执行结果 +// contributed == false: agent 已结束或取消已完成,本次调用无实际效果 + +err := handle.Wait() +``` + +`CancelHandle.Wait()` 的三种返回值: + +```go +switch { +case err == nil: + // 取消成功 +case errors.Is(err, adk.ErrCancelTimeout): + // 安全点超时,已自动升级为立即取消 +case errors.Is(err, adk.ErrExecutionEnded): + // agent 在取消生效前已自然结束 +} +``` + +### 三种取消模式 + + + + + + +
    模式行为适用场景
    CancelImmediate
    立即中断,不等待安全点紧急停止、超时兜底
    CancelAfterChatModel
    等当前 ChatModel 调用完成后取消需要完整模型回答
    CancelAfterToolCalls
    等当前 ToolCalls 全部完成后取消确保 tool 副作用完整
    + +> 💡 +> `CancelMode` 是位掩码,可组合使用:`CancelAfterChatModel | CancelAfterToolCalls` 等价于"哪个安全点先到达就取消"。 + +### 安全点取消 + +```go +// 等 ChatModel 完成后取消,5 秒超时保护 +handle, _ := cancelFunc( + adk.WithAgentCancelMode(adk.CancelAfterChatModel), + adk.WithAgentCancelTimeout(5*time.Second), +) +``` + +> 💡 +> 安全点模式务必配合 `WithAgentCancelTimeout`。若 agent 永远不到达安全点,超时后自动升级为立即取消。 + +### 递归取消 + +默认取消仅影响根 agent。使用 `WithRecursive()` 将取消传播到 AgentTool 内嵌套的子 agent: + +```go +handle, _ := cancelFunc( + adk.WithAgentCancelMode(adk.CancelAfterChatModel), + adk.WithRecursive(), +) +``` + +### 消费端识别取消 + +```go +for { + event, ok := iter.Next() + if !ok { + break + } + if event.Err != nil { + var cancelErr *adk.CancelError + if errors.As(event.Err, &cancelErr) { + log.Printf("Agent 被取消 (mode=%v, escalated=%v)", + cancelErr.Info.Mode, cancelErr.Info.Escalated) + } + break + } + // 处理正常事件... +} +``` + +--- + +## 第二部分:TurnLoop + +### 场景 + +构建一个持续运行的 agent 服务:用户随时发送消息,agent 按轮次处理;紧急消息可抢占当前执行。 + +### Turn 生命周期 + + + +### 基本用法 + +```go +loop := adk.NewTurnLoop(adk.TurnLoopConfig[string, *schema.Message]{ + // GenInput:接收缓冲区所有项目,决定本轮消费哪些 + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(strings.Join(items, "\n"))}}, + Consumed: items, + }, nil + }, + + // PrepareAgent:根据本轮消费项构建 Agent + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return myAgent, nil + }, + + // OnAgentEvents:处理 agent 事件流(可选) + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + if event.Err != nil { + return event.Err + } + log.Printf("收到事件: agent=%s", event.AgentName) + } + return nil + }, +}) + +loop.Push("消息 1") +loop.Push("消息 2") +loop.Run(ctx) // 非阻塞,启动后台处理 +loop.Push("消息 3") // 运行中仍可推入 +loop.Stop() +result := loop.Wait() // 阻塞至退出 +``` + +### 核心回调 + + + + + + + +
    回调必填职责
    GenInput
    接收缓冲区所有项目,返回
    Consumed
    (本轮处理)和
    Remaining
    (留给后续轮次)。不在两者中的项目会被丢弃。
    PrepareAgent
    根据 Consumed 项目构建 Agent(设置 prompt、tools、middleware 等)
    OnAgentEvents
    处理 agent 事件流。未设置时默认 drain 事件并返回首个错误
    GenResume
    从 checkpoint 恢复时调用,决定如何合并 interrupted/unhandled/new items
    + +> 💡 +> `OnAgentEvents` 中**不要传播 CancelError**——框架会自动处理。Stop 导致的 `CancelError` 作为 `ExitReason` 传播;Preempt 导致的 `CancelError` 被框架吞掉,循环继续下一轮。回调仅在自身出现致命错误时才应返回 non-nil error。 + +### 抢占(Preempt) + +```go +// 推送紧急消息,在安全点取消当前 agent +accepted, ack := loop.Push("紧急消息!", adk.WithPreempt[string, *schema.Message](adk.AnySafePoint)) + +if accepted { + <-ack // 等待抢占信号被提交(当前 turn 保证会被取消) +} +``` + +抢占是原子操作——"推入新消息"和"取消当前 agent"作为整体执行: + +1. 紧急消息入缓冲区 +2. 当前 agent 在安全点被取消 +3. TurnLoop 自动开始新 turn +4. `GenInput` 收到所有缓冲项目(含紧急消息),重新决策 + +> 💡 +> `WithPreempt` 始终使用安全点取消,**不自动设置 WithRecursive**。而 `WithPreemptTimeout` 会自动启用 `WithRecursive`——超时升级为立即取消时,嵌套子 agent 也会被终止。 + +### 带超时 / 带延迟的抢占 + +```go +// 安全点等待,5 秒超时后升级为立即取消(自动递归) +loop.Push("紧急", adk.WithPreemptTimeout[string, *schema.Message](adk.AnySafePoint, 5*time.Second)) + +// 2 秒宽限期后再发起抢占 +loop.Push("新消息", + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + adk.WithPreemptDelay[string, *schema.Message](2*time.Second), +) +``` + +### 条件抢占:WithPushStrategy + +当抢占决策依赖当前 turn 状态时,使用 `WithPushStrategy` 避免 TOCTOU 竞态: + +```go +loop.Push(urgentItem, adk.WithPushStrategy( + func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message]) []adk.PushOption[string, *schema.Message] { + if tc == nil { + return nil // 当前无活跃 turn,无需抢占 + } + if isLowPriority(tc.Consumed) { + return []adk.PushOption[string, *schema.Message]{ + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + } + } + return nil // 当前是高优先级任务,不抢占 + }, +)) +``` + +### 在 OnAgentEvents 中感知抢占和停止 + +`TurnContext` 提供 `Preempted` 和 `Stopped` 两个信号通道: + +```go +OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + + select { + case <-tc.Preempted: + log.Println("当前 turn 被抢占,正在收尾...") + case <-tc.Stopped: + log.Printf("循环正在停止,原因: %s", tc.StopCause()) + default: + } + + if event.Err != nil { + return event.Err + } + // 处理事件... + } + return nil +}, +``` + +> 💡 +> `Preempted` / `Stopped` 仅在对应的取消调用实际 "contribute" 到当前 turn 的 `CancelError` 时才关闭。如果取消已被其他信号最终确定,通道保持打开。 + +### 停止 TurnLoop + +```go +// 等当前 turn 完成后退出(ExitReason 为 nil) +loop.Stop() + +// 立即中止当前 agent(递归传播到嵌套 agent) +loop.Stop(adk.WithImmediate()) + +// 安全点停止(递归传播,无超时) +loop.Stop(adk.WithGraceful()) + +// 带超时的安全点停止(超时后升级为立即取消) +loop.Stop(adk.WithGracefulTimeout(10 * time.Second)) + +// 空闲后自动关停(持续空闲 30 秒后停止) +loop.Stop(adk.UntilIdleFor(30 * time.Second)) +``` + +> 💡 +> 可多次调用 `Stop()` 升级取消策略。典型模式:先 `WithGraceful()`,超时后再 `WithImmediate()`。 + +### 附带停止原因 + +```go +loop.Stop( + adk.WithGraceful(), + adk.WithStopCause("quota exceeded"), +) +result := loop.Wait() +log.Printf("停止原因: %s", result.StopCause) +``` + +--- + +## 第三部分:声明式 Checkpoint 恢复 + +### 场景 + +Agent 被取消或中断后,下次启动时自动从断点恢复,而非从头开始。TurnLoop 自动管理输入簿记(bookkeeping),应用层只需声明 interrupted/unhandled/new items 如何重入后续 turn。 + +### 配置 Checkpoint + +在 `TurnLoopConfig` 中同时设置 `Store` 和 `CheckpointID` 即可启用: + +```go +store := NewMyCheckpointStore() // 实现 CheckPointStore 接口 + +cfg := adk.TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(items[0])}}, + Consumed: items[:1], + Remaining: items[1:], + }, nil + }, + + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return myAgent, nil + }, + + // GenResume:从 checkpoint 恢复时调用 + GenResume: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], interruptedItems, unhandledItems, newItems []string) (*adk.GenResumeResult[string, *schema.Message], error) { + all := append(append(interruptedItems, unhandledItems...), newItems...) + return &adk.GenResumeResult[string, *schema.Message]{ + Consumed: all[:1], + Remaining: all[1:], + }, nil + }, + + Store: store, + CheckpointID: "session-123", +} +``` + +### 恢复流程 + +`Run()` 启动时自动查询 Store: + + + + + + +
    Checkpoint 状态行为
    存在 mid-turn checkpoint(agent 执行中被中断)调用
    GenResume
    ,将 interrupted/unhandled/new items 交给应用层决策后恢复执行
    存在 between-turns checkpoint(轮次间被停止)将已缓冲项目加入 buffer,通过
    GenInput
    正常处理
    不存在 checkpoint从头开始
    + +```go +// 第一次运行 +loop := adk.NewTurnLoop(cfg) +loop.Push("消息 1") +loop.Run(ctx) +loop.Stop(adk.WithGraceful()) +exit := loop.Wait() +log.Printf("checkpoint 尝试: %v, err: %v", exit.CheckpointAttempted, exit.CheckpointErr) + +// 第二次运行(相同 cfg,包含相同 CheckpointID) +loop2 := adk.NewTurnLoop(cfg) +loop2.Push("新消息") // 作为 newItems 传入 GenResume +loop2.Run(ctx) // 自动检测 checkpoint 并恢复 +result := loop2.Wait() +``` + +### 跳过 Checkpoint + +```go +loop.Stop(adk.WithSkipCheckpoint()) // 本次退出不保存 checkpoint +``` + +### 实现 CheckPointStore + +```go +type CheckPointStore interface { + Get(ctx context.Context, checkPointID string) ([]byte, bool, error) + Set(ctx context.Context, checkPointID string, checkPoint []byte) error +} +``` + +可选实现 `CheckPointDeleter` 以支持显式删除过期 checkpoint: + +```go +type CheckPointDeleter interface { + Delete(ctx context.Context, checkPointID string) error +} +``` + +正常退出(未保存新 checkpoint)时,TurnLoop 会尝试删除先前加载的 checkpoint 以防过期恢复。**只有实现了 CheckPointDeleter 的 Store 才会执行删除**;否则由 Store 自身管理生命周期。 + +> 💡 +> 使用 `Store` 时,泛型参数 `T` 必须支持 `encoding/gob` 编解码——TurnLoop 通过 gob 持久化 runner checkpoint 和 item 簿记信息。 + +--- + +## 第四部分:完整示例 + +模拟一个支持优先级调度、抢占和 checkpoint 恢复的聊天服务: + +```go +package main + +import ( + "context" + "log" + "strings" + "time" + + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/schema" +) + +func main() { + ctx := context.Background() + store := adk.NewInMemoryStore() + + cfg := adk.TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + // 按优先级排序后,只消费第一条,其余留给后续轮次 + sorted := sortByPriority(items) + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(sorted[0])}}, + Consumed: sorted[:1], + Remaining: sorted[1:], // 不在两者中的项目会被丢弃 + }, nil + }, + + GenResume: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], interruptedItems, unhandledItems, newItems []string) (*adk.GenResumeResult[string, *schema.Message], error) { + all := append(append(interruptedItems, unhandledItems...), newItems...) + return &adk.GenResumeResult[string, *schema.Message]{ + Consumed: all[:1], + Remaining: all[1:], + }, nil + }, + + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return buildAgent(consumed), nil + }, + + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + // 感知抢占/停止信号,做收尾处理 + select { + case <-tc.Preempted: + log.Println("被更高优先级消息抢占") + case <-tc.Stopped: + log.Printf("服务关停: %s", tc.StopCause()) + default: + } + if event.Err != nil { + // 不传播 CancelError,框架自动处理 + return event.Err + } + log.Printf("[%s] %s", event.AgentName, extractText(event)) + } + return nil + }, + + Store: store, + CheckpointID: "chat-session-001", + } + + loop := adk.NewTurnLoop(cfg) + loop.Push("你好,帮我查一下天气") + loop.Run(ctx) + + // 1 秒后发送紧急消息抢占 + time.AfterFunc(1*time.Second, func() { + loop.Push("停!先帮我处理这个紧急问题", + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + ) + }) + + // 5 秒后优雅关停 + time.AfterFunc(5*time.Second, func() { + loop.Stop( + adk.WithGracefulTimeout(3*time.Second), + adk.WithStopCause("service shutdown"), + ) + }) + + result := loop.Wait() + log.Printf("退出原因: %v", result.ExitReason) + log.Printf("未处理消息: %v", result.UnhandledItems) + log.Printf("停止原因: %s", result.StopCause) + log.Printf("checkpoint: attempted=%v, err=%v", result.CheckpointAttempted, result.CheckpointErr) + + // 下次以相同 cfg 启动将自动从 checkpoint 恢复 +} +``` + +--- + +## 常见问题 + +### Q: 安全点取消会不会永远等不到安全点? + +会。如果 agent 陷入长时间运行的 tool 或 model 调用,安全点可能迟迟不到。**务必配合 WithAgentCancelTimeout 使用**,超时后自动升级为 `CancelImmediate`。 + +### Q: `WithRecursive` 什么时候需要? + +默认取消仅影响根 agent。当 agent 层级中包含 AgentTool 嵌套的子 agent,且你希望子 agent 也在安全点响应取消时,才需要。不确定时,先不加。 + +### Q: 泛型参数 T 有什么要求? + +当配置了 `Store` 时,`T` 必须可被 `encoding/gob` 编解码。基础类型(`string`、`int` 等)和全导出字段的 struct 默认支持。若 `T` 包含 interface 字段,需通过 `gob.Register` 注册具体类型。 + +### Q: `Push` 在 loop 停止后会怎样? + +`Push` 返回 `(false, closedCh)`。这些 "late items" 不会进入 checkpoint,可在 `Wait()` 返回后通过 `result.TakeLateItems()` 回收。一旦调用 `TakeLateItems()`,后续 `Push` 会 panic 以防数据静默丢失。 + +### Q: 多次调用 `Stop()` 会怎样? + +安全——每次调用可以升级取消策略。典型模式: + +```go +loop.Stop(adk.WithGraceful()) // 先尝试优雅停止 +time.AfterFunc(3*time.Second, func() { + loop.Stop(adk.WithImmediate()) // 3 秒后升级为立即取消 +}) +``` + +### Q: `GenInput` 返回的 items 不在 Consumed 也不在 Remaining 会怎样? + +会被丢弃。这是刻意设计——允许 `GenInput` 在决策时过滤掉不需要的项目。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/agent_cancel_and_turnloop_api_doc.md b/content/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/agent_cancel_and_turnloop_api_doc.md new file mode 100644 index 00000000000..332d89ec21d --- /dev/null +++ b/content/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart/agent_cancel_and_turnloop_api_doc.md @@ -0,0 +1,1299 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: Agent 取消与 TurnLoop API 文档 +weight: 1 +--- + +# Agent 取消与 TurnLoop API 文档 + +## 概述 + +本文档描述 Eino ADK(Agent Development Kit)中的核心高级特性: + +1. **Agent 取消**:优雅或立即取消运行中 agent 的机制 +2. **TurnLoop**:基于推送的事件循环,用于管理 agent 执行周期(依赖 Agent 取消功能) + +--- + +## Agent 取消 API + +### 概述 + +Agent 取消功能提供对运行中 agent 的细粒度控制。支持立即取消和安全点取消(等待特定执行点,如 chat model 调用后或 tool 调用后)。默认情况下,取消模式仅影响根 agent;嵌套在 AgentTool 内的子 agent 不会收到取消通知。使用 `WithRecursive()` 可将取消传播到整个 agent 层级(包括 AgentTool 内的嵌套子 agent),在层级中任意位置先到达安全点时触发取消。 + +**Checkpoint 保证**:无论使用哪种 `CancelMode`,取消都会在 Runner 维度保存 checkpoint,取消后可通过 `Runner.Resume` 或 `Runner.ResumeWithParams` 恢复执行。使用 `WithRecursive` 时,子 agent 也会尝试触发取消,并将自身的中断信息级联向上传递,最终在根 agent 层面生成一个包含子 agent checkpoint 的完整 checkpoint,从而支持从子 agent 中断处恢复。 + +### 核心类型 + +#### `CancelMode` + +指定 agent 应在何时被取消。模式可以通过位运算 OR 组合使用。 + +```go +type CancelMode int + +const ( + // CancelImmediate 立即取消 agent,无需等待 ChatModel 或 ToolCalls 安全点。 + // 默认仅中断根 agent;AgentTool 内的子 agent 通过 context 取消作为副作用被清理。 + // 使用 WithRecursive 可将显式的 immediate-cancel 信号传播到子 agent, + // 以实现干净的 teardown(带 grace period)。 + CancelImmediate CancelMode = 0 + + // CancelAfterChatModel 在根 agent 的下一次 chat model 调用完成后取消。 + // 默认仅根 agent 检查此安全点;AgentTool 内的嵌套子 agent 不感知取消。 + // 使用 WithRecursive 将取消传播到所有子 agent——哪个 ChatModel 先完成就触发取消。 + // 注意:此安全点仅在 model 返回了 tool calls 时才会被检查——因为 tool calls + // 意味着后续还有执行(调 tool → 再调 model → ...),此时取消才有意义。 + // 若 model 直接产出最终回答(无 tool calls),执行流走向结束,不经过此检查点。 + CancelAfterChatModel CancelMode = 1 << iota + + // CancelAfterToolCalls 在根 agent 的下一轮 tool 调用全部完成后取消。 + // 默认仅根 agent 检查此安全点。使用 WithRecursive 传播到所有子 agent。 + CancelAfterToolCalls +) +``` + +#### `CancelHandle` + +用于等待取消完成的句柄。 + +```go +type CancelHandle struct{ /* unexported fields */ } + +func (h *CancelHandle) Wait() error +``` + +**Wait 返回值:** + +- `error`: + - `nil`:取消成功(详见 CancelError 的 Interrupt 吸收机制) + - `ErrCancelTimeout`:安全点取消超时,已自动升级为立即取消(取消本身仍然成功) + - `ErrExecutionEnded`:agent 在取消生效前已结束(正常完成或出错),没有可取消的执行 + +#### `AgentCancelFunc` + +用于取消运行中 agent 的函数类型。 + +```go +type AgentCancelFunc func(...AgentCancelOption) (*CancelHandle, bool) +``` + +**返回值:** + +- `CancelHandle`: + - 返回时表示取消请求已提交 + - 通过 `Wait()` 等待取消完成并获取结果 +- `bool`: + - 表示本次调用是否“contribute”到当前执行的 `CancelError` + - `true`:本次调用的取消选项在 `CancelError` 最终确定前被纳入 + - `false`:取消已最终确定(例如已 handled 或 execution 已结束),本次调用不会影响 `CancelError` + - TurnLoop 会利用该返回值提供 `TurnContext.Preempted` / `TurnContext.Stopped` 的严格语义 + +#### `AgentCancelOption` + +配置一次 agent 取消请求的不透明选项类型。用户通常不自行实现该类型,而是使用 `WithAgentCancelMode`、`WithAgentCancelTimeout` 和 `WithRecursive` 创建选项。 + +```go +type AgentCancelOption func(*agentCancelConfig) +``` + +#### `AgentCancelInfo` + +取消操作的信息。 + +```go +type AgentCancelInfo struct { + Mode CancelMode // 使用的取消模式 + Escalated bool // 是否升级为立即取消 + Timeout bool // 是否超时 +} +``` + +#### `CancelError` + +当 agent 被取消时通过 `AgentEvent.Err` 发送的错误。使用 `errors.As` 提取。 + +**Interrupt 吸收机制**:当取消处于活跃状态时,**任何** interrupt——无论是取消安全点节点产生的还是业务逻辑产生的(如 tool 中的 `tool.Interrupt`)——都会被转换为 `CancelError`。取消会"吸收"业务 interrupt。这是有意为之: + +- 在并发执行中(并行 workflow、并发 tool 调用),取消引发的 interrupt 和业务 interrupt 可能作为单一复合信号到达,无法拆分。 +- 即使在顺序执行中,在活跃取消期间将业务 interrupt 视为 CancelError 也能提供一致的语义——调用方只需处理 `CancelError` 一种信号,无需区分"取消引发的 interrupt"和"恰好在取消期间触发的业务 interrupt"。 +- 业务 interrupt **不会丢失**——checkpoint 保留了完整的 interrupt 层级。恢复运行(`Runner.Resume`)时,agent 重新执行中断的代码路径,业务 interrupt 会自然重新触发。 + +```go +type CancelError struct { + Info *AgentCancelInfo + + InterruptContexts []*InterruptCtx // 用于定向恢复的上下文(可配合 ResumeWithParams) +} +``` + +### 函数 + +#### `WithCancel` + +创建一个启用取消功能的 `AgentRunOption`。返回选项和取消函数。 + +```go +func WithCancel() (AgentRunOption, AgentCancelFunc) +``` + +**返回值:** + +- `AgentRunOption`:传递给 `Run()` 或 `Resume()` 的选项 +- `AgentCancelFunc`:用于取消的函数 + +**示例:** + +```go +cancelOpt, cancelFunc := WithCancel() +iter := runner.Run(ctx, messages, cancelOpt) + +// 之后,取消 agent +handle, contributed := cancelFunc(WithAgentCancelMode(CancelAfterChatModel)) +if contributed { + // 本次调用的取消选项已生效 + switch err := handle.Wait(); { + case err == nil: + // 取消成功 + case errors.Is(err, ErrExecutionEnded): + // agent 在取消生效前已结束 + case errors.Is(err, ErrCancelTimeout): + // 安全点取消超时,已自动升级为立即取消 + } +} +``` + +### 选项 + +#### `WithAgentCancelMode` + +设置 agent 取消操作的取消模式。 + +```go +func WithAgentCancelMode(mode CancelMode) AgentCancelOption +``` + +**参数:** + +- `mode CancelMode`:要使用的取消模式 + +**示例:** + +```go +handle, _ := cancelFunc(WithAgentCancelMode(CancelAfterToolCalls)) +_ = handle.Wait() +``` + +#### `WithAgentCancelTimeout` + +设置取消操作的超时时间。仅适用于安全点模式。 + +```go +func WithAgentCancelTimeout(timeout time.Duration) AgentCancelOption +``` + +**参数:** + +- `timeout time.Duration`:超时时长 + +**行为:** + +- 仅对 `CancelAfterChatModel` / `CancelAfterToolCalls` 生效;若在超时内仍未到达安全点,将自动升级为 `CancelImmediate`。升级后同样会保存 checkpoint,可通过 `Runner.Resume` 恢复 +- `timeout <= 0` 不会设置有效 deadline,因此不会触发超时升级 +- 当发生超时升级时,`CancelHandle.Wait()` 返回 `ErrCancelTimeout`,同时 `CancelError.Info.Timeout=true` 且 `CancelError.Info.Escalated=true` + +**示例:** + +```go +handle, _ := cancelFunc( + WithAgentCancelMode(CancelAfterChatModel), + WithAgentCancelTimeout(5*time.Second), +) +_ = handle.Wait() +``` + +#### `WithRecursive` + +启用递归取消传播。默认情况下,取消模式仅影响根 agent;AgentTool 内的子 agent 不会收到取消通知。`WithRecursive` 使取消传播到所有子 agent: + +- **CancelAfterChatModel / CancelAfterToolCalls**:子 agent 检查各自的安全点,哪个先到达就触发取消。 +- **CancelImmediate**:子 agent 收到显式的 immediate-cancel 信号以实现干净的 teardown;根 agent 使用 grace period 收集子 agent 的 interrupt。 + +启用 `WithRecursive` 后,不仅根 agent 会保存 checkpoint,正在执行的 AgentTool 内的子 agent 也会保存各自的 checkpoint。恢复时可从子 agent 中断处继续,无需从根 agent 重新执行。 + +一旦任何一次取消调用包含了 `WithRecursive`,该标志在整个取消生命周期内保持有效(单调升级)。 + +```go +func WithRecursive() AgentCancelOption +``` + +**示例:** + +```go +// 取消时传播到嵌套子 agent +handle, _ := cancelFunc( + WithAgentCancelMode(CancelAfterChatModel), + WithRecursive(), +) +_ = handle.Wait() + +// 升级:先非递归取消,后续调用添加递归 +handle1, _ := cancelFunc(WithAgentCancelMode(CancelAfterChatModel)) +handle2, _ := cancelFunc(WithRecursive()) // 升级为递归,所有子 agent 现在开始检查安全点 +``` + +### 哨兵错误 + +#### `ErrCancelTimeout` + +当取消操作超时时,由 `CancelHandle.Wait` 返回。 + +```go +var ErrCancelTimeout = errors.New("cancel timed out") +``` + +#### `ErrExecutionEnded` + +当 agent 在取消生效前已结束(正常完成或出错)时,由 `CancelHandle.Wait` 返回。 + +注意:在取消活跃期间发生的业务 interrupt 会被吸收为 `CancelError`(见 CancelError 文档),因此它们导致 `nil`(取消成功),**而非** `ErrExecutionEnded`。只有执行完全结束且未发生任何 interrupt 时才会返回此错误。 + +```go +var ErrExecutionEnded = errors.New("execution already ended") +``` + +#### `ErrStreamCanceled` + +当 `CancelImmediate` 在流式输出进行中触发,框架会立即中止底层流,并在 `AgentEvent.Output.MessageOutput.MessageStream` 的 `.Recv()` 中返回 `ErrStreamCanceled`。这同时适用于 ChatModel 的流式响应和 StreamableTool 的流式输出——两者的流都通过 `AgentEvent.Output.MessageOutput.MessageStream` 暴露给用户,取消监控机制完全对称。 + +**出现时机**:仅在 `CancelImmediate`(包括安全点取消超时后自动升级的情况)期间,ChatModel 或 StreamableTool 正在进行流式输出时出现。安全点取消(`CancelAfterChatModel` / `CancelAfterToolCalls`)不会产生此错误,因为它们会等到安全点再中断。 + +**出现位置**:`ErrStreamCanceled` 出现在 `AgentEvent.Output.MessageOutput.MessageStream.Recv()` 中,而非 `AgentEvent.Err`。随后 Runner 会发出一个独立事件,其中 `AgentEvent.Err` 为 `*CancelError`,表示取消完成。注意该事件不包含 `AgentEvent.Action.Interrupted`——`Action.Interrupted` 仅用于业务 interrupt,而取消始终通过 `CancelError` 传递。 + +```go +var ErrStreamCanceled error = &StreamCanceledError{} +``` + +#### `StreamCanceledError` + +`ErrStreamCanceled` 的具体错误类型。该类型导出是为了让流取消错误可以在 checkpoint 保存时通过 gob 序列化;业务代码通常使用 `errors.Is(err, ErrStreamCanceled)` 判断即可。 + +```go +type StreamCanceledError struct{} + +func (e *StreamCanceledError) Error() string +``` + +```go +// 处理流式事件时的 ErrStreamCanceled +for { + event, ok := events.Next() + if !ok { + break + } + + if event.Output != nil && event.Output.MessageOutput != nil && event.Output.MessageOutput.IsStreaming { + stream := event.Output.MessageOutput.MessageStream + for { + chunk, err := stream.Recv() + if err != nil { + if errors.Is(err, ErrStreamCanceled) { + // 流被立即取消中止(ChatModel 或 StreamableTool),后续事件中会收到 CancelError + break + } + if err == io.EOF { + break + } + } + // 处理 chunk... + _ = chunk + } + } + + if event.Err != nil { + var cancelErr *CancelError + if errors.As(event.Err, &cancelErr) { + // 取消完成,CancelError 包含取消模式和中断上下文等信息 + break + } + } +} +``` + +## TurnLoop API + +### 概述 + +`TurnLoop` 是一个基于推送的事件循环,以轮次(turn)为单位管理 agent 的执行。用户将数据项推送到 TurnLoop 的缓冲区中,TurnLoop 通过配置的 agent 处理这些数据。这种设计实现了灵活的、事件驱动的 agent 工作流。 + +**注意**:TurnLoop 的部分功能(如抢占和停止)依赖 Agent 取消功能。 + +### 核心类型 + +#### `TurnLoop[T, M]` + +主要的事件循环实例。通过 `NewTurnLoop()` 创建,然后调用 `Run()` 启动。 + +```go +type TurnLoop[T any, M MessageType] struct { ... } +``` + +#### `MessageType` + +约束 ADK 可使用的消息类型。当前仅支持 `*schema.Message` 与 `*schema.AgenticMessage`;外部包不能扩展该联合类型。 + +```go +type MessageType interface { + *schema.Message | *schema.AgenticMessage +} +``` + +#### `TypedAgent[M]` + +TurnLoop 每轮实际运行的 agent 接口。 + +```go +type TypedAgent[M MessageType] interface { + Name(ctx context.Context) string + Description(ctx context.Context) string + + Run(ctx context.Context, input *TypedAgentInput[M], options ...AgentRunOption) *AsyncIterator[*TypedAgentEvent[M]] +} + +type Agent = TypedAgent[*schema.Message] +``` + +#### `TypedAgentInput[M]` + +传给 agent 的输入。 + +```go +type TypedAgentInput[M MessageType] struct { + Messages []M + EnableStreaming bool +} + +type AgentInput = TypedAgentInput[*schema.Message] +``` + +#### `TypedAgentEvent[M]` + +agent 执行过程中发出的事件。TurnLoop 的 `OnAgentEvents` 回调消费该类型。 + +```go +type TypedAgentEvent[M MessageType] struct { + AgentName string + RunPath []RunStep + Output *TypedAgentOutput[M] + Action *AgentAction + Err error +} + +type AgentEvent = TypedAgentEvent[*schema.Message] +``` + +#### `TurnLoopConfig[T, M]` + +创建 TurnLoop 的配置结构。 + +```go +type TurnLoopConfig[T any, M MessageType] struct { + // GenInput 接收 TurnLoop 实例和所有缓冲的项目,决定处理哪些内容。 + // 返回哪些项目现在消费、哪些留到后续 turn。 + // loop 参数允许在回调中直接调用 Push() 或 Stop()。 + // 必填。 + GenInput func(ctx context.Context, loop *TurnLoop[T, M], items []T) (*GenInputResult[T, M], error) + + // GenResume 在 Run() 期间最多调用一次。当配置了 CheckpointID 时, + // Run() 会查询 Store 中的 checkpoint: + // - 若 checkpoint 包含 runner 状态(即 agent 在 turn 中途被中断), + // Run() 调用 GenResume 来规划恢复 turn。 + // - 否则(无 checkpoint 或 between-turns checkpoint),GenResume 不会被调用, + // TurnLoop 通过 GenInput 正常处理。 + // 参数含义: + // - inFlightItems:前次运行被取消或业务 interrupt 时正在处理的项目 + // - unhandledItems:前次运行退出时已缓冲但未处理的项目 + // - newItems:Run() 调用前通过 Push() 缓冲的新项目 + // + // 返回 GenResumeResult,描述如何恢复被中断的 agent turn + // (可选 ResumeParams)以及如何操作缓冲区(Consumed/Remaining)。 + // 可选;仅在需要恢复时必填。 + GenResume func(ctx context.Context, loop *TurnLoop[T, M], inFlightItems, unhandledItems, newItems []T) (*GenResumeResult[T, M], error) + + // PrepareAgent 返回一个配置好的 TypedAgent 来处理被消费的项目。 + // 每个 turn 调用一次,传入 GenInput 决定消费的项目。 + // loop 参数允许在回调中直接调用 Push() 或 Stop()。 + // 必填。 + PrepareAgent func(ctx context.Context, loop *TurnLoop[T, M], consumed []T) (TypedAgent[M], error) + + // OnAgentEvents 处理 agent 发出的事件。 + // TurnContext 提供 per-turn 信息与控制: + // - tc.Consumed:触发本次 agent 执行的已消费项目 + // - tc.Loop:允许在回调中直接调用 Push() 或 Stop() + // - tc.Preempted / tc.Stopped:处理事件时的信号 + // + // 错误处理:返回的 error 仅在回调自身想要中止 TurnLoop 时使用。 + // 回调看到 CancelError 时始终不需要抛出,框架会自动处理: + // - Stop 时:框架自动将 CancelError 作为 ExitReason 抛出,TurnLoop 终止。 + // - Preempt 时:框架不会抛出 CancelError;若回调也返回 nil,TurnLoop 进入下一轮。 + // 实践中,仅在回调内部故障需要终止 TurnLoop 时返回非 nil error。 + // + // 可选。如果未提供,事件将被消费,第一个错误(包括 Stop 触发的 CancelError)将作为 ExitReason 返回。 + OnAgentEvents func(ctx context.Context, tc *TurnContext[T, M], events *AsyncIterator[*TypedAgentEvent[M]]) error + + // Store 是用于持久化和恢复的检查点存储。可选。 + // 与 CheckpointID 配合使用时,启用自动 checkpoint 恢复。 + // TurnLoop 始终通过 gob 编码持久化 runner checkpoint bytes 和 item 簿记信息 + // (InFlightItems, UnhandledItems),因此使用 Store 时 T 必须可 gob 编解码。 + Store CheckPointStore + + // CheckpointID 与 Store 配合使用,启用声明式的自动 checkpoint 恢复。 + // Run() 时,TurnLoop 使用此 ID 查询 Store: + // - 若存在包含 runner 状态的 checkpoint(mid-turn interrupt),调用 GenResume 计划恢复 turn。 + // - 若存在不含 runner 状态的 checkpoint(between-turns),将存储的 unhandled items 缓冲, + // 然后通过 GenInput 正常处理。 + // - 若不存在 checkpoint,TurnLoop 从头开始。 + // + // 退出时,如果 TurnLoop 保存了新 checkpoint,它将使用同一 CheckpointID 保存。 + // 未保存新 checkpoint 时,TurnLoop 会尝试删除同一 CheckpointID 下的旧 checkpoint + // 以防止过期恢复(需 Store 实现 CheckPointDeleter)。 + // 使用 WithSkipCheckpoint() 可显式跳过 checkpoint 保存。 + CheckpointID string +} +``` + +#### `TurnContext[T, M]` + +`OnAgentEvents` 回调函数可用的 per-turn 上下文信息。 + +```go +type TurnContext[T any, M MessageType] struct { + // Loop 是 TurnLoop 实例,可在回调中调用 Push()/Stop()。 + Loop *TurnLoop[T, M] + + // Consumed 是触发本次 agent 执行的已消费项目。 + Consumed []T + + // Preempted 在至少一次 preemptive Push 实际 contribute 到当前 turn 的 + // CancelError 时关闭(通过 Push + WithPreempt)。 + // "contribute" 表示该次取消调用的 options 在 CancelError 最终确定前被纳入。 + // 若本次 turn 未发生 contribute 的抢占(例如取消已最终确定),该通道保持打开。 + // + // Preempted 和 Stopped 可能在同一个 turn 内都被关闭——当两个信号在 agent + // 仍处于取消过程中时先后到达。取消完全处理后到达的信号不会 contribute。 + Preempted <-chan struct{} + + // Stopped 在 Stop() 的取消调用实际 contribute 到当前 turn 的 CancelError 时关闭。 + // 若 Stop 未 contribute(例如取消已最终确定),该通道保持打开。 + // + // 关于 Preempted 与 Stopped 的关系,参见 Preempted 文档。 + Stopped <-chan struct{} + + // StopCause 返回通过 WithStopCause 传入的业务侧停止原因。 + // 此值仅在 Stopped 通道关闭后才有意义。在此之前返回空字符串。 + StopCause func() string +} +``` + +#### `GenInputResult[T, M]` + +`GenInput` 回调函数的返回结果。 + +```go +type GenInputResult[T any, M MessageType] struct { + // RunCtx 是本次 turn 的执行上下文(可选)。 + // 若设置,将用于 PrepareAgent、agent 的 Run/Resume 以及 OnAgentEvents。 + // 需从传入 GenInput 的 ctx 派生,以保留 TurnLoop 的取消语义与继承的 values。 + // 例如: + // runCtx := context.WithValue(ctx, traceKey{}, extractTraceID(items)) + // return &GenInputResult[T, M]{RunCtx: runCtx, ...}, nil + // 若为 nil,则使用 TurnLoop 的上下文。 + RunCtx context.Context + + // Input 是要执行的 agent 输入 + Input *TypedAgentInput[M] + + // RunOpts 是本次 agent 运行的选项。 + // 注意:不需要在此处传入 WithCheckPointID,TurnLoop 会自动注入 checkpointID 到 Runner 中。 + RunOpts []AgentRunOption + + // Consumed 是本轮 turn 选择处理的项目: + // 这些项目会从 buffer 中移除,并作为 PrepareAgent 的输入参数。 + Consumed []T + + // Remaining 是保留在 buffer 中供未来 turn 处理的项目: + // TurnLoop 会在本轮开始执行 agent 前把 Remaining push 回 buffer。 + // + // 注意:items 中既不在 Consumed 也不在 Remaining 的项目会被丢弃。 + Remaining []T +} +``` + +#### `GenResumeResult[T, M]` + +`GenResume` 回调函数的返回结果。 + +```go +type GenResumeResult[T any, M MessageType] struct { + // RunCtx 是本次恢复 turn 的执行上下文(可选)。 + RunCtx context.Context + + // RunOpts 是本次 agent 恢复运行的选项。 + // 注意:不需要在此处传入 WithCheckPointID,TurnLoop 会自动注入 checkpointID 到 Runner 中。 + RunOpts []AgentRunOption + + // ResumeParams 包含恢复被中断 agent 的参数(可选)。 + ResumeParams *ResumeParams + + // Consumed 是本轮恢复 turn 选择处理的项目: + // 这些项目会从 buffer 中移除,并作为 PrepareAgent 的输入参数。 + Consumed []T + + // Remaining 是保留在 buffer 中供未来 turn 处理的项目: + // TurnLoop 会在本轮恢复 agent 前把 Remaining push 回 buffer。 + // + // 注意:(inFlightItems, unhandledItems, newItems) 中既不在 Consumed 也不在 Remaining + // 的项目会被丢弃。 + Remaining []T +} +``` + +#### `InterruptError` + +当 agent 产生业务 interrupt(`AgentAction.Interrupted`)并导致 TurnLoop 退出时,`ExitReason` 为 `*InterruptError`。它表示 agent 主动暂停在业务定义的 interrupt 点,而不是被取消。 + +```go +type InterruptError struct { + // InterruptContexts 提供定向恢复所需的 interrupt 上下文。 + // 每个上下文代表 agent 层级中一个被 interrupt 的位置。 + InterruptContexts []*InterruptCtx +} + +func (e *InterruptError) Error() string +``` + +**行为:** + +- `*InterruptError` 会触发 TurnLoop checkpoint 保存;恢复时通过 `GenResume` 的 `inFlightItems` 参数拿到原 turn 正在处理的项目 +- `InterruptContexts` 可用于构造 `ResumeParams.Targets`,并通过 `GenResumeResult.ResumeParams` 传给 `Runner.ResumeWithParams` +- 与 `CancelError` 不同,`InterruptError` 表示业务侧主动暂停;取消活跃期间发生的 interrupt 仍会被吸收为 `CancelError` + +#### `TurnLoopExitState[T, M]` + +TurnLoop 退出时返回的状态,包含退出原因和未处理的项目。 + +```go +type TurnLoopExitState[T any, M MessageType] struct { + // ExitReason 表示 TurnLoop 退出的原因。 + // nil 表示正常退出(Stop() 被调用且 TurnLoop 正常完成)。 + // 非 nil 可能是 context 错误、回调错误、*CancelError 等。 + // 当 Stop() 取消了一个正在运行的 agent 时,ExitReason 为 *CancelError。 + // 此字段不包含 checkpoint 错误——见 CheckpointErr。 + ExitReason error + + // UnhandledItems 包含已缓冲但未处理的项目。 + // 即 Push 返回 true 但未被任何 turn 消费的项目。 + // 无论 ExitReason 为何值始终有效。 + UnhandledItems []T + + // InFlightItems 包含被中断 turn 正在处理的项目。 + // 取消(Stop + WithImmediate、WithGraceful 或 WithGracefulTimeout)和业务 interrupt + // 都会填充该字段;如果 agent 在取消生效前正常完成,则为空。 + // 恢复时通过 GenResume 的 inFlightItems 参数透传给用户。 + InFlightItems []T + + // StopCause 是通过 WithStopCause 传入的业务侧停止原因。 + // 若未调用 Stop 或未提供 cause,则为空字符串。 + StopCause string + + // CheckpointAttempted 表示 TurnLoop 退出时是否尝试保存 checkpoint。 + // 仅当 Store 已配置、CheckpointID 已设置、TurnLoop 退出时非 idle、未使用 WithSkipCheckpoint, + // 且退出由 Stop() 或业务 interrupt 触发时为 true。 + CheckpointAttempted bool + + // CheckpointErr 是 checkpoint 保存时的错误(如有)。 + // 当 CheckpointAttempted 为 false(未尝试保存)或保存成功时为 nil。 + CheckpointErr error + + // TakeLateItems 返回 TurnLoop 停止后被推送的项目 + // (即 Push 返回 false 的那些项目)。这些项目不包含在 checkpoint 中。 + // + // 此函数是幂等的:首次调用计算并缓存结果,后续调用返回相同的切片。 + // + // 调用 TakeLateItems 后,后续的 Push() 将 panic, + // 以防止项目被静默丢失。 + // + // 在 Wait() 返回后,可从任意 goroutine 安全调用。 + // 若从未调用 TakeLateItems,late items 将被正常垃圾回收。 + TakeLateItems func() []T +} +``` + +### 函数 + +#### `NewTurnLoop` + +创建一个新的 TurnLoop,但不启动它。返回的 TurnLoop 立即接受 `Push` 和 `Stop` 调用;推入的项目会被缓冲,直到调用 `Run`。 + +若 `GenInput` 或 `PrepareAgent` 为 nil,则 panic。 + +```go +func NewTurnLoop[T any, M MessageType](cfg TurnLoopConfig[T, M]) *TurnLoop[T, M] +``` + +**参数:** + +- `cfg TurnLoopConfig[T, M]`:TurnLoop 的配置 + +**返回值:** + +- `*TurnLoop[T, M]`:未启动的 TurnLoop 实例 + +**示例:** + +```go +loop := NewTurnLoop(TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], items []string) (*GenInputResult[string, *schema.Message], error) { + return &GenInputResult[string, *schema.Message]{ + Input: &TypedAgentInput[*schema.Message]{Messages: []Message{schema.UserMessage(items[0])}}, + Consumed: items, + }, nil + }, + PrepareAgent: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], consumed []string) (TypedAgent[*schema.Message], error) { + return myAgent, nil + }, +}) + +// 可以在启动前推入项目或传递引用 +_, _ = loop.Push("initial_item") +loop.Run(ctx) +``` + +### 方法 + +所有方法在 TurnLoop 未启动时均可安全调用(宽容 API): + +- `Push`:项目被缓冲,`Run` 调用后开始处理。 +- `Stop`:设置停止标志,后续 `Run` 将立即退出。 +- `Wait`:阻塞直到 `Run` 被调用且 TurnLoop 退出。如果从未调用 `Run`,`Wait` 将永久阻塞。 + +> 注:Push 在未启动前写入的项目,会在首次 Run 启动后被处理。 + +#### `Run` + +启动 TurnLoop 的处理 goroutine。此方法是非阻塞的:TurnLoop 在后台运行,通过 `Wait` 获取结果。 + +若在 `TurnLoopConfig` 中配置了 `CheckpointID` 且 `Store` 中存在匹配的 checkpoint,TurnLoop 将自动从该 checkpoint 尝试恢复;否则从头开始处理已 `Push()` 的项目。多次调用 `Run` 是幂等的 no-op:只有首次调用会启动 TurnLoop。 + +```go +func (l *TurnLoop[T, M]) Run(ctx context.Context) +``` + +**参数:** + +- `ctx context.Context`:TurnLoop 生命周期的上下文 + +**示例:** + +```go +loop := NewTurnLoop(cfg) +loop.Run(context.Background()) +``` + +#### `Push` + +向 TurnLoop 的缓冲区添加一个项目进行处理。此方法是非阻塞且线程安全的。 + +```go +func (l *TurnLoop[T, M]) Push(item T, opts ...PushOption[T, M]) (bool, <-chan struct{}) +``` + +**参数:** + +- `item T`:要添加到缓冲区的项目 +- `opts ...PushOption[T, M]`:可选的推送选项 + +**返回值:** + +- `bool`:如果 TurnLoop 已停止则返回 `false`(项目仍会被保留,可通过 `TurnLoopExitState.TakeLateItems()` 取回),否则返回 `true`(包括尚未调用 `Run` 的情况,项目将被缓冲) +- `<-chan struct{}`:仅在使用 `WithPreempt` / `WithPreemptTimeout` 时返回非 nil 值。调用方可等待此 channel 关闭,以确认抢占信号已被 TurnLoop 接收并提交取消请求——即当前 turn 已确定会被抢占。具体时机: + - 若当前有运行中的 agent:channel 在 TurnLoop 调用 cancel 后关闭 + - 若当前无运行中的 agent(TurnLoop 空闲或尚未启动):channel 立即关闭(无需取消) + - 若无需等待确认,可忽略此返回值 + +**示例:** + +```go +// 普通推送 +ok, _ := loop.Push("message1") +if !ok { + // 循环已停止,项目可通过 TakeLateItems() 取回 +} + +// 抢占式推送:推送新项目并请求取消当前 turn +ok, ack := loop.Push("urgent_message", WithPreempt[string, *schema.Message](AnySafePoint)) +if !ok { + // 循环已停止 +} else { + <-ack // 等待确认:抢占信号已被接收,当前 turn 确定会被取消 +} +``` + +##### SafePoint 类型 + +`SafePoint` 描述 agent 可以在哪个边界被取消。值可以用按位 OR 组合以接受多个安全点。 + +`SafePoint` 仅用于抢占 API(`WithPreempt`/`WithPreemptTimeout`)。一个关键的设计约束:**抢占始终瞄准安全点**——用户的意图是在一个明确定义的边界处取消,而不是立即中止。立即取消仅在超时升级时才可达(通过 `WithPreemptTimeout`),而非用户的直接选择。这就是为什么 `SafePoint` 没有 "immediate" 值,且 `WithPreempt` 要求非零 `SafePoint`(否则 panic)。 + +`SafePoint` 在内部映射为 `CancelMode`,但对 TurnLoop 用户隐藏了该细节。 + +```go +type SafePoint int + +const ( + AfterChatModel SafePoint = 1 << iota // 允许 agent 完成当前 chat-model 调用后被取消 + AfterToolCalls // 允许 agent 完成当前 tool 调用轮次后被取消 + AnySafePoint = AfterChatModel | AfterToolCalls // AfterChatModel | AfterToolCalls 的简写 +) +``` + +##### `PushOption[T, M]` + +配置一次 `Push` 调用的不透明选项类型。用户通常不自行实现该类型,而是使用 `WithPreempt`、`WithPreemptTimeout`、`WithPreemptDelay` 或 `WithPushStrategy` 创建选项。 + +```go +type PushOption[T any, M MessageType] func(*pushConfig[T, M]) +``` + +##### `WithPreempt` + +推送新项目的同时,请求在指定的 `SafePoint` 处取消当前 agent turn。取消完成后,TurnLoop 启动新的 turn,`GenInput` 将看到所有缓冲项目(包括刚推送的)。使用 `WithPreemptTimeout` 添加超时以升级为立即中止。 + +由于安全点在 turn 级别边界触发(chat model 返回之后或所有 tool 调用完成之后),**取消发生时没有嵌套 agent 在运行**——AgentTools 内的嵌套 agent 要么尚未启动(AfterChatModel),要么已经完成(AfterToolCalls)。因此 `WithPreempt` 的取消不涉及嵌套 agent。而 `WithPreemptTimeout` 在超时升级为立即取消时,会同时终止 AgentTools 内正在运行的嵌套 agent。 + +`WithPreempt` 和 `WithPreemptTimeout` 互斥;如果同时传给同一个 `Push` 调用,后者生效。 + +`safePoint` 不能为零;传入 `SafePoint(0)` 会 panic。 + +```go +func WithPreempt[T any, M MessageType](safePoint SafePoint) PushOption[T, M] +``` + +**参数:** + +- `safePoint SafePoint`:指定 agent 在哪个安全点让出 + +**示例:** + +```go +_, _ = loop.Push("urgent_item", WithPreempt[string, *schema.Message](AnySafePoint)) +_, _ = loop.Push("item", WithPreempt[string, *schema.Message](AfterToolCalls)) +``` + +##### `WithPreemptTimeout` + +与 `WithPreempt` 类似,但添加了超时。如果 agent 在超时时间内未到达安全点,抢占将升级为立即取消。超时升级时,AgentTools 内的嵌套 agent 也会收到取消信号并被终止。 + +`timeout <= 0` 不会设置有效 deadline,因此不会触发超时升级。 + +`safePoint` 不能为零;传入 `SafePoint(0)` 会 panic。 + +```go +func WithPreemptTimeout[T any, M MessageType](safePoint SafePoint, timeout time.Duration) PushOption[T, M] +``` + +**参数:** + +- `safePoint SafePoint`:指定 agent 在哪个安全点让出 +- `timeout time.Duration`:超时后升级为立即取消 + +**示例:** + +```go +_, _ = loop.Push("urgent_item", WithPreemptTimeout[string, *schema.Message](AnySafePoint, 5*time.Second)) +``` + +##### `WithPreemptDelay` + +设置抢占生效前的延迟时长。必须与 `WithPreempt` 或 `WithPreemptTimeout` 一起使用。 + +`delay <= 0` 等价于不设置延迟。 + +```go +func WithPreemptDelay[T any, M MessageType](delay time.Duration) PushOption[T, M] +``` + +**参数:** + +- `delay time.Duration`:抢占前的延迟时长 + +**示例:** + +```go +_, _ = loop.Push("item", WithPreempt[string, *schema.Message](AnySafePoint), WithPreemptDelay[string, *schema.Message](2*time.Second)) +``` + +##### `WithPushStrategy` + +提供基于当前 turn 状态的动态推送选项解析。回调接收当前 turn 的 context 和 `TurnContext`(如果没有活跃 turn 则为 nil),返回要实际应用的 `PushOption` 列表。 + +当使用 `WithPushStrategy` 时,同一 `Push` 调用中传入的所有其他 `PushOption` 将被忽略。返回的选项中不得包含另一个 `WithPushStrategy`;嵌套的 strategy 会被静默剥除。 + +TurnLoop 会先在内部锁下 hold 当前 run loop 并获取当前 turn 快照,然后在该稳定快照上调用回调;回调中读取的 turn 状态与最终推送决策之间不会跨越到下一轮 turn。 + +```go +func WithPushStrategy[T any, M MessageType](fn func(ctx context.Context, tc *TurnContext[T, M]) []PushOption[T, M]) PushOption[T, M] +``` + +**参数:** + +- `fn func(ctx context.Context, tc *TurnContext[T, M]) []PushOption[T, M]`:策略回调函数 + - `ctx`:当前 turn 的上下文(无活跃 turn 时为 `context.Background()`) + - `tc`:当前 turn 的 `TurnContext`(无活跃 turn 时为 `nil`) + +**示例:** + +```go +_, _ = loop.Push(urgentItem, WithPushStrategy(func(ctx context.Context, tc *TurnContext[MyItem, *schema.Message]) []PushOption[MyItem, *schema.Message] { + if tc == nil { + return nil // 两轮之间,普通 push + } + if isLowPriority(tc.Consumed) { + return []PushOption[MyItem, *schema.Message]{WithPreempt[MyItem, *schema.Message](AnySafePoint)} + } + return nil // 不抢占高优先级任务 +})) +``` + +#### `Stop` + +向 TurnLoop 发送停止信号并立即返回(非阻塞)。 + +不带选项时,当前 agent turn 运行至完成,TurnLoop 在 turn 边界退出而不启动新的 turn。此时 `ExitReason` 为 `nil`。 + +使用 `WithImmediate()` 立即中止正在运行的 agent turn。使用 `WithGraceful()` 在最近的安全点取消,并递归传播到嵌套 agent。使用 `WithGracefulTimeout()` 在安全点取消并设置升级截止时间。使用 `UntilIdleFor()` 延迟停止,直到 TurnLoop 持续空闲一段时间后自动关停。 + +可以在 `Run` 之前调用,此时后续 `Run` 将立即退出。 + +多次调用是允许的;后续调用更新取消选项。不带 `UntilIdleFor` 的 `Stop()` 调用会立即关停 TurnLoop,即使先前的 `UntilIdleFor` 仍在等待中。注意 `WithSkipCheckpoint` 和 `WithStopCause` 具有粘性语义——分别见各自的文档。 + +如果运行中的 agent 不支持 `WithCancel` 的 `AgentRunOption`,所有取消相关选项(`WithImmediate`、`WithGraceful`、`WithGracefulTimeout`)退化为"在进入下一次迭代时退出 TurnLoop"——当前 agent turn 会运行到完成后 TurnLoop 再退出。 + +调用 `Wait()` 阻塞直到 TurnLoop 完全退出并获取结果。 + +```go +func (l *TurnLoop[T, M]) Stop(opts ...StopOption) +``` + +**参数:** + +- `opts ...StopOption`:可选的停止选项 + +**示例:** + +```go +// 不带选项:turn 边界退出(当前 turn 完成后停止,ExitReason 为 nil) +loop.Stop() + +// 立即中止当前 agent turn +loop.Stop(WithImmediate()) + +// 安全点停止(优雅关停,递归传播到嵌套 agent) +loop.Stop(WithGraceful()) + +// 带超时的安全点停止 +loop.Stop(WithGracefulTimeout(10 * time.Second)) + +// 空闲后停止(持续空闲 30 秒后自动关停) +loop.Stop(UntilIdleFor(30 * time.Second)) +``` + +##### `StopOption` + +配置一次 `Stop` 调用的不透明选项类型。用户通常不自行实现该类型,而是使用 `WithImmediate`、`WithGraceful`、`WithGracefulTimeout`、`UntilIdleFor`、`WithSkipCheckpoint` 或 `WithStopCause` 创建选项。 + +```go +type StopOption func(*stopConfig) +``` + +##### `WithImmediate` + +立即中止正在运行的 agent turn,不等待任何安全点。AgentTools 内的嵌套 agent 也会收到取消信号并被终止。 + +这是最激进的停止模式,适合优先保证尽快停机的场景;若同时确定未来不需要恢复,应额外使用 `WithSkipCheckpoint()`。 + +```go +func WithImmediate() StopOption +``` + +**示例:** + +```go +loop.Stop(WithImmediate()) +``` + +##### `WithGraceful` + +请求优雅停止:在最近的安全点(tool 调用之后或 chat-model 调用之后)等待,并递归传播到嵌套 agent。不设置时间限制;使用 `WithGracefulTimeout` 添加超时后升级为立即取消。 + +`WithGraceful` 和 `WithGracefulTimeout` 互斥;如果同时传给同一个 `Stop` 调用,后者生效。 + +```go +func WithGraceful() StopOption +``` + +**示例:** + +```go +loop.Stop(WithGraceful()) +``` + +##### `WithGracefulTimeout` + +与 `WithGraceful` 类似,但添加了超时期限。如果 agent 在 `gracePeriod` 内未到达安全点,停止将升级为立即取消。 + +`gracePeriod` 必须为正值;传入零或负值会 panic。 + +```go +func WithGracefulTimeout(gracePeriod time.Duration) StopOption +``` + +**参数:** + +- `gracePeriod time.Duration`:超时后升级为立即取消 + +**示例:** + +```go +loop.Stop(WithGracefulTimeout(10 * time.Second)) +``` + +##### `UntilIdleFor` + +延迟停止,直到 TurnLoop 持续空闲(在 turn 之间阻塞且无待处理项目)达到指定时长。每当新项目到达时,计时器从零重置。 + +当业务代码从外部监控 agent 活动并希望在一段时间无工作后关停 TurnLoop 时很有用,且不与并发的 `Push` 调用产生竞态。 + +`UntilIdleFor` 不影响正在运行的 agent;它仅在 TurnLoop 空闲时生效。同一次调用中的取消选项(`WithImmediate`、`WithGraceful`、`WithGracefulTimeout`)会被静默忽略。`UntilIdleFor` 可与非取消选项(`WithSkipCheckpoint`、`WithStopCause`)组合。 + +若需在空闲等待期间升级为立即关停,发起一次新的不带 `UntilIdleFor` 的 `Stop` 调用即可覆盖空闲等待: + +```go +loop.Stop(UntilIdleFor(30 * time.Second)) // 等待空闲 +// ... 稍后,如果需要立即中止: +loop.Stop(WithImmediate()) // 覆盖空闲等待,立即关停 +``` + +仅首次 `UntilIdleFor` 的时长生效;后续调用传入不同时长会被忽略。 + +`duration` 必须为正值;传入零或负值会 panic。 + +```go +func UntilIdleFor(duration time.Duration) StopOption +``` + +**参数:** + +- `duration time.Duration`:空闲等待时长 + +**示例:** + +```go +// 持续空闲 30 秒后自动关停 +loop.Stop(UntilIdleFor(30 * time.Second)) + +// 空闲关停且不保存 checkpoint +loop.Stop(UntilIdleFor(30*time.Second), WithSkipCheckpoint()) +``` + +##### `WithSkipCheckpoint` + +告知 TurnLoop 本次 Stop 不要持久化 checkpoint。适用于调用方确定不会在未来恢复的场景。 + +标志位是粘性的:一旦任何 `Stop()` 调用设置了此选项,后续的升级调用无法撤消。 + +```go +func WithSkipCheckpoint() StopOption +``` + +**示例:** + +```go +// 永久停止,不保存 checkpoint +loop.Stop(WithSkipCheckpoint()) + +// 与取消选项组合:立即中止且不保存 checkpoint +loop.Stop(WithImmediate(), WithSkipCheckpoint()) +``` + +##### `WithStopCause` + +为 Stop 调用附带一个业务侧的停止原因字符串。 + +原因会在两处暴露: + +- `TurnLoopExitState.StopCause`:`Wait()` 返回后可用 +- `TurnContext.StopCause()`:在 `OnAgentEvents` 中,`<-tc.Stopped` 关闭后可用 + +若多次 `Stop()` 调用都提供了 cause,以首个非空值为准。 + +```go +func WithStopCause(cause string) StopOption +``` + +**参数:** + +- `cause string`:业务侧停止原因 + +**示例:** + +```go +loop.Stop(WithStopCause("user session timeout")) + +// 组合使用 +loop.Stop( + WithGraceful(), + WithStopCause("quota exceeded"), +) +``` + +#### `Wait` + +阻塞直到 TurnLoop 退出并返回退出状态。可以从多个 goroutine 安全调用,所有调用者将收到相同的结果。阻塞直到 `Run` 被调用且 TurnLoop 退出;如果从未调用 `Run`,将永久阻塞。 + +```go +func (l *TurnLoop[T, M]) Wait() *TurnLoopExitState[T, M] +``` + +**返回值:** + +- `*TurnLoopExitState[T, M]`:包含退出原因、未处理项目、checkpoint 状态和业务停止原因的退出状态 + +**示例:** + +```go +loop.Stop() +result := loop.Wait() +if result.ExitReason != nil { + log.Printf("循环退出时出错: %v", result.ExitReason) +} +``` + +### 扩展接口 + +#### `CheckPointStore` + +用于保存和读取 checkpoint 的存储接口。`TurnLoopConfig.Store` 使用该接口;当它与 `CheckpointID` 同时配置时,TurnLoop 才会启用自动恢复与持久化。 + +```go +type CheckPointStore interface { + Get(ctx context.Context, checkPointID string) ([]byte, bool, error) + Set(ctx context.Context, checkPointID string, checkPoint []byte) error +} +``` + +#### `CheckPointDeleter` + +`CheckPointStore` 的可选扩展接口。实现此接口的 Store 支持显式删除 checkpoint。 + +TurnLoop 在未保存新 checkpoint 时,会尝试删除先前加载的 checkpoint,以防止过期恢复。**只有实现了 CheckPointDeleter 的 Store 才会执行此删除**;否则过期 checkpoint 的生命周期由 Store 自身管理。 + +```go +type CheckPointDeleter interface { + Delete(ctx context.Context, checkPointID string) error +} +``` + +--- + +## 使用示例 + +### Agent 取消基本使用 + +```go +ctx := context.Background() +runner := NewRunner(ctx, RunnerConfig{ + Agent: myAgent, +}) + +// 启用取消功能 +cancelOpt, cancelFunc := WithCancel() +iter := runner.Run(ctx, messages, cancelOpt) + +// 在另一个 goroutine 中,在 chat model 完成后取消 +go func() { + time.Sleep(2 * time.Second) + handle, _ := cancelFunc( + WithAgentCancelMode(CancelAfterChatModel), + WithAgentCancelTimeout(5*time.Second), + ) + err := handle.Wait() + if err != nil { + log.Printf("取消失败: %v", err) + } +}() + +// 处理事件 +for { + event, ok := iter.Next() + if !ok { + break + } + if event.Err != nil { + var cancelErr *CancelError + if errors.As(event.Err, &cancelErr) { + log.Printf("Agent 被取消: mode=%v, escalated=%v", + cancelErr.Info.Mode, cancelErr.Info.Escalated) + } + break + } + // 处理事件 +} +``` + +### TurnLoop 基本使用 + +```go +ctx := context.Background() + +loop := NewTurnLoop(TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], items []string) (*GenInputResult[string, *schema.Message], error) { + // 处理所有项目,并为本次 turn 绑定 trace 上下文 + runCtx := context.WithValue(ctx, traceKey{}, extractTrace(items[0])) + return &GenInputResult[string, *schema.Message]{ + RunCtx: runCtx, + Input: &TypedAgentInput[*schema.Message]{Messages: []Message{schema.UserMessage(strings.Join(items, "\n"))}}, + Consumed: items, + }, nil + }, + PrepareAgent: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], consumed []string) (Agent, error) { + return myAgent, nil + }, + OnAgentEvents: func(ctx context.Context, tc *TurnContext[string, *schema.Message], events *AsyncIterator[*TypedAgentEvent[*schema.Message]]) error { + for { + event, ok := events.Next() + if !ok { + break + } + if event.Err != nil { + var cancelErr *CancelError + if errors.As(event.Err, &cancelErr) { + // 取消由 TurnLoop 捕获并转化为退出状态,回调不需要主动返回。 + continue + } + return event.Err + } + // 处理事件 + } + return nil + }, +}) + +// 可以在启动前推入项目 +_, _ = loop.Push("用户消息 1") +_, _ = loop.Push("用户消息 2") + +// 启动循环 +loop.Run(ctx) + +// 停止并等待(turn 边界退出,ExitReason 为 nil) +loop.Stop() +result := loop.Wait() +``` + +### 带抢占功能的 TurnLoop + +```go +loop := NewTurnLoop(TurnLoopConfig[string, *schema.Message]{...}) +loop.Run(ctx) + +// 推送紧急项目并抢占当前 agent +_, ack := loop.Push("urgent_message", WithPreempt[string, *schema.Message](AnySafePoint)) +if ack != nil { + <-ack +} + +// 或带延迟 +_, _ = loop.Push("item", WithPreempt[string, *schema.Message](AnySafePoint), WithPreemptDelay[string, *schema.Message](1*time.Second)) +``` + +### TurnLoop 声明式 Checkpoint 恢复 + +```go +ctx := context.Background() + +// 第一次运行——配置 Store 和 CheckpointID 启用自动 checkpoint +cfg := TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], items []string) (*GenInputResult[string, *schema.Message], error) { + return &GenInputResult[string, *schema.Message]{ + Input: &TypedAgentInput[*schema.Message]{Messages: []Message{schema.UserMessage(items[0])}}, + Consumed: items, + }, nil + }, + GenResume: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], inFlightItems, unhandledItems, newItems []string) (*GenResumeResult[string, *schema.Message], error) { + all := append(append(inFlightItems, unhandledItems...), newItems...) + return &GenResumeResult[string, *schema.Message]{ + Consumed: all, + }, nil + }, + PrepareAgent: func(ctx context.Context, loop *TurnLoop[string, *schema.Message], consumed []string) (Agent, error) { + return myAgent, nil + }, + Store: myStore, + CheckpointID: "my-session-id", +} + +loop := NewTurnLoop(cfg) +_, _ = loop.Push("message1") +loop.Run(ctx) + +// 停止运行 +loop.Stop(WithGraceful()) +exit := loop.Wait() + +// 从 checkpoint 恢复——使用相同的 cfg(包含相同 CheckpointID), +// Run() 会自动检测并从 checkpoint 恢复 +loop2 := NewTurnLoop(cfg) +_, _ = loop2.Push("new_message") // 新项目会作为 newItems 传入 GenResume +loop2.Run(ctx) +result2 := loop2.Wait() +``` + +--- + +## 最佳实践 + +### Agent 取消 + +1. **选择合适的模式**:优雅取消使用安全点模式(`CancelAfterChatModel`、`CancelAfterToolCalls`),紧急情况使用 `CancelImmediate` +2. **设置超时**:建议为安全点模式设置超时,防止无限等待 +3. **处理 CancelError**:在事件错误中检查 `CancelError`,区分取消和失败 +4. **理解 Interrupt 吸收**:取消活跃期间的业务 interrupt 会被吸收为 `CancelError`,但 checkpoint 会保留完整数据,恢复时业务 interrupt 会自然重新触发 +5. **恢复能力**:使用 `CancelError` 中的 `InterruptContexts` 实现定向恢复 +6. **递归传播**:默认取消仅影响根 agent。当 agent 层级中包含 AgentTool 嵌套的子 agent 时,使用 `WithRecursive()` 将取消传播到所有子 agent。不确定时,先不加 `WithRecursive()` ——仅在明确需要子 agent 也响应取消安全点时才启用 + +### TurnLoop + +1. **处理所有事件**:如果提供了 `OnAgentEvents`,应完整消费事件迭代器;未提供时框架会自动 drain 事件 +2. **感知抢占/停止**:在 `OnAgentEvents` 中使用 `TurnContext.Preempted` / `TurnContext.Stopped`(`select`)来感知抢占/停止;注意它们仅在对应取消调用实际 contribute 到本次 turn 的 `CancelError` 时才会关闭 +3. **声明式 Checkpoint**:在 `TurnLoopConfig` 中同时配置 `Store` 和 `CheckpointID` 以启用自动 checkpoint 恢复;`Run()` 会自动检测并从已有 checkpoint 恢复 +4. **恢复运行**:使用相同的 `CheckpointID` 创建新的 TurnLoop 并调用 `Run()`,框架会自动检测 checkpoint 并调用 `GenResume`;新项目通过 `Push()` 在 `Run()` 前缓冲 +5. **过期 Checkpoint 清理**:未保存新 checkpoint 时,框架会自动删除先前加载的 checkpoint,防止过期恢复;**只有实现了 CheckPointDeleter 接口的 Store 才会执行此删除** +6. **区分 CancelError 与业务 interrupt**:`*CancelError` 表示取消路径,`*InterruptError` 表示业务侧主动 interrupt;两者都可能产生 checkpoint,并通过 `GenResume` 的 `inFlightItems` 传回在处理中项目 +7. **跳过 Checkpoint**:当确定不再恢复时,使用 `WithSkipCheckpoint()` 避免不必要的 checkpoint 写入;该标志在升级调用中保持粘性 +8. **业务停止原因**:通过 `WithStopCause()` 附带业务层停止原因,与技术层面的 `ExitReason` 分离;在 `OnAgentEvents` 中通过 `<-tc.Stopped` 后读取 `tc.StopCause()` 获取 +9. **T 的 gob 兼容性**:使用 `Store` 时,`T` 必须可 gob 编解码,因为框架通过 gob 持久化 runner bytes 和 item 簿记信息 +10. **停止升级**:可多次调用 `Stop()`——后续调用更新取消选项(如从 `WithGraceful()` 升级到 `WithImmediate()`) +11. **空闲关停**:使用 `UntilIdleFor()` 在无工作时自动关停,避免与并发 `Push` 的竞态 +12. **上下文派生**:如需 per-turn trace,请在 `GenInput`/`GenResume` 中从 `ctx` 派生 `RunCtx` +13. **Late Items 恢复**:`Push()` 返回 `false` 时项目并未丢失——通过 `TurnLoopExitState.TakeLateItems()` 取回;注意调用 `TakeLateItems` 后不可再 `Push()` +14. **区分退出结果和 Checkpoint 结果**:`ExitReason` 反映 loop 本身的退出原因,`CheckpointAttempted` + `CheckpointErr` 反映 checkpoint 持久化结果,两者独立判断 + +### 集成使用 + +1. **抢占 vs 停止**:执行期间的紧急项目使用 `WithPreempt()`,最终关闭使用 `Stop()` +2. **条件抢占**:当抢占决策依赖当前 turn 状态时,使用 `WithPushStrategy` 而非先读状态再调 `Push`——它在原子快照下执行,避免 TOCTOU 竞态 +3. **上下文取消**:取消传给 `Run(ctx)` 的 `ctx` 可中止当前 turn 并让 loop 退出(`ExitReason` 通常为 `context.Canceled`/`context.DeadlineExceeded`);`Stop()` 更适合有序停机并可通过 `WithGraceful`/`WithGracefulTimeout` 控制取消策略 diff --git a/content/zh/docs/eino/core_modules/flow_integration_components/react_agent_manual.md b/content/zh/docs/eino/core_modules/flow_integration_components/react_agent_manual.md index 761f9bf5116..b1686c4ed49 100644 --- a/content/zh/docs/eino/core_modules/flow_integration_components/react_agent_manual.md +++ b/content/zh/docs/eino/core_modules/flow_integration_components/react_agent_manual.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-17" lastmod: "" tags: [] title: ReAct Agent 使用手册 diff --git a/content/zh/docs/eino/ecosystem_integration/_index.md b/content/zh/docs/eino/ecosystem_integration/_index.md index dd106a516b6..07e32c60ada 100644 --- a/content/zh/docs/eino/ecosystem_integration/_index.md +++ b/content/zh/docs/eino/ecosystem_integration/_index.md @@ -1,67 +1,10 @@ --- Description: "" -date: "2026-01-20" +date: "2026-05-17" lastmod: "" tags: [] title: 组件集成 -weight: 6 +weight: 5 --- -## 组件集成 -### ChatModel - -- openai: [ChatModel - OpenAI](https://github.com/cloudwego/eino-ext/blob/main/components/model/openai/README.md) -- ark: [ChatModel - ARK](https://github.com/cloudwego/eino-ext/blob/main/components/model/ark/README.md) -- ollama: [ChatModel - Ollama](https://github.com/cloudwego/eino-ext/blob/main/components/model/ollama/README.md) - -### Document - -#### Loader - -- file: [Loader - local file](/zh/docs/eino/ecosystem_integration/document/loader_local_file) -- s3: [Loader - amazon s3](/zh/docs/eino/ecosystem_integration/document/loader_amazon_s3) -- web url: [Loader - web url](/zh/docs/eino/ecosystem_integration/document/loader_web_url) - -#### Parser - -- html: [Parser - html](/zh/docs/eino/ecosystem_integration/document/parser_html) -- pdf: [Parser - pdf](/zh/docs/eino/ecosystem_integration/document/parser_pdf) - -#### Transformer - -- markdown splitter: [Splitter - markdown](/zh/docs/eino/ecosystem_integration/document/splitter_markdown) -- recursive splitter: [Splitter - recursive](/zh/docs/eino/ecosystem_integration/document/splitter_recursive) -- semantic splitter: [Splitter - semantic](/zh/docs/eino/ecosystem_integration/document/splitter_semantic) - -### Embedding - -- ark: [Embedding - ARK](/zh/docs/eino/ecosystem_integration/embedding/embedding_ark) -- openai: [Embedding - OpenAI](/zh/docs/eino/ecosystem_integration/embedding/embedding_openai) - -### Indexer - -- volc vikingdb: [Indexer - volc VikingDB](/zh/docs/eino/ecosystem_integration/indexer/indexer_volc_vikingdb) -- Milvus 2.5+: [Indexer - Milvus 2 (v2.5+)](/zh/docs/eino/ecosystem_integration/indexer/indexer_milvusv2) -- Milvus 2.4: [Indexer - Milvus](/zh/docs/eino/ecosystem_integration/indexer/indexer_milvus) -- OpenSearch 3: [Indexer - OpenSearch 3](/zh/docs/eino/ecosystem_integration/indexer/indexer_opensearch3) -- OpenSearch 2: [Indexer - OpenSearch 2](/zh/docs/eino/ecosystem_integration/indexer/indexer_opensearch2) -- ElasticSearch 9: [Indexer - Elasticsearch 9](/zh/docs/eino/ecosystem_integration/indexer/indexer_elasticsearch9) -- Elasticsearch 8: [Indexer - ES8](/zh/docs/eino/ecosystem_integration/indexer/indexer_es8) -- ElasticSearch 7: [Indexer - Elasticsearch 7 ](/zh/docs/eino/ecosystem_integration/indexer/indexer_elasticsearch7) - -### Retriever - -- volc vikingdb: [Retriever - volc VikingDB](/zh/docs/eino/ecosystem_integration/retriever/retriever_volc_vikingdb) -- Milvus 2.5+: [Retriever - Milvus 2 (v2.5+) ](/zh/docs/eino/ecosystem_integration/retriever/retriever_milvusv2) -- Milvus 2.4: [Retriever - Milvus](/zh/docs/eino/ecosystem_integration/retriever/retriever_milvus) -- OpenSearch 3: [Retriever - OpenSearch 3](/zh/docs/eino/ecosystem_integration/retriever/retriever_opensearch3) -- OpenSearch 2: [Retriever - OpenSearch 2](/zh/docs/eino/ecosystem_integration/retriever/retriever_opensearch2) -- ElasticSearch 9: [Retriever - Elasticsearch 9](/zh/docs/eino/ecosystem_integration/retriever/retriever_elasticsearch9) -- ElasticSearch 8: [Retriever - ES8](/zh/docs/eino/ecosystem_integration/retriever/retriever_es8) -- ElasticSearch 7: [Retriever - ES 7](/zh/docs/eino/ecosystem_integration/retriever/retriever_elasticsearch7) - -### Tools - -- googlesearch: [Tool - Googlesearch](/zh/docs/eino/ecosystem_integration/tool/tool_googlesearch) -- duckduckgo search: [Tool - DuckDuckGoSearch](/zh/docs/eino/ecosystem_integration/tool/tool_duckduckgo_search) diff --git a/content/zh/docs/eino/ecosystem_integration/chat_model/_index.md b/content/zh/docs/eino/ecosystem_integration/chat_model/_index.md index 429009520f3..1189ebfa9fa 100644 --- a/content/zh/docs/eino/ecosystem_integration/chat_model/_index.md +++ b/content/zh/docs/eino/ecosystem_integration/chat_model/_index.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-19" lastmod: "" tags: [] title: ChatModel @@ -31,3 +31,24 @@ weight: 1 - 上述链接直接指向 GitHub 仓库的最新文档 - 中文文档和英文文档内容同步更新 - 如需查看历史版本或提交文档修改建议,请访问 GitHub 仓库 + +# AgenticModel 组件列表 + +本分类的各组件详细文档请参考 GitHub README: + + + + + + + + +
    组件名称中文文档English Docs
    AgenticARKREADME.zh_CN.mdREADME.md
    AgenticDeepSeekREADME_zh.mdREADME.md
    AgenticOpenAIREADME.zh_CN.mdREADME.md
    AgenticGeminiREADME.zh_CN.mdREADME.md
    AgenticQwenREADME_zh.mdREADME.md
    + +--- + +**说明**: + +- 上述链接直接指向 GitHub 仓库的最新文档 +- AgenticModel 是面向 Agentic 场景的模型接口,支持 Server Tools、MCP Tools、前缀缓存等高级能力 +- 如需查看历史版本或提交文档修改建议,请访问 GitHub 仓库 diff --git a/content/zh/docs/eino/overview/_index.md b/content/zh/docs/eino/overview/_index.md index 186b15725bf..88b5fc6ae0d 100644 --- a/content/zh/docs/eino/overview/_index.md +++ b/content/zh/docs/eino/overview/_index.md @@ -350,7 +350,7 @@ Eino 框架由几个部分组成: - [Eino Devops](https://github.com/cloudwego/eino-ext/tree/main/devops):可视化开发、可视化调试等。 - [EinoExamples](https://github.com/cloudwego/eino-examples):是包含示例应用程序和最佳实践的代码仓库。 -详见:[Eino 框架结构说明](/zh/docs/eino/overview/eino_框架结构说明) +详见:[Eino 框架结构说明](/zh/docs/eino/overview/eino_architecture) ## 详细文档 diff --git a/content/zh/docs/eino/overview/eino_adk_quickstart.md b/content/zh/docs/eino/overview/eino_adk_quickstart.md new file mode 100644 index 00000000000..0409bda79b6 --- /dev/null +++ b/content/zh/docs/eino/overview/eino_adk_quickstart.md @@ -0,0 +1,255 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: 五分钟上手 Eino ADK +weight: 9 +--- + +本文面向已了解 Eino 的开发者,聚焦 ADK 中最重要的自主决策原语:**ChatModelAgent** 及其运行时增强机制 **ChatModelAgentMiddleware**。 + +## 先认识 ChatModelAgent + +当我们谈论 "Agent" 时,绝大多数时候指的是:以大模型为核心,配备工具,能够自主决策并解决复杂现实问题的实体。`ChatModelAgent` 就是 Eino ADK 对这一概念的直接实现。 + +**ChatModelAgent = 以 ChatModel 作为决策器、以 Tools 作为行动空间、以工具反馈和历史记录作为下一轮决策上下文的 ReAct Agent。** + +四个关键部分: + +1. **ChatModel**:大模型,负责推理与决策。 +2. **Tools**:工具集合,定义 Agent 可执行的行动范围。 +3. **反馈**:工具执行结果回到模型上下文,成为下一轮决策的依据。 +4. **历史记录**:完整保留问题求解过程中的推理轨迹、工具调用和工具结果。 + +因此,`ChatModelAgent` 不是一次模型调用,而是一次可持续推进的问题求解过程。 + +## ChatModelAgent 的执行结构:ReAct Loop + +`ChatModelAgent` 的核心能力是**自主决策**——在一次 `Run` 中,模型可以反复推理、行动、获取反馈,直到问题被解决。支撑这种能力的执行结构就是 ReAct Loop。 + +自主决策需要四个要素同时存在: + +1. **决策器(ChatModel)**:每一轮根据当前上下文,判断下一步该做什么。 +2. **行动空间(Tools)**:定义 Agent 能采取的具体行动。 +3. **反馈信号(Tool Feedback)**:行动的结果被注入上下文,成为后续决策的依据——这使 Agent 能根据真实执行结果修正方向,而不是一次猜测到底。 +4. **累积上下文(History)**:完整保留推理轨迹、工具调用与工具结果。每一轮模型看到的不是独立的单次提问,而是从问题开始到当前为止的完整求解过程。 + +这四者缺一不可:没有决策器就无法推理,没有行动空间就无法执行,没有反馈就无法修正,没有累积上下文就无法基于历史做出更好的判断。 + + + +关键特征:**累积上下文驱动的渐进式决策**。每一轮循环不是从零开始,而是在此前所有推理与行动的完整轨迹之上继续推进。模型的每一次决策都基于不断增长的问题求解上下文做出,这让 Agent 能处理需要多步推理、试错、修正的复杂任务。 + +## 什么让你的 ChatModelAgent 不同 + +ReAct Loop 的结构是固定的。那什么让**你的** ChatModelAgent 有别于其他人的,能针对你的具体问题? + +四个维度: + +1. **ChatModel** — 选择哪个模型做决策。 +2. **Instruction** — 系统指令:角色定义、行为约束、少样本示例。 +3. **Tools** — 工具集合:决定 Agent 可以做什么。 +4. **Middleware(ChatModelAgentMiddleware)** — 在 ReAct Loop 的特定生命周期点位上注入行为:拦截、修改、增强循环中的输入和输出。 + +前三者定义了 Agent "是什么"——决策能力、角色约束、行动范围。 + +Middleware 定义了 Agent "怎么跑"——它不改变 Loop 的结构(推理 → 行动 → 反馈始终不变),而是控制循环运行时的具体行为。例如:模型调用前压缩上下文、运行前动态注入工具、工具调用时做权限检查、模型失败时重试或切换备用模型。这些都是在 Loop 的特定点位上做的运行时增强。 + +## Middleware:在 ReAct Loop 中注入行为 + +构建 ChatModelAgent 时,你会遇到这些典型问题: + +- **Agent 需要读写文件、执行命令?** → 需要在运行前注入一组通用工具。 +- **Agent 需要复用一组预定义的指令和知识?** → 需要把可复用能力打包成 Skill,按需加载。 +- **上下文越来越长,超出模型窗口怎么办?** → 需要在每次模型调用前自动压缩历史。 +- **工具太多,全部塞进 prompt 会稀释注意力?** → 需要按需搜索和加载工具。 +- **模型偶尔调用失败或返回垃圾?** → 需要自动重试或切换备用模型。 + +这些需求的共同点:它们不需要改变 ReAct Loop 的结构,只需要在循环的特定点位上拦截和增强。这就是 Middleware 做的事。 + +对应的内置 Middleware: + + + + + + + + +
    场景Middleware做了什么
    需要文件系统能力FileSystem运行前注入 ls/read/write/edit/grep/execute 等工具
    复用预定义能力Skill将指令、知识、工具打包为可按需加载的技能单元
    上下文超窗口Reduction / Summarization模型调用前压缩消息和工具结果
    工具过多ToolSearch按需搜索并加载 Tools,而非一次性暴露全部
    模型调用不稳定ModelRetry / ModelFailover单次模型调用维度做重试 / 故障切换
    + +每个 Middleware 的实现,都是在 ReAct Loop 的某个钩子点位上做注入。下图展示了 `ChatModelAgentMiddleware` 的各个钩子在循环中的位置: + + + +对应的钩子点位总结: + + + + + + + + + +
    钩子点位时机典型用途
    BeforeAgent
    Agent 运行前(仅一次)增强 Instruction,注入 Tools
    BeforeModelRewriteState
    每次模型调用前修改 Messages / ToolInfos
    AfterModelRewriteState
    每次模型调用后修改模型响应或修补状态
    WrapModel
    单次模型调用维度重试、故障切换、改写模型返回
    WrapToolCall
    单次工具调用维度权限、安全、输出改写
    AfterAgent
    Agent 成功结束后后处理、状态清理
    + +完整 Middleware 速查见文末附录。 + +## 快速上手:创建并运行 ChatModelAgent + +`Runner` 是执行 Agent 的入口。它把一次用户请求转化为一次 Agent 运行,负责单次运行配置、事件流输出、流式开关,以及 checkpoint / resume 等运行期能力。最小用法是:把 `ChatModelAgent` 放进 `RunnerConfig`,然后调用 `Query` 或 `Run`。 + +以下示例展示了如何创建一个最简 ChatModelAgent,并通过 Runner 执行: + +```go +package main + +import ( + "context" + "fmt" + "log" + + "github.com/cloudwego/eino-ext/components/model/ark" + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/compose" + "github.com/cloudwego/eino/components/tool" +) + +func main() { + ctx := context.Background() + + // 1. 创建 ChatModel + chatModel, err := ark.NewChatModel(ctx, &ark.ChatModelConfig{ + Model: "doubao-seed-1-8-251228", + APIKey: "your_api_key", // 替换为你的 API Key + }) + if err != nil { + log.Fatal(err) + } + + // 2. 创建 ChatModelAgent + agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "my-assistant", + Description: "一个可以使用工具回答问题的助手。", + Instruction: "你是一个有帮助的助手。请根据可用工具回答用户问题。", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{ + // 注册你的工具,例如 webSearchTool + }, + }, + }, + // Handlers: []adk.ChatModelAgentMiddleware{...}, // 注册 Middleware + }) + if err != nil { + log.Fatal(err) + } + + // 3. 通过 Runner 执行 Agent + runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + }) + + // 4. 发送用户请求并消费事件流 + iter := runner.Query(ctx, "帮我搜索一下今天的新闻") + for { + event, ok := iter.Next() + if !ok { + break + } + fmt.Println(event) + } +} +``` + +核心流程:`NewChatModelAgent` → `NewRunner` → `Runner.Query/Run` → 消费 `AsyncIterator` 事件流。 + +更多基础示例可参考:[Eino: 快速开始](/zh/docs/eino/quick_start)。 + +## 延伸阅读:DeepAgents + +DeepAgents 是一个预构建的 ChatModelAgent,核心价值在于两个预置 Middleware: + +- **WriteTodos(PlanTask)**:让主 Agent 在执行前显式规划任务列表,并在执行过程中持续追踪进度。复杂问题不再靠模型"一口气想完",而是先拆解、再逐步推进。 +- **TaskTool**:让主 Agent 把子任务委派给子 Agent 执行,子 Agent 独立完成后将结果汇总回主循环。这使得单个 Agent 的能力边界可以通过组合来扩展。 + +此外,DeepAgents 还预置了系统提示词和可选的 FileSystem Middleware,开箱即可处理需要任务规划和多 Agent 协作的场景。 + +``` +DeepAgents = ChatModelAgent + + WriteTodos(任务规划与追踪) + + TaskTool(子任务委派) + + 可选 FileSystem + + 预置系统提示词 +``` + +进一步阅读: + +- Eino ADK Deep Agents 完整指南:[Eino ADK: DeepAgents](/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents) +- DeepAgents 示例:[eino-examples/adk/multiagent/deep at main · cloudwego/eino-examples](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) + +## 延伸阅读:为什么不继续使用 flow/react? + +回到第一性原理:Graph 和 Agent 是两种本质不同的 AI 应用形态。 + +- **Graph** 的核心是**确定性**:开发者预定义拓扑结构,节点间的流转关系在编译期就已确定。输入是结构化的,输出是可预测的。 +- **Agent** 的核心是**自主性**:LLM 在运行时动态决定下一步行动,执行路径不可预知,输出是全过程事件流。 + +`flow/react` 本质上是用 Graph 的方式来"模拟" Agent——把 ReAct 的推理循环展开为静态的节点和边。这可行,但本质上是一种错位:用确定性编排来承载动态决策。当 Agent 的复杂度增长时,这个错位会产生系统性问题: + +1. **交付物不匹配**:Graph 面向"最终结果",而 Agent 的交付物是全过程(推理轨迹、中间工具调用、状态变化)。用 Graph 做 Agent 时,中间过程只能通过 Callback 等旁路抽取——可行,但属于补丁。 +2. **运行模式不匹配**:Graph 是同步执行模型,而 Agent 天然是异步的长时运行。事件流输出、checkpoint / resume、中断恢复等运行期能力,需要框架在 Agent 维度统一管理,而非散落在 Graph 节点的回调中。 +3. **扩展点不匹配**:Agent 的运行时增强(上下文压缩、工具动态加载、模型重试、安全控制)本质上是对决策循环的拦截和注入。在 Graph 中,这些能力没有统一的挂载点,只能散落在各个节点或边上;在 ChatModelAgent 中,它们有明确的生命周期钩子(Middleware)。 + +因此,flow/react 不是被废弃,而是回到它最匹配的位置:**确定性流程编排**。当核心问题是"自主决策 + 运行时增强"时,正确的抽象是 `ChatModelAgent + ChatModelAgentMiddleware`。 + +进一步阅读: + +- Agent 还是 Graph?AI 应用路线辨析:[Agent 还是 Graph?AI 应用路线辨析](/zh/docs/eino/overview/graph_or_agent) + + + +## 附录:Middleware 速查 + +### 实例一览 + + + + + + + + + + + + + + + + + +
    Middleware描述
    Reduction超长工具输出截断 / 写入文件系统,防止 token 超限
    Summarization历史消息摘要压缩
    Skill可复用指令/知识以 Tool 形式暴露,Agent 按需加载
    FileSystemls/read/write/edit/glob/grep/execute 文件操作工具集
    ToolSearch
    tool_search
    元工具,按需搜索加载工具(减少常驻工具列表占用)
    PatchToolCall修补消息历史中的悬空工具调用(缺失工具结果)
    SafeToolWrapToolCall 维度拦截工具执行错误,转为可读文本返回模型,使 Agent 可自行修正而非中断
    ModelRetry模型调用失败时按策略重试 [内置配置]
    ModelFailover模型调用失败时切换备用模型 [内置配置]
    AgentsMD将 Agents.md 知识文件注入模型上下文,提升上下文质量
    PlanTask持久化的任务管理工具集(create/get/update/list),支持依赖关系追踪
    WriteTodos轻量级 TODO 列表工具,Agent 可创建和追踪结构化待办事项 [DeepAgent 内置]
    TaskTool子 Agent 委派工具,主 Agent 通过它把子任务交给子 Agent 独立执行 [DeepAgent 内置]
    Permission工具调用权限控制 [WIP]
    + +> 注:ModelRetry / ModelFailover 在代码中是 `ChatModelAgentConfig` 的内置字段(`ModelRetryConfig` / `ModelFailoverConfig`),概念上对应 `WrapModel` 钩子。SafeTool 为示例模式(见 ChatWithEino ch05),实现为用户自定义 Middleware。WriteTodos / TaskTool 为 DeepAgent 内置,不单独导出。Permission 为规划中能力。 + +### 分类 + + + + + + + + +
    类别解决什么问题包含
    扩展通用 Tool给 Agent 更多能力FileSystem, Skill, ToolSearch, PlanTask, WriteTodos, TaskTool
    处理 ReAct 过程中的错误提高可靠性ModelRetry, ModelFailover, SafeTool, PatchToolCall
    保证上下文窗口在上限内防 token 超限Reduction, Summarization, ToolSearch
    安全与权限约束 Agent 行为Permission
    提高上下文内容质量让模型看到更好的上下文Skill, AgentsMD
    + +ToolSearch 跨两个类别:既是"扩展 Tool"(提供按需工具发现能力),也是"保证上下文窗口"(避免一次性加载过多工具描述)。 + +进一步阅读: + +- ChatModelAgent Middleware 详解:[Eino ADK: ChatModelAgentMiddleware](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware) diff --git a/content/zh/docs/eino/overview/graph_or_agent.md b/content/zh/docs/eino/overview/graph_or_agent.md index 96d9c21c323..5aca3312163 100644 --- a/content/zh/docs/eino/overview/graph_or_agent.md +++ b/content/zh/docs/eino/overview/graph_or_agent.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: Agent 还是 Graph?AI 应用路线辨析 diff --git a/content/zh/docs/eino/quick_start/_index.md b/content/zh/docs/eino/quick_start/_index.md index 9333db3afd1..a08ae5fe2fb 100644 --- a/content/zh/docs/eino/quick_start/_index.md +++ b/content/zh/docs/eino/quick_start/_index.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: 快速开始 @@ -69,7 +69,7 @@ EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run . - + @@ -78,7 +78,8 @@ EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run . - + +
    章节主题入口
    第一章ChatModel 与 Message(Console)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch01_chatmodel_agent_console.md
    第一章ChatModel 与 AgenticMessage(Console)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch01_chatmodel_agent_console.md
    第二章Agent 与 Runner(Console 多轮)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch02_chatmodel_agent_runner_console.md
    第三章Memory 与 Session(持久化对话)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch03_memory_session_jsonl.md
    第四章Tool 与文件系统访问https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch04_tool_backend_filesystem.md
    第七章Interrupt/Resume(中断与恢复)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch07_interrupt_resume.md
    第八章Graph Tool(复杂工作流)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch08_graph_tool.md
    第九章Skill(Console)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch09_skill.md
    最终章A2UI(Web)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch10_a2ui.md
    第十章A2UI(Web)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch10_a2ui.md
    第十一章 TurnLoophttps://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch11_turnloop.md |
    ## 最终交付:一个可扩展的端到端 Agent 应用骨架 diff --git a/content/zh/docs/eino/quick_start/chapter_01_chatmodel_and_message.md b/content/zh/docs/eino/quick_start/chapter_01_chatmodel_and_message.md index 4dcb14ee4eb..d3e3bf27276 100644 --- a/content/zh/docs/eino/quick_start/chapter_01_chatmodel_and_message.md +++ b/content/zh/docs/eino/quick_start/chapter_01_chatmodel_and_message.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-19" lastmod: "" tags: [] title: 第一章:ChatModel 与 Message(Console) @@ -56,15 +56,16 @@ ChatWithEino 是一个基于 Eino 框架构建的智能助手,能够帮助开 - - - - - - - - - + + + + + + + + + +
    章节主题核心内容能力提升
    第一章ChatModel 与 Message理解 Component 抽象,实现单次对话基础对话能力
    第二章Agent 与 Runner引入执行抽象,实现多轮对话会话管理能力
    第三章Memory 与 Session持久化对话历史,支持会话恢复持久化能力
    第四章Tool 与文件系统添加文件访问能力,读取源码工具调用能力
    第五章Middleware中间件机制,统一处理横切关注点扩展性增强
    第六章Callback回调机制,监控 Agent 执行过程可观测性
    第七章Interrupt 与 Resume中断与恢复,支持长时间任务可靠性增强
    第八章Graph 与 Tool使用 Graph 编排复杂工作流复杂编排能力
    第九章A2UIAgent 到 UI 的集成方案生产级应用
    第一章ChatModel 与 AgenticMessage理解 Component 抽象,实现单次对话基础对话能力
    第二章Agent 与 Runner引入执行抽象,实现多轮对话会话管理能力
    第三章Memory 与 Session持久化对话历史,支持会话恢复持久化能力
    第四章Tool 与文件系统添加文件访问能力,读取源码工具调用能力
    第五章Middleware中间件机制,统一处理横切关注点扩展性增强
    第六章Callback回调机制,监控 Agent 执行过程可观测性
    第七章Interrupt 与 Resume中断与恢复,支持长时间任务可靠性增强
    第八章Graph 与 Tool使用 Graph 编排复杂工作流复杂编排能力
    第九章Skill使用 Skill 中间件加载并复用技能文档知识复用能力
    最终章A2UIAgent 到 UI 的集成方案生产级应用
    **为什么这样设计?** @@ -77,7 +78,7 @@ ChatWithEino 是一个基于 Eino 框架构建的智能助手,能够帮助开 --- -本章目标:理解 Eino 的 Component 抽象,用最小代码调用一次 ChatModel(支持流式输出),并掌握 `schema.Message` 的基本用法。 +本章目标:理解 Eino 的 Component 抽象,用最小代码调用一次 ChatModel(支持流式输出),并掌握如何用 `schema.AgenticMessage` 组织模型输入和流式输出。 ## 代码位置 @@ -88,11 +89,12 @@ ChatWithEino 是一个基于 Eino 框架构建的智能助手,能够帮助开 Eino 定义了一组 Component 接口(`ChatModel`、`Tool`、`Retriever`、`Loader` 等),每个接口描述一类可替换的能力: ```go -type BaseChatModel interface { - Generate(ctx context.Context, input []*schema.Message, opts ...Option) (*schema.Message, error) - Stream(ctx context.Context, input []*schema.Message, opts ...Option) ( - *schema.StreamReader[*schema.Message], error) +type BaseModel[M any] interface { + Generate(ctx context.Context, input []M, opts ...Option) (M, error) + Stream(ctx context.Context, input []M, opts ...Option) (*schema.StreamReader[M], error) } + +type AgenticModel = BaseModel[*schema.AgenticMessage] ``` **接口带来的好处:** @@ -103,15 +105,29 @@ type BaseChatModel interface { 本章只涉及 `ChatModel`,后续章节会逐步引入 `Tool`、`Retriever` 等 Component。 -## schema.Message:对话的基本单位 +本示例代码默认使用 `model.AgenticModel`,也就是 `model.BaseModel[*schema.AgenticMessage]`。这样后续章节可以在同一套消息结构里表达文本、reasoning、工具调用、工具结果等内容。 + +## schema.AgenticMessage:对话的基本单位 -`Message` 是 Eino 里对话数据的基本结构: +`AgenticMessage` 是本 Quickstart 使用的对话数据结构: + +在一次模型调用中,模型可能会返回多个有序事件,例如先输出 `reasoning`,再调用 server tool,随后继续 `reasoning`,接着调用 function tool。`AgenticMessage` 会用 `ContentBlock` 按顺序保存这些结构化事件。 ```go -type Message struct { - Role RoleType // system / user / assistant / tool - Content string // 文本内容 - ToolCalls []ToolCall // 仅 assistant 消息可能有 +type AgenticMessage struct { + Role AgenticRoleType + ContentBlocks []*ContentBlock + ResponseMeta *AgenticResponseMeta + Extra map[string]any +} + +type ContentBlock struct { + Type ContentBlockType + Reasoning *Reasoning + UserInputText *UserInputText + AssistantGenText *AssistantGenText + FunctionToolCall *FunctionToolCall + FunctionToolResult *FunctionToolResult // ... } ``` @@ -119,18 +135,23 @@ type Message struct { 常用构造函数: ```go -schema.SystemMessage("You are a helpful assistant.") -schema.UserMessage("What is the weather today?") -schema.AssistantMessage("I don't know.", nil) // 第二个参数是 ToolCalls -schema.ToolMessage("tool result", "call_id") +schema.SystemAgenticMessage("You are a helpful assistant.") +schema.UserAgenticMessage("What is the weather today?") + +&schema.AgenticMessage{ + Role: schema.AgenticRoleTypeAssistant, + ContentBlocks: []*schema.ContentBlock{ + schema.NewContentBlock(&schema.AssistantGenText{Text: "I don't know."}), + }, +} ``` **角色语义:** -- `system`:系统指令,通常放在 messages 最前面 +- `system`:系统指令,通常放在消息列表最前面 - `user`:用户输入 - `assistant`:模型回复 -- `tool`:工具调用结果(后续章节涉及) +- 工具调用和工具结果通过 `function_tool_call` / `function_tool_result` content block 表达(后续章节涉及) ## 前置条件 @@ -181,42 +202,47 @@ go run ./cmd/ch01 -- "用一句话解释 Eino 的 Component 设计解决了什 按执行顺序: -1. **创建 ChatModel**:根据 `MODEL_TYPE` 环境变量选择 OpenAI 或 Ark 实现 -2. **构造输入 messages**:`SystemMessage(instruction)` + `UserMessage(query)` -3. **调用 Stream**:所有 ChatModel 实现都必须支持 `Stream()`,返回 `StreamReader[*Message]` +1. **创建 ChatModel**:根据 `MODEL_TYPE` 环境变量选择 OpenAI 或 Ark 的 agentic model +2. **构造输入 messages**:通过 `msgops.NewSystem[M]` / `msgops.NewUser[M]` 创建 `AgenticMessage` +3. **调用 Stream**:使用 `model.BaseModel[M].Stream()`,返回 `StreamReader[M]` 4. **打印结果**:迭代 `StreamReader` 逐帧打印 assistant 回复 -关键代码片段(**注意:这是简化后的代码片段,不能直接运行****,完整代码请参考** [cmd/ch01/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch01/main.go)): +关键代码片段(**注意:这是简化后的代码片段,不能直接运行,完整代码请参考** [cmd/ch01/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch01/main.go)): ```go -// 构造输入 -messages := []*schema.Message{ - schema.SystemMessage(instruction), - schema.UserMessage(query), -} - -// 调用 Stream(所有 ChatModel 都必须实现) -stream, err := cm.Stream(ctx, messages) -if err != nil { - log.Fatal(err) -} -defer stream.Close() +func runTyped[M adk.MessageType](ctx context.Context, instruction, query string) { + cm, err := chatmodel.NewModel[M](ctx) + if err != nil { + log.Fatal(err) + } -for { - chunk, err := stream.Recv() - if errors.Is(err, io.EOF) { - break + messages := []M{ + msgops.NewSystem[M](instruction), + msgops.NewUser[M](query), } + + stream, err := cm.Stream(ctx, messages) if err != nil { log.Fatal(err) } - fmt.Print(chunk.Content) + defer stream.Close() + + for { + frame, err := stream.Recv() + if errors.Is(err, io.EOF) { + break + } + if err != nil { + log.Fatal(err) + } + fmt.Print(msgops.AssistantDeltaText(frame)) + } } ``` ## 本章小结 - **Component 接口**:定义可替换、可组合、可测试的能力边界 -- **Message**:对话数据的基本单位,通过角色区分语义 +- **AgenticMessage**:对话数据的基本单位,通过角色和 content block 区分语义 - **ChatModel**:最基础的 Component,提供 `Generate` 和 `Stream` 两个核心方法 - **实现选择**:通过环境变量或配置切换 OpenAI/Ark 等不同实现,业务代码无需改动 diff --git a/content/zh/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md b/content/zh/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md index e7360cf64db..6cf34722257 100644 --- a/content/zh/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md +++ b/content/zh/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: 第二章:ChatModelAgent、Runner、AgentEvent(Console 多轮) @@ -113,12 +113,12 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ **ChatModel vs ChatModelAgent:本质区别** - + - - - - + + + +
    维度ChatModelChatModelAgent
    维度ChatModelChatModelAgent
    定位Component(组件)Agent(智能体)
    接口
    Generate() / Stream()
    Run() -> AsyncIterator[*AgentEvent]
    输出直接返回消息内容返回事件流(包含消息、控制动作等)
    能力单纯的模型调用可扩展 tools、middleware、interrupt 等
    适用场景简单的对话场景复杂的智能体应用
    核心接口
    Generate()
    /
    Stream()
    Run() -> AsyncIterator[*AgentEvent]
    输出形态直接返回消息内容返回事件流(包含消息、控制动作等)
    核心能力单纯的大语言模型调用支持扩展 tools、middleware、interrupt 等能力
    适用场景简单对话交互场景复杂智能体应用开发
    **为什么需要 ChatModelAgent?** @@ -163,12 +163,12 @@ type Runner struct { 1. **生命周期管理**:Runner 管理 Agent 的启动、恢复、中断等状态 2. **Checkpoint 支持**:配合 `CheckPointStore` 实现中断恢复(后续章节涉及) 3. **统一入口**:提供 `Run()` 和 `Query()` 等便捷方法 -4. **事件流封装**:将 Agent 的事件流转换为可消费的 `AsyncIterator[*AgentEvent]` +4. **事件流封装**:将 Agent 的事件流转换为可消费的 `AsyncIterator[*TypedAgentEvent[M]]` **使用方式:** ```go -runner := adk.NewRunner(ctx, adk.RunnerConfig{ +runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ Agent: agent, EnableStreaming: true, }) @@ -239,34 +239,45 @@ for { 没有 tools 时,`ChatModelAgent` 在一次 `Run()` 里只会完成一轮模型调用。多轮对话是通过调用侧维护 history 实现的: -1. 用 `history []*schema.Message` 保存累计对话 -2. 每次用户输入:把 `UserMessage` 追加到 history -3. 调用 `runner.Run(ctx, history)` 得到事件流,消费得到 assistant 文本 -4. 把本轮 assistant 文本追加回 history,进入下一轮 +1. 用 `history []M` 保存累计对话,本示例默认 `M` 为 `*schema.AgenticMessage` +2. 每次用户输入:通过 `msgops.NewUser[M]` 追加到 history +3. 调用 `runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history))` 得到事件流,消费得到 assistant 文本 +4. 通过 `msgops.NewAssistant[M]` 把本轮 assistant 文本追加回 history,进入下一轮 **关键代码片段(**注意:这是简化后的代码片段,不能直接运行,完整代码请参考** [cmd/ch02/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch02/main.go)): ```go -history := make([]*schema.Message, 0, 16) +func runTyped[M adk.MessageType](ctx context.Context, instruction string) { + agent, err := adk.NewTypedChatModelAgent[M](ctx, &adk.TypedChatModelAgentConfig[M]{ + Name: "Ch02Agent", + Instruction: instruction, + Model: cm, + }) + if err != nil { + log.Fatal(err) + } -for { - // 1. 读取用户输入 - line := readUserInput() - if line == "" { - break + runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ + Agent: agent, + EnableStreaming: true, + }) + + history := make([]M, 0, 16) + + for { + line := readUserInput() + if line == "" { + break + } + + history = append(history, msgops.NewUser[M](line)) + events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) + result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) + if err != nil { + log.Fatal(err) + } + history = append(history, msgops.NewAssistant[M](result.AssistantText, nil)) } - - // 2. 追加用户消息到 history - history = append(history, schema.UserMessage(line)) - - // 3. 调用 Runner 执行 Agent - events := runner.Run(ctx, history) - - // 4. 消费事件流,收集 assistant 回复 - content := collectAssistantFromEvents(events) - - // 5. 追加 assistant 消息到 history - history = append(history, schema.AssistantMessage(content, nil)) } ``` diff --git a/content/zh/docs/eino/quick_start/chapter_03_memory_and_session.md b/content/zh/docs/eino/quick_start/chapter_03_memory_and_session.md index 0eba8441743..9968f9d5fab 100644 --- a/content/zh/docs/eino/quick_start/chapter_03_memory_and_session.md +++ b/content/zh/docs/eino/quick_start/chapter_03_memory_and_session.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: 第三章:Memory 与 Session(持久化对话) @@ -10,11 +10,12 @@ weight: 3 本章目标:实现对话历史的持久化存储,支持跨进程恢复会话。 > **⚠️ 重要说明:业务层概念 vs 框架概念** - +> > 本章介绍的 **Memory、Session、Store 是业务层概念**,**不是 Eino 框架的核心组件**。 - > - +> - **Eino 框架层面**:提供 `adk.Runner`、`adk.NewTypedRunner[M]`、`schema.AgenticMessage` 等基础抽象,框架本身不关心对话历史的存储方式 +> - **业务层层面**:Memory/Session/Store 是本示例项目为了实现持久化对话而设计的业务逻辑,通过组装给 `adk.Runner` 的输入来与 Eino 框架交互 +> > 换句话说,Eino 框架只负责"如何处理消息",而"如何存储消息"完全由业务层决定。本章提供的实现只是一个简单的参考示例,你可以根据自己的业务需求选择完全不同的存储方案(数据库、Redis、云存储等)。 ## 代码位置 @@ -87,7 +88,7 @@ type Session struct { ID string CreatedAt time.Time - messages []*schema.Message // 对话历史 + messages []M // 对话历史,示例默认 M 为 *schema.AgenticMessage // ... } ``` @@ -120,13 +121,15 @@ type Store struct { 每个 Session 存储为一个 `.jsonl` 文件: ``` -{"type":"session","id":"083d16da-...","created_at":"2026-03-11T10:00:00Z"} -{"role":"user","content":"你好,我是谁?"} -{"role":"assistant","content":"你好!我暂时不知道你是谁..."} -{"role":"user","content":"我叫张三"} -{"role":"assistant","content":"好的,张三,很高兴认识你!"} +{"type":"session","id":"083d16da-...","created_at":"2026-03-11T10:00:00Z","message_kind":"agentic"} +{"role":"user","content_blocks":[{"type":"user_input_text","user_input_text":{"text":"你好,我是谁?"}}]} +{"role":"assistant","content_blocks":[{"type":"assistant_gen_text","assistant_gen_text":{"text":"你好!我暂时不知道你是谁..."}}]} +{"role":"user","content_blocks":[{"type":"user_input_text","user_input_text":{"text":"我叫张三"}}]} +{"role":"assistant","content_blocks":[{"type":"assistant_gen_text","assistant_gen_text":{"text":"好的,张三,很高兴认识你!"}}]} ``` +会话默认保存在 `./data/sessions_agentic`;如果需要放到其他目录,可以设置 `SESSION_DIR_AGENTIC`。 + **为什么用 JSONL?** - **简单**:每行一个 JSON 对象,易于读写 @@ -141,7 +144,7 @@ type Store struct { ### 1. 创建 Store ```go -sessionDir := "./data/sessions" +sessionDir := "./data/sessions_agentic" store, err := mem.NewStore(sessionDir) if err != nil { log.Fatal(err) @@ -161,7 +164,7 @@ if err != nil { ### 3. 追加用户消息 ```go -userMsg := schema.UserMessage("你好") +userMsg := msgops.NewUser[M]("你好") if err := session.Append(userMsg); err != nil { log.Fatal(err) } @@ -171,14 +174,17 @@ if err := session.Append(userMsg); err != nil { ```go history := session.GetMessages() -events := runner.Run(ctx, history) -content := collectAssistantFromEvents(events) +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) +if err != nil { + log.Fatal(err) +} ``` ### 5. 追加助手消息 ```go -assistantMsg := schema.AssistantMessage(content, nil) +assistantMsg := msgops.NewAssistant[M](result.AssistantText, nil) if err := session.Append(assistantMsg); err != nil { log.Fatal(err) } @@ -187,6 +193,11 @@ if err := session.Append(assistantMsg); err != nil { **关键代码片段(**注意:这是简化后的代码片段,不能直接运行,完整代码请参考** [cmd/ch03/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch03/main.go)): ```go +store, err := mem.NewStore[M](msgops.DefaultSessionDir(msgops.KindOf[M]())) +if err != nil { + log.Fatal(err) +} + // 创建或恢复 Session session, err := store.GetOrCreate(sessionID) if err != nil { @@ -194,18 +205,21 @@ if err != nil { } // 用户输入 -userMsg := schema.UserMessage(line) +userMsg := msgops.NewUser[M](line) if err := session.Append(userMsg); err != nil { log.Fatal(err) } // 调用 Agent history := session.GetMessages() -events := runner.Run(ctx, history) -content := collectAssistantFromEvents(events) +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) +if err != nil { + log.Fatal(err) +} // 保存助手回复 -assistantMsg := schema.AssistantMessage(content, nil) +assistantMsg := msgops.NewAssistant[M](result.AssistantText, nil) if err := session.Append(assistantMsg); err != nil { log.Fatal(err) } @@ -217,7 +231,7 @@ if err := session.Append(assistantMsg); err != nil { - **Session 是业务层概念**:由业务代码实现和管理,负责存储和加载对话历史 - **Agent(Runner)是框架层概念**:由 Eino 框架提供,负责处理消息并生成回复 -- **两者的交互点**:业务层通过 `session.GetMessages()` 获取消息列表,传递给 `runner.Run(ctx, history)` 进行处理 +- **两者的交互点**:业务层通过 `session.GetMessages()` 获取消息列表,再通过 `msgops.NormalizeMessagesForModelInput(history)` 生成模型输入,最后传递给 `runner.Run(ctx, messages)` 进行处理 **架构分层:** @@ -281,7 +295,7 @@ if err := session.Append(assistantMsg); err != nil { **框架层 vs 业务层:** -- **Eino 框架层**:提供 `adk.Runner`、`schema.Message` 等基础抽象,不关心消息如何存储 +- **Eino 框架层**:提供 `adk.Runner`、typed runner、`schema.AgenticMessage` 等基础抽象,不关心消息如何存储 - **业务层(本章实现)**:Memory/Session/Store 是业务层概念,用于管理对话历史的存储 **业务层概念:** @@ -294,7 +308,7 @@ if err := session.Append(assistantMsg); err != nil { **业务层与框架层的交互:** - 业务层负责存储消息,通过 `session.GetMessages()` 获取消息列表 -- 将消息列表传递给框架层的 `runner.Run(ctx, history)` 进行处理 +- 将消息列表规整为模型输入后,传递给框架层的 `runner.Run(ctx, messages)` 进行处理 - 收集框架层返回的回复,再由业务层保存到存储中 > **💡 提示**:本章的实现只是众多存储方案中的一种简单示例。在实际项目中,你可以根据业务需求选择数据库、Redis、云存储等方案,甚至可以实现更复杂的功能如会话过期清理、搜索、分享等。 diff --git a/content/zh/docs/eino/quick_start/chapter_04_tool_and_filesystem.md b/content/zh/docs/eino/quick_start/chapter_04_tool_and_filesystem.md index 9f0e38919c1..5db71a20e09 100644 --- a/content/zh/docs/eino/quick_start/chapter_04_tool_and_filesystem.md +++ b/content/zh/docs/eino/quick_start/chapter_04_tool_and_filesystem.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: 第四章:Tool 与文件系统访问 @@ -135,7 +135,7 @@ backend, err := localbk.NewBackend(ctx, &localbk.Config{}) 添加自定义 Tool✅ 手动注册每个 Tool✅ 手动注册或自动注册 文件系统访问(Backend)❌ 需手动创建并注册所有文件工具✅ 一级配置,自动注册 命令执行(StreamingShell)❌ 需手动创建✅ 一级配置,自动注册 -内置任务管理❌✅
    write_todos
    工具 +内置任务管理❌✅ write_todos 工具 支持子 Agent❌✅ @@ -170,7 +170,7 @@ agent, err := deep.New(ctx, &deep.Config{ Name: "Ch04ToolAgent", Description: "ChatWithDoc agent with filesystem access via LocalBackend.", ChatModel: cm, - Instruction: instruction, + Instruction: agentInstruction, Backend: backend, // 提供文件系统操作能力 StreamingShell: backend, // 提供命令执行能力 MaxIteration: 50, @@ -214,7 +214,7 @@ ls $PROJECT_ROOT go run ./cmd/ch04 ``` -**PROJECT_ROOT 说明:** +**PROJECT_ROOT**** 说明:** - **不设置时**:`PROJECT_ROOT` 默认为当前工作目录(`chatwitheino` 所在目录),Agent 只能访问本示例项目的文件。这对于快速试验已足够。 - **设置后**:指向 Eino 核心库根目录,Agent 可以检索 Eino 框架的完整代码库(核心库、扩展库、示例库)。这是 ChatWithEino 的完整使用场景。 diff --git a/content/zh/docs/eino/quick_start/chapter_05_middleware.md b/content/zh/docs/eino/quick_start/chapter_05_middleware.md index 0bae0820e06..2dac5a0fa76 100644 --- a/content/zh/docs/eino/quick_start/chapter_05_middleware.md +++ b/content/zh/docs/eino/quick_start/chapter_05_middleware.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: 第五章:Middleware(中间件模式) @@ -182,6 +182,9 @@ func (m *safeToolMiddleware) WrapInvokableToolCall( return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { result, err := endpoint(ctx, args, opts...) if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err + } // 将错误转换为字符串,而不是返回错误 return fmt.Sprintf("[tool error] %v", err), nil } @@ -305,7 +308,8 @@ agent, err := deep.New(ctx, &deep.Config{ MaxRetries: 5, IsRetryAble: func(_ context.Context, err error) bool { return strings.Contains(err.Error(), "429") || - strings.Contains(err.Error(), "Too Many Requests") + strings.Contains(err.Error(), "Too Many Requests") || + strings.Contains(err.Error(), "qpm limit") }, }, }) @@ -409,9 +413,9 @@ agent, _ := deep.New(ctx, &deep.Config{ - - - + + +
    Middleware功能说明
    reduction工具输出缩减,当工具返回内容过长时自动截断并卸载到文件系统,防止上下文溢出
    summarization对话历史自动摘要,当 token 数量超过阈值时自动生成摘要压缩历史
    skill技能加载中间件,让 Agent 能够动态加载和执行预定义的技能
    reduction工具输出缩减,当工具返回内容过长时自动截断并卸载到文件系统,防止上下文溢出
    summarization对话历史自动摘要,当 token 数量超过阈值时自动生成摘要压缩历史
    skill技能加载中间件,让 Agent 能够动态加载和执行预定义的技能
    **Middleware 链示例:** diff --git a/content/zh/docs/eino/quick_start/chapter_06_callback_and_trace.md b/content/zh/docs/eino/quick_start/chapter_06_callback_and_trace.md index 0ff7dfadc2d..236bac07914 100644 --- a/content/zh/docs/eino/quick_start/chapter_06_callback_and_trace.md +++ b/content/zh/docs/eino/quick_start/chapter_06_callback_and_trace.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: 第六章:Callback 与 Trace(可观测性) @@ -159,12 +159,12 @@ callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client)) Callback 在组件生命周期的 5 个关键时机触发。下表中 `Timing*` 是 Eino 内部常量名(用于 `TimingChecker` 接口),对应的 Handler 接口方法是右侧所示: - - - - - - + + + + + +
    时机常量对应 Handler 方法触发点输入/输出
    TimingOnStart
    OnStart
    组件开始处理前CallbackInput
    TimingOnEnd
    OnEnd
    组件成功返回后CallbackOutput
    TimingOnError
    OnError
    组件返回错误时error
    TimingOnStartWithStreamInput
    OnStartWithStreamInput
    组件接收流式输入时StreamReader[CallbackInput]
    TimingOnEndWithStreamOutput
    OnEndWithStreamOutput
    组件返回流式输出时StreamReader[CallbackOutput]
    时机常量对应 Handler 方法触发点输入 / 输出
    TimingOnStartOnStart组件开始处理前CallbackInput
    TimingOnEndOnEnd组件成功返回后CallbackOutput
    TimingOnErrorOnError组件返回错误时error
    TimingOnStartWithStreamInputOnStartWithStreamInput组件接收流式输入时StreamReader[CallbackInput]
    TimingOnEndWithStreamOutputOnEndWithStreamOutput组件返回流式输出时StreamReader[CallbackOutput]
    **示例:ChatModel 调用流程** @@ -251,48 +251,26 @@ callbacks.AppendGlobalHandlers(handler) ### 2. 集成 CozeLoop ```go -func setupCozeLoop(ctx context.Context) (*cozeloop.Client, error) { - apiToken := os.Getenv("COZELOOP_API_TOKEN") - workspaceID := os.Getenv("COZELOOP_WORKSPACE_ID") - - if apiToken == "" || workspaceID == "" { - return nil, nil // 未配置则跳过 - } - +// Setup CozeLoop tracing (optional) +// Set COZELOOP_API_TOKEN and COZELOOP_WORKSPACE_ID to enable +cozeloopApiToken := os.Getenv("COZELOOP_API_TOKEN") +cozeloopWorkspaceID := os.Getenv("COZELOOP_WORKSPACE_ID") +if cozeloopApiToken != "" && cozeloopWorkspaceID != "" { client, err := cozeloop.NewClient( - cozeloop.WithAPIToken(apiToken), - cozeloop.WithWorkspaceID(workspaceID), + cozeloop.WithAPIToken(cozeloopApiToken), + cozeloop.WithWorkspaceID(cozeloopWorkspaceID), ) if err != nil { - return nil, err + log.Fatalf("cozeloop.NewClient failed: %v", err) } - - // 注册为全局 Callback + defer func() { + time.Sleep(5 * time.Second) + client.Close(ctx) + }() callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client)) - - return client, nil -} -``` - -### 3. 在 main 中使用 - -```go -func main() { - ctx := context.Background() - - // 设置 CozeLoop(可选) - client, err := setupCozeLoop(ctx) - if err != nil { - log.Printf("cozeloop setup failed: %v", err) - } - if client != nil { - defer func() { - time.Sleep(5 * time.Second) // 等待数据上报 - client.Close(ctx) - }() - } - - // 创建 Agent 并运行... + log.Println("CozeLoop tracing enabled") +} else { + log.Println("CozeLoop tracing disabled (set COZELOOP_API_TOKEN and COZELOOP_WORKSPACE_ID to enable)") } ``` diff --git a/content/zh/docs/eino/quick_start/chapter_07_interrupt_resume.md b/content/zh/docs/eino/quick_start/chapter_07_interrupt_resume.md index 87b6bd4b55d..74c12919b19 100644 --- a/content/zh/docs/eino/quick_start/chapter_07_interrupt_resume.md +++ b/content/zh/docs/eino/quick_start/chapter_07_interrupt_resume.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: 第七章:Interrupt/Resume(中断与恢复) @@ -173,11 +173,15 @@ func (m *approvalMiddleware) WrapInvokableToolCall( return fmt.Sprintf("tool '%s' disapproved", tCtx.Name), nil } - // 重新中断 - return "", tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ - ToolName: tCtx.Name, - ArgumentsInJSON: storedArgs, - }, storedArgs) + isTarget, _, _ = tool.GetResumeContext[any](ctx) + if !isTarget { + return "", tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ + ToolName: tCtx.Name, + ArgumentsInJSON: storedArgs, + }, storedArgs) + } + + return endpoint(ctx, storedArgs, opts...) }, nil } @@ -248,7 +252,7 @@ type CheckPointStore interface { ### 1. 配置 Runner 使用 CheckPointStore ```go -runner := adk.NewRunner(ctx, adk.RunnerConfig{ +runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ Agent: agent, EnableStreaming: true, CheckPointStore: adkstore.NewInMemoryStore(), // 内存存储 @@ -258,11 +262,11 @@ runner := adk.NewRunner(ctx, adk.RunnerConfig{ ### 2. 配置 Agent 使用 ApprovalMiddleware ```go -agent, err := deep.New(ctx, &deep.Config{ +agent, err := deep.NewTyped[M](ctx, &deep.TypedConfig[M]{ // ... 其他配置 - Handlers: []adk.ChatModelAgentMiddleware{ - &approvalMiddleware{}, // 添加审批中间件 - &safeToolMiddleware{}, // 将 Tool 错误转换为字符串(中断类错误会继续向上抛出) + Handlers: []adk.TypedChatModelAgentMiddleware[M]{ + newApprovalMiddleware[M](), // 添加审批中间件 + newSafeToolMiddleware[M](), // 将 Tool 错误转换为字符串(中断类错误会继续向上抛出) }, }) ``` @@ -272,22 +276,27 @@ agent, err := deep.New(ctx, &deep.Config{ ```go checkPointID := sessionID -events := runner.Run(ctx, history, adk.WithCheckPointID(checkPointID)) -content, interruptInfo, err := printAndCollectAssistantFromEvents(events) +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history), adk.WithCheckPointID(checkPointID)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{ + ShowToolCalls: true, + ShowToolResults: true, + CaptureInterrupt: true, +}) if err != nil { return err } -if interruptInfo != nil { +assistantText := result.AssistantText +if result.InterruptInfo != nil { // 注意:建议使用同一个 stdin reader 同时读取「用户输入」与「审批 y/n」 // 避免审批输入被当成下一轮 you> 的消息 - content, err = handleInterrupt(ctx, runner, checkPointID, interruptInfo, reader) + assistantText, err = handleInterrupt[M](ctx, runner, checkPointID, result.InterruptInfo, reader) if err != nil { return err } } -_ = session.Append(schema.AssistantMessage(content, nil)) +_ = session.Append(msgops.NewAssistant[M](assistantText, nil)) ``` ## Interrupt/Resume 执行流程 diff --git a/content/zh/docs/eino/quick_start/chapter_08_graph_tool.md b/content/zh/docs/eino/quick_start/chapter_08_graph_tool.md index c7562e9d062..2126cb87769 100644 --- a/content/zh/docs/eino/quick_start/chapter_08_graph_tool.md +++ b/content/zh/docs/eino/quick_start/chapter_08_graph_tool.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: 第八章:Graph Tool(复杂工作流) @@ -51,7 +51,7 @@ you> 请帮我分析 RFC6455 文档中关于 WebSocket 握手的部分 **重要说明:本章只是展示 compose/graph/workflow 能力的一角。** -从更大的视角看,Eino 的 `compose` 包提供了非常通用、确定性的编排能力:你可以把任何需要"确定性业务流程"的系统,用 `compose` 的 Graph/Chain/Workflow 组织成可执行的流水线,并且它能够**原生编排 Eino 的所有 component**(如 ChatModel、Prompt、Tools、Retriever、Embedding、Indexer 等),同时具备完整的 **callback** 体系,以及 **interrupt/resume + checkpoint** 支持。 +从更大的视角看,Eino 的 `compose` 包提供了非常通用、确定性的编排能力:你可以把任何需要“确定性业务流程”的系统,用 `compose` 的 Graph/Chain/Workflow 组织成可执行的流水线,并且它能够**原生编排 Eino 的所有 component**(如 ChatModel、Prompt、Tools、Retriever、Embedding、Indexer 等),同时具备完整的 **callback** 体系,以及 **interrupt/resume + checkpoint** 支持。 **Graph Tool 的定位:** @@ -135,8 +135,8 @@ wf.AddLambdaNode("answer", answerFunc). ```go type Input struct { - FilePath string `json:"file_path" jsonschema:"description=Absolute path to the document"` - Question string `json:"question" jsonschema:"description=The question to answer"` + FilePath string `json:"file_path" jsonschema:"description=Absolute path to the uploaded document file"` + Question string `json:"question" jsonschema:"description=The question to answer from the document"` } type Output struct { @@ -192,23 +192,35 @@ func buildWorkflow(cm model.BaseChatModel) *compose.Workflow[Input, Output] { AddInputWithOptions("chunk", []*compose.FieldMapping{compose.ToField("Chunks")}, compose.WithNoDirectDependency()). AddInputWithOptions(compose.START, []*compose.FieldMapping{compose.MapFields("Question", "Question")}, compose.WithNoDirectDependency()) - // filter: 筛选 top-k + // filter: sort descending by score, keep up to top-3 chunks with score ≥ 3. wf.AddLambdaNode("filter", compose.InvokableLambda( func(ctx context.Context, scored []scoredChunk) ([]scoredChunk, error) { sort.Slice(scored, func(i, j int) bool { return scored[i].Score > scored[j].Score }) - // 返回 top-3 - if len(scored) > 3 { - scored = scored[:3] + const maxK = 3 + var top []scoredChunk + for _, c := range scored { + if c.Score < 3 { + break + } + top = append(top, c) + if len(top) == maxK { + break + } } - return scored, nil + return top, nil }, )).AddInput("score") - // answer: 生成答案 + // answer: synthesize a response from top-k chunks, or return a not-found message if empty. wf.AddLambdaNode("answer", compose.InvokableLambda( func(ctx context.Context, in synthIn) (Output, error) { + if len(in.TopK) == 0 { + return Output{ + Answer: fmt.Sprintf("No relevant content found in the document for: %q", in.Question), + }, nil + } return synthesize(ctx, cm, in) }, )). @@ -229,7 +241,9 @@ func BuildTool(ctx context.Context, cm model.BaseChatModel) (tool.BaseTool, erro return graphtool.NewInvokableGraphTool[Input, Output]( wf, "answer_from_document", - "Search a large document for relevant content and synthesize an answer.", + "Search a large uploaded document for content relevant to a question and synthesize a "+ + "cited answer from the most relevant passages. "+ + "Use this instead of read_file when the document may be too large to fit in context.", ) } ``` @@ -237,6 +251,7 @@ func BuildTool(ctx context.Context, cm model.BaseChatModel) (tool.BaseTool, erro **关键代码片段(**注意:这是简化后的代码片段,不能直接运行,完整代码请参考** [rag/rag.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/rag/rag.go)): ```go +func BuildTool[M adk.MessageType](ctx context.Context, cm model.BaseModel[M]) (tool.BaseTool, error) { // 构建工作流 wf := compose.NewWorkflow[Input, Output]() @@ -249,6 +264,7 @@ wf.AddLambdaNode("score", scoreFunc). // 封装为 Tool return graphtool.NewInvokableGraphTool[Input, Output](wf, "answer_from_document", "...") +} ``` ## Graph Tool 执行流程 diff --git a/content/zh/docs/eino/quick_start/chapter_09_a2ui_protocol.md b/content/zh/docs/eino/quick_start/chapter_09_a2ui_protocol.md index b6ff586670f..346f5a5f8a5 100644 --- a/content/zh/docs/eino/quick_start/chapter_09_a2ui_protocol.md +++ b/content/zh/docs/eino/quick_start/chapter_09_a2ui_protocol.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: 第十章:A2UI 协议(流式 UI 组件) @@ -23,9 +23,7 @@ Eino 更关注“可组合的智能执行与编排能力”,至于“如何呈 ## 代码位置 -- 入口代码:[main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/main.go) -- Agent 构建:[agent.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/agent.go) -- 服务端路由:[server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) +- 入口代码(Runner 版):[cmd/ch10/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch10/main.go) - A2UI 子集实现:[a2ui/types.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/types.go) - A2UI 事件流转换:[a2ui/streamer.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/streamer.go) - 前端页面:[static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html) @@ -39,7 +37,7 @@ Eino 更关注“可组合的智能执行与编排能力”,至于“如何呈 在 `quickstart/chatwitheino` 目录下执行: ```bash -go run . +go run ./cmd/ch10/ ``` 输出示例: @@ -54,9 +52,11 @@ starting server on http://localhost:8080 ```bash go run ./scripts/sync_eino_ext_skills.go -src /path/to/eino-ext -dest ./skills/eino-ext -clean -EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run . +EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run ./cmd/ch10/ ``` +会话默认保存在 `./data/sessions_agentic`。 + ## 从文本到 UI:为什么需要 A2UI 前八章我们实现的 Agent 只输出文本,但现代 AI 应用需要更丰富的交互。 @@ -91,7 +91,7 @@ EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run . 每一行 SSE(`data: {...}`)承载一个 A2UI Message,Message 是一个“信封结构”,每次只会出现一个字段: -**关键代码片段(注意:这是简化后的代码片段,不能直接运行,完整代码请参考 a2ui/types.go):** +**关键代码片段(注意:这是简化后的代码片段,不能直接运行,完整代码请参考 ****a2ui/types.go****):** ```go type Message struct { @@ -122,13 +122,13 @@ type Message struct { 最终 Web 版的核心链路是: -- 后端运行 Agent,得到 `*adk.AsyncIterator[*adk.AgentEvent]` +- 后端运行 Agent,得到 `*adk.AsyncIterator[*adk.TypedAgentEvent[M]]` - 把事件流转换为 A2UI JSONL/SSE 流输出给浏览器(见 [a2ui/streamer.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/streamer.go)) - 前端解析 SSE 的 `data:` 行并渲染组件树(见 [static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html)) ### 服务端路由(高层) -与 A2UI 相关的关键接口(见 [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go)): +与 A2UI 相关的关键接口(见 [cmd/ch10/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch10/main.go)): - `GET /`:返回前端页面 `static/index.html` - `POST /sessions/:id/chat`:返回 SSE 流(A2UI messages),把 Agent 运行结果边跑边渲染到 UI @@ -137,7 +137,7 @@ type Message struct { ### 事件流转换(高层) -服务端把 `Runner.Run(...)` 的事件流交给 `a2ui.StreamToWriter(...)`,后者负责: +服务端把 `Runner.Run(...)` 的事件流交给 `a2ui.StreamToWriter[M](...)`,后者负责: - 对 user/assistant/tool 的输出做拆分 - 把 tool call / tool result 渲染成 “chip 卡片” @@ -148,7 +148,7 @@ type Message struct { - 前端通过 `fetch('/sessions/:id/chat')` 发起请求,然后从 `res.body` 读取流式字节,按行切分并解析 `data: {...}` 的 JSON(见 [static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html))。 -**关键代码片段(注意:这是简化后的代码片段,不能直接运行,完整代码请参考 static/index.html):** +**关键代码片段(注意:这是简化后的代码片段,不能直接运行,完整代码请参考 ****static/index.html****):** ```javascript const res = await fetch(`/sessions/${id}/chat`, { @@ -220,33 +220,8 @@ while (true) { - **流式输出**:后端以 SSE 推送 A2UI JSONL,前端增量渲染组件树 - **事件到 UI**:把 `AgentEvent` 转为 `tool call / tool result / assistant stream` 的可视化输出 -## 系列收尾:这个 Quickstart Agent 的完整愿景 - -到本章为止,我们用一个可以实际运行的 Agent 串起了 Eino 的核心能力。你可以把它理解为一个可扩展的“端到端 Agent 应用骨架”: - -- 运行时:Runner 驱动执行,支持流式输出与事件模型 -- 工具层:Filesystem / Shell 等 Tool 能力接入,工具错误可被安全处理 -- 中间件:可插拔的 middleware/handler,用于错误处理、重试、审批等横切能力 -- 可观测:callbacks/trace 能力把关键链路打通,便于调试与线上观测 -- 人机协作:interrupt/resume + checkpoint 支持审批、补参、分支选择等交互式流程 -- 确定性编排:compose(graph/chain/workflow)把复杂业务流程组织为可维护、可复用的执行图 -- 业务交付:像 A2UI 这样的 UI 集成,属于业务层自由选择的一环,用来把 Agent 能力以合适的产品形态呈现给用户 - -你可以在这个骨架上逐步替换/扩展任意环节:模型、工具、存储、工作流、前端渲染协议,而不需要推倒重来。 - -## 扩展思考 - -**其他组件类型:** - -- 图表组件(折线图、柱状图、饼图) -- 地图组件 -- 时间线组件 -- 树形组件 -- 标签页组件 +## 下一步 -**高级功能:** +本章的 `cmd/ch10` 使用 `adk.Runner` 实现了完整的 Web 应用。但 Runner 是"一次性"模型——如果用户在 Agent 回答到一半时发出新问题,Runner 没有内置机制来取消当前执行并切换到新输入。 -- 组件交互(点击、拖拽、输入) -- 条件渲染 -- 组件动画 -- 响应式布局 +下一章将引入 `adk.TurnLoop`,为 Agent 增加 **抢占(Preempt)** 和 **中止(Abort)** 能力。 diff --git a/content/zh/docs/eino/quick_start/chapter_09_skill_console.md b/content/zh/docs/eino/quick_start/chapter_09_skill_console.md index e66120d0562..b26b92b74f7 100644 --- a/content/zh/docs/eino/quick_start/chapter_09_skill_console.md +++ b/content/zh/docs/eino/quick_start/chapter_09_skill_console.md @@ -1,13 +1,13 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-19" lastmod: "" tags: [] title: 第九章:Skill(Console) weight: 9 --- -本章目标:在第八章(RAG + Interrupt/Resume + Checkpoint)基础上,引入 `skill` 中间件,让 Agent 可以发现并加载一组可复用的技能文档(`SKILL.md`),并在需要时通过工具调用使用它们。 +本章目标:在第八章(RAG + Interrupt/Resume + Checkpoint)基础上,引入 `skill` 技能包,采用 `skill middleware` 注入和管理 skills,让 Agent 可以发现并加载一组可复用的技能文档(`SKILL.md`),并在需要时通过工具调用使用它们。 ## 代码位置 @@ -17,21 +17,16 @@ weight: 9 ## 前置条件 - 与第一章一致:需要配置一个可用的 ChatModel(OpenAI 或 Ark) -- 准备好 `eino-ext` PR 提供的 skills(`eino-guide` / `eino-component` / `eino-compose` / `eino-agent`) +- 准备好 `eino-ext` PR 提供的 skills 文档(`eino-guide` / `eino-component` / `eino-compose` / `eino-agent`) -为什么是这四个? +`skill middleware` 支持各种 skills 的接入。本章仅以 eino 相关的四个 skills 作为示例,演示如何使用 `skill middleware` 接入 skills。为什么是这四个? -ChatWithEino 的定位是“帮用户学习 Eino 框架、并尝试用 AI 辅助写 Eino 代码”。这四个 skills 正好覆盖了这个目标所需的关键知识面: - -- `eino-guide`:学习入口与导航(从哪里开始、怎么快速跑起来) -- `eino-component`:Component 接口与各类实现参考(Model/Embedding/Retriever/Tool/Callback 等) -- `eino-compose`:编排与确定性工作流参考(Graph/Chain/Workflow 等) -- `eino-agent`:ADK/Agent 相关参考(Agent、Runner、Middleware、Filesystem、Human-in-the-loop 等) +ChatWithEino 的定位是“帮用户学习 Eino 框架、并尝试用 AI 辅助写 Eino 代码”。这四个 skills 文档正好覆盖了这个目标所需的关键知识面。 skills 的来源可以是: - `eino-ext` 仓库本地路径(脚本会自动读取 `/skills/...`) -- 或你已安装 skills 的目录(目录下能看到上述四个子目录) +- 或你已安装 skills 的目录(目录下能看到上述四个子目录)∑ ## 从 Graph Tool 到 Skill:为什么需要“技能文档” @@ -42,6 +37,8 @@ skills 的来源可以是: - **Tool** 更像“动作/能力”:读文件、跑 workflow、调用外部系统 - **Skill** 更像“可复用的知识/指令包”:用一组 markdown(`SKILL.md` + `reference/*.md`)描述“如何做某类事” +而 `Skill middleware` 就是负责把 skills 接入 agent。注册 skill middleware 后,Agent 才能通过 `skill` 工具按需读取某个 Skill。 + 简单类比: - **Tool** = “能做什么”(函数/接口) @@ -53,7 +50,7 @@ skills 的来源可以是: ### 1) 同步 eino-ext skills 到本地目录 -为了让 `skill` 中间件可以“发现”这些 skills,需要把它们放到一个统一目录下,并满足扫描约定: +为了让 `skill` middleware 可以“发现”这些 skills,需要把它们放到一个统一目录下,并满足扫描约定: - `EINO_EXT_SKILLS_DIR//SKILL.md` @@ -73,7 +70,8 @@ go run ./scripts/sync_eino_ext_skills.go -src /path/to/eino-ext -dest ./skills/e ### 2) 启动 Chapter 9 ```bash -EINO_EXT_SKILLS_DIR=/absolute/path/to/chatwitheino/skills/eino-ext go run ./cmd/ch09 +export EINO_EXT_SKILLS_DIR=/absolute/path/to/chatwitheino/skills/eino-ext +go run ./cmd/ch09 ``` 输出示例(节选): @@ -85,11 +83,11 @@ Enter your message (empty line to exit): ## 在 DeepAgent 中启用 Skill -本章的 “Skill 可被调用” 不是自动发生的,你需要在 Agent 构建时把 `skill` 中间件注册进去。核心就是三步: +本章的 “Skill 可被调用” 不是自动发生的,你需要在 Agent 构建时把 `Skill middleware` 注册进去。核心就是三步: 1. 用本地 filesystem backend(本章用 `eino-ext/adk/backend/local`)提供文件读取/Glob 能力 2. 用 `skill.NewBackendFromFilesystem` 把 `EINO_EXT_SKILLS_DIR` 变成一个 Skill Backend -3. 用 `skill.NewMiddleware` 生成中间件,并把它塞进 DeepAgent 的 `Handlers` +3. 用 `skill.NewTyped[M]` 生成泛型 `Skill middleware`,并把它塞进 DeepAgent 的 `Handlers` **关键代码片段(注意:这是简化后的代码片段,不能直接运行,完整代码请参考 ****cmd/ch09/main.go****):** @@ -100,15 +98,15 @@ skillBackend, _ := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesys Backend: backend, BaseDir: skillsDir, // = $EINO_EXT_SKILLS_DIR }) -skillMiddleware, _ := skill.NewMiddleware(ctx, &skill.Config{ +skillMiddleware, _ := skill.NewTyped[M](ctx, &skill.TypedConfig[M]{ Backend: skillBackend, }) -agent, _ := deep.New(ctx, &deep.Config{ +agent, _ := deep.NewTyped[M](ctx, &deep.TypedConfig[M]{ ChatModel: cm, Backend: backend, StreamingShell: backend, - Handlers: []adk.ChatModelAgentMiddleware{ + Handlers: []adk.TypedChatModelAgentMiddleware[M]{ skillMiddleware, // ... 其他中间件,比如 approval/safeTool/retry 等 }, @@ -138,5 +136,5 @@ Use the skill tool with skill="eino-guide" and tell me what the entry point is f - 当模型调用 skill 工具时,控制台会打印: - `[tool call] ...` - `[tool result] ...`(对结果做了截断展示) -- 会话保存在 `SESSION_DIR`(默认 `./data/sessions`),支持恢复: +- 会话默认保存在 `./data/sessions_agentic`,支持恢复: - `go run ./cmd/ch09 --session ` diff --git a/content/zh/docs/eino/quick_start/chapter_11_turnloop.md b/content/zh/docs/eino/quick_start/chapter_11_turnloop.md new file mode 100644 index 00000000000..f59f6083833 --- /dev/null +++ b/content/zh/docs/eino/quick_start/chapter_11_turnloop.md @@ -0,0 +1,247 @@ +--- +Description: "" +date: "2026-05-19" +lastmod: "" +tags: [] +title: 第十一章:TurnLoop — 抢占、中止与多轮生命周期 +weight: 11 +--- + +上一章我们用 `adk.Runner` 实现了完整的 A2UI Web 应用。它能正常工作,但试试这个场景: + +> 你问 Agent 一个复杂问题,它开始调用工具、生成长回答……但你忽然意识到问错了,想换一个问题。 + +在上一章的 Runner 模式下,你只能等它说完,或者刷新页面丢弃一切。 + +本章引入 `adk.TurnLoop`,让 Agent 支持两个用户侧可感知的新能力:**抢占**和**中止**。 + +## 前置条件 + +与第一章一致:需要配置一个可用的 ChatModel(OpenAI 或 Ark),详见第一章的"前置条件"部分。 + +## 运行 & 体验 + +在 `quickstart/chatwitheino` 目录下执行: + +```bash +go run . +``` + +打开浏览器访问 `http://localhost:8080`,然后试试以下操作: + +### 体验抢占(Preempt) + +1. 发送一个会触发长回答的问题,例如"详细解释一下 Eino 的所有组件" +2. **在 Agent 还在回答时**,直接发送一条新消息,例如"算了,就告诉我 ChatModel 是什么" +3. 观察:旧回答立即停止,Agent 开始回答新问题 + +### 体验中止(Abort) + +1. 发送一个问题 +2. **在 Agent 回答过程中**,点击右上角的 **Abort 按钮** +3. 观察:Agent 立即停止,不再继续输出 + +这两个能力在上一章的 Runner 版本中都不存在。以下解释它们是如何实现的。 + +## 代码位置 + +- 入口代码:[main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/main.go) +- Agent 构建:[agent.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/agent.go) +- TurnLoop 服务端:[server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) + +## 为什么 Runner 做不到 + +上一章的 `cmd/ch10` 中,每个 `/sessions/:id/chat` 请求调用一次 `runner.Run(ctx, messages)`。Runner 是**单轮(single-turn)**模型——调用一次、执行一次、结束。如果用户在 Agent 执行过程中又发了一条消息,Runner 没有"正在运行的循环"可以接收它。 + +TurnLoop 则是一个**持久运行的多轮(multi-turn)执行循环**。它在轮次之间保持 idle 等待,随时可以通过 `Push()` 接收新输入并立即响应。正是因为有一个持续运行的循环,抢占和中止才成为可能——你可以打断一个正在进行的轮次,或者直接停止整个循环。 + + + + + + + + + +
    能力Ch10(Runner,单轮)Ch11(TurnLoop,多轮)
    流式输出
    审批 / 中断
    跨轮次持久运行、实时响应新输入❌ 每次 Run () 独立✅ Push () 随时送入
    抢占正在进行的回答✅ Push(item, WithPreempt(...))
    中止 Agent✅ loop.Stop(WithImmediate())
    灵活的 per-turn 输入构建❌ 业务层手动拼装✅ GenInput 回调
    + +## TurnLoop 的核心模型 + +TurnLoop 是一个**基于推送的事件循环,以轮次(turn)为单位管理 Agent 的执行**。与 Runner 的"调用一次、执行一次"不同,TurnLoop 持续运行:轮次结束后进入 idle 等待,新 item 到来时立即启动下一轮。 + +``` +Push(item) → [队列] → GenInput(items) → Agent.Run() → OnAgentEvents(events) + ↑ │ + └──── idle 等待 / 下一轮 ←──────┘ +``` + +关键概念: + +- **Item**:用户输入的载体。本示例定义为 `ChatItem`,可以携带用户消息或审批决定 +- **GenInput**:从队列中的 items 构建 Agent 输入(选择哪些 items 消费、哪些保留给下一轮) +- **OnAgentEvents**:接收 Agent 输出的事件流,负责渲染和持久化 +- **Push**:向队列推入新 item,可附带抢占选项 + +## 一个 Session 对应一个 TurnLoop + +在本示例的 Web 场景中,每个聊天 session 对应一个 TurnLoop 实例。当用户发送第一条消息时,服务端为该 session 创建一个 TurnLoop 并调用 `Run()` 启动它;后续消息通过 `Push()` 送入同一个循环。这个循环在轮次之间保持 idle 等待,直到 session 被删除或用户 abort。 + +这是 TurnLoop 最典型的使用模式:**循环的生命周期与用户会话绑定**。一个长期运行的 TurnLoop 让抢占和中止成为自然的操作——因为"正在运行的循环"始终存在,新输入随时可以送入。 + +## 常规流程:idle → 新消息 → 回答 → idle + +最简单的场景是用户依次提问、等回答、再提下一个问题: + +```go +// 用户发送第一条消息时,创建并启动 TurnLoop +loop := adk.NewTurnLoop(cfg) +loop.Push(&ChatItem{Query: "hello"}) +loop.Run(ctx) +// → GenInput 构建输入 → Agent 执行 → OnAgentEvents 流式输出 +// → 轮次结束,TurnLoop 进入 idle 等待 + +// 用户发送第二条消息(此时 loop 处于 idle) +loop.Push(&ChatItem{Query: "explain Eino's architecture"}) +// → TurnLoop 唤醒,开始新一轮:GenInput → Agent → OnAgentEvents → idle +``` + +这个流程与上一章的 Runner 在用户体验上没有区别——区别在于 TurnLoop 的循环**持续存在**,不需要每次都重新创建。而一旦用户在 Agent 还在回答时发来新消息,就进入了下面的"抢占"场景。 + +## 抢占是怎么实现的 + +当用户在 Agent 回答过程中发送新消息时,业务层只需一行代码触发抢占: + +```go +loop.Push(item, adk.WithPreempt[*ChatItem, M](adk.AfterToolCalls)) +``` + +TurnLoop 收到这个指令后: + +1. 等待当前 tool call 完成(`AfterToolCalls` 表示不打断正在执行的工具,避免不一致状态) +2. 取消当前轮次——OnAgentEvents 的 context 被取消,旧轮次退出 +3. 从队列取出新 item,通过 GenInput 构建输入,启动新一轮 + +抢占模式可以根据业务需要选择不同的安全点: + + + + + + +
    模式具体行为
    AfterToolCalls等待当前正在执行的工具调用完成后,再取消当前轮次并启动新一轮执行
    AfterChatModel等待当前大模型调用完成后,再取消当前轮次并启动新一轮执行
    AnySafePoint在任一安全点(如工具调用间隙、模型调用间隙)立即取消当前轮次并启动新一轮执行
    + +> 本示例中 TurnLoop 运行在独立 goroutine 中,而 HTTP handler 需要把事件流写入 SSE 响应。两者之间通过 channel 协调(见 [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) 中的 `iterEnvelope`/`iterResult` 以及 `handlerDone` 信号机制)。这些是 HTTP 适配层的细节,不属于 TurnLoop API 本身。 + +## 中止是怎么实现的 + +中止更简单——直接停止整个 TurnLoop: + +```go +loop.Stop(adk.WithImmediate()) // 立即取消,不等待当前轮次 +loop.Wait() // 等待完全退出 +``` + +### Stop 的三种模式 + + + + + + +
    模式具体行为
    loop.Stop()轮次边界退出:等待当前轮次完成后退出
    loop.Stop(WithImmediate())立即退出:取消当前轮次的 context
    loop.Stop(WithGraceful())安全点退出:在下一个安全点(如 tool call 之间)退出
    + +## TurnLoop 的配置 + +创建 TurnLoop 时,通过 `TurnLoopConfig` 指定回调和选项: + +```go +cfg := adk.TurnLoopConfig[*ChatItem, M]{ + // GenInput:每轮开始时调用,决定"这一轮 Agent 看到什么" + // 从队列中选择 items 构建 Agent 输入,返回 Consumed(本轮处理)和 Remaining(留到后续轮次) + GenInput: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], items []*ChatItem) (*adk.GenInputResult[*ChatItem, M], error) { + // ...构建 AgentInput,持久化用户消息... + }, + + // PrepareAgent:每轮调用一次,返回本轮使用的 Agent + // 本示例直接返回同一个 Agent,但你可以根据 items 动态选择不同 Agent + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], consumed []*ChatItem) (adk.TypedAgent[M], error) { + return agent, nil + }, + + // OnAgentEvents:接收 Agent 的事件流,负责渲染输出和持久化中间消息 + // 本示例通过 channel 把事件流转交给 HTTP handler 做 SSE 输出 + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[*ChatItem, M], events *adk.AsyncIterator[*adk.TypedAgentEvent[M]]) error { + // ...把 events 交给 HTTP handler,等待消费完成... + }, + + // 以下三个字段用于声明式 checkpoint(审批恢复),下一节详细介绍 + GenResume: makeGenResume(), + Store: checkpointStore, + CheckpointID: sessionID, +} + +loop := adk.NewTurnLoop(cfg) +``` + + + + + + + + +
    回调调用时机职责
    GenInput队列中有 items 时选择消费哪些 items,构建 Agent 输入(可决定哪些 items 保留给下一轮)
    PrepareAgentGenInput 之后返回本轮使用的 Agent 实例,支持动态调整 Agent 配置
    OnAgentEventsAgent 产出事件流时消费事件、渲染输出、持久化结果,是业务层处理 Agent 输出的核心入口
    GenResume从 checkpoint 恢复时从新 Push 进来的 items 中提取审批结果,构建
    ResumeParams
    ,实现审批恢复的自动化
    Store + CheckpointID启用声明式 checkpoint,TurnLoop 自动处理执行状态的保存与恢复
    + +> 完整的回调实现请参考 [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go)。 + +## 声明式 Checkpoint:审批恢复的自动化 + +在第七章(Runner 模式)中,审批恢复需要业务层手动调用 `runner.ResumeWithParams()`,自己判断"这次是正常执行还是恢复执行"。TurnLoop 提供了更简洁的方式——在配置中声明 `Store` 和 `CheckpointID`(见上一节),TurnLoop 会自动处理保存与恢复: + +1. Agent 执行到审批 interrupt 时,TurnLoop 自动将执行状态保存到 `Store`(以 `CheckpointID` 为 key) +2. 用户做出审批决定后,业务层创建一个新的 TurnLoop(使用**相同的** `CheckpointID`),并 Push 审批 item +3. 新 TurnLoop `Run()` 时,检测到 checkpoint 存在,**自动调用 `GenResume`**(而非 `GenInput`)获取恢复参数 +4. Agent 从 interrupt 点继续执行 + +`GenResume` 的职责就是从新 Push 进来的 items 中提取审批结果,构建 `ResumeParams`: + +```go +GenResume: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], + canceledItems, unhandledItems, newItems []*ChatItem, +) (*adk.GenResumeResult[*ChatItem, M], error) { + // newItems 包含审批恢复时 Push 的 item + item := newItems[0] + return &adk.GenResumeResult[*ChatItem, M]{ + ResumeParams: &adk.ResumeParams{ + InterruptID: item.InterruptID, + ApprovalResult: item.ApprovalResult, + }, + }, nil +} +``` + +相比 Runner 的 `ResumeWithParams()`,声明式 checkpoint 让业务层不需要管理"正常执行 vs 恢复执行"的分支——TurnLoop 根据 checkpoint 是否存在自动选择走 `GenInput` 还是 `GenResume`。 + +## 本章小结 + +- **TurnLoop** 是一个持久运行的多轮执行循环,生命周期与用户会话绑定 +- **常规流程**:`Push(item)` → GenInput → Agent → OnAgentEvents → idle → 等待下一个 Push +- **抢占**:`Push(item, WithPreempt(AfterToolCalls))` 一行代码取消当前轮次并开始新一轮 +- **中止**:`loop.Stop(WithImmediate())` 一行代码终止整个循环 +- **声明式 checkpoint**:配置 `Store` + `CheckpointID`,TurnLoop 自动处理 interrupt 的保存与恢复 +- 回调的具体实现请参考 [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) + +## 系列收尾:完整 Agent 应用骨架 + +到本章为止,我们用一个可以实际运行的 Agent 串起了 Eino 的核心能力: + +- **运行时**:Runner / TurnLoop 驱动执行,支持流式输出、抢占与中止 +- **工具层**:Filesystem / Shell 等 Tool 能力接入,工具错误可被安全处理 +- **中间件**:可插拔的 middleware/handler,用于错误处理、重试、审批等横切能力 +- **可观测**:callbacks/trace 能力把关键链路打通,便于调试与线上观测 +- **人机协作**:interrupt/resume + checkpoint 支持审批、补参、分支选择等交互式流程 +- **确定性编排**:compose(graph/chain/workflow)把复杂业务流程组织为可维护、可复用的执行图 +- **业务交付**:A2UI 协议把 Agent 能力以流式 UI 的形式呈现给用户 +- **执行控制**:TurnLoop 提供抢占、中止、多轮生命周期管理,适配真实业务场景的复杂交互需求 + +你可以在这个骨架上逐步替换/扩展任意环节:模型、工具、存储、工作流、前端渲染协议,而不需要推倒重来。 diff --git a/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md b/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md index 5224228048c..9a409f96d0e 100644 --- a/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md +++ b/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: v0.8.*-adk middlewares @@ -65,7 +65,7 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ > 💡 > **功能**: 自动对话历史摘要,防止超出模型上下文窗口限制 -📚 **详细文档**: [Middleware: FileSystem](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem) +📚 **详细文档**: [Middleware: Summarization](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_summarization) 当对话历史的 Token 数量超过阈值时,自动调用 LLM 生成摘要,压缩上下文。 @@ -247,7 +247,7 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ > 💡 > 升级到 v0.8 前,请查阅 Breaking Changes 文档了解所有不兼容变更 -📚 **完整文档**: [Eino v0.8 不兼容更新](/zh/docs/eino/release_notes_and_migration/eino_v0.8._-adk_middlewares/eino_v0.8_不兼容更新) +📚 **完整文档**: [Eino v0.8 不兼容更新](/zh/docs/eino/release_notes_and_migration/eino_v0.8._-adk_middlewares/eino_v0.8_breaking_changes) **变更概览**: @@ -263,7 +263,7 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ ## 升级指南 -详细的迁移步骤和代码示例请参考:[Eino v0.8 不兼容更新](/zh/docs/eino/release_notes_and_migration/eino_v0.8._-adk_middlewares/eino_v0.8_不兼容更新) +详细的迁移步骤和代码示例请参考:[Eino v0.8 不兼容更新](/zh/docs/eino/release_notes_and_migration/eino_v0.8._-adk_middlewares/eino_v0.8_breaking_changes) **快速检查清单**: diff --git "a/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_\344\270\215\345\205\274\345\256\271\346\233\264\346\226\260.md" b/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/eino_v0.8_breaking_changes.md similarity index 100% rename from "content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_\344\270\215\345\205\274\345\256\271\346\233\264\346\226\260.md" rename to content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/eino_v0.8_breaking_changes.md diff --git a/content/zh/docs/eino/release_notes_and_migration/_index.md b/content/zh/docs/eino/release_notes_and_migration/_index.md index 5bfe102c396..07624cb5f08 100644 --- a/content/zh/docs/eino/release_notes_and_migration/_index.md +++ b/content/zh/docs/eino/release_notes_and_migration/_index.md @@ -4,7 +4,7 @@ date: "2026-03-02" lastmod: "" tags: [] title: 发布记录 & 迁移指引 -weight: 8 +weight: 7 --- # 版本管理规范 diff --git a/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md b/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md new file mode 100644 index 00000000000..d2a667dfabb --- /dev/null +++ b/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md @@ -0,0 +1,61 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: v0.9.* agentic-runtime +weight: 9 +--- + +V0.9 的版本主题是 `agentic-runtime`。该版本主要围绕 ADK 的消息协议、Agent 运行控制和多轮运行时能力展开,在保留 `*schema.Message` 默认路径的同时,引入 `AgenticMessage` 及配套泛型抽象,为更丰富的模型原生 Agent 协议、服务端工具调用、运行中断与恢复打下基础。 + +## 1. AgenticMessage 与 ADK 支持 + +V0.9 新增 `schema.AgenticMessage`,用于表达比传统 `schema.Message` 更完整的 Agentic 消息结构。 + +- `AgenticMessage` 采用 content block 模型,支持文本、推理内容、工具调用、工具结果、服务端工具、MCP 工具和多模态内容等结构化片段。 +- `[]ContentBlock` 能更完整地保留不同模型协议响应中的 block 时序;新增 block 类型也更适配 OpenAI Responses API、Claude、Gemini 等协议中的 tool use、reasoning、streaming metadata 等结构。 +- `components/model` 新增 `AgenticModel` 组件,用于接入以 `AgenticMessage` 为输入输出的模型实现。 +- ADK 对 `AgenticMessage` 路径提供 typed agent、typed event、typed runner 和 typed `ChatModelAgent` 支持,使 AgenticModel 能进入 ADK 的 Agent 生命周期。 + +## 2. ChatModelAgent 能力扩展 + +V0.9 对 `ChatModelAgent` 的运行控制、模型调用可靠性和 middleware 扩展点进行了系统增强。 + +### Cancel + +- 新增 Agent Cancel 能力,用于从外部主动终止正在运行的 Agent。 +- 支持安全点取消、递归取消、取消超时升级,以及取消过程中的 checkpoint 持久化。 +- 取消期间发生的 interrupt 会统一进入取消语义,调用方可以通过 `CancelError` 区分主动取消与普通业务失败。 + +### Model Retry + +- Retry 从简单的 error retry 扩展为 `ShouldRetry(ctx, RetryContext) -> RetryDecision`。 +- Retry 决策可以读取模型输出、拒绝不满足条件的输出、修改下一次输入、追加模型 option,并覆盖 backoff。 + +### Model Failover + +- 新增 Model Failover 能力,用于在模型调用失败后切换到备用模型。 +- Failover 决策可以读取失败 attempt 的输出、错误、原始输入和 attempt 序号,并选择下一次使用的模型。 +- 支持为备用模型改写输入;也支持优先复用上一次调用成功的模型,降低每次从固定主模型开始试错的成本。 + +### Middleware 增强 + +- `ChatModelAgentMiddleware` 新增 `AfterAgent`,用于在 Agent 成功结束后执行收尾逻辑。 +- Summarization、reduction、skill、filesystem、plan-task、patch-tool-calls 等 middleware 完成泛型化,支持 `AgenticMessage` 路径。 +- Summarization middleware 新增 `TypedMiddleware.Summarize`,同步 summarization 能力从独立函数转为 middleware 内聚能力。 +- Filesystem middleware 增强多模态读取能力,并增加 PDF pages 校验。 +- 新增 `agentsmd` middleware,用于加载和注入 `AGENTS.md` 风格的项目指令。 +- `ChatModelAgentState` 增加 `ToolInfos` 和 `DeferredToolInfos`,作为 middleware 调整模型可见工具集合的主路径。 +- `ToolInfos` 表示当前模型调用直接可见的工具;`DeferredToolInfos` 表示可由模型通过工具搜索机制按需发现的候选工具。 +- Tool search middleware 支持三类工具加载方式:使用模型侧原生 tool search 能力从 deferred tools 中按需加载;按模型协议要求提供固定 schema 的 `ToolSearchTool`,由模型通过该入口搜索 deferred tools;不依赖模型侧协议,使用 Eino 提供的自定义 `tool_search` tool 检索工具,并把命中的工具追加到常规 `ToolInfos`。 +- Compose 新增 `AgenticToolsNode`,`ToolsNode` 增加 tool name 和 argument alias 支持。 + +## 3. TurnLoop + +V0.9 新增 `TurnLoop`,用于把一次性的 Agent run 提升为可持续运行、可被外部驱动的 turn 级运行时。 + +- 面向多轮运行:`TurnLoop` 持续接收外部输入,每个 turn 独立规划输入、构造 Agent、消费事件,适合长期在线的交互式 Agent。 +- 支持输入合并:`GenInput` 在 turn 边界决定本轮消费哪些输入、哪些继续等待,应用可以实现批处理、去重、合并用户连续输入等策略。 +- 支持抢占:带 preempt option 的 `Push` 会原子地写入新输入并请求取消当前 turn,使高优先级输入可以打断正在运行的 Agent。 +- 支持声明式 checkpoint/resume:恢复时,应用不需要自行还原输入队列;`TurnLoop` 会区分被中断的输入、尚未处理的输入和恢复后新到达的输入,应用只需声明这些输入如何重新进入后续 turn。 diff --git a/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md b/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md new file mode 100644 index 00000000000..ab08c27725c --- /dev/null +++ b/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md @@ -0,0 +1,198 @@ +--- +Description: "" +date: "2026-05-19" +lastmod: "" +tags: [] +title: Eino V0.9 更新注意事项 +weight: 1 +--- + +本文列出现有用户从 V0.8.x 升级到 V0.9 `agentic-runtime` 时需要关注的 API 和语义变化。未列出的新增能力通常不影响既有 `*schema.Message` 路径。 + +## API 显式变更 + +### Agent Transfer / Workflow Agent / Supervisor 标记为 NOT RECOMMENDED + +V0.9 将基于 Agent Transfer(全上下文共享)的多 Agent 协作模式整体标记为 **NOT RECOMMENDED**。受影响的公开 API 包括: + +**Agent Transfer 相关**: + +- `SetSubAgents` +- `AgentWithOptions` / `WithDisallowTransferToParent` / `WithHistoryRewriter` +- `ChatModelAgentConfig.Exit` / `ChatModelAgentConfig.OutputKey` +- `AgentWithDeterministicTransferTo` +- `OnSetSubAgents` / `OnSetAsSubAgent` / `OnDisallowTransferToParent` + +**Workflow Agent**: + +- `NewSequentialAgent` / `SequentialAgentConfig` +- `NewParallelAgent` / `ParallelAgentConfig` +- `NewLoopAgent` / `LoopAgentConfig` + +**Supervisor**: + +- `supervisor.New` / `supervisor.Config` + +> 💡 +> 这些 API 仍然可以使用,不会编译失败,但不建议在新项目中采用。经验表明,Agent 之间共享完整对话上下文的 transfer 模式在实际效果上并不优于工具调用模式。 + +推荐迁移方向: + +- 使用 `ChatModelAgent` + `AgentTool`(将子 Agent 封装为工具,按需调用)。 +- 使用 `DeepAgent`(结构化子任务委派)。 +- 上述两种方式均可获得更好的可控性、可观测性和 prompt cache 效率。 + +### ChatModelAgentMiddleware 新增 AfterAgent + +`ChatModelAgentMiddleware` 新增 `AfterAgent` 方法。手写实现该接口的类型需要补充该方法,否则会编译失败。 + +推荐做法: + +- 如果 middleware 不需要特殊收尾逻辑,嵌入 `*adk.BaseChatModelAgentMiddleware`。 +- 如果 middleware 需要在 Agent 成功结束后清理状态、记录事件或补充统计,实现 `AfterAgent(ctx, state)`。 + +影响范围: + +- 仅影响显式实现 `ChatModelAgentMiddleware` 的用户代码。 +- 通过 `BaseChatModelAgentMiddleware` 组合扩展的代码可保持兼容。 + +### AgentMiddleware 结构体废弃 + +`AgentMiddleware` 结构体及 `ChatModelAgentConfig.Middlewares` 字段已标记为 **Deprecated**,将在未来版本中移除。 + +> 💡 +> AgentMiddleware 和 Middlewares 字段均已废弃。请迁移至 interface-based 的 Handlers(ChatModelAgentMiddleware)方式。 + +迁移方式: + +- 将 `Middlewares []AgentMiddleware` 中的各项逻辑迁移到 `Handlers []ChatModelAgentMiddleware`。 +- `AgentMiddleware.BeforeChatModel` → 实现 `ChatModelAgentMiddleware.BeforeModelRewriteState`。 +- `AgentMiddleware.AfterChatModel` → 实现 `ChatModelAgentMiddleware.AfterModelRewriteState`。 +- `AgentMiddleware.WrapToolCall` → 实现 `ChatModelAgentMiddleware.WrapToolCall`。 +- `AgentMiddleware.AdditionalInstruction` → 在 `BeforeModelRewriteState` 中修改 `state.Instruction`。 +- `AgentMiddleware.AdditionalTools` → 在 `BeforeModelRewriteState` 中修改 `state.ToolInfos`。 +- 如果 middleware 不需要特殊逻辑,嵌入 `*adk.BaseChatModelAgentMiddleware` 以获得默认空实现。 + +影响范围: + +- 所有在 `ChatModelAgentConfig.Middlewares` 中使用 `AgentMiddleware` 的代码需要迁移。 +- 当前版本两种方式可共存(Handlers 在 Middlewares 之后执行),但建议尽早迁移以避免未来版本移除时的编译失败。 + +### summarization.SummarizeMessages 被移除 + +`summarization.SummarizeMessages` 和 `summarization.SummarizeOutput` 不再导出。 + +迁移方式: + +- 构造 summarization middleware 时继续使用 `summarization.New` 或 `summarization.NewTyped`。 +- 需要主动触发同步 summarization 时,使用 `TypedMiddleware.Summarize`。 + +该调整将 summarization 的配置、状态读取和执行逻辑收敛到 middleware 内部,避免独立函数与运行时状态语义分叉。 + +## 需要关注语义变化的能力 + +### Summarization Finalize 后处理语义变化 + +V0.8.x 中,summarization middleware 会先执行默认 summary 后处理,再调用用户配置的 `Finalize`。因此自定义 `Finalize` 收到的 `summary` 已经包含 `PreserveUserMessages` 替换、`TranscriptFilePath` 注入和 summary preamble。 + +V0.9 中,如果设置了 `Config.Finalize`,middleware 会直接把模型生成的 raw summary 传给 `Finalize`,不再自动执行默认后处理。受影响的配置包括: + +- `PreserveUserMessages` +- `TranscriptFilePath` + +迁移方式: + +- 如果希望保留默认后处理,不要设置 `Finalize`,让 middleware 使用默认 finalization 路径。 +- 如果必须自定义 `Finalize`,但仍希望保留默认后处理,先通过 `DefaultFinalizer` 构造默认 finalizer,再在自定义逻辑中显式组合。 +- `DefaultFinalizer` 不会自动读取外层 `Config.PreserveUserMessages` 和 `Config.TranscriptFilePath`;需要通过 `DefaultFinalizerConfig` 显式传入。 +- 使用 `NewFinalizer().PreserveSkills(...).Build()` 的代码需要特别检查:该 finalizer 只负责 preserve skills,不会自动补上 `PreserveUserMessages` 和 `TranscriptFilePath`。 + +### 工具列表修改路径调整 + +`ModelContext.Tools` 不再是推荐的工具列表修改入口。 + +升级建议: + +- 在 `BeforeModelRewriteState` 中修改 `state.ToolInfos`。 +- 如需模型原生 deferred tool search,修改 `state.DeferredToolInfos`。 +- 不建议在 `WrapModel` 中修改工具列表;该修改只影响当前模型调用,后续 middleware、后续 turn 或 checkpoint/resume 不会继承这次修改。 + +### ToolSearch / AgentsMD Middleware 内部实现迁移 + +ToolSearch 和 AgentsMD middleware 的内部实现从 `WrapModel`(v0.8.x)迁移至 `BeforeModelRewriteState`(v0.9)。 + +> 💡 +> 对仅使用 `toolsearch.New()` / `agentsmd.New()` 的用户,公开 API(Config 结构体、构造函数)未变化,无需修改代码。 + +语义变化: + +- **v0.8.x**:middleware 通过 `WrapModel` 在模型调用时临时注入工具列表(via `model.Option`),变更不持久化,不进入 agent state。 +- **v0.9**:middleware 在 `BeforeModelRewriteState` 中直接修改 `state.ToolInfos` / `state.DeferredToolInfos` 和 `state.Messages`(注入提醒消息),变更随 state 持久化。 + +影响: + +- **Checkpoint/Resume**:ToolSearch 注入的提醒消息和动态工具搜索结果现在会随 checkpoint 持久化并在恢复时正确重建,v0.8.x 中这些信息会在恢复后丢失。 +- **其他 Middleware 可见性**:后续 middleware 的 `BeforeModelRewriteState` / `AfterModelRewriteState` 现在能看到 ToolSearch 修改后的 `state.ToolInfos`,而 v0.8.x 中这些修改对其他 middleware 不可见。 +- **Prompt Cache**:由于工具列表变更现在反映在 state 中(而非每次模型调用时临时注入),模型的 KV-cache 行为可能有差异。 + +需要注意: + +- 如果有自定义 middleware 依赖 `WrapModel` 中的 `ModelContext.Tools` 来读取/修改工具列表,应迁移至 `BeforeModelRewriteState` 中读取 `state.ToolInfos`。 + +### Model Retry 决策语义增强 + +`ModelRetryConfig` 新增 `ShouldRetry`。当 `ShouldRetry` 非空时,`IsRetryAble` 会被忽略。 + +需要注意: + +- 旧的 `IsRetryAble` 仍可用于错误维度的简单重试。 +- 使用 `ShouldRetry` 后,应显式处理成功输出但业务不接受的场景。 +- Interrupt 和 `ErrStreamCanceled` 不作为普通 retry error 处理。 + +### Cancel 错误语义 + +V0.9 引入主动取消语义后,应用需要区分主动取消、普通错误和业务 interrupt。 + +升级建议: + +- 上层应区分 `CancelError`、普通 error 和业务 interrupt。 +- 如果应用主动接入 `WithCancel`,不要把 `CancelError` 当作普通业务失败处理。 + +### AgenticMessage 迁移需要理解新的消息结构 + +`TypedChatModelAgent[*schema.AgenticMessage]` 是面向模型原生 Agentic 协议的新路径。迁移到该路径不只是把泛型参数从 `*schema.Message` 改成 `*schema.AgenticMessage`,还需要按 `AgenticMessage` 的 content block 结构处理消息内容。 + +需要注意: + +- AgenticMessage 路径使用 `AgenticModel` 与 `AgenticToolsNode` 处理工具调用。 +- 工具调用和工具结果通过 `AgenticMessage` content block 表达,尤其需要正确处理 tool call / tool result content block。 +- Agent transfer 能力不适用于 AgenticMessage 路径。 +- 既有应用如果不需要模型原生 Agentic 协议,建议继续使用默认 `*schema.Message` 路径;只有在明确要接入 `AgenticModel` 协议时再迁移。 + +### 模型适配器需要识别新增 option + +V0.9 引入 `AgenticModel` 后,模型适配器需要更严格地处理 call-time options。`AgenticModel` 是 `BaseModel[*schema.AgenticMessage]` 的别名,不再提供类似 `ToolCallingChatModel.WithTools` 的增强接口;工具绑定统一通过 `model.WithTools` 作为 `model.Option` 传入。 + +需要注意: + +- 所有支持 AgenticMessage 的模型适配器都应读取 `Options.Tools`,并将其映射到 provider 的 tool calling 协议。 +- `AgenticModel` 不应要求用户先调用某个 `WithTools` 方法得到“带工具的模型实例”;ADK 会在每次模型调用时通过 `model.WithTools` 传递当前工具列表。 +- 如果适配器只从自身 config 读取工具,而忽略 `model.WithTools`,在 ChatModelAgent / AgenticToolsNode 路径下会出现模型看不到工具或工具列表不随运行态变化的问题。 + +V0.9 还在 `model.Options` 中新增: + +- `DeferredTools` +- `ToolSearchTool` +- `AgenticToolChoice` + +现有模型适配器忽略这些 option 通常不会导致编译失败,但会导致 deferred tool search、模型原生 tool search 或 agentic tool choice 不生效。适配器维护者应按目标 provider 的协议补齐转换逻辑。 + +### ToolInfo 序列化形态变化 + +`ToolInfo` 增加显式 JSON/Gob 编解码,以保留 `ParamsOneOf`。 + +影响: + +- `ToolInfo` 进入了 `ChatModelAgentState.ToolInfos` / `DeferredToolInfos`,因此可能随 Agent state 一起进入 checkpoint。 +- 显式 JSON/Gob 编解码用于保证 `ParamsOneOf` 在 checkpoint、deep copy 和恢复过程中不会丢失。 +- 如果外部系统直接依赖旧版 `ToolInfo` JSON 形态,需要重新确认序列化兼容性。 diff --git a/content/zh/docs/eino/release_notes_and_migration/v02_second_release.md b/content/zh/docs/eino/release_notes_and_migration/v02_second_release.md index fbda1b7ed12..4c26f5728b5 100644 --- a/content/zh/docs/eino/release_notes_and_migration/v02_second_release.md +++ b/content/zh/docs/eino/release_notes_and_migration/v02_second_release.md @@ -74,7 +74,7 @@ weight: 2 ### BugFix -- Fixed the SSTI vulnerability in the Jinja chat template(langchaingo 存在 gonja 模板注入) +- Fixed the SSTI vulnerability in the Jinja chat template [langchaingo 存在 gonja 模板注入](https://bytedance.larkoffice.com/docx/UvqxdlFfSoTIr1xtsQ5cIZTVn2b) ## v0.2.0 diff --git a/static/img/eino/DwTrwyD1eh2DqNbsGE8cfdTNnYb.png b/static/img/eino/DwTrwyD1eh2DqNbsGE8cfdTNnYb.png new file mode 100644 index 00000000000..9752e8ef3bc Binary files /dev/null and b/static/img/eino/DwTrwyD1eh2DqNbsGE8cfdTNnYb.png differ diff --git a/static/img/eino/GzIObeN6roy2SAxpEXBcMqrRnYb.png b/static/img/eino/GzIObeN6roy2SAxpEXBcMqrRnYb.png deleted file mode 100644 index d0994449c34..00000000000 Binary files a/static/img/eino/GzIObeN6roy2SAxpEXBcMqrRnYb.png and /dev/null differ diff --git a/static/img/eino/HAz4wb8f6h4XSOb7yUVc2CkUnAg.png b/static/img/eino/HAz4wb8f6h4XSOb7yUVc2CkUnAg.png new file mode 100644 index 00000000000..31a535951a7 Binary files /dev/null and b/static/img/eino/HAz4wb8f6h4XSOb7yUVc2CkUnAg.png differ diff --git a/static/img/eino/eino_adk_write_todos.png b/static/img/eino/HOJtbxNKWoibi2xzXrAcx0BUndb.png similarity index 100% rename from static/img/eino/eino_adk_write_todos.png rename to static/img/eino/HOJtbxNKWoibi2xzXrAcx0BUndb.png diff --git a/static/img/eino/A737bctqLoOzNrxbK8Hc5ccmnEb.png b/static/img/eino/Ifu5bvB6conps5xBH5fcFdiCnCW.png similarity index 100% rename from static/img/eino/A737bctqLoOzNrxbK8Hc5ccmnEb.png rename to static/img/eino/Ifu5bvB6conps5xBH5fcFdiCnCW.png diff --git a/static/img/eino/N9ZzwvvuWhya0vbIzLEcMx6DnMP.png b/static/img/eino/N9ZzwvvuWhya0vbIzLEcMx6DnMP.png deleted file mode 100644 index 997eeaf21aa..00000000000 Binary files a/static/img/eino/N9ZzwvvuWhya0vbIzLEcMx6DnMP.png and /dev/null differ diff --git a/static/img/eino/eino_adk_excel_using_deep.png b/static/img/eino/PhKjbQyKZoqaM9xyxptcceM9nsg.png similarity index 100% rename from static/img/eino/eino_adk_excel_using_deep.png rename to static/img/eino/PhKjbQyKZoqaM9xyxptcceM9nsg.png diff --git a/static/img/eino/RlIuwflSQh1gzlb7eMkcarFenbe.png b/static/img/eino/RlIuwflSQh1gzlb7eMkcarFenbe.png new file mode 100644 index 00000000000..332a1f260b8 Binary files /dev/null and b/static/img/eino/RlIuwflSQh1gzlb7eMkcarFenbe.png differ diff --git a/static/img/eino/TXVlwT7Iohh1EtbEeC6cIptxnZd.png b/static/img/eino/TXVlwT7Iohh1EtbEeC6cIptxnZd.png deleted file mode 100644 index 0d005ed243f..00000000000 Binary files a/static/img/eino/TXVlwT7Iohh1EtbEeC6cIptxnZd.png and /dev/null differ diff --git a/static/img/eino/X9I4wGCprhpho7bXk6icMHmwnRb.png b/static/img/eino/X9I4wGCprhpho7bXk6icMHmwnRb.png new file mode 100644 index 00000000000..75c1f0d62be Binary files /dev/null and b/static/img/eino/X9I4wGCprhpho7bXk6icMHmwnRb.png differ diff --git a/static/img/eino/XrWqwC669hGGoibW1q3c2ToTnvf.png b/static/img/eino/XrWqwC669hGGoibW1q3c2ToTnvf.png new file mode 100644 index 00000000000..018eb4f5742 Binary files /dev/null and b/static/img/eino/XrWqwC669hGGoibW1q3c2ToTnvf.png differ diff --git a/static/img/eino/Xs38beDNAobevkx0epfcjkCnnFb.png b/static/img/eino/Xs38beDNAobevkx0epfcjkCnnFb.png new file mode 100644 index 00000000000..6fe56677c4b Binary files /dev/null and b/static/img/eino/Xs38beDNAobevkx0epfcjkCnnFb.png differ diff --git a/static/img/eino/eino_adk_agent_as_tool_sequence_diagram_1.png b/static/img/eino/eino_adk_agent_as_tool_sequence_diagram_1.png deleted file mode 100644 index 8022d0dc902..00000000000 Binary files a/static/img/eino/eino_adk_agent_as_tool_sequence_diagram_1.png and /dev/null differ diff --git a/static/img/eino/eino_adk_chat_model_agent_view.png b/static/img/eino/eino_adk_chat_model_agent_view.png deleted file mode 100644 index 4480f271c5c..00000000000 Binary files a/static/img/eino/eino_adk_chat_model_agent_view.png and /dev/null differ diff --git a/static/img/eino/eino_adk_collaboration_example.png b/static/img/eino/eino_adk_collaboration_example.png deleted file mode 100644 index d4b4f93b456..00000000000 Binary files a/static/img/eino/eino_adk_collaboration_example.png and /dev/null differ diff --git a/static/img/eino/eino_adk_collaboration_run_path_sequential.png b/static/img/eino/eino_adk_collaboration_run_path_sequential.png deleted file mode 100644 index d75265eab0f..00000000000 Binary files a/static/img/eino/eino_adk_collaboration_run_path_sequential.png and /dev/null differ diff --git a/static/img/eino/eino_adk_deterministic_transfer.png b/static/img/eino/eino_adk_deterministic_transfer.png deleted file mode 100644 index ac1e2f9e20d..00000000000 Binary files a/static/img/eino/eino_adk_deterministic_transfer.png and /dev/null differ diff --git a/static/img/eino/eino_adk_directory_structure.png b/static/img/eino/eino_adk_directory_structure.png deleted file mode 100644 index 3bb9b51236d..00000000000 Binary files a/static/img/eino/eino_adk_directory_structure.png and /dev/null differ diff --git a/static/img/eino/eino_adk_implementation_nested_loop_sequential.png b/static/img/eino/eino_adk_implementation_nested_loop_sequential.png deleted file mode 100644 index b8e4e0ced2b..00000000000 Binary files a/static/img/eino/eino_adk_implementation_nested_loop_sequential.png and /dev/null differ diff --git a/static/img/eino/eino_adk_loop_agent.png b/static/img/eino/eino_adk_loop_agent.png deleted file mode 100644 index c0037634621..00000000000 Binary files a/static/img/eino/eino_adk_loop_agent.png and /dev/null differ diff --git a/static/img/eino/eino_adk_loop_definition.png b/static/img/eino/eino_adk_loop_definition.png deleted file mode 100644 index 61d49ad8595..00000000000 Binary files a/static/img/eino/eino_adk_loop_definition.png and /dev/null differ diff --git a/static/img/eino/eino_adk_loop_exit.png b/static/img/eino/eino_adk_loop_exit.png deleted file mode 100644 index b2846e59866..00000000000 Binary files a/static/img/eino/eino_adk_loop_exit.png and /dev/null differ diff --git a/static/img/eino/eino_adk_message_event.png b/static/img/eino/eino_adk_message_event.png deleted file mode 100644 index 432abe89ee7..00000000000 Binary files a/static/img/eino/eino_adk_message_event.png and /dev/null differ diff --git a/static/img/eino/eino_adk_module_architecture.png b/static/img/eino/eino_adk_module_architecture.png deleted file mode 100644 index 5f90a1cb074..00000000000 Binary files a/static/img/eino/eino_adk_module_architecture.png and /dev/null differ diff --git a/static/img/eino/eino_adk_overview_sequential.png b/static/img/eino/eino_adk_overview_sequential.png deleted file mode 100644 index ec96a47852c..00000000000 Binary files a/static/img/eino/eino_adk_overview_sequential.png and /dev/null differ diff --git a/static/img/eino/eino_adk_parallel_agent.png b/static/img/eino/eino_adk_parallel_agent.png deleted file mode 100644 index e46c2031c91..00000000000 Binary files a/static/img/eino/eino_adk_parallel_agent.png and /dev/null differ diff --git a/static/img/eino/eino_adk_parallel_controller_overview.png b/static/img/eino/eino_adk_parallel_controller_overview.png deleted file mode 100644 index 934ef4de58d..00000000000 Binary files a/static/img/eino/eino_adk_parallel_controller_overview.png and /dev/null differ diff --git a/static/img/eino/eino_adk_parallel_definition.png b/static/img/eino/eino_adk_parallel_definition.png deleted file mode 100644 index e46c2031c91..00000000000 Binary files a/static/img/eino/eino_adk_parallel_definition.png and /dev/null differ diff --git a/static/img/eino/eino_adk_parallel_use_case.png b/static/img/eino/eino_adk_parallel_use_case.png deleted file mode 100644 index 8d4fcf8bda2..00000000000 Binary files a/static/img/eino/eino_adk_parallel_use_case.png and /dev/null differ diff --git a/static/img/eino/eino_adk_parallel_yet_another_2.png b/static/img/eino/eino_adk_parallel_yet_another_2.png deleted file mode 100644 index 934ef4de58d..00000000000 Binary files a/static/img/eino/eino_adk_parallel_yet_another_2.png and /dev/null differ diff --git a/static/img/eino/eino_adk_plan_execute_replan.png b/static/img/eino/eino_adk_plan_execute_replan.png deleted file mode 100644 index e55f6b66e51..00000000000 Binary files a/static/img/eino/eino_adk_plan_execute_replan.png and /dev/null differ diff --git a/static/img/eino/eino_adk_preview_tree.png b/static/img/eino/eino_adk_preview_tree.png deleted file mode 100644 index 3193c0ec254..00000000000 Binary files a/static/img/eino/eino_adk_preview_tree.png and /dev/null differ diff --git a/static/img/eino/eino_adk_quick_start_agent_types.png b/static/img/eino/eino_adk_quick_start_agent_types.png deleted file mode 100644 index de96fda45b2..00000000000 Binary files a/static/img/eino/eino_adk_quick_start_agent_types.png and /dev/null differ diff --git a/static/img/eino/eino_adk_run_path.png b/static/img/eino/eino_adk_run_path.png deleted file mode 100644 index 860b928c2d8..00000000000 Binary files a/static/img/eino/eino_adk_run_path.png and /dev/null differ diff --git a/static/img/eino/eino_adk_run_path_deterministic.png b/static/img/eino/eino_adk_run_path_deterministic.png deleted file mode 100644 index 36154fa1fad..00000000000 Binary files a/static/img/eino/eino_adk_run_path_deterministic.png and /dev/null differ diff --git a/static/img/eino/eino_adk_run_path_sub_agent.png b/static/img/eino/eino_adk_run_path_sub_agent.png deleted file mode 100644 index 6e44d1197a9..00000000000 Binary files a/static/img/eino/eino_adk_run_path_sub_agent.png and /dev/null differ diff --git a/static/img/eino/eino_adk_self_driving.png b/static/img/eino/eino_adk_self_driving.png deleted file mode 100644 index 3193c0ec254..00000000000 Binary files a/static/img/eino/eino_adk_self_driving.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequence_diagram.png b/static/img/eino/eino_adk_sequence_diagram.png deleted file mode 100644 index 9e9d14dd810..00000000000 Binary files a/static/img/eino/eino_adk_sequence_diagram.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequential_agent.png b/static/img/eino/eino_adk_sequential_agent.png deleted file mode 100644 index 99d71862ccf..00000000000 Binary files a/static/img/eino/eino_adk_sequential_agent.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequential_controller.png b/static/img/eino/eino_adk_sequential_controller.png deleted file mode 100644 index ff9458c4f31..00000000000 Binary files a/static/img/eino/eino_adk_sequential_controller.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequential_definition.png b/static/img/eino/eino_adk_sequential_definition.png deleted file mode 100644 index 99d71862ccf..00000000000 Binary files a/static/img/eino/eino_adk_sequential_definition.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequential_quickstart.png b/static/img/eino/eino_adk_sequential_quickstart.png deleted file mode 100644 index 86c80eecb80..00000000000 Binary files a/static/img/eino/eino_adk_sequential_quickstart.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequential_with_loop.png b/static/img/eino/eino_adk_sequential_with_loop.png deleted file mode 100644 index 71feae7155d..00000000000 Binary files a/static/img/eino/eino_adk_sequential_with_loop.png and /dev/null differ diff --git a/static/img/eino/eino_adk_streaming.png b/static/img/eino/eino_adk_streaming.png deleted file mode 100644 index 1ecc91272ce..00000000000 Binary files a/static/img/eino/eino_adk_streaming.png and /dev/null differ diff --git a/static/img/eino/eino_adk_supervisor.png b/static/img/eino/eino_adk_supervisor.png deleted file mode 100644 index 5e10e2abb64..00000000000 Binary files a/static/img/eino/eino_adk_supervisor.png and /dev/null differ diff --git a/static/img/eino/eino_adk_supervisor_definition.png b/static/img/eino/eino_adk_supervisor_definition.png deleted file mode 100644 index b733e9ba381..00000000000 Binary files a/static/img/eino/eino_adk_supervisor_definition.png and /dev/null differ diff --git a/static/img/eino/eino_adk_supervisor_example.png b/static/img/eino/eino_adk_supervisor_example.png deleted file mode 100644 index 042240dc3c2..00000000000 Binary files a/static/img/eino/eino_adk_supervisor_example.png and /dev/null differ diff --git a/static/img/eino/eino_adk_yet_another_loop.png b/static/img/eino/eino_adk_yet_another_loop.png deleted file mode 100644 index b2846e59866..00000000000 Binary files a/static/img/eino/eino_adk_yet_another_loop.png and /dev/null differ diff --git a/static/img/eino/eino_collaboration_agent_as_tool_thumbnail.png b/static/img/eino/eino_collaboration_agent_as_tool_thumbnail.png deleted file mode 100644 index 8022d0dc902..00000000000 Binary files a/static/img/eino/eino_collaboration_agent_as_tool_thumbnail.png and /dev/null differ