Add minicpm5 tool call parser#23802
Conversation
|
@CISC would you mind taking a look at this PR? MiniCPM5 tool call parsing — would appreciate your review. |
I'll review the |
There was a problem hiding this comment.
Thanks for the PR. The code present deviates too much from what is already established. I don't really see anything that would necessitate the need for these deviations.
I'll let @pwilkin weigh in if he thinks a dedicated parser is necessary or if this can be handled by the autoparser with some tweaks.
| std::string & common_chat_peg_mapper::args_target() { | ||
| return (current_tool && !current_tool->name.empty()) ? current_tool->arguments : args_buffer; | ||
| common_chat_tool_call * tool = active_tool(); | ||
| return (tool && !tool->name.empty()) ? tool->arguments : args_buffer; | ||
| } | ||
|
|
||
| common_chat_tool_call * common_chat_peg_mapper::active_tool() { | ||
| if (committed_tool_idx.has_value()) { | ||
| return &result.tool_calls.at(committed_tool_idx.value()); | ||
| } | ||
| if (pending_tool_call.has_value()) { | ||
| return &pending_tool_call.value(); | ||
| } | ||
| return nullptr; |
There was a problem hiding this comment.
If it's not broke, don't fix it.
| } | ||
| } | ||
|
|
||
| common_peg_parser common_chat_peg_builder::minicpm5_xml_tool_calls(const ordered_json & tools, |
There was a problem hiding this comment.
Inline in the chat param init for the model template.
| auto arg_choice = choice(); | ||
| for (const auto & el : params.at("properties").items()) { | ||
| const std::string & prop_name = el.key(); | ||
| const std::string & prop_type = el.value().value("type", "string"); |
There was a problem hiding this comment.
Please use the established patterns in the repo for determining if a type is a string. See common_schema_info.
| auto replace_all = [](std::string & s, const std::string & from, const std::string & to) { | ||
| if (from.empty()) { | ||
| return; | ||
| } | ||
| for (size_t pos = 0; (pos = s.find(from, pos)) != std::string::npos;) { | ||
| s.replace(pos, from.size(), to); | ||
| pos += to.size(); | ||
| } | ||
| }; | ||
|
|
||
| replace_all(out, "<functionname=", "<function name="); | ||
| replace_all(out, "<paramname=", "<param name="); | ||
|
|
||
| // Some GGUF outputs drop opening tag names but keep attributes, e.g. | ||
| // ` name="python"> name="code">...` instead of `<function name="python">...`. | ||
| static const std::regex LEADING_FUNC_ATTR(R"((?:^|[\n\r])\s*name=\"([^\"]+)\">)"); | ||
| out = std::regex_replace(out, LEADING_FUNC_ATTR, "\n<function name=\"$1\">"); | ||
| static const std::regex PARAM_ATTR(R"(>\s*name=\"([^\"]+)\">)"); | ||
| out = std::regex_replace(out, PARAM_ATTR, "><param name=\"$1\">"); | ||
|
|
||
| static const std::string IM_END = "<|im_end|>"; | ||
| if (out.size() >= IM_END.size() && | ||
| out.compare(out.size() - IM_END.size(), IM_END.size(), IM_END) == 0) { | ||
| out.erase(out.size() - IM_END.size()); | ||
| while (!out.empty() && (out.back() == '\n' || out.back() == ' ')) { | ||
| out.pop_back(); | ||
| } | ||
| } | ||
|
|
||
| return out; |
There was a problem hiding this comment.
All of this should be defined in the parser. None of this looks like it needs to be done post process.
| static const std::string SP_SPACE = "\xC4\xA0"; // U+0120 | ||
| static const std::string SP_NL = "\xC1\x8A"; // U+010A | ||
|
|
||
| for (size_t pos = 0; (pos = out.find(SP_SPACE, pos)) != std::string::npos;) { | ||
| out.replace(pos, SP_SPACE.size(), " "); | ||
| pos += 1; | ||
| } | ||
| for (size_t pos = 0; (pos = out.find(SP_NL, pos)) != std::string::npos;) { | ||
| out.replace(pos, SP_NL.size(), "\n"); | ||
| pos += 1; | ||
| } |
There was a problem hiding this comment.
This seems easy enough to do in a single pass.
| reasoning = p.reasoning(p.until_one_of(TOOL_START_MARKERS)) + p.optional(p.literal("\n")) + | ||
| p.optional(p.literal(THINK_END) + p.optional(p.literal("\n\n"))); |
There was a problem hiding this comment.
Too many optionals. This can be a choice with two branches: one that completes a thought with the end thinking tag and one that preempts thinking with the start of a tool call. Use p.space() to consume whitespace.
| p.optional(p.literal(THINK_END) + p.optional(p.literal("\n\n"))); | ||
| } | ||
|
|
||
| auto suffix = p.optional(p.literal("<|im_end|>") + p.optional(p.literal("\n"))); |
| // MiniCPM5 tool calls are parsed post-hoc via peg-minicpm5 (XML output). | ||
| // Do not attach JSON-schema GBNF here — it is invalid for this format and | ||
| // can destabilize llama-server. SGLang/vLLM use parsers only for MiniCPM5. | ||
| data.grammar.clear(); | ||
| data.grammar_lazy = false; | ||
| data.grammar_triggers = {}; | ||
| (void) include_grammar; |
There was a problem hiding this comment.
Even if you don't use the JSON schema to grammar implementation, the PEG parser will compose GBNF rules to enforce tool call structure. Please don't deviate from established patterns.
| const std::string normalized_input = params.format == COMMON_CHAT_FORMAT_PEG_MINICPM5 ? | ||
| common_chat_normalize_minicpm5_output(input) : | ||
| input; | ||
|
|
||
| const std::string effective_input = params.generation_prompt.empty() | ||
| ? input | ||
| : params.generation_prompt + input; | ||
| ? normalized_input | ||
| : params.generation_prompt + normalized_input; |
There was a problem hiding this comment.
Normalization should occur in a dedicated mapper. Not here.
| endif() | ||
|
|
||
| llama_build_and_test(test-chat-peg-parser.cpp peg-parser/simple-tokenize.cpp) | ||
| llama_build_and_test(test-chat-peg-parser-minicpm5.cpp) |
There was a problem hiding this comment.
To reiterate, do not deviate from established patterns. Chat template tests are located under test-chat.cpp.
|
Can you add the chat template here? Then we'll be able to just test it with the autoparser and see how much works out of the box and what needs to be possibly fixed. |
|
@pwilkin @CISC @aldehir |
Overview
Add MiniCPM5 tool call support for
llama-server.MiniCPM5 outputs tool calls as XML (
<function name="..."><param name="...">...</param></function>), not JSON. This PR adds apeg-minicpm5parser to detect the template, parse tool calls (including streaming), normalize common output quirks, and expose OpenAI-compatibletool_calls.Also fixes a peg mapper streaming bug (use tool index instead of dangling pointer) and adds Jinja
min/maxfilters needed by the MiniCPM5 template.Test plan
test-chat-peg-parser-minicpm5test-jinja(min/max)llama-server