Skip to content

Add minicpm5 tool call parser#23802

Open
zhangtao2-1 wants to merge 3 commits into
ggml-org:masterfrom
zhangtao2-1:minicpm5_tool_call
Open

Add minicpm5 tool call parser#23802
zhangtao2-1 wants to merge 3 commits into
ggml-org:masterfrom
zhangtao2-1:minicpm5_tool_call

Conversation

@zhangtao2-1
Copy link
Copy Markdown
Contributor

Overview

Add MiniCPM5 tool call support for llama-server.

MiniCPM5 outputs tool calls as XML (<function name="..."><param name="...">...</param></function>), not JSON. This PR adds a peg-minicpm5 parser to detect the template, parse tool calls (including streaming), normalize common output quirks, and expose OpenAI-compatible tool_calls.

Also fixes a peg mapper streaming bug (use tool index instead of dangling pointer) and adds Jinja min/max filters needed by the MiniCPM5 template.

Test plan

  • test-chat-peg-parser-minicpm5
  • test-jinja (min / max)
  • Manual test with MiniCPM5 GGUF + tools on llama-server

@zhangtao2-1 zhangtao2-1 requested review from a team, CISC and ggerganov as code owners May 28, 2026 07:34
@zhangtao2-1
Copy link
Copy Markdown
Contributor Author

@CISC would you mind taking a look at this PR? MiniCPM5 tool call parsing — would appreciate your review.

Comment thread common/jinja/value.cpp
Comment thread tests/test-jinja.cpp
@CISC
Copy link
Copy Markdown
Member

CISC commented May 28, 2026

@CISC would you mind taking a look at this PR? MiniCPM5 tool call parsing — would appreciate your review.

I'll review the jinja part, leaving the parser to @ggml-org/llama-common

Copy link
Copy Markdown
Contributor

@aldehir aldehir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. The code present deviates too much from what is already established. I don't really see anything that would necessitate the need for these deviations.

I'll let @pwilkin weigh in if he thinks a dedicated parser is necessary or if this can be handled by the autoparser with some tweaks.

Comment thread common/chat-peg-parser.cpp Outdated
Comment on lines +213 to +225
std::string & common_chat_peg_mapper::args_target() {
return (current_tool && !current_tool->name.empty()) ? current_tool->arguments : args_buffer;
common_chat_tool_call * tool = active_tool();
return (tool && !tool->name.empty()) ? tool->arguments : args_buffer;
}

common_chat_tool_call * common_chat_peg_mapper::active_tool() {
if (committed_tool_idx.has_value()) {
return &result.tool_calls.at(committed_tool_idx.value());
}
if (pending_tool_call.has_value()) {
return &pending_tool_call.value();
}
return nullptr;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not broke, don't fix it.

Comment thread common/chat-peg-parser.cpp Outdated
}
}

common_peg_parser common_chat_peg_builder::minicpm5_xml_tool_calls(const ordered_json & tools,
Copy link
Copy Markdown
Contributor

@aldehir aldehir May 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline in the chat param init for the model template.

Comment thread common/chat-peg-parser.cpp Outdated
auto arg_choice = choice();
for (const auto & el : params.at("properties").items()) {
const std::string & prop_name = el.key();
const std::string & prop_type = el.value().value("type", "string");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the established patterns in the repo for determining if a type is a string. See common_schema_info.

Comment thread common/chat.cpp Outdated
Comment on lines +2280 to +2309
auto replace_all = [](std::string & s, const std::string & from, const std::string & to) {
if (from.empty()) {
return;
}
for (size_t pos = 0; (pos = s.find(from, pos)) != std::string::npos;) {
s.replace(pos, from.size(), to);
pos += to.size();
}
};

replace_all(out, "<functionname=", "<function name=");
replace_all(out, "<paramname=", "<param name=");

// Some GGUF outputs drop opening tag names but keep attributes, e.g.
// ` name="python"> name="code">...` instead of `<function name="python">...`.
static const std::regex LEADING_FUNC_ATTR(R"((?:^|[\n\r])\s*name=\"([^\"]+)\">)");
out = std::regex_replace(out, LEADING_FUNC_ATTR, "\n<function name=\"$1\">");
static const std::regex PARAM_ATTR(R"(>\s*name=\"([^\"]+)\">)");
out = std::regex_replace(out, PARAM_ATTR, "><param name=\"$1\">");

static const std::string IM_END = "<|im_end|>";
if (out.size() >= IM_END.size() &&
out.compare(out.size() - IM_END.size(), IM_END.size(), IM_END) == 0) {
out.erase(out.size() - IM_END.size());
while (!out.empty() && (out.back() == '\n' || out.back() == ' ')) {
out.pop_back();
}
}

return out;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this should be defined in the parser. None of this looks like it needs to be done post process.

Comment thread common/chat.cpp Outdated
Comment on lines +2268 to +2278
static const std::string SP_SPACE = "\xC4\xA0"; // U+0120
static const std::string SP_NL = "\xC1\x8A"; // U+010A

for (size_t pos = 0; (pos = out.find(SP_SPACE, pos)) != std::string::npos;) {
out.replace(pos, SP_SPACE.size(), " ");
pos += 1;
}
for (size_t pos = 0; (pos = out.find(SP_NL, pos)) != std::string::npos;) {
out.replace(pos, SP_NL.size(), "\n");
pos += 1;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems easy enough to do in a single pass.

Comment thread common/chat.cpp Outdated
Comment on lines +2374 to +2375
reasoning = p.reasoning(p.until_one_of(TOOL_START_MARKERS)) + p.optional(p.literal("\n")) +
p.optional(p.literal(THINK_END) + p.optional(p.literal("\n\n")));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too many optionals. This can be a choice with two branches: one that completes a thought with the end thinking tag and one that preempts thinking with the start of a tool call. Use p.space() to consume whitespace.

Comment thread common/chat.cpp Outdated
p.optional(p.literal(THINK_END) + p.optional(p.literal("\n\n")));
}

auto suffix = p.optional(p.literal("<|im_end|>") + p.optional(p.literal("\n")));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be needed.

Comment thread common/chat.cpp Outdated
Comment on lines +2395 to +2401
// MiniCPM5 tool calls are parsed post-hoc via peg-minicpm5 (XML output).
// Do not attach JSON-schema GBNF here — it is invalid for this format and
// can destabilize llama-server. SGLang/vLLM use parsers only for MiniCPM5.
data.grammar.clear();
data.grammar_lazy = false;
data.grammar_triggers = {};
(void) include_grammar;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if you don't use the JSON schema to grammar implementation, the PEG parser will compose GBNF rules to enforce tool call structure. Please don't deviate from established patterns.

Comment thread common/chat.cpp Outdated
Comment on lines +2722 to +2728
const std::string normalized_input = params.format == COMMON_CHAT_FORMAT_PEG_MINICPM5 ?
common_chat_normalize_minicpm5_output(input) :
input;

const std::string effective_input = params.generation_prompt.empty()
? input
: params.generation_prompt + input;
? normalized_input
: params.generation_prompt + normalized_input;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normalization should occur in a dedicated mapper. Not here.

Comment thread tests/CMakeLists.txt Outdated
endif()

llama_build_and_test(test-chat-peg-parser.cpp peg-parser/simple-tokenize.cpp)
llama_build_and_test(test-chat-peg-parser-minicpm5.cpp)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reiterate, do not deviate from established patterns. Chat template tests are located under test-chat.cpp.

@zhangtao2-1
Copy link
Copy Markdown
Contributor Author

@aldehir Thanks. Understood on aligning with established patterns — I'll refactor accordingly. Also happy to wait for @pwilkin's input on dedicated parser vs. autoparser before going too far in either direction.

@github-actions github-actions Bot added testing Everything test related jinja parser Issues related to the jinja parser labels May 28, 2026
@pwilkin
Copy link
Copy Markdown
Member

pwilkin commented May 28, 2026

Can you add the chat template here? Then we'll be able to just test it with the autoparser and see how much works out of the box and what needs to be possibly fixed.

@zhangtao2-1 zhangtao2-1 requested a review from pwilkin as a code owner May 28, 2026 10:10
@zhangtao2-1
Copy link
Copy Markdown
Contributor Author

@pwilkin @CISC @aldehir
Pushed 5187d06 with review fixes: align with established patterns (inline PEG, common_schema_info, dedicated minicpm5_mapper, GBNF restored, no generic mapper changes), jinja min/max API + tests, MiniCPM5 template fixture for autoparser experiments, tests in test-chat.cpp. test-chat / test-jinja / test-chat-peg-parser pass. PTAL — especially on dedicated parser vs autoparser.

Comment thread common/jinja/value.cpp Outdated
Comment thread common/jinja/value.cpp
Comment thread tests/test-jinja.cpp Outdated
Comment thread tests/test-jinja.cpp Outdated
Comment thread tests/test-jinja.cpp Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jinja parser Issues related to the jinja parser testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants