Add minicpm5 tool call parser by zhangtao2-1 · Pull Request #23802 · ggml-org/llama.cpp

zhangtao2-1 · 2026-05-28T07:34:57Z

Overview

Add MiniCPM5 tool call support for llama-server.

MiniCPM5 outputs tool calls as XML (<function name="..."><param name="...">...</param></function>), not JSON. This PR adds a peg-minicpm5 parser to detect the template, parse tool calls (including streaming), normalize common output quirks, and expose OpenAI-compatible tool_calls.

Also fixes a peg mapper streaming bug (use tool index instead of dangling pointer) and adds Jinja min/max filters needed by the MiniCPM5 template.

Test plan

test-chat-peg-parser-minicpm5
test-jinja (min / max)
Manual test with MiniCPM5 GGUF + tools on llama-server

zhangtao2-1 · 2026-05-28T07:37:09Z

@CISC would you mind taking a look at this PR? MiniCPM5 tool call parsing — would appreciate your review.

CISC · 2026-05-28T07:49:03Z

@CISC would you mind taking a look at this PR? MiniCPM5 tool call parsing — would appreciate your review.

I'll review the jinja part, leaving the parser to @ggml-org/llama-common

aldehir

Thanks for the PR. The code present deviates too much from what is already established. I don't really see anything that would necessitate the need for these deviations.

I'll let @pwilkin weigh in if he thinks a dedicated parser is necessary or if this can be handled by the autoparser with some tweaks.

aldehir · 2026-05-28T08:10:14Z

 std::string & common_chat_peg_mapper::args_target() {
-    return (current_tool && !current_tool->name.empty()) ? current_tool->arguments : args_buffer;
+    common_chat_tool_call * tool = active_tool();
+    return (tool && !tool->name.empty()) ? tool->arguments : args_buffer;
+}
+
+common_chat_tool_call * common_chat_peg_mapper::active_tool() {
+    if (committed_tool_idx.has_value()) {
+        return &result.tool_calls.at(committed_tool_idx.value());
+    }
+    if (pending_tool_call.has_value()) {
+        return &pending_tool_call.value();
+    }
+    return nullptr;


If it's not broke, don't fix it.

aldehir · 2026-05-28T08:12:10Z

    }
 }
+
+common_peg_parser common_chat_peg_builder::minicpm5_xml_tool_calls(const ordered_json & tools,


Inline in the chat param init for the model template.

aldehir · 2026-05-28T08:13:26Z

+            auto arg_choice = choice();
+            for (const auto & el : params.at("properties").items()) {
+                const std::string & prop_name = el.key();
+                const std::string & prop_type = el.value().value("type", "string");


Please use the established patterns in the repo for determining if a type is a string. See common_schema_info.

aldehir · 2026-05-28T08:16:34Z

+    auto replace_all = [](std::string & s, const std::string & from, const std::string & to) {
+        if (from.empty()) {
+            return;
+        }
+        for (size_t pos = 0; (pos = s.find(from, pos)) != std::string::npos;) {
+            s.replace(pos, from.size(), to);
+            pos += to.size();
+        }
+    };
+
+    replace_all(out, "<functionname=", "<function name=");
+    replace_all(out, "<paramname=", "<param name=");
+
+    // Some GGUF outputs drop opening tag names but keep attributes, e.g.
+    // ` name="python"> name="code">...` instead of `<function name="python">...`.
+    static const std::regex LEADING_FUNC_ATTR(R"((?:^|[\n\r])\s*name=\"([^\"]+)\">)");
+    out = std::regex_replace(out, LEADING_FUNC_ATTR, "\n<function name=\"$1\">");
+    static const std::regex PARAM_ATTR(R"(>\s*name=\"([^\"]+)\">)");
+    out = std::regex_replace(out, PARAM_ATTR, "><param name=\"$1\">");
+
+    static const std::string IM_END = "<|im_end|>";
+    if (out.size() >= IM_END.size() &&
+        out.compare(out.size() - IM_END.size(), IM_END.size(), IM_END) == 0) {
+        out.erase(out.size() - IM_END.size());
+        while (!out.empty() && (out.back() == '\n' || out.back() == ' ')) {
+            out.pop_back();
+        }
+    }
+
+    return out;


All of this should be defined in the parser. None of this looks like it needs to be done post process.

aldehir · 2026-05-28T08:17:19Z

+    static const std::string SP_SPACE = "\xC4\xA0"; // U+0120
+    static const std::string SP_NL    = "\xC1\x8A"; // U+010A
+
+    for (size_t pos = 0; (pos = out.find(SP_SPACE, pos)) != std::string::npos;) {
+        out.replace(pos, SP_SPACE.size(), " ");
+        pos += 1;
+    }
+    for (size_t pos = 0; (pos = out.find(SP_NL, pos)) != std::string::npos;) {
+        out.replace(pos, SP_NL.size(), "\n");
+        pos += 1;
+    }


This seems easy enough to do in a single pass.

aldehir · 2026-05-28T08:22:47Z

+            reasoning = p.reasoning(p.until_one_of(TOOL_START_MARKERS)) + p.optional(p.literal("\n")) +
+                        p.optional(p.literal(THINK_END) + p.optional(p.literal("\n\n")));


Too many optionals. This can be a choice with two branches: one that completes a thought with the end thinking tag and one that preempts thinking with the start of a tool call. Use p.space() to consume whitespace.

aldehir · 2026-05-28T08:23:06Z

+                        p.optional(p.literal(THINK_END) + p.optional(p.literal("\n\n")));
+        }
+
+        auto suffix = p.optional(p.literal("<|im_end|>") + p.optional(p.literal("\n")));


Should not be needed.

aldehir · 2026-05-28T08:25:27Z

+    // MiniCPM5 tool calls are parsed post-hoc via peg-minicpm5 (XML output).
+    // Do not attach JSON-schema GBNF here — it is invalid for this format and
+    // can destabilize llama-server. SGLang/vLLM use parsers only for MiniCPM5.
+    data.grammar.clear();
+    data.grammar_lazy     = false;
+    data.grammar_triggers = {};
+    (void) include_grammar;


Even if you don't use the JSON schema to grammar implementation, the PEG parser will compose GBNF rules to enforce tool call structure. Please don't deviate from established patterns.

aldehir · 2026-05-28T08:26:22Z

+    const std::string normalized_input = params.format == COMMON_CHAT_FORMAT_PEG_MINICPM5 ?
+        common_chat_normalize_minicpm5_output(input) :
+        input;
+
    const std::string effective_input = params.generation_prompt.empty()
-        ? input
-        : params.generation_prompt + input;
+        ? normalized_input
+        : params.generation_prompt + normalized_input;


Normalization should occur in a dedicated mapper. Not here.

aldehir · 2026-05-28T08:27:24Z

 endif()

 llama_build_and_test(test-chat-peg-parser.cpp peg-parser/simple-tokenize.cpp)
+llama_build_and_test(test-chat-peg-parser-minicpm5.cpp)


To reiterate, do not deviate from established patterns. Chat template tests are located under test-chat.cpp.

zhangtao2-1 · 2026-05-28T08:41:50Z

@aldehir Thanks. Understood on aligning with established patterns — I'll refactor accordingly. Also happy to wait for @pwilkin's input on dedicated parser vs. autoparser before going too far in either direction.

pwilkin · 2026-05-28T09:27:25Z

Can you add the chat template here? Then we'll be able to just test it with the autoparser and see how much works out of the box and what needs to be possibly fixed.

zhangtao2-1 · 2026-05-28T10:16:35Z

@pwilkin @CISC @aldehir
Pushed 5187d06 with review fixes: align with established patterns (inline PEG, common_schema_info, dedicated minicpm5_mapper, GBNF restored, no generic mapper changes), jinja min/max API + tests, MiniCPM5 template fixture for autoparser experiments, tests in test-chat.cpp. test-chat / test-jinja / test-chat-peg-parser pass. PTAL — especially on dedicated parser vs autoparser.

Add minicpm5 tool call parser

f50d698

zhangtao2-1 requested review from a team, CISC and ggerganov as code owners May 28, 2026 07:34

CISC reviewed May 28, 2026

View reviewed changes

Comment thread common/jinja/value.cpp

Comment thread tests/test-jinja.cpp

zhangtao2-1 mentioned this pull request May 28, 2026

Eval bug: Toolcalling not working for MiniCPM5-1B #23781

Open

aldehir requested changes May 28, 2026

View reviewed changes

github-actions Bot added testing Everything test related jinja parser Issues related to the jinja parser labels May 28, 2026

Refactor MiniCPM5 PEG parser per review feedback

5187d06

zhangtao2-1 requested a review from pwilkin as a code owner May 28, 2026 10:10

CISC reviewed May 28, 2026

View reviewed changes

Comment thread common/jinja/value.cpp Outdated

Comment thread common/jinja/value.cpp

Comment thread tests/test-jinja.cpp Outdated

Comment thread tests/test-jinja.cpp Outdated

Comment thread tests/test-jinja.cpp Outdated

Fix jinja min/max API to match Jinja2

4188894

		reasoning = p.reasoning(p.until_one_of(TOOL_START_MARKERS)) + p.optional(p.literal("\n")) +
		p.optional(p.literal(THINK_END) + p.optional(p.literal("\n\n")));

Conversation

zhangtao2-1 commented May 28, 2026

Overview

Test plan

Uh oh!

zhangtao2-1 commented May 28, 2026

Uh oh!

Uh oh!

Uh oh!

CISC commented May 28, 2026

Uh oh!

aldehir left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aldehir May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhangtao2-1 commented May 28, 2026

Uh oh!

pwilkin commented May 28, 2026

Uh oh!

zhangtao2-1 commented May 28, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aldehir left a comment •

edited

Loading

aldehir May 28, 2026 •

edited

Loading