fix(tps): correct off-by-one in decode token count for generation TPS by paul90317 · Pull Request #23828 · ggml-org/llama.cpp

paul90317 · 2026-05-28T16:19:28Z

Overview

Generation TPS was computed using an extra decode token in the token count, while the decode time measurement did not include this extra step. This caused an inflated TPS value due to mismatched token/time accounting.

This change fixes the off-by-one issue in decode token counting to ensure consistent alignment between decode tokens and decode duration.

Additional information

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: yes (used for commit message drafting and wording)

Generation TPS was computed using an extra decode token in the token count, while the decode time measurement did not include this extra step. This caused an inflated TPS value due to mismatched token/time accounting. This change fixes the off-by-one issue in decode token counting to ensure consistent alignment between decode tokens and decode duration.

paul90317 · 2026-05-28T16:42:02Z

A potential issue may be in the stop condition that uses slot.n_decoded:

// check the limits
if (slot.n_decoded > 0 && slot.has_next_token && !slot.has_budget(params_base)) {
    slot.stop = STOP_TYPE_LIMIT;
    slot.has_next_token = false;
}

This may cause the model to output one more token than the configured limit.

I will investigate a better solution.

paul90317 requested a review from a team as a code owner May 28, 2026 16:19

github-actions Bot added examples server labels May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tps): correct off-by-one in decode token count for generation TPS#23828

fix(tps): correct off-by-one in decode token count for generation TPS#23828
paul90317 wants to merge 1 commit into
ggml-org:masterfrom
paul90317:master

paul90317 commented May 28, 2026

Uh oh!

paul90317 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

paul90317 commented May 28, 2026

Overview

Additional information

Requirements

Uh oh!

paul90317 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant