Skip to content

step_provider Server-Side Auto-Fill from RecipeStep#3377

Merged
Trecek merged 5 commits into
developfrom
step-provider-not-server-side-auto-filled-minimax-bypass-in/3366
May 31, 2026
Merged

step_provider Server-Side Auto-Fill from RecipeStep#3377
Trecek merged 5 commits into
developfrom
step-provider-not-server-side-auto-filled-minimax-bypass-in/3366

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented May 31, 2026

Summary

Fix the step_provider auto-fill gap in tools_execution.py where RecipeStep.provider is parsed from YAML but never read server-side, causing the Minimax provider to silently bypass in agent-eval. The _resolve_provider_profile call at line 374 receives step_provider="" because the auto-fill block at line 440+ runs 66 lines too late and doesn't include step_provider anyway.

The fix adds an early auto-fill block before _resolve_provider_profile that reads _recipe_step.provider when the caller passes an empty step_provider. This maps to Tier 3 in the resolution cascade, correctly subordinate to all config-level overrides (Tiers 0-2). A secondary fix adds step_provider forwarding instructions to the sous-chef SKILL.md for defense-in-depth.

Audit finding (out of scope): RecipeStep.model has the same auto-fill gap — it is declared on the schema but never server-side resolved. This should be tracked as a separate issue.

Requirements

Conflict Resolution Decisions

The following files had merge conflicts that were automatically resolved.

Architecture Impact

No architecture lens diagrams were generated for this PR.

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/impl-20260530-183127-514496/.autoskillit/temp/make-plan/step_provider_auto_fill_plan_2026-05-30_183600.md

🤖 Generated with Claude Code via AutoSkillIt

Token Usage Summary

Step Model count uncached output cache_read peak_ctx turns cache_write time
plan* opus[1m] 1 982 21.7k 1.8M 101.7k 159 85.4k 12m 55s
review_approach* sonnet 1 10.8k 5.7k 181.2k 46.4k 116 33.8k 6m 57s
verify* sonnet 1 52 13.3k 213.0k 56.7k 86 40.7k 6m 5s
implement* sonnet 1 424.1k 6.5k 679.9k 48.6k 60 39.1k 4m 1s
audit_impl* sonnet 1 280 8.0k 240.4k 37.0k 35 26.8k 3m 36s
prepare_pr* sonnet 1 72.0k 3.2k 190.1k 27.6k 20 15.7k 1m 24s
compose_pr* sonnet 1 76.7k 2.0k 245.3k 27.6k 23 15.5k 57s
review_pr* sonnet 2 365 97.4k 1.6M 96.0k 142 202.4k 20m 11s
resolve_review* sonnet 2 444 43.8k 3.6M 90.7k 147 141.7k 21m 45s
Total 585.8k 201.6k 8.7M 101.7k 601.1k 1h 17m

* Step used a non-Anthropic provider; caching behavior may differ.

Token Efficiency

Step LoC Changed cache_read/LoC cache_write/LoC output/LoC
plan 0
review_approach 0
verify 0
implement 140 4856.1 279.4 46.4
audit_impl 0
prepare_pr 0
compose_pr 0
review_pr 0
resolve_review 25 142665.4 5668.4 1751.1
Total 165 52758.2 3643.1 1221.6

Model Usage Breakdown

Model steps uncached output cache_read cache_write time
opus[1m] 1 982 21.7k 1.8M 85.4k 12m 55s
sonnet 8 584.8k 179.9k 7.0M 515.7k 1h 4m

@Trecek Trecek force-pushed the step-provider-not-server-side-auto-filled-minimax-bypass-in/3366 branch from 0ae1fa4 to 5ab7caa Compare May 31, 2026 02:29
Trecek and others added 5 commits May 30, 2026 19:55
…_provider_profile

When run_skill is called with step_provider="" and a step_name that maps to a
RecipeStep with provider="minimax", the server now resolves step_provider
server-side BEFORE calling _resolve_provider_profile. This closes the 66-line
ordering gap where the late auto-fill block ran after provider resolution and
never included step_provider anyway.

Defense-in-depth: also updated sous-chef SKILL.md to instruct the LLM to
forward step_provider from recipe step provider: fields to run_skill.

Tests: 3 new tests in test_tools_execution_step_resolution.py covering
auto-fill, LLM-override precedence, and observability logging. 1 new
contract test in test_sous_chef_parameter_forwarding.py.

Closes #3366.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ep_provider test

The spy always returns a hardcoded ('minimax', ...) tuple, so the assertion
executor.calls[0].provider_name == 'minimax' cannot distinguish whether
auto-fill logic ran vs. the spy's hardcoded return.  The meaningful check
captured_kwargs['step_provider'] == 'minimax' is already present and sufficient.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
step_provider was the only recipe-step parameter whose server-side auto-fill
was gated on is_feature_enabled('providers'). The analogous auto-fills for
output_dir, stale_threshold, and idle_output_timeout live in the unconditional
block at the bottom of run_skill. Move step_provider there for consistency,
reusing the already-fetched _recipe_step variable.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ntions_step_provider

Path("src/...").read_text() is cwd-relative and raises FileNotFoundError when
pytest is invoked from a non-root directory. Use Path(__file__).parents[2] to
anchor the path to the test file's location.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rs-gate

The first fix placed the auto-fill in the post-gate unconditional block, but
_resolve_provider_profile is called inside the gate, so step_provider must be
filled before the gate check. Move it to the pre-gate position (after _cfg but
before is_feature_enabled) so it runs unconditionally and the filled value
reaches _resolve_provider_profile on the same execution path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Trecek Trecek force-pushed the step-provider-not-server-side-auto-filled-minimax-bypass-in/3366 branch from 5ab7caa to 77f693a Compare May 31, 2026 02:56
@Trecek Trecek added this pull request to the merge queue May 31, 2026
Merged via the queue into develop with commit fb9dd67 May 31, 2026
3 checks passed
@Trecek Trecek deleted the step-provider-not-server-side-auto-filled-minimax-bypass-in/3366 branch May 31, 2026 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant