[wip] goal shift#23858
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 069206b414
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let objective_changed = previous_goal | ||
| .as_ref() | ||
| .is_some_and(|previous_goal| previous_goal.objective != goal.objective); |
There was a problem hiding this comment.
Inject steering when external creates a goal
When an external goal is created during an active turn, callers pass previous_goal = None, so objective_changed becomes false and the code skips the steering injection below even though it immediately marks the current turn as pursuing that new goal. In that scenario the model continues the turn without ever seeing the user-created objective, while subsequent token/time accounting is attributed to it; treat a missing previous goal as needing an objective steering item when goal.status is active.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
I rechecked this on head 61d766901ce68499ab3bbc200787948df38679f7; this is still present, and the missing-steering case is broader than new-goal creation. The same objective_changed gate also skips steering when an external mutation transitions an existing paused/blocked/complete goal back to Active without changing the objective. In both cases apply_external_goal_set starts accounting the current turn/idle wall clock against the goal immediately, but the model never receives the active objective. Please key the steering decision off the goal becoming active without in-turn objective context, not only previous.objective != goal.objective, and add coverage for previous_goal == None plus status-transition-to-Active cases.
usgenes39-dotcom
left a comment
There was a problem hiding this comment.
I found blocking issues in the goal steering/accounting paths.
The existing inline thread on runtime.rs is still valid on this head; I added a reply because the same predicate also misses external status transitions back to Active, not just first creation. That path starts charging token/time progress to the goal without giving the model the objective, so the accounting and the actual work can diverge.
Separately, the budget-limit steering de-dupe now resets when the same goal is marked active again, which can repeatedly inject the same budget-limited stop message across turns and spend more tokens after the goal is already over budget. CI is also red on this head: rust-ci fails argument-comment lint on the new token_usage(...) test calls, and Bazel still has failing Windows/macOS test jobs.
| if inner.budget_limit_reported_goal_id.as_deref() != Some(goal_id.as_str()) { | ||
| inner.budget_limit_reported_goal_id = None; | ||
| } | ||
| inner.budget_limit_reported_goal_id = None; |
There was a problem hiding this comment.
This unconditional reset breaks the same-goal budget-limit de-dupe. Once a goal reaches BudgetLimited, mark_budget_limit_reported_if_new records that the steering item was already injected. A later on_turn_start or external active re-mark for the same goal calls this method again and clears the recorded id, so the next counted tool finish can inject the identical budget-limit steering for the same already-limited goal. That is a product/economic regression: a stopped-over-budget goal can keep spending model tokens on repeated stop steering and repeated goal-progress accounting. The old conditional preserved the report id when the active goal id did not change; restore that behavior or track a per-goal budget-limit epoch, and add a cross-turn same-goal BudgetLimited test because the current tests only cover same-turn de-duping.
No description provided.