Skip to content

fix: evals CLI link and preserve config#1755

Open
miguelg719 wants to merge 3 commits intomainfrom
miguelgonzalez/stg-1456-fix-evals-cli
Open

fix: evals CLI link and preserve config#1755
miguelg719 wants to merge 3 commits intomainfrom
miguelgonzalez/stg-1456-fix-evals-cli

Conversation

@miguelg719
Copy link
Collaborator

@miguelg719 miguelg719 commented Feb 25, 2026

why

After the build migration, pnpm build:cli was no longer linking or preserving overriden configs

what changed

  • Added bin field in package.json to enable npm linking
  • Implemented smart config merging in the build script that updates tasks/benchmarks from source while preserving user-customized defaults
  • Added auto-linking via npm link --force at the end of the build process with graceful fallback, for whenever users run pnpm build:cli
  • Set serverCache: false in initV3 for consistent eval behavior on API

test plan

@changeset-bot
Copy link

changeset-bot bot commented Feb 25, 2026

⚠️ No Changeset found

Latest commit: 5493ad7

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@miguelg719 miguelg719 changed the title fix: evals cli liking and preserving config fix: evals CLI liking and preserving config Feb 25, 2026
@miguelg719 miguelg719 marked this pull request as ready for review February 25, 2026 09:06
@miguelg719 miguelg719 changed the title fix: evals CLI liking and preserving config fix: evals CLI link and preserve config Feb 25, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 25, 2026

Greptile Summary

Fixed the evals CLI build process to properly link the binary and preserve user configuration defaults after the build migration. The PR adds three key fixes:

  • Added bin field in package.json to enable npm linking
  • Implemented smart config merging in the build script that updates tasks/benchmarks from source while preserving user-customized defaults
  • Added auto-linking via npm link --force at the end of the build process with graceful fallback
  • Set serverCache: false in initV3 for consistent eval behavior

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The changes are well-scoped fixes to the build process with proper error handling and graceful degradation. The config merging logic correctly preserves user defaults while updating tasks/benchmarks. The addition of serverCache: false aligns with eval requirements.
  • No files require special attention

Important Files Changed

Filename Overview
packages/evals/scripts/build-cli.ts Added config merging logic to preserve user defaults and auto-linking via npm link
packages/evals/package.json Added bin field pointing to ./dist/cli/cli.js to enable npm linking
packages/evals/initV3.ts Added serverCache: false to V3 options for evals consistency

Last reviewed commit: 107f640

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files

Confidence score: 3/5

  • Some regression risk due to a medium-severity config handling change that can silently drop defaults for existing users.
  • packages/evals/scripts/build-cli.ts replaces the entire defaults object, which may ignore newly added default keys in source config after prior builds.
  • Pay close attention to packages/evals/scripts/build-cli.ts - ensure defaults are merged rather than replaced to avoid losing new default entries.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/evals/scripts/build-cli.ts">

<violation number="1" location="packages/evals/scripts/build-cli.ts:48">
P2: Replacing the entire `defaults` object instead of merging it will silently drop any new default keys added to the source config for users who have built before. Use a shallow merge so user overrides are preserved while new source defaults are still picked up.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant Dev as Developer / User
    participant Build as build-cli.ts
    participant FS as Local File System
    participant NPM as npm Registry (Global)
    participant CLI as evals CLI (dist)
    participant Engine as Stagehand Engine

    Note over Dev, NPM: Build & Link Phase (pnpm build:cli)

    Dev->>Build: Execute build script
    Build->>FS: Read packages/evals/evals.config.json
    FS-->>Build: Source config data
    
    opt NEW: If existing config exists in dist/
        Build->>FS: Read dist/cli/evals.config.json
        FS-->>Build: Existing dist config
        Build->>Build: NEW: Preserve 'defaults' from dist config
    end

    Build->>Build: Merge tasks/benchmarks from source
    Build->>FS: NEW: Write merged config to dist/
    Build->>FS: Set executable permissions (chmod 755)
    
    Build->>NPM: NEW: npm link --force
    alt Link Success
        NPM-->>Build: Linked 'evals' binary globally
    else Link Failure
        Build-->>Dev: Console Warning (non-fatal)
    end

    Note over Dev, Engine: Execution Phase

    Dev->>CLI: Run 'evals' command
    CLI->>Engine: initV3() initialization
    Engine->>Engine: CHANGED: Set serverCache: false
    Engine->>Engine: Initialize evaluation suite
    Engine-->>Dev: Evaluation results
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

miguelg719 and others added 2 commits February 25, 2026 01:14
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant