Skip to content

bug: before_all hook failure aborts entire eval run instead of just the affected suite #910

@christso

Description

@christso

Problem

When a before_all workspace hook fails for one eval file, the entire eval run aborts — including all other eval files that haven't started yet.

Reproduction

agentv eval run 'examples/**/*.eval.yaml' '!examples/showcase/multi-model-benchmark/**'

If copilot-log-eval's before_all hook (workspace-setup.mjs) fails, no subsequent evals run.

Expected behavior

A before_all failure should:

  1. Mark all tests in that eval file as errors
  2. Continue running the remaining eval files
  3. Report the failure in the summary

Actual behavior

Error: before_all script failed: Script failed: Process exited with code 1

The process exits immediately. No other evals run.

Workaround

Exclude the failing eval from glob patterns using negation:

agentv eval run 'examples/**/*.eval.yaml' '!examples/features/copilot-log-eval/**'

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions