Skip to content

Test suite is not xdist-safe at high worker counts (shared relative pybnf_output/ dir races) #421

@wshlavacek

Description

@wshlavacek

Summary

Several test classes write to (and rmtree) the shared, cwd-relative default pybnf_output/ output directory. Under pytest -n with enough workers, these collide: one class's teardown rmtree('pybnf_output') runs while another class is mid-fit writing into pybnf_output/..., producing nondeterministic FileNotFoundErrors.

Repro

PYBNF_NO_BNGSIM=1 uv run --no-sync pytest -m "not slow" -n auto

On a 6-core machine (6 workers) this flakes; example failures seen:

FAILED tests/test_seed_determinism.py::TestChainRngSpawn::test_chain_rngs_reseed_on_bootstrap
FAILED tests/test_simplex.py::TestSimplex::test_updates
ERROR  tests/test_seed_determinism.py::TestSeedReproducibility::test_mh_seed_reproducibility
  FileNotFoundError: '.../pybnf_output/Initialize/parabola_gen_net.net'

The same command on GitHub's hosted runners (4 workers) is reliably green — it's worker-count-sensitive, not a product bug.

Root cause

  • tests/test_simplex.py — configs default to output_dir = pybnf_output; teardown_class does rmtree('pybnf_output').
  • tests/test_job_class.pyrmtree('pybnf_output', ...) in teardown.
  • tests/test_seed_determinism.py — some configs hit the default pybnf_output/Initialize/ (network-gen) path.

All three share the same relative dir in a shared cwd, so parallel workers race.

Proper fix

Give each test class a unique output dir (e.g. pybnf_output_simplex, pybnf_output_jobclass, …) — or a tmp_path-based one — and update the teardown rmtree + any relative-path assertions accordingly. After that, the local pre-push hook can mirror CI at full -n auto instead of being capped.

Interim mitigation (shipped)

The pre-push pytest-ci-mirror hook is capped at -n 4 (CI's worker count), keeping it in CI's empirically-green regime rather than local -n auto. See .pre-commit-config.yaml.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions