Skip to content

ENTR-18: Docker toolchain, real CI gate on make test, pre-commit hooks#2

Merged
David Shukhin (davidshukhin) merged 7 commits into
mainfrom
dshukhin/entr-18-devops-pre-commit-hooks-dockerfile-ci-matrix-changelog-gate
Jun 11, 2026
Merged

ENTR-18: Docker toolchain, real CI gate on make test, pre-commit hooks#2
David Shukhin (davidshukhin) merged 7 commits into
mainfrom
dshukhin/entr-18-devops-pre-commit-hooks-dockerfile-ci-matrix-changelog-gate

Conversation

@davidshukhin

Copy link
Copy Markdown
Member

Summary

Implements the core of ENTR-18: Docker environment, CI gating on make test, and minimal pre-commit — plus the test-suite repairs required to make that gate real. The old CI skipped the ARM and FPGA jobs with || true/echo, and the Python test scripts always exited 0, so no test had ever actually gated anything. Getting to a genuinely green make test surfaced real bugs, several of them hardware-relevant:

DevOps (the issue's core scope)

  • Dockerfile (python:3.11-slim-trixie + pinned g++-13 + CPU torch): deps-only image, repo bind-mounted, so it rebuilds only when Dockerfile/requirements.txt change. make docker-build / make docker-test.
  • CI: single test job — buildx with GHA layer cache, then make test in the container. This is the required status check for branch protection (ENO-11); the name is intentionally short and stable. Separate lint job runs flake8 + clang-format.
  • Pre-commit: flake8 + clang-format (both pinned to match CI), wired into make lint, one setup line in CONTRIBUTING.md. One-time clang-format pass over arm/ fpga/ test/ in its own commit.
  • Out of scope per the issue: 3.10/3.12 matrix and changelog gate (left as starter tasks).

Bugs found by un-skipping the tests

  • fpga/hls_compat.h never compiled under g++ — rebuilt the ap_* sim types around implicit integral conversions.
  • spatial_hash.h: ap_int<2> loop indices made the 3×3 neighborhood sweep an infinite loop; linear probing stored no keys, so k-NN lookups could never find a populated bucket (always 0 neighbors).
  • Counter widths: 4096-capacity buffers counted with ap_uint<12>, wrapping to 0 at exactly full — new event_cnt_t (13-bit) used for all counts.
  • pwm_output.h: pwm_tick_t was 16-bit but pulse widths reach 200,000 ticks — every motor output truncated mod 65536, including the disarmed safe-minimum. Now 18-bit.
  • aer_interface.h: handshake state register was a non-static uninitialized local — UB every call.
  • arm/collision_predictor.h: TTC formula multiplied Lee's τ = θ/θ̇ by fov/θ, cancelling object size — TTC came out ~30× too large and urgency was permanently 0 (no looming object could ever trigger evasion). Now τ = θ/θ̇ as documented. ⚠️ Flagging for ENTR-15: this changes flight-relevant behavior; the SITL parameter sweeps should validate the thresholds against the corrected TTC scale.
  • arm/Makefile: sim target compiled objects with the ARM cross-compiler but linked with native g++.
  • Python test runners now sys.exit(1) on failure (previously always 0 → CI could never catch a regression); RNG seeded (unseeded data made runs flaky).

Known limitation (intentionally visible)

models/models/FPGA.pth (weight truncation, not a retrain) predicts near-constant/NaN flow regardless of input, so the two model-quality assertions in test_fpga_simulator.py are XFAIL with an explicit degeneracy probe — they re-arm automatically once the ENTR-16 d=128 retrain lands. Everything structural still gates.

Verification

  • make docker-test green locally: Python 5/5, golden model clean, ARM 23/23, FPGA testbench 17/17 assertions
  • make lint exit 0 in the container; pre-commit run --all-files passes
  • Old workflow's convert_weights_to_fpga.py --validate-only step dropped (depends on untracked UNION.pth)

🤖 Generated with Claude Code

The ARM and FPGA test targets had never actually run (CI skipped them
with || true / echo). Getting `make test` genuinely green surfaced and
fixes:

- hls_compat.h: implicit integral conversions for ap_uint/ap_int
  (builtin operators replace ambiguous member overloads), runtime
  bit-slice width, missing converting ctors, explicit double conversion
  on fixed-point types
- spatial_hash.h: 3x3 sweep used ap_int<2> indices ([-2,1]) so dx++
  past 1 never terminated; linear probing stored no keys so lookups
  could never find a bucket — grid_cell_t now carries its owning key
- counters: MAX_EVENTS/RING_BUFFER_SIZE = 4096 needs 13 bits; all
  ap_uint<12> counts wrapped to 0 at exactly full capacity
  (new event_cnt_t typedef)
- pwm_output.h: pwm_tick_t was 16-bit but PWM_MAX_TICKS = 200,000;
  every pulse width (incl. disarmed minimum) truncated mod 65536
- aer_interface.h: handshake state register was a non-static
  uninitialized local
- collision_predictor.h: TTC formula multiplied by fov/theta, which
  cancelled object size out of Lee's tau entirely (TTC ~30x too large,
  urgency permanently 0) — now tau = theta/theta_dot as documented
- encoder_systolic.h: ~20MB of locals overflowed the C-sim stack;
  now static (HLS maps them to BRAM either way)
- arm/Makefile: sim target overrode CXXFLAGS but not CXX, mixing the
  ARM cross-compiler with the native linker
- Python tests: runners now exit nonzero on failure (they always
  exited 0, so CI could never catch a regression); FP32 tolerance on
  the complex-unrolling check; seeded RNG (unseeded data made CI
  flaky); model-quality assertions XFAIL with an explicit degeneracy
  probe until the d=128 retrain (ENTR-16) replaces the truncated
  FPGA.pth weights
- testbench: motor checks normalize ticks back to thrust; AER feed
  emulates the 4-phase handshake; cluster-separation test checks
  cross-cluster edges; untrack built binaries

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
python:3.11-slim-trixie with g++-13 pinned (gcc 14 rejects the
hls_compat.h template bodies) and CPU-only torch. The image carries
dependencies only; `make docker-test` bind-mounts the repo, so it
rebuilds only when Dockerfile/requirements.txt change. C++ test
targets now build with -O2 (the 8400-step pipeline subtests are far
too slow at -O0).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Mechanical reformat with the new .clang-format (Google base, 4-space
indent, 100 cols, include sorting off — HLS headers are
order-dependent). No logic changes; ARM and FPGA suites verified
green after the pass.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
.flake8 is the single config for make lint, pre-commit, and CI
(error-only gate: E9,F63,F7,F82). make lint now also runs
clang-format --dry-run --Werror over the C++ tree. One-line setup in
CONTRIBUTING.md.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Replaces the four old jobs (3.10/3.11 matrix, ARM and FPGA jobs that
skipped with '|| true', build-check) with two:

- test: buildx + GHA layer cache, then 'make test' inside the image —
  Python pipeline + golden model + ARM C++ + FPGA testbench all gate
  for real now. This job is the required status check for branch
  protection (ENO-11) — keep the name stable.
- lint: flake8 + clang-format via 'make lint'.

Dropped from the old gate: convert_weights_to_fpga.py --validate-only
(depends on the untracked UNION.pth weights).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Root-run containers leave root-owned __pycache__ and test binaries in
the bind-mounted tree, which then break native runs and git clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@davidshukhin David Shukhin (davidshukhin) merged commit 2234e9a into main Jun 11, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant