ENTR-18: Docker toolchain, real CI gate on make test, pre-commit hooks#2
Merged
David Shukhin (davidshukhin) merged 7 commits intoJun 11, 2026
Conversation
The ARM and FPGA test targets had never actually run (CI skipped them with || true / echo). Getting `make test` genuinely green surfaced and fixes: - hls_compat.h: implicit integral conversions for ap_uint/ap_int (builtin operators replace ambiguous member overloads), runtime bit-slice width, missing converting ctors, explicit double conversion on fixed-point types - spatial_hash.h: 3x3 sweep used ap_int<2> indices ([-2,1]) so dx++ past 1 never terminated; linear probing stored no keys so lookups could never find a bucket — grid_cell_t now carries its owning key - counters: MAX_EVENTS/RING_BUFFER_SIZE = 4096 needs 13 bits; all ap_uint<12> counts wrapped to 0 at exactly full capacity (new event_cnt_t typedef) - pwm_output.h: pwm_tick_t was 16-bit but PWM_MAX_TICKS = 200,000; every pulse width (incl. disarmed minimum) truncated mod 65536 - aer_interface.h: handshake state register was a non-static uninitialized local - collision_predictor.h: TTC formula multiplied by fov/theta, which cancelled object size out of Lee's tau entirely (TTC ~30x too large, urgency permanently 0) — now tau = theta/theta_dot as documented - encoder_systolic.h: ~20MB of locals overflowed the C-sim stack; now static (HLS maps them to BRAM either way) - arm/Makefile: sim target overrode CXXFLAGS but not CXX, mixing the ARM cross-compiler with the native linker - Python tests: runners now exit nonzero on failure (they always exited 0, so CI could never catch a regression); FP32 tolerance on the complex-unrolling check; seeded RNG (unseeded data made CI flaky); model-quality assertions XFAIL with an explicit degeneracy probe until the d=128 retrain (ENTR-16) replaces the truncated FPGA.pth weights - testbench: motor checks normalize ticks back to thrust; AER feed emulates the 4-phase handshake; cluster-separation test checks cross-cluster edges; untrack built binaries Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
python:3.11-slim-trixie with g++-13 pinned (gcc 14 rejects the hls_compat.h template bodies) and CPU-only torch. The image carries dependencies only; `make docker-test` bind-mounts the repo, so it rebuilds only when Dockerfile/requirements.txt change. C++ test targets now build with -O2 (the 8400-step pipeline subtests are far too slow at -O0). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Mechanical reformat with the new .clang-format (Google base, 4-space indent, 100 cols, include sorting off — HLS headers are order-dependent). No logic changes; ARM and FPGA suites verified green after the pass. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
.flake8 is the single config for make lint, pre-commit, and CI (error-only gate: E9,F63,F7,F82). make lint now also runs clang-format --dry-run --Werror over the C++ tree. One-line setup in CONTRIBUTING.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Replaces the four old jobs (3.10/3.11 matrix, ARM and FPGA jobs that skipped with '|| true', build-check) with two: - test: buildx + GHA layer cache, then 'make test' inside the image — Python pipeline + golden model + ARM C++ + FPGA testbench all gate for real now. This job is the required status check for branch protection (ENO-11) — keep the name stable. - lint: flake8 + clang-format via 'make lint'. Dropped from the old gate: convert_weights_to_fpga.py --validate-only (depends on the untracked UNION.pth weights). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Root-run containers leave root-owned __pycache__ and test binaries in the bind-mounted tree, which then break native runs and git clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the core of ENTR-18: Docker environment, CI gating on
make test, and minimal pre-commit — plus the test-suite repairs required to make that gate real. The old CI skipped the ARM and FPGA jobs with|| true/echo, and the Python test scripts always exited 0, so no test had ever actually gated anything. Getting to a genuinely greenmake testsurfaced real bugs, several of them hardware-relevant:DevOps (the issue's core scope)
python:3.11-slim-trixie+ pinned g++-13 + CPU torch): deps-only image, repo bind-mounted, so it rebuilds only whenDockerfile/requirements.txtchange.make docker-build/make docker-test.testjob — buildx with GHA layer cache, thenmake testin the container. This is the required status check for branch protection (ENO-11); the name is intentionally short and stable. Separatelintjob runs flake8 + clang-format.make lint, one setup line in CONTRIBUTING.md. One-time clang-format pass overarm/ fpga/ test/in its own commit.Bugs found by un-skipping the tests
fpga/hls_compat.hnever compiled under g++ — rebuilt the ap_* sim types around implicit integral conversions.spatial_hash.h:ap_int<2>loop indices made the 3×3 neighborhood sweep an infinite loop; linear probing stored no keys, so k-NN lookups could never find a populated bucket (always 0 neighbors).ap_uint<12>, wrapping to 0 at exactly full — newevent_cnt_t(13-bit) used for all counts.pwm_output.h:pwm_tick_twas 16-bit but pulse widths reach 200,000 ticks — every motor output truncated mod 65536, including the disarmed safe-minimum. Now 18-bit.aer_interface.h: handshake state register was a non-static uninitialized local — UB every call.arm/collision_predictor.h: TTC formula multiplied Lee's τ = θ/θ̇ byfov/θ, cancelling object size — TTC came out ~30× too large and urgency was permanently 0 (no looming object could ever trigger evasion). Now τ = θ/θ̇ as documented.arm/Makefile:simtarget compiled objects with the ARM cross-compiler but linked with native g++.sys.exit(1)on failure (previously always 0 → CI could never catch a regression); RNG seeded (unseeded data made runs flaky).Known limitation (intentionally visible)
models/models/FPGA.pth(weight truncation, not a retrain) predicts near-constant/NaN flow regardless of input, so the two model-quality assertions intest_fpga_simulator.pyare XFAIL with an explicit degeneracy probe — they re-arm automatically once the ENTR-16 d=128 retrain lands. Everything structural still gates.Verification
make docker-testgreen locally: Python 5/5, golden model clean, ARM 23/23, FPGA testbench 17/17 assertionsmake lintexit 0 in the container;pre-commit run --all-filespassesconvert_weights_to_fpga.py --validate-onlystep dropped (depends on untracked UNION.pth)🤖 Generated with Claude Code