feat: widen ECCVM layout (8 additions/row, 8 wnaf digits/row) by notnotraju · Pull Request #21721 · AztecProtocol/aztec-packages

notnotraju · 2026-03-18T09:51:27Z

Summary

Doubles the width of the ECCVM Precomputed and MSM tables:

WNAF_DIGITS_PER_ROW: 4 → 8 (precomputed table rows halved per scalar)
ADDITIONS_PER_ROW: 4 → 8 (MSM addition rows halved per round)
DOUBLINGS_PER_ROW: stays at 4 (decoupled from additions)

Net effect: for m short scalar muls, MSM rows go from 33·⌈m/4⌉ + 31 to 33·⌈m/8⌉ + 31; precompute rows go from 8m to 4m.

Capacity analysis

Each app circuit adds ~1104 ECCVM rows with a base overhead of ~1494 rows. The MaxCapacityPassing test now computes the max app count from CONST_ECCVM_LOG_N instead of hardcoding it.

`CONST_ECCVM_LOG_N`	ECCVM rows	Max apps	vs. old 4-wide at LOG_N=15
15	32768	28	+65% capacity (was 17)
14	16384	13	-24% capacity (was 17)

LOG_N=14 does NOT maintain capacity parity: only 13 apps fit (down from 17). The win here is keeping LOG_N = 15 and getting 28 apps — nearly doubling the stack depth.

Changes made

Constants & types (eccvm_builder_types.hpp):

WNAF_DIGITS_PER_ROW 4→8, ADDITIONS_PER_ROW 4→8

Flavor (eccvm_flavor.hpp):

36 new witness columns: msm_add5..8, msm_x5..8, msm_y5..8, msm_collision_x5..8, msm_lambda5..8, msm_slice5..8, precompute_s5hi..s8lo, precompute_tx2, precompute_ty2, lookup_read_counts_2, lookup_read_counts_3

Builders:

precomputed_tables_builder.hpp: 8 digits per row, 2 points per row (Tx/Ty + Tx2/Ty2), int64_t for row_chunk to avoid overflow
msm_builder.hpp: 8 additions per row with 4 doublings, dummy-point padding for unused slots, 4 read-count columns with compressed-slice-to-table mapping

Relations (all in relations/ecc_vm/):

ecc_msm_relation: extended to 8 addition constraints per row
ecc_bools_relation: boolean checks for msm_add5..8
ecc_wnaf_relation: 8 slice decompositions per row
ecc_point_table_relation: 2nd precomputed point constraint (Tx2/Ty2)
ecc_set_relation: 8 slice fingerprints, 8 add-gated tuples, eccvm_set_permutation_delta as product of 8 terms; second term uses tx2/ty2
ecc_lookup_relation: 8 reads, 4 table terms (point1 pos/neg, point2 pos/neg)

Prover/verifier (eccvm_prover.cpp, eccvm_verifier.cpp, eccvm_trace_checker.cpp):

eccvm_set_permutation_delta updated to 8-term product

Test infrastructure:

eccvm.test.cpp: updated delta computation, transcript manifest
eccvm_transcript.test.cpp: updated expected prover manifest
chonk.test.cpp: MaxCapacityPassing now computes max apps from LOG_N (28 at LOG_N=15)
Gate count / proof size constants updated

Not included (needed before merge)

Noir constant updates (constants.nr ECCVM proof length)
VK regeneration
Decision on CONST_ECCVM_LOG_N (stay at 15 for capacity, not 14)

Test plan

41/41 eccvm_tests pass
50/50 CI=1 NO_FAIL_FAST=1 ./bootstrap test pass (includes chonk, goblin, dsl, VK checks)
LOG_N=14 tested: only 13 apps fit (insufficient for 17-app workload)
E2E / IVC integration tests
Noir constant sync

This is the first step toward halving the Precomputed and MSM table heights by doubling their width. The key changes: - WNAF_DIGITS_PER_ROW: 4 -> 8 (process 8 wNAF digits per precompute row) - ADDITIONS_PER_ROW: 4 -> 8 (process 8 point additions per MSM row) - DOUBLINGS_PER_ROW: new constant, always NUM_WNAF_DIGIT_BITS (= 4) The new DOUBLINGS_PER_ROW constant decouples the doubling chain length (which must remain 4, matching the wNAF digit width w=4) from ADDITIONS_PER_ROW (which we are doubling to 8). Previously, these were conflated because ADDITIONS_PER_ROW happened to equal NUM_WNAF_DIGIT_BITS.

Key changes to MSMRow and trace computation: - AddState array: hardcoded size 4 -> ADDITIONS_PER_ROW (now 8) - Doubling loops: use DOUBLINGS_PER_ROW (= 4) instead of ADDITIONS_PER_ROW for the doubling phase, since we always do w=4 doublings regardless of how many additions we pack per row - Trace sizing: (num_msm_rows - 2) * 4 -> * ADDITIONS_PER_ROW - trace_index computation: * 4 -> * ADDITIONS_PER_ROW - After doubling loops, advance trace_index by (ADDITIONS_PER_ROW - DOUBLINGS_PER_ROW) to skip unused slots allocated in the point trace - Final row add_state: use ADDITIONS_PER_ROW-sized array fill

With WNAF_DIGITS_PER_ROW doubled from 4 to 8: - num_rows_per_scalar drops from 8 to 4 (32 digits / 8 per row) - Each row now encodes 8 wNAF digits via 16 two-bit slices (s1..s16), up from 4 digits / 8 slices (s1..s8) - Each row stores 2 precomputed points (precompute_accumulator and precompute_accumulator2), since we have 8 points to store across 4 rows. Row i stores table[POINT_TABLE_SIZE-1-2i] and table[POINT_TABLE_SIZE-2-2i]. - Horner scalar accumulation shifts by 2^32 (was 2^16) since each row now contributes 8*4 = 32 bits of scalar data. - row_chunk computation extended to sum all 8 wNAF digits. - Removed static_assert(WNAF_DIGITS_PER_ROW == 4), replaced with static_assert(WNAF_DIGITS_PER_ROW == 8). - Updated POINT_TABLE_SIZE/2 == num_rows_per_scalar*2 assert to reflect the new 2-points-per-row layout.

Updates ECCVMFlavor entity counts and column definitions: NUM_WIRES: 85 -> 121 NUM_ALL_ENTITIES: 118 -> 156 NUM_WITNESS_ENTITIES: 87 -> 123 NUM_SHIFTED_ENTITIES: 26 -> 28 New WireNonShiftedEntities (+34 columns): - precompute_s5hi..s8lo: 8 new 2-bit slice columns for digits 5-8 - msm_add5..add8: 4 new addition selector columns - msm_x5..x8, msm_y5..y8: 8 new point coordinate columns - msm_collision_x5..x8: 4 new collision inverse columns - msm_lambda5..lambda8: 4 new slope columns - msm_slice5..slice8: 4 new wNAF slice columns - lookup_read_counts_2, _3: 2 new lookup read count columns New WireToBeShiftedWithoutAccumulatorsEntities (+2 columns): - precompute_tx2, precompute_ty2: 2nd precomputed point per row, needs shifting for inter-row point table constraints Corresponding ShiftedEntities updated with precompute_tx2_shift, precompute_ty2_shift. CommitmentLabels updated for all new columns.

Extends the ProverPolynomials constructor to populate the 36 new flavor columns from the builder row data: Precompute section: - Wire precompute_s5hi..s8lo from point_table_rows[i].s9..s16 - Wire precompute_tx2/ty2 from point_table_rows[i].precompute_accumulator2 MSM section (all from add_state[4..7]): - Wire msm_add5..add8 from add_state[4..7].add - Wire msm_x5..x8, msm_y5..y8 from add_state[4..7].point - Wire msm_collision_x5..x8 from add_state[4..7].collision_inverse - Wire msm_lambda5..lambda8 from add_state[4..7].lambda - Wire msm_slice5..slice8 from add_state[4..7].slice lookup_read_counts_2/_3 columns are declared but not yet populated; they will be wired when the lookup relation is updated to support 4 table terms per precompute row.

The MSM relation now supports 8 point additions per row (was 4). The doubling chain remains 4-wide (= wNAF digit width w = 4). Key changes: - Addition chain: first_add + 7 conditional adds (was first_add + 3) - Skew chain: 8 conditional skew additions (was 4) - Collision checks: 8 inverse checks (was 4) - Slice-zero enforcement: 8 checks (was 4) - Count update: sum of add1..add8 (was add1..add4) - Addition continuity: add{i+1} * (-add{i} + 1) for i=1..7 (was 1..3) - Cross-row continuity: (-add8 + 1) * add1_shift (was -add4 + 1) Subrelation count: 47 -> 67 (20 new subrelations) New subrelations: ADD slopes 5-8, SKEW slopes 5-8, collision 5-8, slice-zero 5-8, continuity add5-8. MAX_PARTIAL_RELATION_LENGTH for this relation: 8 -> 12 (due to the longer addition chain increasing the degree of the accumulator output).

Extend the bools relation with 4 new boolean constraints for the msm_add5 through msm_add8 columns (indices 23-26). Subrelation count: 23 -> 27.

Update ecc_wnaf_relation to process 8 wNAF digits per precompute row (was 4), halving the number of rows per scalar from 8 to 4. Key changes: - SUBRELATION_PARTIAL_LENGTHS expanded from 23 to 35 entries - 16 two-bit range checks (was 8) for slices s1hi..s8lo - 8 wNAF conversions w0..w7 (was 4 w0..w3) - Horner accumulation uses 2^32 shift (was 2^16) for 8 digits - Round max changed from 7 to 3 (NUM_WNAF_DIGITS_PER_SCALAR/8 - 1) - Added slice-zero checks for w4..w7 (subrelations 31-34) - Updated header docstring to reflect 4-row layout

…ed tuples Update ECCVMSetRelation for 8-wide precompute and MSM tables: Numerator changes: - 8 slice fingerprints instead of 4, with round encoding 8*round+j - Scalar reconstruction uses 8 wNAF digits with 2^32 shift (was 4, 2^16) - Skew tuple uses round offset 8 (was 4) - eccvm_set_permutation_delta comment updated for 8-term product Denominator changes: - 8 add-gated (pc, round, slice) tuples instead of 4 - PC offsets 0..7 (was 0..3) for msm_add1..msm_add8 SUBRELATION_PARTIAL_LENGTHS updated to {29, 3} (was {22, 3}) to accommodate the higher degree from the 8-wide grand product.

Update ECCVMLookupRelation for 8-wide MSM and 2 precomputed points per precompute row: - NUM_LOOKUP_TERMS: 4 -> 8 (msm_add1..msm_add8 gated reads) - NUM_TABLE_TERMS: 2 -> 4 (positive/negative for each of 2 points) - LENGTH: 9 -> 15 Table term structure (4 terms covering all 16 slice values): - table_index 0: point 1 positive, slice = 15 - 2*round -> {15,13,11,9} - table_index 1: point 1 negative, slice = 2*round -> {0,2,4,6} - table_index 2: point 2 positive, slice = 14 - 2*round -> {14,12,10,8} - table_index 3: point 2 negative, slice = 2*round + 1 -> {1,3,5,7} Lookup read counts expanded from 2 to 4 columns (lookup_read_counts_0..3) to match the 4 table terms.

…tion Update ECCVMPointTableRelation for 2 precomputed points per row (Tx/Ty and Tx2/Ty2): SUBRELATION_PARTIAL_LENGTHS expanded from 6 to 8 entries: - Subrelations 0-1: Doubling constraint, now uses Tx2/Ty2 as the base point (at transition row, Tx2=P so Dx=2P) - Subrelations 2-3: Dx/Dy continuity (unchanged) - Subrelations 4-5: NEW intra-row addition (Tx = Tx2 + Dx), gated by precompute_select. Validates first point = second point + 2P. - Subrelations 6-7: NEW inter-row addition (Tx2 = Tx_shift + Dx), gated by not-transition and not-first-row. Validates second point of row i equals first point of row i+1 plus 2P. Row layout example for point P: round 0: Tx=15P, Tx2=13P | round 1: Tx=11P, Tx2=9P round 2: Tx=7P, Tx2=5P | round 3: Tx=3P, Tx2=P

With 8 wNAF digits per precompute row, the zero-tuple fingerprint used for padding inactive rows must be the product of 8 terms (γ + j·β² + t·β⁴) for j = 0..7, rather than 4 terms for j = 0..3. Updated in all three locations: - eccvm_prover.cpp - eccvm_verifier.cpp - eccvm_trace_checker.cpp

Add benchmarks for ECCVM relation evaluation using Sumcheck univariates (prover-side work), in addition to the existing values-based benchmarks (verifier-side work).

- Set all MSM relation SUBRELATION_PARTIAL_LENGTHS to 12 (was mixed 8/12). Required because the View type is derived from the max partial length subrelation (index 0 = 12), so all intermediate Univariates are 12-wide and can only be accumulated into 12-wide accumulators. - Fixed element count: was 68, now 67 (matching the array declaration). - Removed unused Tx2_shift/Ty2_shift variables from point table relation (the inter-row constraint uses Tx_shift/Ty_shift, not the shifted versions of the second point).

Two bugs fixed: 1. batch_normalize crash on zero z-coordinates: With ADDITIONS_PER_ROW=8 and DOUBLINGS_PER_ROW=4, doubling rows only use 4 of 8 trace slots. The unused slots had default Element{} with z=0 (point at infinity), causing batch_normalize to fail when inverting z-coordinates. Fix: fill unused slots with valid (non-infinity) dummy points and track which slots are used via is_used vector to skip them during collision_inverse computation. 2. Signed integer overflow in precomputed_tables_builder: With 8 wNAF digits per row, row_chunk = slice0 * (1<<28) can reach ~4 billion, exceeding INT_MAX. This was undefined behavior causing incorrect scalar_sum values. Fix: use int64_t for row_chunk computation.

…r-row layout Three fixes: 1. Lookup read counts: Reworked MSM builder to return 4 read count columns (was 2). With 2 precomputed points per row and 4 table terms in the lookup relation, each compressed slice value maps to one of 4 tables based on parity and magnitude: - Table 0: odd slices >= 8 (point 1 positive) - Table 1: even slices < 8 (point 1 negative) - Table 2: even slices >= 8 (point 2 positive) - Table 3: odd slices < 8 (point 2 negative) ProverPolynomials now wires all 4 read count columns. 2. Set relation second term: Changed base point reference from precompute_tx/ty to precompute_tx2/ty2. In the 2-point-per-row layout, the base point P is stored in tx2/ty2 at the transition row (round=3), not in tx/ty (which holds 3P). 3. Removed debug trace code from trace checker.

- eccvm.test.cpp: Fix eccvm_set_permutation_delta in complete_proving_key_for_test() to compute the product of 8 zero-tuple fingerprints (γ + j·β² + t·β⁴) for j=0..7, matching the prover/verifier/trace_checker. Previously only computed 4 terms, causing CommittedSumcheck test to fail. - eccvm_transcript.test.cpp: Update hardcoded prover manifest in construct_eccvm_honk_manifest() with all new wire columns added for the 8-wide layout: - PRECOMPUTE_S5HI through PRECOMPUTE_S8LO (8 columns) - MSM_ADD5 through MSM_ADD8 (4 columns) - MSM_X5/Y5 through MSM_X8/Y8 (8 columns) - MSM_COLLISION_X5 through MSM_COLLISION_X8 (4 columns) - MSM_LAMBDA5 through MSM_LAMBDA8 (4 columns) - MSM_SLICE5 through MSM_SLICE8 (4 columns) - LOOKUP_READ_COUNTS_2, LOOKUP_READ_COUNTS_3 (2 columns) - PRECOMPUTE_TX2, PRECOMPUTE_TY2 (2 columns) All 41 eccvm_tests now pass.

The ECCVM recursive verifier gate count increased from 224,657 to 269,130 due to the wider relation columns and higher-degree subrelations in the 8-wide ECCVM layout. The recursive flavor inherits all entity/relation changes automatically from the native ECCVMFlavor via templates, so no code changes were needed in the stdlib recursive verifier itself.

Update comments across msm_builder.hpp, eccvm_flavor.hpp, and ecc_lookup_relation_impl.hpp to reflect the new 8-wide layout: - "4 point-additions per row" → 8 - "size-4 array" → size-8 - "result of four EC additions" → eight - Document msm_x/y/add/lambda/slice/collision_x 1..8 - Document precompute_s1..s8 (8 slices per row) - Document precompute_tx2/ty2 (2 points per row) - Document all 4 lookup_read_counts columns

Remove ECCVM univariate benchmark additions from relations.bench.cpp to keep this PR focused on the 8-wide layout change.

- Fix ECCVM proof size in design doc: ~716 → 756 Fr (confirmed by static_assert in proof_compression.hpp) - Correct set relation degree comments: denominator sub-products are 16 (8 add-gated tuples) + 6 (transcript z1/z2) + 4 (MSM output) = 26, not the previously claimed 28. Full GP subrelation degree = 27, partial length upper bound = 29. - Fix duplicate comment blocks in set relation numerator/denominator third term docstrings - Update inline cumulative degree annotations throughout compute_grand_product_numerator/denominator

Instead of hardcoding 17 apps, compute the max number of app circuits that fit in the ECCVM based on CONST_ECCVM_LOG_N. Each app adds ~1104 ECCVM rows with ~1494 base overhead. At LOG_N=15: 28 apps; LOG_N=14: 13.

Plan to pack 2 doubling rounds into 1 MSM row (DOUBLINGS_PER_ROW 4→8), cutting doubling rows from 31 to 16 per MSM. Reuses lambda5..8 on doubling rows (free since q_add/q_double are mutually exclusive). No new columns needed. MSM formula: 33*ceil(m/8)+16 (was +31).

The 31 doubling rows per MSM cannot be halved because each occurs between consecutive digit-slot ADD phases in the Straus algorithm. Combining two DBL rounds into one row would require 8-bit digits (point table size 256), which is impractical. The 8-wide change achieves ~1.65x capacity (17→28 apps), not 2x.

notnotraju · 2026-03-18T16:44:34Z

Related: AztecProtocol/barretenberg#1654

notnotraju added 19 commits March 17, 2026 13:41

eccvm: add boolean checks for msm_add5..8

3b393a8

Extend the bools relation with 4 new boolean constraints for the msm_add5 through msm_add8 columns (indices 23-26). Subrelation count: 23 -> 27.

chore(eccvm): add ECCVM univariate benchmarks for sumcheck prover

d4c16a4

Add benchmarks for ECCVM relation evaluation using Sumcheck univariates (prover-side work), in addition to the existing values-based benchmarks (verifier-side work).

notnotraju added the ci-barretenberg Run all barretenberg/cpp checks. label Mar 18, 2026

chore(eccvm): revert benchmark file changes

4b9154d

Remove ECCVM univariate benchmark additions from relations.bench.cpp to keep this PR focused on the 8-wide layout change.

notnotraju changed the title ~~feat(eccvm): widen ECCVM layout (8 additions/row, 8 wnaf digits/row)~~ feat: widen ECCVM layout (8 additions/row, 8 wnaf digits/row) Mar 18, 2026

notnotraju added 4 commits March 18, 2026 11:05

chore(eccvm): make MaxCapacityPassing test compute max apps from LOG_N

913c4eb

Instead of hardcoding 17 apps, compute the max number of app circuits that fit in the ECCVM based on CONST_ECCVM_LOG_N. Each app adds ~1104 ECCVM rows with ~1494 base overhead. At LOG_N=15: 28 apps; LOG_N=14: 13.

notnotraju marked this pull request as ready for review March 18, 2026 14:22

notnotraju self-assigned this Mar 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: widen ECCVM layout (8 additions/row, 8 wnaf digits/row)#21721

feat: widen ECCVM layout (8 additions/row, 8 wnaf digits/row)#21721
notnotraju wants to merge 24 commits intomerge-train/barretenbergfrom
rk/eccvm-wide-short

notnotraju commented Mar 18, 2026 •

edited

Loading

Uh oh!

notnotraju commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

notnotraju commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Capacity analysis

Changes made

Not included (needed before merge)

Test plan

Uh oh!

notnotraju commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

notnotraju commented Mar 18, 2026 •

edited

Loading