Skip to content

[ABA-14] test(vortex-fastlanes): repro for ZeroBytes scheme reporting nbytes()==0#13

Open
abnobdoss wants to merge 1 commit into
developfrom
fix/aba-14-bitpack-zerobytes-nbytes
Open

[ABA-14] test(vortex-fastlanes): repro for ZeroBytes scheme reporting nbytes()==0#13
abnobdoss wants to merge 1 commit into
developfrom
fix/aba-14-bitpack-zerobytes-nbytes

Conversation

@abnobdoss
Copy link
Copy Markdown
Owner

Summary

  • Adds a single #[ignore]-d regression test reproducing the ABA-14 bug: BitPackedArray built from all-zero input reports nbytes() == 0
  • No fix included — a semantics decision is needed from upstream before the fix approach can be chosen
  • Test name: issue_aba14_bitpack_zerobytes_reports_nonzero_nbytes in encodings/fastlanes/src/bitpacking/array/bitpack_compress.rs

Linear

https://linear.app/abanoubdoss/issue/ABA-14

Root cause (confirmed on develop)

When bitpack_to_best_bit_width processes an all-zero array it selects bit_width=0 (zero packed bits is cheapest). bitpack_primitive short-circuits at line 147–149 with Buffer::<T>::empty(). The VTable exposes that buffer as the sole physical buffer, so Array::nbytes() (which sums buffer lengths) returns 0 for an array that logically holds 1024 elements.

Downstream consequence

vortex-compressor/src/estimate.rs maps after_nbytes == 0 from sample compression to EstimateScore::ZeroBytes. EstimateScore::is_valid (line 142–149) returns false for ZeroBytes, so the scheme that achieves perfect compression of all-zero data is silently excluded from the compressor's selection.

Open question

Should nbytes() for a perfectly-compressible scheme like ZeroBytes (bit_width==0) report:

  • Virtual uncompressed size (e.g. len * ptype.byte_width()) — preserves comparability with uncompressed arrays in the estimator, but misrepresents actual storage
  • Actual byte size (currently: 0) — truthful about storage, but breaks the estimator's assumption that 0 bytes means "bad sample" rather than "perfect compression"

Currently 0 causes vortex-compressor/src/estimate.rs:122–149 to classify the result as ZeroBytes and exclude it from selection. The TODO comment at line 113–116 in estimate.rs already acknowledges this ambiguity.

Validation

  • Test reproduces the bug: cargo test -p vortex-fastlanes -- issue_aba14 (with #[ignore] removed) panics with nbytes=0 with bit_width=0
  • All existing tests continue to pass with #[ignore] restored

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

…bytes()==0

Add an ignored regression test that reproduces the bug described in
ABA-14: a BitPackedArray produced from an all-zero input selects
bit_width=0 (cheapest for all-zero data), which causes
`bitpack_primitive` to short-circuit to `Buffer::<T>::empty()`. The
VTable exposes that buffer as the sole physical buffer, so
`Array::nbytes()` reports 0 for an array of 1024 logical elements.

Downstream: `vortex-compressor/src/estimate.rs` maps `after_nbytes==0`
from sample compression to `EstimateScore::ZeroBytes`, which
`EstimateScore::is_valid` treats as ineligible for scheme selection —
the scheme that achieves perfect compression of all-zero data is
silently excluded from the compressor's selection.

No fix is included pending a semantics decision: should `nbytes()` for a
zero-bit-width scheme report virtual uncompressed size or actual byte
size? See https://linear.app/abanoubdoss/issue/ABA-14

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Abanoub Doss <abanoub.doss@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant