Skip to content

[ABA-17] fix(vortex-array): bounds-check raw indices in ConstantArray::take#6

Open
abnobdoss wants to merge 2 commits into
developfrom
fix/aba-17-constant-take-oob
Open

[ABA-17] fix(vortex-array): bounds-check raw indices in ConstantArray::take#6
abnobdoss wants to merge 2 commits into
developfrom
fix/aba-17-constant-take-oob

Conversation

@abnobdoss
Copy link
Copy Markdown
Owner

Summary

ConstantArray::take (via impl TakeReduce for Constant in
vortex-array/src/arrays/constant/compute/take.rs) branched only on the
indices' validity mask (AllOr::{All, None, Some}) and constructed its result
from array.scalar() + indices.len(). It never compared the raw index values
against array.len(). A VALID out-of-bounds index — e.g. a u32::MAX slot
from a corrupted offset table or a u32-overflowed index expression — was
silently echoed back as the constant fill value, diverging from every other
take kernel (Primitive::take rejects the same input by routing through
cast(to_unsigned()) and an unchecked buffer index, which panics or errors
on OOB).

Fix

Bounds-check the min/max of the VALID indices against the source length once
at the top of Constant::take:

if let Some(mm) = min_max(indices, &mut ctx)? {
    let min_idx = usize::try_from(&mm.min)?;
    let max_idx = usize::try_from(&mm.max)?;
    vortex_ensure!(min_idx < array.len(), OutOfBounds: min_idx, 0, array.len());
    vortex_ensure!(max_idx < array.len(), OutOfBounds: max_idx, 0, array.len());
}
  • min_max skips nulls, so a null index slot carrying a garbage raw value is
    still accepted — the bounds contract applies only to VALID positions, and a
    naive "check every raw value" fix would regress that.
  • usize::try_from on the min scalar rejects negative signed indices,
    mirroring Primitive::take's cast(to_unsigned()) behavior at
    vortex-array/src/arrays/primitive/compute/take/mod.rs:88-91.
  • The error uses vortex_ensure!(.. OutOfBounds: idx, 0, len), consistent with
    the existing OOB error idiom at vortex-array/src/array/erased.rs:281.

Trade-off

This adds an O(n) min/max pass over the indices to what was previously an
O(1) constant-fold path. The cost is necessary for correctness —
Constant::take discards its indices entirely, so there is no later kernel
that could surface the OOB lazily. The alternative would be to restructure
the constant-fold to retain the indices' bounds metadata, which is out of
scope for a contained fix. Reviewers may prefer a Stat::Max-cached fast
path; happy to add one in a follow-up if desired, but min_max already
short-circuits on cached stats internally.

Linear

ABA-17 — "ConstantArray::take
never validates raw index values for OOB"

Validation

  • Added regression test
    arrays::constant::compute::take::tests::issue_aba17_take_must_reject_oob_indices
    in vortex-array/src/arrays/constant/compute/take.rs. Test covers:
    1. Single u32::MAX valid index against len=10 — must error.
    2. Mixed in-range + far-OOB (1010) valid indices — must error.
    3. Null index slot carrying u32::MAX raw value — must be accepted (only
      valid positions are checked).
  • Committed in two commits so the regression test is bisect-clean: commit 1
    adds the test #[ignore]d (would fail on develop without the fix); commit 2
    applies the fix and un-ignores the test.

Checks run:

  • cargo +nightly fmt --all
  • cargo test -p vortex-array --lib — 2753 passed, 0 failed, 0 ignored
  • cargo clippy -p vortex-array --all-targets --all-features — clean

I did not run the full workspace cargo clippy --all-targets --all-features
or ./scripts/public-api.sh because this change has no public API surface and
is fully contained to one private impl in vortex-array.

🤖 Generated with Claude Code

Abanoub Doss and others added 2 commits May 21, 2026 08:04
…ccepts OOB indices

`Constant::take` (TakeReduce) only inspects the indices' validity mask and
never compares the raw index values against the source array's length. A
VALID index past `array.len()` lands in the `AllOr::All` branch and is
silently echoed back as the constant fill value — a contract violation
relative to every other take kernel (e.g. `Primitive::take`, which rejects
such indices).

This test is `#[ignore]`d here and un-ignored by the follow-up fix commit,
so this commit is bisect-clean.

Linear: ABA-17

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Abanoub Doss <abanoub.doss@gmail.com>
`Constant::take` (TakeReduce) branched only on the indices' validity mask
(`AllOr::{All, None, Some}`) and constructed its result from
`array.scalar()` + `indices.len()`. It never compared the raw index values
against `array.len()`. A VALID out-of-bounds index — e.g. a `u32::MAX`
slot from a corrupted offset table — silently echoed back as the constant
fill value, diverging from every other take kernel (`Primitive::take`
rejects the same input).

Bounds-check the min/max of the VALID indices against the source length
once at the top of the impl. The `min_max` aggregate skips nulls, so a
null position carrying a garbage raw value is still accepted (matching
the rest of the contract). `usize::try_from` on the min scalar rejects
negative signed indices, mirroring `Primitive::take`'s
`cast(to_unsigned())` behavior. The upper bound is reported via
`vortex_ensure!(.. OutOfBounds: idx, 0, len)`, consistent with the
existing OOB error idiom in `ArrayRef::execute_scalar`.

Trade-off: this adds an O(n) min/max pass over indices to what was
previously an O(1) constant-fold path. The cost is necessary for
correctness — `Constant::take` discards its indices, so there is no
later kernel that could surface the OOB lazily.

Also un-ignores the regression test added in the previous commit.

Linear: ABA-17

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Abanoub Doss <abanoub.doss@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant