perf(merge_insert): probe every indexed key column for composite joins#6878
perf(merge_insert): probe every indexed key column for composite joins#6878ytyky wants to merge 4 commits into
Conversation
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
Benchmark numbersWall-clock for one
The pre-PR path scales linearly with target row count (full scan of |
|
@claude review |
There was a problem hiding this comment.
No bugs found, and the test coverage (composite key with partial/full indexing, NULL semantics, and post-index unindexed fragments) looks solid. Deferring because this touches the merge_insert query plan and introduces a deliberate NULL-semantics asymmetry (single-column keeps NullEqualsNull, composite uses NullEqualsNothing), plus a documented small-table regression — worth a human design-level look.
Extended reasoning...
Overview
This PR generalizes the indexed merge_insert path to handle composite (multi-column) joins. Previously the optimization was hard-gated to single-column joins; any multi-column merge fell through to a full target scan even when every key column had a scalar index. The PR:
- Adds an
IndexLookupstruct and aMapIndexExec::new_multiconstructor that AND-folds per-columnIsInprobes against each indexed column's scalar index. - Replaces
join_key_as_scalar_indexwithindexed_join_keys(returns all indexed keys) and addsunindexed_fragments_for_keyswhich unions the per-index unindexed fragment sets — a row is only reachable via the composite probe if every chosen index covers its fragment. - Restricts
MapIndexExec's fragment bitmap to the intersection of all per-lookup bitmaps so the deletion mask only filters rows the probe could actually see. - Reports a 3.27× speedup at 2M rows and a documented ~20ms regression at 200k rows (crossover around 400-500k rows).
Security risks
None apparent. This is an internal query-planning optimization with no changes to auth, serialization formats, on-disk layout, or external surface area. ScalarValue::try_from_array already handles arbitrary input types safely.
Level of scrutiny
Higher than mechanical. merge_insert is production-critical (data correctness boundary), and this PR changes the physical query plan that produces target rows. Two specific design choices warrant human review:
- NULL semantics asymmetry: the new code uses
NullEqualsNothingfor composite keys (matching the full-scan path) but preservesNullEqualsNullfor the single-column path. The comment frames this as backward-compat preservation, but it means single-column and composite paths now have different NULL semantics — worth confirming this is the intended long-term contract rather than an inconsistency to be cleaned up later. - Small-table regression: ~20ms slower on 200k-row tables. The author flags an opt-out heuristic as future work; reviewers may want to weigh whether shipping without the heuristic is acceptable now.
Other factors
- The bug-hunting pass found no issues.
- Test coverage is genuinely thoughtful — composite-key with index on each side / both / neither, NULL semantics on indexed composite key, and the unindexed-fragments-after-index case that would silently lose updates if the union logic were wrong.
- The benchmark file is gated behind
#[ignore]so it does not affect CI runtime. - The conventional-commits bot flagged the title, but the title (
perf(merge_insert): …) appears valid; the bot may have been responding to an earlier revision.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
dcf1d2f to
1f834de
Compare
The indexed merge_insert path was hard-gated to single-column joins, so composite-key merges fell through to a full target scan even when every key column had a scalar index. That scan dominates wall-clock time whenever the source batch is small relative to the target table. Generalize `MapIndexExec` to accept a list of `IndexLookup` entries instead of a single `(column, index_name)` pair. For each input batch the operator builds one `IsIn` query per lookup and combines them with `ScalarIndexExpr::And`, so the emitted candidate row addresses are the intersection of every indexed column's matches. The deletion mask is restricted to the intersection of the per-index fragment bitmaps for the same reason. The existing single-column constructor is preserved as a thin wrapper. In `merge_insert`, replace `join_key_as_scalar_index` with `indexed_join_keys`, which collects every join column that has a scalar index supporting exact equality. `create_indexed_scan_joined_stream` then builds a single `MapIndexExec` over all of them and joins on the full composite key with standard SQL NULL semantics (the single-column path keeps `NullEqualsNull` to avoid changing existing semantics). Fragments missing from any chosen index are unioned in via the existing unindexed-scan path so coverage stays complete. Refs: lance-format#3124
One-shot wall-clock timer for indexed multi-column merge_insert against
a 200k-row target with scalar indices on both join columns. Gated
behind `#[ignore]`; run with
cargo test --release -p lance --test composite_merge_insert_bench -- --ignored --nocapture
Used to validate the perf improvement in the parent commit.
200k rows was below the break-even where the indexed multi-column probe overtakes the full-scan fallback. Bump to 2M so the bench exercises the regime the parent perf commit is optimizing for.
… multi-lookup plan shape The composite merge_insert bench prints timings to stdout, which the workspace-wide `print_stdout` clippy lint denies. Allow the lint at the bench file level and reformat to satisfy `cargo fmt`. Add a unit test for `MapIndexExec` that exercises the multi-lookup branches missed by the integration tests: the `IndexedLookup [...]` Display rendering, `with_new_children` preserving every lookup, and the single-column convenience constructor's unchanged Display.
1f834de to
2314e1b
Compare
Summary
merges fell through to a full target scan even when every key column had a
scalar index.
MapIndexExecto AND-fold per-columnIsInprobes, and havemerge_insertfeed it every indexed join key.scalar indices (Composite scalar indices #3125) remain a complementary follow-up; the merge_insert
plumbing this PR adds is what such an index would plug into.
Test plan
cargo test -p lance --lib merge_insert(145 tests, includes 6 newcomposite-key cases covering single-side / both-side / no index / NULL
semantics / partial fragment coverage)
cargo fmt --all -- --checkcargo clippy -p lance --tests -- -D warningsCloses #3124