Skip to content

feat: support batch flat vector queries#6828

Open
LeoReeYang wants to merge 9 commits into
lance-format:mainfrom
LeoReeYang:leo/batch-vector-query-6821
Open

feat: support batch flat vector queries#6828
LeoReeYang wants to merge 9 commits into
lance-format:mainfrom
LeoReeYang:leo/batch-vector-query-6821

Conversation

@LeoReeYang
Copy link
Copy Markdown
Contributor

@LeoReeYang LeoReeYang commented May 18, 2026

Summary

  • Extends the existing Scanner::nearest API to accept batched query vectors for fixed-size vector columns.
  • Implements flat/refine batch KNN in KNNVectorDistanceExec, returning one stream with up to m * k rows and query_index to identify each query's results.
  • Keeps batch KNN on the flat path for now; ANN batch search can be added separately.
  • Adds a Python benchmark for the binding-level batch KNN API with row count, dimensionality, batch size, rounds, and query count modeled as pytest parameters.

Closes #6821.

Benchmark

Python benchmark command:

uv run --extra benchmarks pytest python/benchmarks/test_search.py::test_batch_flat_knn

Dataset size, dimensionality, query count, batch size, and rounds are declared in the benchmark's @pytest.mark.parametrize values. Adjust those parameters in python/benchmarks/test_search.py to reproduce the scaling rows below.

Dataset: random float32 vectors written to a real local .lance dataset. No memory:// dataset and no throttled/simulated object store latency. OS page cache effects are accepted.

Query Count Scaling

Fixed dataset: 1,000,000 rows, dim=512, k=10. This is about 1.9 GiB of raw vector values.

rows dim query count (m) separate queries mean batch query mean time saved speedup
1,000,000 512 2 224.19 ms 180.68 ms 43.51 ms 1.24x
1,000,000 512 5 573.84 ms 310.23 ms 263.61 ms 1.85x
1,000,000 512 10 1.1241 s 524.05 ms 600.05 ms 2.15x

This shows the expected trend that batching becomes more valuable as m increases: the shared scan/decode work is amortized over more query vectors.

Dataset Size Scaling

Fixed query count: m=10, dim=512, k=10.

rows raw vector size query count (m) separate queries mean batch query mean time saved speedup
100,000 ~0.19 GiB 10 121.74 ms 50.318 ms 71.42 ms 2.42x
500,000 ~0.95 GiB 10 579.07 ms 261.21 ms 317.86 ms 2.22x
1,000,000 ~1.91 GiB 10 1.1241 s 524.05 ms 600.05 ms 2.15x

On local disk with OS page cache, relative speedup is not strictly monotonic with row count because both plans become increasingly dominated by the same cached vector decoding and distance-compute work. The robust trend in this setup is absolute time saved, which grows from ~71 ms to ~600 ms as dataset size grows.

Test plan

  • cargo test -p lance test_batch_knn_flat_results_include_query_index
  • cargo clippy -p lance --tests --benches -- -D warnings
  • ALL_FEATURES=... cargo clippy --profile ci --locked --features ${ALL_FEATURES} --tests -- -D warnings
  • uv run pytest python/tests/test_vector_index.py::test_batch_flat_query_matches_repeated_single_queries
  • uv run --extra benchmarks pytest --collect-only python/benchmarks/test_search.py::test_batch_flat_knn
  • cargo fmt --all -- --check
  • uv run ruff format --check --diff python/benchmarks/test_search.py python/lance/dataset.py python/tests/test_vector_index.py && uv run ruff check python/benchmarks/test_search.py python/lance/dataset.py python/tests/test_vector_index.py

Add a flat KNN batch query path so callers can submit multiple query vectors and share scan work while preserving per-query top-k results.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions
Copy link
Copy Markdown
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

Comment thread rust/lance/src/dataset/scanner.rs Outdated
Comment thread rust/lance/src/dataset/scanner.rs Outdated
Comment thread rust/lance/src/dataset/scanner.rs Outdated
Comment thread rust/lance/src/io/exec/knn.rs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 86.87616% with 71 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/io/exec/knn.rs 79.51% 40 Missing and 11 partials ⚠️
rust/lance/src/dataset/scanner.rs 93.15% 10 Missing and 10 partials ⚠️

📢 Thoughts on this report? Let us know!

Fold batch flat KNN into the existing nearest and KNN execution paths so the public API and plan nodes stay consistent with reviewer feedback.

Co-authored-by: Cursor <cursoragent@cursor.com>
@LeoReeYang LeoReeYang changed the title Support batch flat vector queries feat: support batch flat vector queries May 18, 2026
@LeoReeYang
Copy link
Copy Markdown
Contributor Author

Updated based on review feedback:

  • Removed the separate nearest_batch API; batched fixed-size vector queries now go through Scanner::nearest.
  • Removed the separate KNNBatchVectorDistanceExec type; batch flat KNN is handled inside KNNVectorDistanceExec.
  • Restored simple fast_search / use_index setters and route batch nearest queries to the flat path internally.
  • Updated the benchmark to create a real local disk .lance dataset instead of using memory:// or throttled simulated latency.

Local disk benchmark result for 8 queries, 50k rows, dim=4: separate mean 3.8834 ms vs batch mean 3.3045 ms, about 1.17x speedup. The gain is modest on local disk because the repeated reads are served from OS page cache.

@github-actions github-actions Bot added the enhancement New feature or request label May 18, 2026
Use a larger local-disk dataset and stream benchmark data generation so batch query gains are measured under a more realistic scan workload.

Co-authored-by: Cursor <cursoragent@cursor.com>
@LeoReeYang
Copy link
Copy Markdown
Contributor Author

Updated the benchmark scale per feedback:

  • Dataset is now 1,000,000 rows x 512 dimensions, with 10 batched query vectors and k=10.
  • The benchmark writes a real local .lance dataset under the system temp directory, about 1.9 GiB of raw vector values.
  • Data generation now streams batches into Dataset::write instead of materializing the full benchmark dataset in memory first.

Local result with OS cache accepted:

  • separate_queries/10: mean 1.1166 s
  • batch_query/10: mean 508.05 ms
  • Speedup: about 2.20x

This is meaningfully higher than the previous small-data local run (~1.17x), which matches the expectation that larger scan workloads show more benefit from sharing read/decode work across queries.

Allow the local-disk batch KNN benchmark to vary row count, dimensionality, and query count so PR results can show scaling trends.

Co-authored-by: Cursor <cursoragent@cursor.com>
@LeoReeYang
Copy link
Copy Markdown
Contributor Author

Added a controlled benchmark matrix to make the trend clearer.

Query-count scaling at 1M rows x 512d:

m separate mean batch mean saved speedup
2 224.19 ms 180.68 ms 43.51 ms 1.24x
5 573.84 ms 310.23 ms 263.61 ms 1.85x
10 1.1241 s 524.05 ms 600.05 ms 2.15x

Dataset-size scaling at m=10, 512d:

rows raw vector size separate mean batch mean saved speedup
100k ~0.19 GiB 121.74 ms 50.318 ms 71.42 ms 2.42x
500k ~0.95 GiB 579.07 ms 261.21 ms 317.86 ms 2.22x
1M ~1.91 GiB 1.1241 s 524.05 ms 600.05 ms 2.15x

So the relative speedup clearly increases with m. For dataset size, the absolute time saved grows from ~71 ms to ~600 ms while relative speedup stays above 2x on local disk with OS cache effects accepted. The benchmark is now parameterized with env vars so these rows can be reproduced without editing source.

Comment thread rust/lance/src/io/exec/knn.rs Outdated
Comment thread rust/lance/src/io/exec/knn.rs Outdated
Comment thread rust/lance/src/io/exec/knn.rs Outdated
let row_id = row_ids
.as_ref()
.map(|row_ids| row_ids.value(row_index))
.unwrap_or(fallback_row_id + row_index as u64);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this would happen

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_rowid fallback was removed and batch mode now requires _rowid

Comment thread rust/lance/benches/vector_index.rs Outdated
);
}

fn bench_batch_flat_knn(c: &mut Criterion) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we port this to be in Python?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it has been ported to: python/python/benchmarks/test_search.py:227

DataType::List(_) | DataType::FixedSizeList(_, _) => {
if !matches!(vector_type, DataType::List(_)) {
return Err(Error::invalid_input(format!(
"Query is multivector but column {}({})is not multivector",
Copy link
Copy Markdown
Contributor

@BubbleCal BubbleCal May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain more how this distinguishes between multivector query and query batch?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch-vs-multivector is distinguished by the vector column type: list-like q + List column means one multivector query; list-like q + FixedSizeList column means a batch of single-vector queries.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch-vs-multivector is decided by vector column type with comments added in Scanner::nearest lines 1467-1475

LeoReeYang and others added 2 commits May 18, 2026 19:08
Use the LanceDB-compatible query_index result column and move the batch flat KNN benchmark to Python so benchmark scaling can be reproduced from the binding API.

Co-authored-by: Cursor <cursoragent@cursor.com>
Apply rustfmt output expected by CI for the batch query binding change.

Co-authored-by: Cursor <cursoragent@cursor.com>
Comment thread python/python/benchmarks/test_search.py Outdated
@LeoReeYang LeoReeYang requested a review from BubbleCal May 18, 2026 14:09
Move batch flat KNN benchmark configuration into pytest parameters so review and reproduction do not rely on environment variables.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@BubbleCal BubbleCal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. distance_range param is lost if it's a batch query
  2. this forces the query to be executed by flat KNN even there's an index, we still need to use the index if there is one (just query the index for each query vector).

plz add tests for verifying they are really fixed

LeoReeYang and others added 2 commits May 19, 2026 00:37
Route batched queries through vector indices when available and apply distance range bounds before per-query top-k selection on the flat path.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@BubbleCal
Copy link
Copy Markdown
Contributor

if the query is with:

  • fast_search=True
  • batch query
  • no index

it's expected to return an empty result, but the schema should still contain query_index column, but now it doesn't

In that case Lance runs a flat batch KNN query, returns up to ``k`` rows
for each query vector, and adds ``query_index`` to identify the source
query for each result row. Indexed/ANN batch search is not used in this
first implementation.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comments look not correct

q: QueryVectorLike
The query vector.
The query vector. For fixed-size vector columns, this may be a 2-D
array-like batch of query vectors. Batch queries run flat KNN, apply
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support batch vector query API and shared flat KNN scan

2 participants