feat: support batch flat vector queries by LeoReeYang · Pull Request #6828 · lance-format/lance

LeoReeYang · 2026-05-18T08:38:32Z

Summary

Extends the existing Scanner::nearest API to accept batched query vectors for fixed-size vector columns.
Implements flat/refine batch KNN in KNNVectorDistanceExec, returning one stream with up to m * k rows and query_index to identify each query's results.
Keeps batch KNN on the flat path for now; ANN batch search can be added separately.
Adds a Python benchmark for the binding-level batch KNN API with row count, dimensionality, batch size, rounds, and query count modeled as pytest parameters.

Closes #6821.

Benchmark

Python benchmark command:

uv run --extra benchmarks pytest python/benchmarks/test_search.py::test_batch_flat_knn

Dataset size, dimensionality, query count, batch size, and rounds are declared in the benchmark's @pytest.mark.parametrize values. Adjust those parameters in python/benchmarks/test_search.py to reproduce the scaling rows below.

Dataset: random float32 vectors written to a real local .lance dataset. No memory:// dataset and no throttled/simulated object store latency. OS page cache effects are accepted.

Query Count Scaling

Fixed dataset: 1,000,000 rows, dim=512, k=10. This is about 1.9 GiB of raw vector values.

rows	dim	query count (`m`)	separate queries mean	batch query mean	time saved	speedup
1,000,000	512	2	224.19 ms	180.68 ms	43.51 ms	1.24x
1,000,000	512	5	573.84 ms	310.23 ms	263.61 ms	1.85x
1,000,000	512	10	1.1241 s	524.05 ms	600.05 ms	2.15x

This shows the expected trend that batching becomes more valuable as m increases: the shared scan/decode work is amortized over more query vectors.

Dataset Size Scaling

Fixed query count: m=10, dim=512, k=10.

rows	raw vector size	query count (`m`)	separate queries mean	batch query mean	time saved	speedup
100,000	~0.19 GiB	10	121.74 ms	50.318 ms	71.42 ms	2.42x
500,000	~0.95 GiB	10	579.07 ms	261.21 ms	317.86 ms	2.22x
1,000,000	~1.91 GiB	10	1.1241 s	524.05 ms	600.05 ms	2.15x

On local disk with OS page cache, relative speedup is not strictly monotonic with row count because both plans become increasingly dominated by the same cached vector decoding and distance-compute work. The robust trend in this setup is absolute time saved, which grows from ~71 ms to ~600 ms as dataset size grows.

Test plan

cargo test -p lance test_batch_knn_flat_results_include_query_index
cargo clippy -p lance --tests --benches -- -D warnings
ALL_FEATURES=... cargo clippy --profile ci --locked --features ${ALL_FEATURES} --tests -- -D warnings
uv run pytest python/tests/test_vector_index.py::test_batch_flat_query_matches_repeated_single_queries
uv run --extra benchmarks pytest --collect-only python/benchmarks/test_search.py::test_batch_flat_knn
cargo fmt --all -- --check
uv run ruff format --check --diff python/benchmarks/test_search.py python/lance/dataset.py python/tests/test_vector_index.py && uv run ruff check python/benchmarks/test_search.py python/lance/dataset.py python/tests/test_vector_index.py

Add a flat KNN batch query path so callers can submit multiple query vectors and share scan work while preserving per-query top-k results. Co-authored-by: Cursor <cursoragent@cursor.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

github-actions · 2026-05-18T08:38:50Z

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

codecov · 2026-05-18T09:13:06Z

Codecov Report

❌ Patch coverage is 86.87616% with 71 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance/src/io/exec/knn.rs	79.51%	40 Missing and 11 partials ⚠️
rust/lance/src/dataset/scanner.rs	93.15%	10 Missing and 10 partials ⚠️

📢 Thoughts on this report? Let us know!

Fold batch flat KNN into the existing nearest and KNN execution paths so the public API and plan nodes stay consistent with reviewer feedback. Co-authored-by: Cursor <cursoragent@cursor.com>

LeoReeYang · 2026-05-18T10:01:32Z

Updated based on review feedback:

Removed the separate nearest_batch API; batched fixed-size vector queries now go through Scanner::nearest.
Removed the separate KNNBatchVectorDistanceExec type; batch flat KNN is handled inside KNNVectorDistanceExec.
Restored simple fast_search / use_index setters and route batch nearest queries to the flat path internally.
Updated the benchmark to create a real local disk .lance dataset instead of using memory:// or throttled simulated latency.

Local disk benchmark result for 8 queries, 50k rows, dim=4: separate mean 3.8834 ms vs batch mean 3.3045 ms, about 1.17x speedup. The gain is modest on local disk because the repeated reads are served from OS page cache.

Use a larger local-disk dataset and stream benchmark data generation so batch query gains are measured under a more realistic scan workload. Co-authored-by: Cursor <cursoragent@cursor.com>

LeoReeYang · 2026-05-18T10:16:49Z

Updated the benchmark scale per feedback:

Dataset is now 1,000,000 rows x 512 dimensions, with 10 batched query vectors and k=10.
The benchmark writes a real local .lance dataset under the system temp directory, about 1.9 GiB of raw vector values.
Data generation now streams batches into Dataset::write instead of materializing the full benchmark dataset in memory first.

Local result with OS cache accepted:

separate_queries/10: mean 1.1166 s
batch_query/10: mean 508.05 ms
Speedup: about 2.20x

This is meaningfully higher than the previous small-data local run (~1.17x), which matches the expectation that larger scan workloads show more benefit from sharing read/decode work across queries.

Allow the local-disk batch KNN benchmark to vary row count, dimensionality, and query count so PR results can show scaling trends. Co-authored-by: Cursor <cursoragent@cursor.com>

LeoReeYang · 2026-05-18T10:32:00Z

Added a controlled benchmark matrix to make the trend clearer.

Query-count scaling at 1M rows x 512d:

m	separate mean	batch mean	saved	speedup
2	224.19 ms	180.68 ms	43.51 ms	1.24x
5	573.84 ms	310.23 ms	263.61 ms	1.85x
10	1.1241 s	524.05 ms	600.05 ms	2.15x

Dataset-size scaling at m=10, 512d:

rows	raw vector size	separate mean	batch mean	saved	speedup
100k	~0.19 GiB	121.74 ms	50.318 ms	71.42 ms	2.42x
500k	~0.95 GiB	579.07 ms	261.21 ms	317.86 ms	2.22x
1M	~1.91 GiB	1.1241 s	524.05 ms	600.05 ms	2.15x

So the relative speedup clearly increases with m. For dataset size, the absolute time saved grows from ~71 ms to ~600 ms while relative speedup stays above 2x on local disk with OS cache effects accepted. The benchmark is now parameterized with env vars so these rows can be reproduced without editing source.

BubbleCal · 2026-05-18T10:38:55Z

+                    let row_id = row_ids
+                        .as_ref()
+                        .map(|row_ids| row_ids.value(row_index))
+                        .unwrap_or(fallback_row_id + row_index as u64);


I don't think this would happen

_rowid fallback was removed and batch mode now requires _rowid

BubbleCal · 2026-05-18T10:39:58Z

    );
 }

+fn bench_batch_flat_knn(c: &mut Criterion) {


can we port this to be in Python?

it has been ported to: python/python/benchmarks/test_search.py:227

BubbleCal · 2026-05-18T10:40:30Z

            DataType::List(_) | DataType::FixedSizeList(_, _) => {
-                if !matches!(vector_type, DataType::List(_)) {
-                    return Err(Error::invalid_input(format!(
-                        "Query is multivector but column {}({})is not multivector",


Can you explain more how this distinguishes between multivector query and query batch?

Batch-vs-multivector is distinguished by the vector column type: list-like q + List column means one multivector query; list-like q + FixedSizeList column means a batch of single-vector queries.

batch-vs-multivector is decided by vector column type with comments added in Scanner::nearest lines 1467-1475

Use the LanceDB-compatible query_index result column and move the batch flat KNN benchmark to Python so benchmark scaling can be reproduced from the binding API. Co-authored-by: Cursor <cursoragent@cursor.com>

Apply rustfmt output expected by CI for the batch query binding change. Co-authored-by: Cursor <cursoragent@cursor.com>

Move batch flat KNN benchmark configuration into pytest parameters so review and reproduction do not rely on environment variables. Co-authored-by: Cursor <cursoragent@cursor.com>

BubbleCal

distance_range param is lost if it's a batch query
this forces the query to be executed by flat KNN even there's an index, we still need to use the index if there is one (just query the index for each query vector).

plz add tests for verifying they are really fixed

Route batched queries through vector indices when available and apply distance range bounds before per-query top-k selection on the flat path. Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

BubbleCal · 2026-05-19T07:04:45Z

if the query is with:

fast_search=True
batch query
no index

it's expected to return an empty result, but the schema should still contain query_index column, but now it doesn't

BubbleCal · 2026-05-19T07:05:32Z

+            In that case Lance runs a flat batch KNN query, returns up to ``k`` rows
+            for each query vector, and adds ``query_index`` to identify the source
+            query for each result row. Indexed/ANN batch search is not used in this
+            first implementation.


this comments look not correct

BubbleCal · 2026-05-19T07:05:51Z

    q: QueryVectorLike
-        The query vector.
+        The query vector. For fixed-size vector columns, this may be a 2-D
+        array-like batch of query vectors. Batch queries run flat KNN, apply


feat: support batch flat vector queries

d8cc15a

Add a flat KNN batch query path so callers can submit multiple query vectors and share scan work while preserving per-query top-k results. Co-authored-by: Cursor <cursoragent@cursor.com>

claude Bot reviewed May 18, 2026

View reviewed changes

github-actions Bot added the python label May 18, 2026

BubbleCal requested changes May 18, 2026

View reviewed changes

Comment thread rust/lance/src/dataset/scanner.rs Outdated

Comment thread rust/lance/src/dataset/scanner.rs Outdated

Comment thread rust/lance/src/dataset/scanner.rs Outdated

Comment thread rust/lance/src/io/exec/knn.rs Outdated

refactor: align batch vector query with nearest API

22b315d

Fold batch flat KNN into the existing nearest and KNN execution paths so the public API and plan nodes stay consistent with reviewer feedback. Co-authored-by: Cursor <cursoragent@cursor.com>

LeoReeYang changed the title ~~Support batch flat vector queries~~ feat: support batch flat vector queries May 18, 2026

github-actions Bot added the enhancement New feature or request label May 18, 2026

bench: scale batch flat vector query benchmark

c18a29c

Use a larger local-disk dataset and stream benchmark data generation so batch query gains are measured under a more realistic scan workload. Co-authored-by: Cursor <cursoragent@cursor.com>

bench: parameterize batch vector query benchmark

e47c334

Allow the local-disk batch KNN benchmark to vary row count, dimensionality, and query count so PR results can show scaling trends. Co-authored-by: Cursor <cursoragent@cursor.com>

BubbleCal reviewed May 18, 2026

View reviewed changes

LeoReeYang and others added 2 commits May 18, 2026 19:08

fix: align batch query review feedback

c2fb8bf

Use the LanceDB-compatible query_index result column and move the batch flat KNN benchmark to Python so benchmark scaling can be reproduced from the binding API. Co-authored-by: Cursor <cursoragent@cursor.com>

fix: format python dataset binding

628c3d2

Apply rustfmt output expected by CI for the batch query binding change. Co-authored-by: Cursor <cursoragent@cursor.com>

BubbleCal reviewed May 18, 2026

View reviewed changes

Comment thread python/python/benchmarks/test_search.py Outdated

LeoReeYang requested a review from BubbleCal May 18, 2026 14:09

bench: use pytest params for batch knn benchmark

0de0654

Move batch flat KNN benchmark configuration into pytest parameters so review and reproduction do not rely on environment variables. Co-authored-by: Cursor <cursoragent@cursor.com>

BubbleCal requested changes May 18, 2026

View reviewed changes

LeoReeYang and others added 2 commits May 19, 2026 00:37

fix: respect batch vector query parameters

d76f91d

Route batched queries through vector indices when available and apply distance range bounds before per-query top-k selection on the flat path. Co-authored-by: Cursor <cursoragent@cursor.com>

test: assert indexed batch KNN matches single-query distance_range

ab781ca

Co-authored-by: Cursor <cursoragent@cursor.com>

BubbleCal requested changes May 19, 2026

View reviewed changes

Conversation

LeoReeYang commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmark

Query Count Scaling

Dataset Size Scaling

Test plan

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LeoReeYang commented May 18, 2026

Uh oh!

LeoReeYang commented May 18, 2026

Uh oh!

LeoReeYang commented May 18, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BubbleCal May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BubbleCal left a comment

Choose a reason for hiding this comment

Uh oh!

BubbleCal commented May 19, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LeoReeYang commented May 18, 2026 •

edited

Loading

codecov Bot commented May 18, 2026 •

edited

Loading

BubbleCal May 18, 2026 •

edited

Loading