[Sync] Roll-up of internal changes for development change-over. by hildebrandmw · Pull Request #714 · microsoft/DiskANN

hildebrandmw · 2026-02-04T03:50:42Z

This is a squashed roll-up of the internal changes made since the last sync. After the merging of this PR, development of the open-source portion of the code will be moved to GitHub.

Copilot

Pull request overview

Rolls up internal changes prior to shifting open-source development to GitHub, including new MinMax multi-vector distance support, refactors across async providers and benchmarks, and the introduction of a shared diskann-benchmark-core crate.

Changes:

Add MinMax quantized multi-vector distance implementations (MaxSim/Chamfer) and expose new MinMax APIs.
Refactor async provider + caching/BfTree initialization to return ANNResult and update call sites.
Split benchmark infrastructure into a new shared diskann-benchmark-core crate and reorganize benchmark search/streaming code.

Reviewed changes

Copilot reviewed 150 out of 219 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
diskann-quantization/src/minmax/multi/max_sim.rs	New MinMax multi-vector MaxSim/Chamfer implementation and tests
diskann-quantization/src/minmax/mod.rs	Expose new MinMax modules/types (multi + recompress)
diskann-quantization/src/meta/vector.rs	Refactor canonical front mut constructor with new unsafe API
diskann-quantization/src/lib.rs	Export `multi_vector` module and add compile-fail tests path
diskann-quantization/src/error.rs	Add an internal `Infallible` error type
diskann-providers/src/utils/medoid.rs	Clean up test error handling without unwrap_err
diskann-providers/src/model/graph/provider/async_/table_delete_provider.rs	Fix test attribute ordering under `cfg(feature="bf_tree")`
diskann-providers/src/model/graph/provider/async_/postprocess.rs	Switch output population API to `extend`
diskann-providers/src/model/graph/provider/async_/inmem/provider.rs	Add `DefaultAccessor` impl for in-mem async provider
diskann-providers/src/model/graph/provider/async_/inmem/full_precision.rs	Switch output population API to `extend`
diskann-providers/src/model/graph/provider/async_/common.rs	Tighten vector store bounds to `bytemuck::Pod` and change zero-init
diskann-providers/src/model/graph/provider/async_/caching/utils.rs	Update cache construction to handle fallible `Cache::new`
diskann-providers/src/model/graph/provider/async_/caching/example.rs	Update example cache initialization to handle fallible `Cache::new`
diskann-providers/src/model/graph/provider/async_/caching/bf_cache.rs	Make cache construction fallible and map bf-tree config errors
diskann-providers/src/model/graph/provider/async_/bf_tree/vector_provider.rs	Make provider construction fallible and propagate config errors
diskann-providers/src/model/graph/provider/async_/bf_tree/quant_vector_provider.rs	Make provider construction fallible and propagate config errors
diskann-providers/src/model/graph/provider/async_/bf_tree/neighbor_provider.rs	Make provider construction fallible and propagate config errors
diskann-providers/src/model/graph/provider/async_/bf_tree/mod.rs	Introduce `ConfigError` wrapper and map into `ANNError`
diskann-providers/src/index/diskann_async.rs	Update diversity-search tests to use real `AttributeValueProvider`
diskann-label-filter/src/stores/bftree_store.rs	Adjust bf-tree store limits, add ConfigError conversion, add tests
diskann-disk/src/search/provider/disk_provider.rs	Switch output population API to `extend` and update diversity-search tests
diskann-disk/Cargo.toml	Workspace dependency normalization + formatting
diskann-benchmark/src/utils/tokio.rs	Remove multi-thread runtime helper (moved to benchmark-core)
diskann-benchmark/src/utils/streaming.rs	Extend `TagSlotManager` with capacity + counters; remove dynamic config struct
diskann-benchmark/src/utils/mod.rs	Add `SmallBanner` display helper
diskann-benchmark/src/utils/filters.rs	Generalize filter strategy setup input types + add bitmap adaptor helper
diskann-benchmark/src/inputs/async_.rs	Simplify `GraphSearch` config; switch runbook checks to benchmark-core BigANN
diskann-benchmark/src/backend/index/streaming/stats.rs	New streaming stats formatting + summary table
diskann-benchmark/src/backend/index/streaming/mod.rs	New streaming module wiring
diskann-benchmark/src/backend/index/streaming/managed.rs	New managed streaming layer (tag→slot mapping + maintenance triggering)
diskann-benchmark/src/backend/index/streaming/full_precision.rs	New full-precision streaming implementation using benchmark-core stages
diskann-benchmark/src/backend/index/spherical.rs	Switch to benchmark-core search infrastructure and new filter plumbing
diskann-benchmark/src/backend/index/search/range.rs	New range-search runner wrapper around benchmark-core
diskann-benchmark/src/backend/index/search/mod.rs	New search module layout
diskann-benchmark/src/backend/index/search/knn.rs	New knn-search runner wrapper around benchmark-core (KNN + MultiHop)
diskann-benchmark/src/backend/index/scalar.rs	Update imports to new build/search module layout
diskann-benchmark/src/backend/index/range_search.rs	Remove legacy range search implementation (moved to benchmark-core wrappers)
diskann-benchmark/src/backend/index/product.rs	Update imports to new build/search module layout
diskann-benchmark/src/backend/index/multihop_filtered_search.rs	Remove legacy multihop filtered search implementation (moved to benchmark-core)
diskann-benchmark/src/backend/index/mod.rs	Reorder/introduce new `search` + `streaming` modules
diskann-benchmark/src/backend/exhaustive/spherical.rs	Update recall computation call signature
diskann-benchmark/src/backend/exhaustive/product.rs	Update recall computation call signature
diskann-benchmark/src/backend/exhaustive/minmax.rs	Update recall computation call signature + formatting
diskann-benchmark/example/spherical-filter.json	Fix query predicate file name
diskann-benchmark/example/async-multihop-filter-ground-truth-small.json	Add `start_point_strategy` to example
diskann-benchmark/example/async-filter.json	Fix query predicate file name
diskann-benchmark/example/async-filter-ground-truth.json	Add `start_point_strategy` to example
diskann-benchmark/Cargo.toml	Add dependency on `diskann-benchmark-core`; add `itertools`; drop `serde_yaml`
diskann-benchmark-runner/src/ux.rs	New UX helpers for normalizing/stabilizing test output
diskann-benchmark-runner/src/utils/microseconds.rs	Add exported `timed!` macro
diskann-benchmark-runner/src/lib.rs	Expose `ux` module behind feature for tests/tools
diskann-benchmark-runner/src/app.rs	Use shared UX normalization/backtrace stripping in tests
diskann-benchmark-runner/src/any.rs	Add public `Any::new` and `Any::raw` APIs; update docs + tests
diskann-benchmark-runner/Cargo.toml	Add `ux-tools` feature
diskann-benchmark-core/tests/bigann-ux/unrecognized-operation/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/unrecognized-operation/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/unrecognized-operation/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/unrecognized-dataset-key/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/unrecognized-dataset-key/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/unrecognized-dataset-key/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/stage-key-not-number/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/stage-key-not-number/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/stage-key-not-number/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/stage-key-not-integer/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/stage-key-not-integer/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/stage-key-not-integer/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/runbook-not-yaml-map/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/runbook-not-yaml-map/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/runbook-does-not-exist/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/runbook-does-not-exist/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/replace-tags-start-equals-end/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/replace-tags-start-equals-end/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/replace-tags-start-equals-end/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/replace-invalid-tags-range/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/replace-invalid-tags-range/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/replace-invalid-tags-range/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/replace-invalid-ids-range/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/replace-invalid-ids-range/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/replace-invalid-ids-range/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/replace-ids-start-equals-end/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/replace-ids-start-equals-end/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/replace-ids-start-equals-end/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/non-integer-max-pts/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/non-integer-max-pts/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/non-integer-max-pts/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/missing-max-pts/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/missing-max-pts/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/missing-max-pts/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/insert-start-equals-end/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/insert-start-equals-end/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/insert-start-equals-end/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/insert-invalid-range/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/insert-invalid-range/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/insert-invalid-range/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/delete-invalid-range/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/delete-invalid-range/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/delete-invalid-range/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/dataset-value-not-mapping/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/dataset-value-not-mapping/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/dataset-value-not-mapping/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/dataset-not-found/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/dataset-not-found/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/dataset-not-found/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/tests/bigann-ux/dataset-key-not-integer-or-string/runbook.yaml	New BigANN UX test input
diskann-benchmark-core/tests/bigann-ux/dataset-key-not-integer-or-string/expected.txt	New BigANN UX expected output
diskann-benchmark-core/tests/bigann-ux/dataset-key-not-integer-or-string/dataset.txt	New BigANN UX dataset selector
diskann-benchmark-core/src/utils.rs	New shared helper for averaging metrics
diskann-benchmark-core/src/tokio.rs	New shared tokio runtime factory (moved from benchmark crate)
diskann-benchmark-core/src/streaming/mod.rs	New streaming API module definition
diskann-benchmark-core/src/streaming/graph/test.rs	Test helpers for streaming graph operations
diskann-benchmark-core/src/streaming/graph/mod.rs	Streaming graph building blocks module
diskann-benchmark-core/src/streaming/graph/inplace_delete.rs	Build stage to benchmark `inplace_delete`
diskann-benchmark-core/src/streaming/graph/drop_deleted.rs	Build stage to benchmark `drop_deleted_neighbors`
diskann-benchmark-core/src/streaming/executors/mod.rs	Executor module entry point
diskann-benchmark-core/src/streaming/executors/bigann/withdata.rs	BigANN runbook adaptor that maps ranges to data slices
diskann-benchmark-core/src/streaming/executors/bigann/mod.rs	BigANN executor module exports
diskann-benchmark-core/src/streaming/api.rs	New generic streaming API traits + AnyStream
diskann-benchmark-core/src/search/mod.rs	New search API module and documentation
diskann-benchmark-core/src/search/graph/range.rs	Range search runner + aggregation support
diskann-benchmark-core/src/search/graph/multihop.rs	Multi-hop filtered search runner
diskann-benchmark-core/src/search/graph/mod.rs	Graph search module exports + test provider
diskann-benchmark-core/src/search/graph/knn.rs	KNN search runner + aggregation support
diskann-benchmark-core/src/lib.rs	New crate root and UX test helper wiring
diskann-benchmark-core/src/internal/mod.rs	Internal benchmark-core module plumbing
diskann-benchmark-core/src/internal/buffer.rs	Internal SearchOutputBuffer abstraction to reduce monomorphization
diskann-benchmark-core/src/build/mod.rs	New build API module and documentation
diskann-benchmark-core/src/build/ids.rs	New ID mapping utilities for build stages
diskann-benchmark-core/src/build/graph/single.rs	Single-insert build stage
diskann-benchmark-core/src/build/graph/multi.rs	Multi-insert build stage
diskann-benchmark-core/src/build/graph/mod.rs	Build graph module exports
diskann-benchmark-core/Cargo.toml	New crate manifest (feature-gated BigANN YAML parsing)
Cargo.toml	Add benchmark-core to workspace; bump versions; update `bf-tree` dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

diskann-quantization/src/meta/vector.rs

diskann-quantization/src/minmax/multi/max_sim.rs

diskann-providers/src/model/graph/provider/async_/common.rs

diskann-benchmark-runner/src/ux.rs

diskann-benchmark-core/src/streaming/api.rs

diskann-quantization/src/error.rs

Copilot · 2026-02-04T03:59:25Z

@hildebrandmw I've opened a new pull request, #715, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot · 2026-02-04T04:03:15Z

@hildebrandmw I've opened a new pull request, #716, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot · 2026-02-04T04:03:51Z

@hildebrandmw I've opened a new pull request, #717, to work on those changes. Once the pull request is ready, I'll request review from you.

codecov-commenter · 2026-02-04T04:15:36Z

Codecov Report

❌ Patch coverage is 87.26980% with 712 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.01%. Comparing base (528fd06) to head (bb765fe).

Files with missing lines	Patch %	Lines
...ann-benchmark/src/backend/index/streaming/stats.rs	0.00%	119 Missing ⚠️
diskann-benchmark/src/backend/index/benchmarks.rs	18.25%	103 Missing ⚠️
...mark/src/backend/index/streaming/full_precision.rs	0.00%	92 Missing ⚠️
...n-benchmark/src/backend/index/streaming/managed.rs	0.00%	79 Missing ⚠️
...ark-core/src/streaming/executors/bigann/parsing.rs	89.59%	46 Missing ⚠️
...iskann-benchmark/src/backend/index/search/range.rs	0.00%	46 Missing ⚠️
...rk-core/src/streaming/executors/bigann/withdata.rs	0.00%	34 Missing ⚠️
...ark-core/src/streaming/executors/bigann/runbook.rs	94.08%	21 Missing ⚠️
diskann-benchmark/src/backend/index/result.rs	54.76%	19 Missing ⚠️
diskann-benchmark-core/src/build/api.rs	97.46%	16 Missing ⚠️
... and 26 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #714      +/-   ##
==========================================
+ Coverage   86.70%   89.01%   +2.30%     
==========================================
  Files         398      428      +30     
  Lines       74238    78151    +3913     
==========================================
+ Hits        64371    69566    +5195     
+ Misses       9867     8585    -1282

Files with missing lines	Coverage Δ
diskann-benchmark-core/src/build/graph/single.rs	`100.00% <100.00%> (ø)`
diskann-benchmark-core/src/search/graph/mod.rs	`100.00% <100.00%> (ø)`
...nchmark-core/src/streaming/graph/inplace_delete.rs	`100.00% <100.00%> (ø)`
diskann-benchmark-core/src/streaming/graph/test.rs	`100.00% <100.00%> (ø)`
diskann-benchmark-core/src/tokio.rs	`100.00% <100.00%> (ø)`
diskann-benchmark-core/src/utils.rs	`100.00% <100.00%> (ø)`
diskann-benchmark-runner/src/any.rs	`100.00% <100.00%> (ø)`
diskann-benchmark-runner/src/app.rs	`78.03% <100.00%> (-1.55%)`	⬇️
diskann-benchmark-runner/src/utils/microseconds.rs	`100.00% <ø> (ø)`
diskann-benchmark/src/backend/exhaustive/minmax.rs	`100.00% <ø> (ø)`
... and 71 more

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Roll-up of internal changes.

3154e97

Copilot AI review requested due to automatic review settings February 4, 2026 03:50

Bump Rust version in CI.

bb765fe

Copilot AI reviewed Feb 4, 2026

View reviewed changes

suri-kumkaran approved these changes Feb 4, 2026

View reviewed changes

Copilot AI mentioned this pull request Feb 4, 2026

Fix spelling errors in streaming API documentation #715

Open

This was referenced Feb 4, 2026

Fix spelling errors in streaming API documentation #716

Closed

Fix doc comment: Windows line endings are \r\n not \n\r #717

Open

arkrishn94 approved these changes Feb 4, 2026

View reviewed changes

hildebrandmw requested a review from harsha-simhadri February 4, 2026 17:14

harsha-simhadri approved these changes Feb 4, 2026

View reviewed changes

arrayka approved these changes Feb 4, 2026

View reviewed changes

hildebrandmw merged commit d9d6ce2 into main Feb 4, 2026
9 checks passed

hildebrandmw deleted the mhildebr/sync-squashed branch February 4, 2026 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Sync] Roll-up of internal changes for development change-over.#714

[Sync] Roll-up of internal changes for development change-over.#714
hildebrandmw merged 2 commits intomainfrom
mhildebr/sync-squashed

hildebrandmw commented Feb 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Feb 4, 2026

Uh oh!

Copilot AI commented Feb 4, 2026

Uh oh!

Copilot AI commented Feb 4, 2026

Uh oh!

codecov-commenter commented Feb 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

hildebrandmw commented Feb 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Feb 4, 2026

Uh oh!

Copilot AI commented Feb 4, 2026

Uh oh!

Copilot AI commented Feb 4, 2026

Uh oh!

codecov-commenter commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

codecov-commenter commented Feb 4, 2026 •

edited

Loading