Describe the bug
While performing filtered ANNS on WIKI-1M dataset with uniform query labels (i.e., every query point has the same one label) to evaluate query performance under a specific level of specificity, the system returns identical neighbor lists for every query. This results in a total recall collapse toward 0. This behavior is observed on both IVF-Graph (high-specificity) and IVF-BFS (low-specificity) indexes.
Steps/Code to reproduce bug
- Load
WIKI-1M dataset.
- Generate a
query_labels.txt where each line is identical (e.g., lbls[0] = lbls[1] = ... = lbls[n-1] = [0]).
- Run search using the following configuration (change
spec_threshold to 2000 to evaluate IVF-BFS query for label 3079):
{
"data_dir": "/data/ann/wiki_1M/",
"data_fname": "base.fbin",
"query_fname": "query.fbin",
"data_label_fname": "base_labels.txt",
"query_label_fname": "query_labels1.txt",
"itopk_size": 32,
"spec_threshold": 1500,
"graph_degree": 32,
"topk": 10,
"num_runs": 1000,
"warmup_runs": 10,
"force_rebuild": true,
"ivf_graph_fname": "ivf_graph.bin",
"ivf_bfs_fname": "ivf_bfs.bin",
"ground_truth_fname": "ground_truth_k10.ibin"
}
Environment details
- Dataset: WIKI-1M
- CPU: AMD Ryzen 7 5700G
- GPU: NVIDIA RTX 2080 Ti
- OS: Ubuntu 22.04 LTS
- NVIDIA Driver: 12.2
- CUDA Compiler: 12.9.86
- Host Compiler: g++ 11.4.0
Observed behavior
Host validation confirms (first two) query vectors are distinct ($L_2$ distance $\approx 0.62$), yet neighbors are duplicated (nbhs[0] = nbhs[1] = ... = nbhs[n-1]).
Cross-path confirmation:
- IVF-Graph: Label
0 and 1 (spec=15.33% and 13.05%), specificity > threshold — FAILED
- IVF-BFS: Label
3079 (0.20%), specificity < threshold — FAILED
Supporting evidence
IVF-Graph (WIKI-1M label 1, spec=13.05%) query results snippet:
IVF-Graph Index Stats:
Total vectors: 980312
Number of labels: 3814
Graph size: [21768457 × 32]
Graph degree: 32
IVF-BFS Index Stats:
Number of labels: 186
Number of rows: 269772
QPS: 380420.79
Recall: 0.0000
=== Search Results Monitor (first 3 queries) ===
Query 0 (label=0):
neighbors: 541004 316361 314017 331946 604415 448044 539479 291316 344698 103825
gt: 370208 251743 484555 597579 386190 368896 860059 781968 802148 401846
recall@10: 0/10
Query 1 (label=0):
neighbors: 541004 316361 314017 331946 604415 448044 539479 291316 344698 103825
gt: 341406 160363 370208 251743 484555 874429 712954 785765 457944 517840
recall@10: 0/10
Query 2 (label=0):
neighbors: 541004 316361 314017 331946 604415 448044 539479 291316 344698 103825
gt: 549821 533776 915603 251743 484555 597579 573700 882094 276817 331536
recall@10: 0/10
IVF-BFS (WIKI-1M label 3079 spec=0.20%) query results snippet:
IVF-Graph Index Stats:
Total vectors: 980312
Number of labels: 2976
Graph size: [20314630 × 32]
Graph degree: 32
IVF-BFS Index Stats:
Number of labels: 1024
Number of rows: 1723599
QPS: 302732.20
Recall: 0.0023
=== Search Results Monitor (first 3 queries) ===
Query 0 (label=3079):
neighbors: 430417 279428 926084 439948 62295 368557 266782 439498 559943 547068
gt: 430417 279428 926084 439948 62295 368557 266782 439498 559943 547068
recall@10: 10/10
Query 1 (label=3079):
neighbors: 430417 279428 926084 439948 62295 368557 266782 439498 559943 547068
gt: 751135 773048 427405 742655 944674 345688 398253 943112 272594 565167
recall@10: 0/10
Query 2 (label=3079):
neighbors: 430417 279428 926084 439948 62295 368557 266782 439498 559943 547068
gt: 226262 778697 417094 104588 281525 47437 862004 267955 913929 809383
recall@10: 0/10
Additional context
For comparison, I also generated synthetic labels in Zipfian distribution for the SIFT-1M dataset and ran the same uniform filtered query with label 1 (spec=75%). This yielded normal results.
The IVF-Graph results snippet:
IVF-Graph Index Stats:
Total vectors: 1000000
Number of labels: 51
Graph size: [3385661 × 32]
Graph degree: 32
IVF-BFS Index Stats:
Number of labels: 0
Number of rows: 0
QPS: 1068508.05
Recall: 0.9292
=== Search Results Monitor (first 3 queries) ===
Query 0 (label=1):
neighbors: 932085 934876 561813 695756 701258 455537 562594 908244 600499 893601
gt: 932085 934876 561813 708177 706771 695756 435345 701258 455537 562594
recall@10: 7/10
Query 1 (label=1):
neighbors: 413071 880592 249062 400194 942339 880462 941776 586780 248426 849742
gt: 413071 706838 880592 249062 400194 942339 880462 941776 420802 586780
recall@10: 8/10
Query 2 (label=1):
neighbors: 408764 408462 861882 406273 406324 551743 861530 402106 239766 823095
gt: 408764 408462 861882 406273 406324 551743 861530 402106 239766 823095
recall@10: 10/10
Interestingly, this does NOT occur on SIFT-1M (128D), even with uniform labels.
Describe the bug
While performing filtered ANNS on
WIKI-1Mdataset with uniform query labels (i.e., every query point has the same one label) to evaluate query performance under a specific level of specificity, the system returns identical neighbor lists for every query. This results in a total recall collapse toward 0. This behavior is observed on both IVF-Graph (high-specificity) and IVF-BFS (low-specificity) indexes.Steps/Code to reproduce bug
WIKI-1Mdataset.query_labels.txtwhere each line is identical (e.g.,lbls[0] = lbls[1] = ... = lbls[n-1] = [0]).spec_thresholdto2000to evaluate IVF-BFS query for label3079):{ "data_dir": "/data/ann/wiki_1M/", "data_fname": "base.fbin", "query_fname": "query.fbin", "data_label_fname": "base_labels.txt", "query_label_fname": "query_labels1.txt", "itopk_size": 32, "spec_threshold": 1500, "graph_degree": 32, "topk": 10, "num_runs": 1000, "warmup_runs": 10, "force_rebuild": true, "ivf_graph_fname": "ivf_graph.bin", "ivf_bfs_fname": "ivf_bfs.bin", "ground_truth_fname": "ground_truth_k10.ibin" }Environment details
Observed behavior$L_2$ distance $\approx 0.62$ ), yet neighbors are duplicated (
Host validation confirms (first two) query vectors are distinct (
nbhs[0] = nbhs[1] = ... = nbhs[n-1]).Cross-path confirmation:
0and1(spec=15.33%and13.05%), specificity > threshold —FAILED3079(0.20%), specificity < threshold —FAILEDSupporting evidence
IVF-Graph (
WIKI-1Mlabel1,spec=13.05%) query results snippet:IVF-BFS (
WIKI-1Mlabel3079spec=0.20%) query results snippet:Additional context
For comparison, I also generated synthetic labels in Zipfian distribution for the
SIFT-1Mdataset and ran the same uniform filtered query with label1(spec=75%). This yielded normal results.The IVF-Graph results snippet:
Interestingly, this does NOT occur on SIFT-1M (128D), even with uniform labels.