NeurIPS'23 Competition Track: Big-ANN

Supported by

New: the latest ongoing leaderboard has been released (March 1st, 2024).
Top entries:

Filter track OOD track Sparse track
Rank Algorithm QPS@90% recall Rank Algorithm QPS@90% recall Rank Algorithm QPS@90% recall
1 Pinecone-filter 85,491 1 Pinecone-ood 38,088 1 Zilliz 10,749
2 Zilliz 84,596 2 Zilliz 33,241 2 Pinecone_smips 10,440
3 ParlayANN IVF2 37,902 3 RoarANN 22,555 3 PyANNS 8,732
4 Puck 19,193 4 PyANNS 22,296 4 shnsw 7,137
... ... ... ... ... ... ... ... ...
Baseline FAISS 3,032 Baseline Diskann 4,133 Baseline Linscan 93

Note: entries by pinecone and zilliz are not open source.

This challenge is to encourage the development of indexing data structures and search algorithms for practical variants of the Approximate Nearest Neighbor (ANN) or Vector search problem. These variants are increasingly relevant as vector search becomes commonplace. This challenge has four tracks covering sparse, filtered, out-of-distribution and streaming variants of ANNS. These variants require adapted search algorithms and strategies with different tradeoffs. Participants are encouraged to develop and submit new algorithms that improve on the baselines for these variants. This competition aims at being accessible to participants by limiting the scale of the datasets to about 10 million points.

Tracks: Datasets, Metrics and Baselines

The evaluation hardware is normalized to Azure Standard D8lds v5 (8 vCPUs and 16GB DRAM). The index build time on this machine will be limited to 12 hours, except for streaming index which has stricter time limits.

The challenge consists of 4 tracks with separate leaderboards and participants can choose to submit entries to one or more tracks:

  • Filtered Search: This task will use a random 10M slice of the YFCC 100M dataset transformed with CLIP embeddings. In addition, we associate with each image a "bag" of tags: words extracted from the description, the camera model, the year the picture was taken and the country. The tags are from a vocabulary of 200386 possible tags. The 100,000 queries consist of one image embedding and one or two tags that must appear in the database elements to be considered.
  • Out-Of-Distribution: This task will use the Yandex Text-to-Image 10M, cross-modal dataset where the database and query index have different distributions in the shared vector space. The base set is a 10M subset of the Yandex visual search database of 200-dimensional image embeddings which are produced with the Se-ResNext-101 model. The query embeddings correspond to the user-specified textual search queries. The text embeddings are extracted with a variant of the DSSM model.
  • Sparse: This task is based on the common MSMARCO passage retrieval dataset, which has 8,841,823 text passages, encoded into sparse vectors using the SPLADE model. The vectors have a large dimension (about 30,000), but each vector in the base dataset has an average of approximately 120 nonzero elements. The query set contains 6,980 text queries, embedded by the same SPLADE model. The average number of nonzero elements in the query set is approximately 49 (since text queries are generally shorter). Given a sparse query vector, the index should return the top-k results according to the maximal inner product between the vectors.
  • Streaming Search: This task uses 30M slice of the MS Turing data set released in the previous challenge. The index starts with zero points and must implement the "runbook" provided -- a sequence of insertion, deletion, and search operations (roughly 4:4:1 ratio) -- within a time bound of 1 hour and 8GB DRAM. The intention is for the algorithm to process the operations and maintain a compact index over the active points rather than index the entire anticipated set of points and use tombstones or flags to mark active elements. More details to come. The runbook is provided in `final_runbook.yaml` which is generated with ``.
Track Dataset Dimensions Data type Baseline algo QPS @ 90% recall Release terms
Filtered YFCC-10M + CLIP 192 uint8 filter-FAISS 3200 CC BY 4.0
OOD Text2Image-10M 200 float32 diskann 4882 CC BY 4.0
Sparse MS MARCO / SPLADE ~30K float32, sparse format Linscan 101 MS-MARCO: Free NC
Streaming MSTuring-30M-clustered 100 float32 fresh-diskann 0.883 recall@10 (45mins) O-UDA
We recommend using Axel for downloading non-Microsoft datasets. We recommend using AzCopy for downloading Microsoft datasets.

Track Winners and Presentations

Filtered Search

  • ParlayANN IVF2: Fusing Classic and Spatial Inverted Indices for Fast Filtered ANNS [slides] Authors: Ben Landrum (UMD), Magdalen Dobson Manohar (CMU), Mazin Karjikar (UMD), Laxman Dhulipala (UMD)


  • RoarANN: Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor Search Authors: Meng Chen, Yue Chen, Rui Ma, Kai Zhang, Yuzheng Cai, Jiayang Shi, Yizhuo Chen, Weiguo Zheng. All authors from Fudan University.
  • PyANNS Authors: Zihao Wang, Shanghai Jiao Tong University*


  • PyANNS Authors: Zihao Wang, Shanghai Jiao Tong University*
  • GrassRMA: GRAph-based Sparse Vector Search with Reducing Memory Accesses
    Authors: Meng Chen, Yue Chen, Rui Ma, Kai Zhang, Yuzheng Cai, Jiayang Shi, Yizhuo Chen, Weiguo Zheng. All authors from Fudan University.

Streaming Search

  • Puck: Efficient Multi-level Index Structure for Approximate Nearest Neighbor Search in Practice [slides] Authors: Jie Yin, Ben Huang, Baidu.

* Zihao Wang is also an employee of Zilliz. However, he declares that the PyANNs entry was created on his time off, without any involvement from Zilliz or any of the other organizers. This entry did not declare conflict with organizers before participating.

Organizer Presentations

Invited Talks



  • To participate, please express interest through the CMT portal.
  • To request cloud compute credits ($1000) towards development, please select the "Requesting cloud credit" field in your CMT entry and share a brief overview of the ideas you plan to develop with these credits in your CMT entry.
  • To get started, please see the instructions in the README file, and submit a Pull Request corresponding to your algorithm(s).
  • For questions and discussions, please use the Github issues or the Discord channel.

Timeline (subject to change)

  • June: Baseline results, testing infrastructure, CFP and final ranking metrics released.
  • End-JulyAugust 30th: Suggested deadline for requesting allocation of cloud compute credits for development. Credits will be provided on ongoing basis.
  • August 30thSeptember 15th: Final deadline for participants to submit an expression of interest through CMT.
  • October 30th: End of competition period. Teams to release code in a containerized form, and complete a pull request to the eval framework with code to run the algorithms.
  • Mid-November: Release of preliminary results on standardized machines. Review of code by organizers and participants. Participants can raise concerns about the evaluation.
  • Early December: Final results published, and competition results archived (the competition will go on if interest continues).
  • During NeurIPS: Organizers will provide an overview of the competition and results. Organizers will also request the best entries (including leaderboard toppers, or promising new approaches) to present an overview for further discussion.

Organizers and Dataset Contributors

Organizers can be reached at We thank Microsoft Research, Meta, Pinecone, Yandex, and Zilliz for help in preparing and organizing this competition. We thank Microsoft for cloud credits towards running the competition, and AWS and Pinecone for compute credits for participants.

Supported by