Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3527
**Context**
Design Doc: [Faiss Benchmarking](https://docs.google.com/document/d/1c7zziITa4RD6jZsbG9_yOgyRjWdyueldSPH6QdZzL98/edit)
**In this diff**
1. Be able to reference codec and index from blobstore (bucket & path) outside the experiment
2. To support #1, naming is moved to descriptors.
3. Build index can be written as well.
4. You can run benchmark with train and then refer it in index built and then refer index built in knn search. Index serialization is optional. Although not yet exposed through index descriptor.
5. Benchmark can support index with different datasets sizes
6. Working with varying dataset now support multiple ground truth. There may be small fixes before we could use this.
7. Added targets for bench_fw_range, ivf, codecs and optimize.
**Analysis of ivf result**: D58823037
Reviewed By: algoriddle
Differential Revision: D57236543
fbshipit-source-id: ad03b28bae937a35f8c20f12e0a5b0a27c34ff3b
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3154
Using the benchmark to find Pareto optimal indices, in this case on BigANN as an example.
Separately optimize the coarse quantizer and the vector codec and use Pareto optimal configurations to construct IVF indices, which are then retested at various scales. See `optimize()` in `optimize.py` as the main function driving the process.
The results can be interpreted with `bench_fw_notebook.ipynb`, which allows:
* filtering by maximum code size
* maximum time
* minimum accuracy
* space or time Pareto optimal options
* and visualize the results and output them as a table.
This version is intentionally limited to IVF(Flat|HNSW),PQ|SQ indices...
Reviewed By: mdouze
Differential Revision: D51781670
fbshipit-source-id: 2c0f800d374ea845255934f519cc28095c00a51f