faiss/benchs/bench_all_ivf
Ryan Russell d2806286d2 docs: Improve readability (#2378)
Summary:
Signed-off-by: Ryan Russell <git@ryanrussell.org>

Various readability fixes focused on `.md` files:
- Grammar
- Fix some incorrect command references to `distributed_kmeans.py`
- Styling the markdown bash code snippets sections so they format

Attempted to put a lot of little things into one PR and commit; let me know if any mods are needed!

Best,
Ryan

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2378

Reviewed By: alexanderguzhva

Differential Revision: D37717671

Pulled By: mdouze

fbshipit-source-id: 0039192901d98a083cd992e37f6b692d0572103a
2022-07-08 09:19:07 -07:00
..
README.md docs: Improve readability (#2378) 2022-07-08 09:19:07 -07:00
bench_all_ivf.py Support for additive quantizer search (#1961) 2021-08-20 01:00:10 -07:00
bench_kmeans.py PQ4 fast scan benchmarks (#1555) 2020-12-16 01:18:58 -08:00
cmp_with_scann.py Add missing copyright headers. (#1689) 2021-02-16 09:11:30 -08:00
datasets.py PQ4 fast scan benchmarks (#1555) 2020-12-16 01:18:58 -08:00
make_groundtruth.py PQ4 fast scan benchmarks (#1555) 2020-12-16 01:18:58 -08:00
parse_bench_all_ivf.py Support for additive quantizer search (#1961) 2021-08-20 01:00:10 -07:00
run_on_cluster_generic.bash PQ4 fast scan benchmarks (#1555) 2020-12-16 01:18:58 -08:00

README.md

Benchmark of IVF variants

This is a benchmark of IVF index variants, looking at compression vs. speed vs. accuracy. The results are in this wiki chapter

The code is organized as:

  • datasets.py: code to access the datafiles, compute the ground-truth and report accuracies

  • bench_all_ivf.py: evaluate one type of inverted file

  • run_on_cluster_generic.bash: call bench_all_ivf.py for all tested types of indices. Since the number of experiments is quite large the script is structured so that the benchmark can be run on a cluster.

  • parse_bench_all_ivf.py: make nice tradeoff plots from all the results.

The code depends on Faiss and can use 1 to 8 GPUs to do the k-means clustering for large vocabularies.

It was run in October 2018 for the results in the wiki.