Summary: In ```cmp_with_scann.py```, we will save npy file for base and query vector file and gt file. However, we will only do this while the lib is faiss, if we directly run this script with scann lib it will complain that file does not exsit. Therefore, the code should be refactored to save npy file from the beginning so that nothing will go wrong. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2573 Reviewed By: mdouze Differential Revision: D42338435 Pulled By: algoriddle fbshipit-source-id: 9227f95e1ff79f5329f6206a0cb7ca169185fdb3 |
||
---|---|---|
.. | ||
README.md | ||
bench_all_ivf.py | ||
bench_kmeans.py | ||
cmp_with_scann.py | ||
datasets_oss.py | ||
make_groundtruth.py | ||
parse_bench_all_ivf.py | ||
run_on_cluster_generic.bash |
README.md
Benchmark of IVF variants
This is a benchmark of IVF index variants, looking at compression vs. speed vs. accuracy. The results are in this wiki chapter
The code is organized as:
-
datasets.py
: code to access the datafiles, compute the ground-truth and report accuracies -
bench_all_ivf.py
: evaluate one type of inverted file -
run_on_cluster_generic.bash
: callbench_all_ivf.py
for all tested types of indices. Since the number of experiments is quite large the script is structured so that the benchmark can be run on a cluster. -
parse_bench_all_ivf.py
: make nice tradeoff plots from all the results.
The code depends on Faiss and can use 1 to 8 GPUs to do the k-means clustering for large vocabularies.
It was run in October 2018 for the results in the wiki.