faiss/demos
Maria Lomeli 0fc8456e1d Offline IVF powered by faiss big batch search (#3202)
Summary:
This PR introduces the offline IVF (OIVF) framework which contains some tooling to run search using IVFPQ indexes (plus OPQ pretransforms) for large batches of queries using [big_batch_search](https://github.com/mlomeli1/faiss/blob/main/contrib/big_batch_search.py) and GPU faiss. See the [README](36226f5fe8/demos/offline_ivf/README.md) for details about using this framework.

This PR includes the following unit tests, which can be run with the unittest library as so:
````
~/faiss/demos/offline_ivf$ python3 -m unittest tests/test_iterate_input.py -k test_iterate_back
````
In test_offline_ivf:
````
test_consistency_check
test_train_index
test_index_shard_equal_file_sizes
test_index_shard_unequal_file_sizes
test_search
test_evaluate_without_margin
test_evaluate_without_margin_OPQ
````
In test_iterate_input:
````
test_iterate_input_file_larger_than_batch
test_get_vs_iterate
test_iterate_back

````

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3202

Reviewed By: algoriddle

Differential Revision: D52734222

Pulled By: mlomeli1

fbshipit-source-id: 61fd0084277c1b14bdae1189db8ae43340611e16
2024-01-16 05:05:15 -08:00
..
offline_ivf Offline IVF powered by faiss big batch search (#3202) 2024-01-16 05:05:15 -08:00
CMakeLists.txt
README.md
demo_auto_tune.py
demo_client_server_ivf.py
demo_imi_flat.cpp Fix some typos (#3056) 2023-09-27 03:17:41 -07:00
demo_imi_pq.cpp Fix some typos (#3056) 2023-09-27 03:17:41 -07:00
demo_ivfpq_indexing.cpp fairring, faiss, fairness (4401366386162573988) 2023-09-14 00:50:50 -07:00
demo_nndescent.cpp
demo_ondisk_ivf.py
demo_residual_quantizer.cpp OSS legal requirements (#2698) 2023-02-07 14:32:56 -08:00
demo_sift1M.cpp
demo_weighted_kmeans.cpp fairring, faiss, fairness (4401366386162573988) 2023-09-14 00:50:50 -07:00

README.md

Demos for a few Faiss functionalities

demo_auto_tune.py

Demonstrates the auto-tuning functionality of Faiss

demo_ondisk_ivf.py

Shows how to construct a Faiss index that stores the inverted file data on disk, eg. when it does not fit in RAM. The script works on a small dataset (sift1M) for demonstration and proceeds in stages:

0: train on the dataset

1-4: build 4 indexes, each containing 1/4 of the dataset. This can be done in parallel on several machines

5: merge the 4 indexes into one that is written directly to disk (needs not to fit in RAM)

6: load and test the index