Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3383
In this diff, I am fixing minor issues in bench_fw where either certain fields are not accessible when index is build from codec. It also requires index to be discovered using codec alias as index factory is not always available.
In subsequent diff internal to meta will have testcase that execute this path.
Reviewed By: algoriddle
Differential Revision: D56444641
fbshipit-source-id: b7af7e7bb47b20bbb5515a66f41dd24f42459d52
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3309
Make sure that the HNSW search stats work, remove stats for deprecated functionality.
Remove code of the link and code paper that is not supported anymore.
Reviewed By: kuarora, junjieqi
Differential Revision: D55247802
fbshipit-source-id: 03f176be092bff6b2db359cc956905d8646ea702
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3154
Using the benchmark to find Pareto optimal indices, in this case on BigANN as an example.
Separately optimize the coarse quantizer and the vector codec and use Pareto optimal configurations to construct IVF indices, which are then retested at various scales. See `optimize()` in `optimize.py` as the main function driving the process.
The results can be interpreted with `bench_fw_notebook.ipynb`, which allows:
* filtering by maximum code size
* maximum time
* minimum accuracy
* space or time Pareto optimal options
* and visualize the results and output them as a table.
This version is intentionally limited to IVF(Flat|HNSW),PQ|SQ indices...
Reviewed By: mdouze
Differential Revision: D51781670
fbshipit-source-id: 2c0f800d374ea845255934f519cc28095c00a51f
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3190
This diff adds more result handlers in order to expose them externally.
This enables range search for HSNW and Fast Scan, and nprobe parameter support for FastScan.
Reviewed By: pemazare
Differential Revision: D52547384
fbshipit-source-id: 271da5ffea6411df3d8e50641abade18bd7b774b
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3144
Visualize results of running the benchmark with Pareto optima filtering:
1. per index or across indices
2. for space, time or space & time
3. knn or range search, the latter @ specific precision
Reviewed By: mdouze
Differential Revision: D51552775
fbshipit-source-id: d4f29e3d46ef044e71b54439b3972548c86af5a7
Summary:
1. Support for index construction parameters outside of the factory string (arbitrary depth of quantizers).
2. Refactor that provides an index wrapper which is a prereq for the optimizer, which will generate indices from pre-optimized components (particularly quantizers)
Reviewed By: mdouze
Differential Revision: D51427452
fbshipit-source-id: 014d05dd798d856360f2546963e7cad64c2fcaeb
Summary:
This PR adds a functionality where an IVF index can be searched and the corresponding codes be returned. It also adds a few functions to compress int arrays into a bit-compact representation.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3143
Test Plan:
```
buck test //faiss/tests/:test_index_composite -- TestSearchAndReconstruct
buck test //faiss/tests/:test_standalone_codec -- test_arrays
```
Reviewed By: algoriddle
Differential Revision: D51544613
Pulled By: mdouze
fbshipit-source-id: 875f72d0f9140096851592422570efa0f65431fc
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3097
A framework for evaluating indices offline.
Long term objectives:
1. Generate offline similarity index performance data with test datasets both for existing indices and automatically generated alternatives. That is, given a dataset and some constraints this workflow should automatically discover optimal index types and parameter choices as well as evaluate the performance of existing production indices and their parameters.
2. Allow researchers, platform owners (Laser, Unicorn) and product teams to understand how different index types perform on their datasets and make optimal choices wrt their objectives. Longer term to enable automatic decision-making/auto-tuning.
Constraints, design choices:
1. I want to run the same evaluation on Meta-internal (fblearner, data from hive and manifold) or the local machine + research cluster (data on local disk or NFS) via OSS Faiss. Via fblearner, I want this to work in a way that it can be turned into a service and plugged into Unicorn or Laser, while the core Faiss part can be used/referred to in our research and to update the wiki with the latest results/recommendations for public datasets.
2. It must support a range of metrics for KNN and range search, and it should be easy to add new ones. Cost metrics need to be fine-grained to allow extrapolation.
3. It should automatically sweep all query time params (eg. nprobe, polysemous code hamming distance, params of quantizers), using`OperatingPointsWithRanges` to cut down the optimal param search space. (For now, it sweeps nprobes only.)
4. [FUTURE] It will generate/sweep index creation hyperparams (factory strings, quantizer sizes, quantizer params), using heuristics.
5. [FUTURE] It will sweep the dataset size: start small test with e.g. 100K db vectors and go up to millions, billions potentially, while narrowing down the index+param choices at each step.
6. [FUTURE] Extrapolate perf metrics (cost and accuracy)
7. Intermediate results must be saved (to disk, to manifold) throughout, and reused as much as possible to cut down on overall runtime and enable faster iteration during development.
For range search, this diff supports the metric proposed in https://docs.google.com/document/d/1v5OOj7kfsKJ16xzaEHuKQj12Lrb-HlWLa_T2ct0LJiw/edit?usp=sharing I also added support for the classical case where the scoring function steps from 1 to 0 at some arbitrary threshold.
For KNN, I added knn_intersection, but other metrics, particularly recall@1 will also be interesting. I also added the distance_ratio metric, which we previously discussed as an interesting alternative, since it shows how much the returned results approximate the ground-truth nearest-neighbours in terms of distances.
In the test case, I evaluated three current production indices for VCE with 1M vectors in the database and 10K queries. Each index is tested at various operating points (nprobes), which are shows on the charts. The results are not extrapolated to the true scale of these indices.
Reviewed By: yonglimeta
Differential Revision: D49958434
fbshipit-source-id: f7f567b299118003955dc9e2d9c5b971e0940fc5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2918
The HammingComputer class is optimized for several vector sizes. So far it's been the caller's responsiblity to instanciate the relevant optimized version.
This diff introduces a `dispatch_HammingComputer` function that can be called with a template class that is instanciated for all existing optimized HammingComputer's.
Reviewed By: algoriddle
Differential Revision: D46858553
fbshipit-source-id: 32c31689bba7c0b406b309fc8574c95fa24022ba
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2782
Add a separate branch for ARM Hamming Distance computations. Also, improves a benchmark for hamming computer.
Reviewed By: mdouze
Differential Revision: D44397463
fbshipit-source-id: 1e44e8e7dd1c5b92e95e8afc754170b501d0feed
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2797
This is the last code instance of setNumProbes
Removing because some people still seem run into errors due to this.
Reviewed By: algoriddle
Differential Revision: D44421600
fbshipit-source-id: fbc1a9d49a0175ddf24c32dab5c1bdb5f1bbbac6
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2638
This diff is a more streamlined way of searching IVF indexes with precomputed clusters.
This will be used for experiments with hybrid CPU / GPU search.
Reviewed By: algoriddle
Differential Revision: D41301032
fbshipit-source-id: a1d645fd0f2bf806454dfd04971edc0a6200d20d
Summary:
In ```cmp_with_scann.py```, we will save npy file for base and query vector file and gt file. However, we will only do this while the lib is faiss, if we directly run this script with scann lib it will complain that file does not exsit.
Therefore, the code should be refactored to save npy file from the beginning so that nothing will go wrong.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2573
Reviewed By: mdouze
Differential Revision: D42338435
Pulled By: algoriddle
fbshipit-source-id: 9227f95e1ff79f5329f6206a0cb7ca169185fdb3
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2567
Intuitively, it should be easier to handle big-batch searches because all distance computations for a set of queries can be done locally within each inverted list.
This benchmark implements this in pure python (but should be close to optimal in terms of speed), on CPU for IndexIVFFlat, IndexIVFPQ and IndexIVFScalarQuantizer. GPU is also supported.
The results are not systematically better, see https://docs.google.com/document/d/1d3YuV8uN7hut6aOATCOMx8Ut-QEl_oRnJdPgDBRF1QA/edit?usp=sharing
Reviewed By: algoriddle
Differential Revision: D41098338
fbshipit-source-id: 479e471b0d541f242d420f581775d57b708a61b8
Summary:
Adds:
- a sparse update function to the heaps
- bucket sort functions
- an IndexRandom index to serve as a dummy coarse quantizer for testing
Reviewed By: algoriddle
Differential Revision: D41804055
fbshipit-source-id: 9402b31c37c367aa8554271d8c88bc93cc1e2bda
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2582
A few more or less cosmetic improvements
* Index::idx_t was in the Index object, which does not make much sense, this diff moves it to faiss::idx_t
* replace multiprocessing.dummy with multiprocessing.pool
* add Alexandr as a core contributor of Faiss in the README ;-)
```
for i in $( find . -name \*.cu -o -name \*.cuh -o -name \*.h -o -name \*.cpp ) ; do
sed -i s/Index::idx_t/idx_t/ $i
done
```
For the fbcode deps:
```
for i in $( fbgs Index::idx_t --exclude fbcode/faiss -l ) ; do
sed -i s/Index::idx_t/idx_t/ $i
done
```
Reviewed By: algoriddle
Differential Revision: D41437507
fbshipit-source-id: 8300f2a3ae97cace6172f3f14a9be3a83999fb89
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2562
Introduce a table of transposed centroids in ProductQuantizer that significantly speeds up ProductQuantizer::compute_codes() call for certain PQ parameters, so speeds up search queries.
* ::sync_tranposed_centroids() call is used to fill the table
* ::clear_transposed_centroids() call clear the table, so that the original baseline code is used for ::compute_codes()
Reviewed By: mdouze
Differential Revision: D40763338
fbshipit-source-id: 87b40e5dd2f8c3cadeb94c1cd9e8a4a5b6ffa97d
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2483
This diff changes the following:
1. all search functions now take a `SearchParameters` argument that overrides the internal search parameters
2. the default implementation for most classes throws when the params argument is non-nullptr / non-None
3. the IndexIVF and IndexHNSW classes have functioning SearchPArameters
4. the SearchParameters includes an IDSelector that can search only in a subset of the index based on a defined subset of ids
There is also some refactoring: the IDSelector was moved to its own .h/.cpp and python/__init__.py is spit in parts.
The diff is quite bulky because the search function prototypes need to be changed in all index classes.
Things to fix in subsequent diffs:
- support SearchParameters for more index types (Flat variants)
- better sub-object ownership for SearchParams (with std::unique_ptr?)
- special handling of IDSelectorRange to make it faster
Reviewed By: alexanderguzhva
Differential Revision: D39852589
fbshipit-source-id: 4988bdb5b9bee1207cd327d3f80bf5e0e2467fe1
Summary:
For search request with few queries or single query, this PR adds the ability to run threads over both queries and different cluster of the IVF. For application where latency is important this can **dramatically reduce latency for single query requests**.
A new implementation (https://github.com/facebookresearch/faiss/issues/14) is added. The new implementation could be merged to the implementation 12 but for simplicity in this PR, I created a separate function.
Tests are added to cover the new implementation and new tests are added to specifically cover the case when a single query is used.
In my benchmarks a very good reduction of latency is observed for single query requests.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2380
Test Plan:
```
buck test //faiss/tests/:test_fast_scan_ivf -- implem14
buck test //faiss/tests/:test_fast_scan_ivf -- implem15
```
Reviewed By: alexanderguzhva
Differential Revision: D38074577
Pulled By: mdouze
fbshipit-source-id: e7a20b6ea2f9216e0a045764b5d7b7f550ea89fe
Summary:
Signed-off-by: Ryan Russell <git@ryanrussell.org>
Various readability fixes focused on `.md` files:
- Grammar
- Fix some incorrect command references to `distributed_kmeans.py`
- Styling the markdown bash code snippets sections so they format
Attempted to put a lot of little things into one PR and commit; let me know if any mods are needed!
Best,
Ryan
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2378
Reviewed By: alexanderguzhva
Differential Revision: D37717671
Pulled By: mdouze
fbshipit-source-id: 0039192901d98a083cd992e37f6b692d0572103a
Summary:
This diff added ProductAdditiveQuantizer.
A Simple Algo description:
1. Divide the vector space into several orthogonal sub-spaces, just like PQ does.
2. Quantize each sub-space by an independent additive quantizer.
Usage:
Construct a ProductAdditiveQuantizer object:
- `d`: dimensionality of the input vectors
- `nsplits`: number of sub-spaces divided into
- `Msub`: `M` of each additive quantizer
- `nbits`: `nbits` of each additive quantizer
```python
d = 128
nsplits = 2
Msub = 4
nbits = 8
plsq = faiss.ProductLocalSearchQuantizer(d, nsplits, Msub, nbits)
prq = faiss.ProductResidualQuantizer(d, nsplits, Msub, nbits)
```
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2286
Test Plan:
```
buck test //faiss/tests/:test_local_search_quantizer -- TestProductLocalSearchQuantizer
buck test //faiss/tests/:test_residual_quantizer -- TestProductResidualQuantizer
```
Reviewed By: alexanderguzhva
Differential Revision: D35907702
Pulled By: mdouze
fbshipit-source-id: 7428a196e6bd323569caa585c57281dd70e547b1
Summary:
Start migration of existing benchmarks to Google's Benchmark library + register benchmark to servicelab.
The benchmark should be automatically registered to servicelab once this diff lands according to https://www.internalfb.com/intern/wiki/ServiceLab/Use_Cases/Benchmarks_(C++)/#servicelab-job.
Reviewed By: mdouze
Differential Revision: D35397782
fbshipit-source-id: 317db2527f12ddde0631cacc3085c634afdd0e37
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2217
This diff introduces a new Faiss contrib module that contains:
- generic k-means implemented in python (was in distributed_ondisk)
- the two-level clustering code, including a simple function that runs it on a Faiss IVF index.
- sparse clustering code (new)
The main idea is that that code is often re-used so better have it in contrib.
Reviewed By: beauby
Differential Revision: D34170932
fbshipit-source-id: cc297cc56d241b5ef421500ed410d8e2be0f1b77
Summary:
## Description
This PR added support for LSQ on GPU. Only the encoding part is running on GPU and the others are still running on CPU.
Multi-GPU is also supported.
## Usage
``` python
lsq = faiss.LocalSearchQuantizer(d, M, nbits)
ngpus = faiss.get_num_gpus()
lsq.icm_encoder_factory = faiss.GpuIcmEncoderFactory(ngpus) # we use all gpus
lsq.train(xt)
codes = lsq.compute_codes(xb)
decoded = lsq.decode(codes)
```
## Performance on SIFT1M
On 1 GPU:
```
===== lsq-gpu:
mean square error = 17337.878528
training time: 40.9857234954834 s
encoding time: 27.12640070915222 s
```
On 2 GPUs:
```
===== lsq-gpu:
mean square error = 17364.658176
training time: 25.832106113433838 s
encoding time: 14.879548072814941 s
```
On CPU:
```
===== lsq:
mean square error = 17305.880576
training time: 152.57522344589233 s
encoding time: 110.01779270172119 s
```
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1978
Test Plan: buck test mode/dev-nosan //faiss/gpu/test/:test_gpu_index_py -- TestLSQIcmEncoder
Reviewed By: wickedfoo
Differential Revision: D29609763
Pulled By: mdouze
fbshipit-source-id: b6ffa2a3c02bf696a4e52348132affa0dd838870
Summary:
## Description
The process of updating the codebook in LSQ may be unstable if the data is not zero-centering. This diff fixed it by using `double` instead of `float` during codebook updating. This would not affect the performance since the update process is quite fast.
Users could switch back to `float` mode by setting `update_codebooks_with_double = False`
## Changes
1. Support `double` during codebook updating.
2. Add a unit test.
3. Add `__init__.py` under `contrib/` to avoid warnings.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1975
Reviewed By: wickedfoo
Differential Revision: D29565632
Pulled By: mdouze
fbshipit-source-id: 932d7932ae9725c299cd83f87495542703ad6654
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1906
This PR implemented LSQ/LSQ++, a vector quantization technique described in the following two papers:
1. Revisiting additive quantization
2. LSQ++: Lower running time and higher recall in multi-codebook quantization
Here is a benchmark running on SIFT1M for 64 bits encoding:
```
===== lsq:
mean square error = 17335.390208
training time: 312.729779958725 s
encoding time: 244.6277096271515 s
===== pq:
mean square error = 23743.004672
training time: 1.1610801219940186 s
encoding time: 2.636141061782837 s
===== rq:
mean square error = 20999.737344
training time: 31.813055515289307 s
encoding time: 307.51959800720215 s
```
Changes:
1. Add LocalSearchQuantizer object
2. Fix an out of memory bug in ResidualQuantizer
3. Add a benchmark for evaluating quantizers
4. Add tests for LocalSearchQuantizer
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1862
Test Plan:
```
buck test //faiss/tests/:test_lsq
buck run mode/opt //faiss/benchs/:bench_quantizer -- lsq pq rq
```
Reviewed By: beauby
Differential Revision: D28376369
Pulled By: mdouze
fbshipit-source-id: 2a394d38bf75b9de0a1c2cd6faddf7dd362a6fa8
Summary:
## Description:
This diff implemented Navigating Spreading-out Graph (NSG) which accepts a KNN graph as input.
Here is the interface of building an NSG graph:
``` c++
void IndexNSG::build(idx_t n, const float *x, idx_t *knn_graph, int GK);
```
where `GK` is the nb of neighbors per node and `knn_graph[i * GK + j]` is the j-th neighbor of node i.
The `add` method is not implemented yet.
The unit tests could be found in `tests/test_nsg.cpp`.
mdouze beauby Maybe I need some advice on how to design the interface and support python.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1707
Test Plan: buck test //faiss/tests/:test_index -- TestNSG
Reviewed By: beauby
Differential Revision: D26748498
Pulled By: mdouze
fbshipit-source-id: 3280f705fb1b5f9c8cc5efeba63b904c3b832544