Summary:
mdouze Please let me know if any additional unit tests are needed
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3444
Reviewed By: algoriddle
Differential Revision: D57665641
Pulled By: mdouze
fbshipit-source-id: 9bec91306a1c31ea4f1f1d726c9d60ac6415fdfc
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3452
Delete all remaining print within the Tests to improve the readability and effectiveness of the codebase.
Reviewed By: junjieqi
Differential Revision: D57466393
fbshipit-source-id: 6ebd66ae2e769894d810d4ba7a5f69fc865b797d
Summary:
This PR adds a functionality where an IVF index can be searched and the corresponding codes be returned. It also adds a few functions to compress int arrays into a bit-compact representation.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3143
Test Plan:
```
buck test //faiss/tests/:test_index_composite -- TestSearchAndReconstruct
buck test //faiss/tests/:test_standalone_codec -- test_arrays
```
Reviewed By: algoriddle
Differential Revision: D51544613
Pulled By: mdouze
fbshipit-source-id: 875f72d0f9140096851592422570efa0f65431fc
Summary: Useful info on github test runs is burried in spurious logging. Avoid this.
Reviewed By: mlomeli1
Differential Revision: D47209139
fbshipit-source-id: b5111c91e2b94f0c3678d599197f8e7094993df1
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2568
Add a fused kernel for exhaustive_L2sqr_blas() call that combines a computation of dot product and the search for the nearest centroid. As a result, no temporary dot product values are written and read in RAM.
Speeds up the training of PQx[1] indices for dsub = 1, 2, 4, 8, and the effect is higher for higher values of [1]. AVX512 version provides additional overloads for dsub = 12, 16.
The speedup is also beneficial for higher values of pq.cp.max_points_per_centroid (which is 256 by default).
Speeds up IVFPQ training as well.
AVX512 kernel is not enabled, but I've seen it speeding up the training TWICE versus AVX2 version. So, please feel free to use it by enabling AVX512 manually.
Reviewed By: mdouze
Differential Revision: D41166766
fbshipit-source-id: 443014e2e59396b3a90b9171fec8c8191052bcf4
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2123
One of the encodings used by LCC is based on a RCQ coarse quantizer and a "payload" of ITQ. The codes are compared with Hamming distances.
The index type `IndexIVFSpectralHash` can be re-purposed to perfrorm this type of index.
This diff contains a small demo demo_rcq_itq script in python to show how:
* the RCQ + ITQ are trained
* the RCQ + ITQ index add and search work (with a very inefficient python implementation)
* they can be transferred to an `IndexIVFSpectralHash`
* the python implementation and `IndexIVFSpectralHash` give the same results
The advantage of using to an `IndexIVFSpectralHash` is that in C++ it offers an `InvertedListScanner` object that can be used to compute query to code distances with its `distance_to_code` method. This is generic and will generalize to other types of encodings and coarse quantizers.
What is missing is an index_factory to make instanciation easier.
Reviewed By: sc268
Differential Revision: D32642900
fbshipit-source-id: 284f3029d239b7946bbca44a748def4e058489bd
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2117
This supports 2 concatenated codecs. It is based on IndexRefine, that already does this but did not have a standalone codec interface.
The main use case for now is a residual quantizer + ITQ.
The test below demonstrates how to instantiate that.
The advantage is that the index_factory parser already exists.
The IndexRefine decoder just uses the second index decoder, that is supposed to be more accurate than the first.
Reviewed By: beauby
Differential Revision: D32569997
fbshipit-source-id: 3fe9cd02eaa7d1cfe23b0f1168cc034821f1c362
Summary:
Added a few functions in contrib to:
- run range searches by batches on the query or the database side
- emulate range search on GPU: search on GPU with k=1024, if the farthest neighbor is still within range, re-perform search on CPU
- as reference implementations for precision-recall on range search datasets
- optimized code to plot precision-recall plots (ie. sweep over thresholds)
The new functions are mainly in a new `evaluation.py`
Reviewed By: wickedfoo
Differential Revision: D25627619
fbshipit-source-id: 58f90654c32c925557d7bbf8083efbb710712e03
Summary:
`long` is 32 bits on windows and so is the default int type for numpy (eg. the one used for `np.arange`).
This diff explicitly specifies 64-bit ints for all occurrences where it matters.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1381
Reviewed By: wickedfoo
Differential Revision: D23371232
Pulled By: mdouze
fbshipit-source-id: 220262cd70ee70379f83de93561a4eae71c94b04