Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2123
One of the encodings used by LCC is based on a RCQ coarse quantizer and a "payload" of ITQ. The codes are compared with Hamming distances.
The index type `IndexIVFSpectralHash` can be re-purposed to perfrorm this type of index.
This diff contains a small demo demo_rcq_itq script in python to show how:
* the RCQ + ITQ are trained
* the RCQ + ITQ index add and search work (with a very inefficient python implementation)
* they can be transferred to an `IndexIVFSpectralHash`
* the python implementation and `IndexIVFSpectralHash` give the same results
The advantage of using to an `IndexIVFSpectralHash` is that in C++ it offers an `InvertedListScanner` object that can be used to compute query to code distances with its `distance_to_code` method. This is generic and will generalize to other types of encodings and coarse quantizers.
What is missing is an index_factory to make instanciation easier.
Reviewed By: sc268
Differential Revision: D32642900
fbshipit-source-id: 284f3029d239b7946bbca44a748def4e058489bd
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2117
This supports 2 concatenated codecs. It is based on IndexRefine, that already does this but did not have a standalone codec interface.
The main use case for now is a residual quantizer + ITQ.
The test below demonstrates how to instantiate that.
The advantage is that the index_factory parser already exists.
The IndexRefine decoder just uses the second index decoder, that is supposed to be more accurate than the first.
Reviewed By: beauby
Differential Revision: D32569997
fbshipit-source-id: 3fe9cd02eaa7d1cfe23b0f1168cc034821f1c362
Summary:
Added a few functions in contrib to:
- run range searches by batches on the query or the database side
- emulate range search on GPU: search on GPU with k=1024, if the farthest neighbor is still within range, re-perform search on CPU
- as reference implementations for precision-recall on range search datasets
- optimized code to plot precision-recall plots (ie. sweep over thresholds)
The new functions are mainly in a new `evaluation.py`
Reviewed By: wickedfoo
Differential Revision: D25627619
fbshipit-source-id: 58f90654c32c925557d7bbf8083efbb710712e03
Summary:
`long` is 32 bits on windows and so is the default int type for numpy (eg. the one used for `np.arange`).
This diff explicitly specifies 64-bit ints for all occurrences where it matters.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1381
Reviewed By: wickedfoo
Differential Revision: D23371232
Pulled By: mdouze
fbshipit-source-id: 220262cd70ee70379f83de93561a4eae71c94b04