Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2245
This changeset makes the `heap_replace_top()` function of the FAISS heap implementation break distance ties by the element's ID, according to the heap's min/max property.
Reviewed By: mdouze
Differential Revision: D34669542
fbshipit-source-id: 0db24fd12442eedeee917fbb3e811ba4a070ce0f
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2166
RQ training is done progressively from one quantizer to the next, maintaining a current set of codes and quantization centroids.
However, for RQ as for any additive quantizer, there is a closed form solution for the centroids that minimizes the quantization error for fixed codes.
This diff offers the option to estimate that codebook at the end of the optimization. It performs this estimation iteratively, ie. several rounds of code computation - codebook refinement are performed.
A pure python implementation + results is here:
https://github.com/fairinternal/faiss_improvements/blob/dbcc746/decoder/refine_aq_codebook.ipynb
Reviewed By: wickedfoo
Differential Revision: D33309409
fbshipit-source-id: 55c13425292e73a1b05f00e90f4dcfdc8b3549e8
Summary:
This diff implemented non-uniform quantization of vector norms in additive quantizers. index_factory and I/O are supported.
index_factory: `XXX_Ncqint{nbits}` where `nbits` is the number of bits to quantize vector norm.
For 8 bits code, it is almost the same as 8-bit uniform quantization. It will slightly improve the accuracy if the code size is less than 8 bits.
```
RQ4x8_Nqint8: R@1 0.1116
RQ4x8_Ncqint8: R@1 0.1117
RQ4x8_Nqint4: R@1 0.0901
RQ4x8_Ncqint4: R@1 0.0989
```
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2037
Test Plan:
buck test //faiss/tests/:test_clustering -- TestClustering1D
buck test //faiss/tests/:test_lsq -- test_index_accuracy_cqint
buck test //faiss/tests/:test_residual_quantizer -- test_norm_cqint
buck test //faiss/tests/:test_residual_quantizer -- test_search_L2
Reviewed By: beauby
Differential Revision: D31083476
Pulled By: mdouze
fbshipit-source-id: f34c3dafc4eb1c6f44a63e68137158911aa4a2f4
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2018
The centroids norms table was not reconstructed correctly after being stored in RCQ.
Reviewed By: Sugoshnr
Differential Revision: D30484389
fbshipit-source-id: 9f618a3939c99dc987590c07eda8e76e19248b08
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1908
To search the best combination of codebooks, the method that was implemented so far is via a beam search.
It is possible to make this faster for a query vector q by precomputing look-up tables in the form of
LUT_m = <q, cent_m>
where cent_m is the set of centroids for quantizer m=0..M-1.
The LUT can then be used as
inner_prod = sum_m LUT_m[c_m]
and
L2_distance = norm_q + norm_db - 2 * inner_prod
This diff implements this computation by:
- adding the LUT precomputation
- storing an exhaustive table of all centroid norms (when using L2)
This is only practical for small additive quantizers, eg. when a residual vector quantizer is used as coarse quantizer (ResidualCoarseQuantizer).
This diff is based on AdditiveQuantizer diff because it applies equally to other quantizers (eg. the LSQ).
Reviewed By: sc268
Differential Revision: D28467746
fbshipit-source-id: 82611fe1e4908c290204d4de866338c622ae4148
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1865
This diff chunks vectors to encode to make it more memory efficient.
Reviewed By: sc268
Differential Revision: D28234424
fbshipit-source-id: c1afd2aaff953d4ecd339800d5951ae1cae4789a
Summary:
This diff adds the following to bring the residual quantizer support on-par with PQ:
- IndexResidual can be built with index factory, serialized and used as a Faiss codec.
- ResidualCoarseQuantizer can be used as a coarse quantizer for inverted files.
The factory string looks like "RQ1x16_6x8" which means a first 16-bit quantizer then 6 8-bit ones. For IVF it's "IVF4096(RQ2x6),Flat".
Reviewed By: sc268
Differential Revision: D27865612
fbshipit-source-id: f9f11d29e9f89d3b6d4cd22e9a4f9222422d5f26
Summary:
This diff includes:
- progressive dimension k-means.
- the ResidualQuantizer object
- GpuProgressiveDimIndexFactory so that it can be trained on GPU
- corresponding tests
- reference Python implementation of the same in scripts/matthijs/LCC_encoding
Reviewed By: wickedfoo
Differential Revision: D27608029
fbshipit-source-id: 9a8cf3310c8439a93641961ca8b042941f0f4249