Summary:
Exporting a few more functions to the C API
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2367
Reviewed By: alexanderguzhva
Differential Revision: D37480505
Pulled By: mdouze
fbshipit-source-id: 899baca8795e29b20e16b56ea3c0d13960e1ea37
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2327
Expose buffer sizes for:
* MultiIndexQuantizer::search
* IndexIVFPQ::add_core_o
* Index2Layer::sa_encode
* ProductQuantizer::compute_codes
These constants were introduced to handle the possible out-of-memory problem. Faiss performs certain operations in chunks. Increasing the chunk sizes reduces the OpenMP overhead and speeds up computations in certain cases at the cost of higher memory consumption.
Reviewed By: mdouze
Differential Revision: D36248391
fbshipit-source-id: 17b38f8b7f59748d5ff72c79938e66b1800983a9
Summary:
This diff added ProductAdditiveQuantizer.
A Simple Algo description:
1. Divide the vector space into several orthogonal sub-spaces, just like PQ does.
2. Quantize each sub-space by an independent additive quantizer.
Usage:
Construct a ProductAdditiveQuantizer object:
- `d`: dimensionality of the input vectors
- `nsplits`: number of sub-spaces divided into
- `Msub`: `M` of each additive quantizer
- `nbits`: `nbits` of each additive quantizer
```python
d = 128
nsplits = 2
Msub = 4
nbits = 8
plsq = faiss.ProductLocalSearchQuantizer(d, nsplits, Msub, nbits)
prq = faiss.ProductResidualQuantizer(d, nsplits, Msub, nbits)
```
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2286
Test Plan:
```
buck test //faiss/tests/:test_local_search_quantizer -- TestProductLocalSearchQuantizer
buck test //faiss/tests/:test_residual_quantizer -- TestProductResidualQuantizer
```
Reviewed By: alexanderguzhva
Differential Revision: D35907702
Pulled By: mdouze
fbshipit-source-id: 7428a196e6bd323569caa585c57281dd70e547b1
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2320
Checks are a bit stricted in platform010, so fix new CI errors here.
The errors corrected fall in 3 classes:
- `&vector[vector.size()]` now fails because `operator []` checks for array bounds even if only the address is maniuplated
- `omp schedule(dynamic)` does not run the loop in the correct order.
- several threads calling omp loop seems to cause errors in the distributed Faiss code
Reviewed By: beauby
Differential Revision: D35895550
fbshipit-source-id: e9dcf5615158610a42870e6a41c77e4db6ebeea0
Summary:
Fixed the include file for the IVFPQ demo in the GPU index. Adds a targets entry for it as well.
Fixes
https://github.com/facebookresearch/faiss/issues/2293
Reviewed By: beauby
Differential Revision: D35775928
fbshipit-source-id: 15ea837e5a67a6d692e980d90195400936dac1e1
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2294
there is a weird CI failure on one of the platforms occurring in the PR
https://github.com/facebookresearch/faiss/pull/2291
This diff makes the test a bit more robust, correcting inter_perf to computer the intersection measure. Hopefully this will make the bug go away.
Reviewed By: beauby
Differential Revision: D35558855
fbshipit-source-id: f5a926d9d8ebee975e538c65ac37b15d485798aa
Summary:
When I reconstruct with by_residual turned off, the distance was greatly increased.
This is because the reconstruct_from_offset function did not check if the by_residual option was off.
I fix this bug with simple if statement.
(like this https://github.com/facebookresearch/faiss/blob/main/faiss/IndexIVFPQ.cpp#L365)
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2298
Reviewed By: alexanderguzhva
Differential Revision: D35746566
Pulled By: mdouze
fbshipit-source-id: 50f98c7cc97c7936507573fe41b65a79ecdbc4ca
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2295
Makes a common ancestor for quantizer classes.
As a first application, adds a clone_Quantizer function
Reviewed By: alexanderguzhva
Differential Revision: D35561960
fbshipit-source-id: 896a4f3fc4ab992511cdc0642689a440f170f683
Summary:
Start migration of existing benchmarks to Google's Benchmark library + register benchmark to servicelab.
The benchmark should be automatically registered to servicelab once this diff lands according to https://www.internalfb.com/intern/wiki/ServiceLab/Use_Cases/Benchmarks_(C++)/#servicelab-job.
Reviewed By: mdouze
Differential Revision: D35397782
fbshipit-source-id: 317db2527f12ddde0631cacc3085c634afdd0e37
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2274
All input matrices needed to be of the correct type and to be C-contiguous. This diff passes the main entry points of the api through `np.ascontiguousarray` so that the function parameters are transparently converted to the suitable format if needed.
We did not have this before because users need to be made aware of the performance impact, but it seems that maybe usability is more useful.
This diff is an alternative to
D35007365
https://github.com/facebookresearch/faiss/pull/2250
Reviewed By: beauby
Differential Revision: D35009612
fbshipit-source-id: fa0d5cfdfbff6b0916d47bd33c620e3ca9d5dd40
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2280
Add a new function call fvec_L2sqr_ny_nearest and a demonstration of its implementation for 4 bits
Reviewed By: mdouze
Differential Revision: D35189945
fbshipit-source-id: d1b2ba42851df195123c7e318a8dcf26f775eaba
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2277
* extend a specialized AVX2 version for IVFPQScannerT::scan_list_with_table to cover IVFPQScannerT::scan_list_polysemous_hc as well
* lower the comparison precision in test_lowlevel_ivf tests from EXPECT_EQ to EXPECT-FLOAT_EQ because of the AVX2 change in IVFPQScannerT::scan_list_polysemous_hc, otherwise tests fail
Reviewed By: mdouze
Differential Revision: D34964138
fbshipit-source-id: 1d304a8f6eda040fa4c626676b4d492f2c12f04f
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2255
The `DistanceComputer` object is derived from an Index (obtained with `get_distance_computer()`). It maintains a current query and quickly computes distances from that query to any item in the database. This is useful, eg. for the IndexHNSW and IndexNSG that rely on query-to-point comparisons in the datasets.
This diff introduces the `FlatCodesDistanceComputer`, that inherits from `DistanceComputer` for Flat indexes. In addition to the distance-to-item function, it adds a `distance_to_code` that computes the distance from any code to the current query, even if it is not stored in the index.
This is implemented for all FlatCode indexes (IndexFlat, IndexPQ, IndexScalarQuantizer and IndexAdditiveQuantizer).
In the process, the two classes were extracted to their own header file `impl/DistanceComputer.h`
Reviewed By: beauby
Differential Revision: D34863609
fbshipit-source-id: 39d8c66475e55c3223c4a6a210827aa48bca292d
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2251
the fast_scan and fast_scan_ivf tests are irregularly timing out on the FB test infra
This diff:
- breaks down more tests into sub-tests
- makes tests cheaper by reducing the test dataset sizes
- corrects a nasty local variable binding bug that prevented all cases of `implem` to be covered.
I also tried to fix the polysemous tests that also timeout but I could not reproduce the timeout.
https://www.internalfb.com/intern/test/562949978542309?ref_report_id=0
Reviewed By: beauby
Differential Revision: D34852254
fbshipit-source-id: b005ffb3723e7d9df75516a539540d9165249cea
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2263
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2258
GPU IVF indices could not be properly deserialized or copied from CPU then added to after the fact, this resulted in the following C++ assertion:
```
Faiss assertion 'indices->numVecs == oldNumVecs' failed in int faiss::gpu::IVFBase::addVectors(faiss::gpu::Tensor<float, 2, true>&, faiss::gpu::Tensor<long int, 1, true>&) at faiss/gpu/impl/IVFBase.cu:581
```
as the count of vectors present was not updated properly everywhere, as discovered internally by vtantia.
This diff fixes this issue by properly updating the count, as well as cleaning up stream usage in the IVF code. The problem is that the code was previously using `thrust::device_vector` which does not have a means to control on which stream copies or other work is performed. This is fixed by replacing all usage of `thrust::device_vector` with our own `DeviceVector` which was already used to store IVF data but not metadata. `DeviceVector` provides sufficient control over the proper CUDA stream usage.
Reviewed By: vtantia, mdouze
Differential Revision: D34886859
fbshipit-source-id: 70577bb386ff7dc0f4443ec4562d3ee80afc24e3
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2245
This changeset makes the `heap_replace_top()` function of the FAISS heap implementation break distance ties by the element's ID, according to the heap's min/max property.
Reviewed By: mdouze
Differential Revision: D34669542
fbshipit-source-id: 0db24fd12442eedeee917fbb3e811ba4a070ce0f
Summary:
If the number of dimensions per sub-quantizer is not in the specialized list, it falls back to the generalized batch GEMM implementation.
When I implemented this, I had in a d2h copy so I could look at the computed distances. I removed the debugging code but not this copy.
Prior to this, PQ16 on 1024 dims was 6x slower than PQ32. Now, it is only 1.5x slower (it is slower because there is a higher number of dims per sub-q, despite there being more sub-qs).
Reviewed By: beauby
Differential Revision: D34526043
fbshipit-source-id: de6f70f0f0b91608eb6ae2a05da2af812546e4bc
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2217
This diff introduces a new Faiss contrib module that contains:
- generic k-means implemented in python (was in distributed_ondisk)
- the two-level clustering code, including a simple function that runs it on a Faiss IVF index.
- sparse clustering code (new)
The main idea is that that code is often re-used so better have it in contrib.
Reviewed By: beauby
Differential Revision: D34170932
fbshipit-source-id: cc297cc56d241b5ef421500ed410d8e2be0f1b77
Summary:
Fix an OMP bug and a memory leakage bug. The first one would lead to non-deterministic results and even worse.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2168
Test Plan: buck test //faiss/tests/:test_lsq -- test_deterministic
Reviewed By: beauby
Differential Revision: D33975589
Pulled By: mdouze
fbshipit-source-id: c1cf2589b0e718354ccf0221c3474633bcb8c7ee
Summary:
As discussed in https://github.com/facebookresearch/faiss/issues/2072, here is a PR to use InvertedListScanner with Python. It might be slower but it gives access to this features in Python for those who want to avoid C++.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2200
Test Plan: check that it compiles
Reviewed By: beauby
Differential Revision: D33975686
Pulled By: mdouze
fbshipit-source-id: dd731f90ce1609d555a17551fcc8c39eadf3fbd7
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2166
RQ training is done progressively from one quantizer to the next, maintaining a current set of codes and quantization centroids.
However, for RQ as for any additive quantizer, there is a closed form solution for the centroids that minimizes the quantization error for fixed codes.
This diff offers the option to estimate that codebook at the end of the optimization. It performs this estimation iteratively, ie. several rounds of code computation - codebook refinement are performed.
A pure python implementation + results is here:
https://github.com/fairinternal/faiss_improvements/blob/dbcc746/decoder/refine_aq_codebook.ipynb
Reviewed By: wickedfoo
Differential Revision: D33309409
fbshipit-source-id: 55c13425292e73a1b05f00e90f4dcfdc8b3549e8
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2147
There was a bug in the OPQ string parsing. This diff adds a test and fixes the error.
Reviewed By: aijanai
Differential Revision: D33020167
fbshipit-source-id: 32e43653849b258a3b6d0cfdc44a6c637433f2c8
Summary:
A recent CUDA driver is required for building packages for CUDA 11.3.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2146
Reviewed By: wickedfoo
Differential Revision: D33020204
Pulled By: beauby
fbshipit-source-id: 01257b1dcb4987f4866cc058c22d1dd5977d76ce
Summary:
- Disable problematic tests on OSX.
- Ensure compiler compatibility with CUDA builds.
- Fix path for Python extension libraries.
- Use CentOS for CUDA packaging.
- Update CUDA versions in CI (10.2 and 11.3).
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2121
Reviewed By: mdouze
Differential Revision: D32921117
Pulled By: beauby
fbshipit-source-id: 588c18add8084b8228ff5abc651eaa4567919cc6
Summary:
IndexHNSW has a deadlock in the add() method, which is fixed by
temporarily releasing the lock on the current element while updating
its neighbors' adjacency lists.
This bug concerns multi-threaded insertion only, and seems to manifest
itself only with certain OpenMP configurations.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2143
Reviewed By: mdouze
Differential Revision: D32919041
Pulled By: beauby
fbshipit-source-id: e515541c1b22bfcb79d29c0bde1843e63f5175fb
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2134
The old parsing was very complex and grew out of hand.
this diff just uses regex parsing.
Reviewed By: wickedfoo
Differential Revision: D32759110
fbshipit-source-id: 243029bba8a7fe70c71323f5edc7e2ce4e669757
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2132
This diff adds the class IndexFlatCodes that becomes the parent of all "flat" encodings.
IndexPQ
IndexFlat
IndexAdditiveQuantizer
IndexScalarQuantizer
IndexLSH
Index2Layer
The other changes are:
- for IndexFlat, there is no vector<float> with the data anymore. It is replaced with a `get_xb()` function. This broke quite a few external codes, that this diff also attempts to fix.
- I/O functions needed to be adapted. This is done without changing the I/O format for any index.
- added a small contrib function to get the data from the IndexFlat
- the functionality has been made uniform, for example remove_ids and add are now in the parent class.
Eventually, we may support generic storage for flat indexes, similar to `InvertedLists`, eg to memmap the data, but this will again require a big change.
Reviewed By: wickedfoo
Differential Revision: D32646769
fbshipit-source-id: 04a1659173fd51b130ae45d345176b72183cae40