Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2533
Implements merge_from for IndexIDMap[2] and IndexPreTransform. In the process, split off IndexIDMap to their own .h/.cpp files.
Reviewed By: alexanderguzhva
Differential Revision: D40420373
fbshipit-source-id: 1570a460706dd3fbc1447f9fcc0e2721eab869bb
Summary:
previously, range_search on IVFFastScan crashed because it used the range_search implem in IndexIVF that tries to obtain an InvertedListsScanner which raises an exception that is not propagated properly in openmp to python.
This diff just throws an exception right away.
Reviewed By: mlomeli1
Differential Revision: D40853406
fbshipit-source-id: e594a3af682b79868233e32a94aea80579378fc0
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2552
The `conda inspect` commands in the `test` section fail without `conda-build` in the `test` environment.
Reviewed By: mlomeli1
Differential Revision: D40793051
fbshipit-source-id: 184418cfa8d0efd6af6b0c806f7bddbeba176732
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2550
This diff contains two fixes:
-`GpuIndex::search_and_reconstruct` was implemented in D37777979 (the default `faiss::Index` implementation never worked on GPU if given GPU input data), but the number of vectors passed to reconstruct was wrong in that diff. This fixes that, and includes a test for `search_and_reconstruct` as well.
-`GpuFlatIndex::reserve` only worked properly if you were calling `add` afterwards. If not, then this would potentially leave the index in a bad state. This bug has existed since 2016 in GPU Faiss.
Also implemented a test for a massive `GpuIndexFlat` index (more than 4 GB of data). Proper implementation of large (>2 GB) indexes via 64 bit indexing arithmetic will be done in a followup diff which touches most of the GPU code.
Reviewed By: alexanderguzhva
Differential Revision: D40765397
fbshipit-source-id: 7eb4368e7588aea144bc5bcc53fd11b1e70f33ea
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2544
Don't use #pragma once for the include headers.
Reviewed By: rahulg
Differential Revision: D40544318
fbshipit-source-id: 129e6de27d569fd46ccc460a262de3b991f568bc
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2532
Add 8x compression level, such as 'PQ64np' for 128-dim data. Prior to this change, only higher compression rates were supported.
Reviewed By: mdouze
Differential Revision: D40312821
fbshipit-source-id: 7dba4e9b8d432f5f7be618c0e7ef50dac2f88497
Summary:
D37777979 included a change in order to allow usage of CPU index types for the IVF coarse quantizer. For residual computation, the centroids of the coarse quantizer IVF cells needed to be on the GPU in float32. If a GPUIndexFlat is used as the coarse quantizer for an IVF index, and if that index was float16, or if a CPU index was used as a coarse quantizer, a shadow copy of the centroids was made in float32 for IVF usage.
However, this shadow copy is only needed if a GPU float16 flat index is used as an IVF coarse quantizer. Previously we were always duplicating this data whether a GpuIndexFlat was used in an IVF index or not.
This diff restricts the construction of the shadow float32 data to only cases where we are using the GpuIndexFlat in an IVF index. Otherwise, the GpuIndexFlat, if float16, will only retain float16 data.
This should prevent the problem with memory bloat with massive float16 flat indexes.
Ideally the shadow float32 values for GPU coarse indices shouldn't be needed as all, but this will require updating the IVFPQ code to allow usage of float16 IVF centroids. This is something I will pursue in a less time-limited diff.
This diff also changes the GpuIndexFlat reconstruct methods to use kernels explicitly designed for operating on float16 and float32 data as needed, rather than having access to the entire matrix of float32 values.
Also added some additional assertions in order to track down issues.
An additional problem as seen with N2630278 post-D37777979 is that calling reconstruct on a large flat index (one where there are more than 2^31 scalar elements in the index) results in int32 overflow error in the reconstruct kernel that would be called for a single vector or a contiguous range of vectors. Previously, this use case was handled by `cudaMemcpyAsync` using `size_t` etc. calculation, but now in order to handle float16 and float32 in the same manner, there is an explicit kernel to do the copy and conversion if needed, avoiding a separate copy then conversion. The error as seen in that notebook was a fault in the reconstruct by range kernel.
This kernel has been temporarily fixed to not have the int32 indexing problems. Since when Faiss GPU was written in 2016, GPU memories have become a lot larger and it now seems the time to support (u)int64 indexing everywhere. I am adding this minimal change for now to fix this fault but early next week I will do a pass over the entire Faiss GPU code to update to using `Index::idx_t` as the indexing type everywhere, which should remove problems in dealing with large datasets.
Reviewed By: mdouze
Differential Revision: D40355184
fbshipit-source-id: 78f8b5d5aebcba610d3cd46f2cb2d26276e0ff15
Summary:
* Modify pq4_get_paked_element to make it not depend on an auxiliary table
* Create pq4_set_packed_element which sets a single element in codes in packed format
(These methods would be used in merge and remove for IndexFastScan
get method is also used in FastScan indices for reconstruction)
* Add remove feature for IndexFastScan
* Add merge feature for indexFast Scan
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2497
Test Plan:
cd build && make -j
make test
cd faiss/python && python setup.py build && cd ../../..
PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
Reviewed By: mdouze
Differential Revision: D39927403
Pulled By: mdouze
fbshipit-source-id: 45271b98419203dfb1cea4f4e7eaf0662523a5b5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2519
From the beginning, the CPU IVF index design allowed rather arbitrary index instances to be used as the coarse (level 1) quantizer. On the GPU, IVF indices automatically constructed a FlatIndex (internal object that a GpuIndexFlat wraps) to be the coarse quantizer, and did not allow substituting anything else.
This diff allows for the GPU to function like the CPU IndexIVF classes, namely that there is a `quantizer` instance that can be arbitrarily substituted assuming that the coarse quantizer has the same number of vectors as the IVF nlist. Also, we now support any CPU or GPU index as the coarse quantizer for a GpuIndexIVF.
We detect internally if the IVF quantizer instance is a GPU index instance, in which case we can avoid d2h/h2d data copies as needed and pass device data directly to the plugged GPU coarse quantizer. If the plugged coarse quantizer is a CPU instance, then proper d2h/h2d copies for data are inserted as needed.
As some GPU IVF indices operate on the residual with respect to the coarse quantizer, it is necessary that the IVF index has access to the coarse centroids, even if the index is a CPU index. When a CPU index is used as a coarse quantizer, a reconstruction and copy of all of the coarse centroids to the GPU is performed. If the user changes the quantizer instance, or otherwise modifies the quantizer, in order for the GPU IVF index to recognize this change, a function `GpuIndexIVF::updateQuantizer()` must be called to update this cached state. If the coarse quantizer instance is a `GpuIndexFlat` then no separate cached copy is made as we can have direct access to the `FlatIndex` centroid storage.
Additionally, the `IndexIVF::search_preassigned` interface has been added to all GPU IVF instances via `GpuIndexIVF`. Conversion as needed from CPU arrays to GPU is done based on the address space of the passed inputs.
Other additional changes:
- Removed the `storeTransposed` functionality of `GpuIndexFlat`, as specified by `GpuIndexFlatConfig::storeTransposed`. This was a feature added back in 2016 to potentially accelerate coarse quantizer lookups by avoiding a transposition during matrix multiplication. This feature was not much used, and was an internal implementation detail, and supporting it with the new pluggable functionality wasn't worth it, so this transposition functionality was removed from the code but the parameter in `GpuIndexFlatConfig` still remained.
- This change also required updating index handling code to be `Index::idx_t` (64 bit) based instead of 32 bit in many instances, as any CPU and non-flat index GPU instances will be reporting IVF cells via `Index::idx_t`. 32 bit indices were used in much of the original Faiss due to the poor performance of 64 bit integers versus 32 bit integers.
- Refactored and deleted some redundant code between the `GPUIndexIVF` subclasses (`GpuIndexIVFFlat`, `GpuIndexIVFPQ`, `GpuIndexIVFScalarQuantizer`) for `search` and `search_preassigned`. This is now done by adding an interface to the IVFBase class which contains GPU-specific state (and is a CUDA file/header so is hidden behind an opaque pointer) and virtual functions to provide dispatch to the IVFFlat (which also implements IVFSQ) or IVFPQ classes in gpu/impl.
- Some of the `GpuIndexIVF` subclasses didn't have a default metric parameter (`METRIC_L2`), unlike the CPU versions. Added this default parameter to the header.
- Updated the check for the passed-in coarse quantizer in `GpuIndexIVF` to more closely correspond to the CPU version. Previously it was throwing an error if the coarse quantizer had ntotal not equal to nlist.
- Moved code that sets the proper GPU device (`DeviceScope` etc) to the very top of the functions that need it. It is critical that this is called, and nice to be able to visually verify that it is being set in these functions. Some functions buried it deep within (though not after the code that actually needed that scope to be set).
Reviewed By: mdouze
Differential Revision: D37777979
fbshipit-source-id: 517f611c2afdae87e79258bf1b3a92be406ade86
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2509
Adds support for:
- IDSelector for Flat and SQ
- search_type in SearchParametersPQ
- IDSelectors implemented in Python (slow but good for testing)
Start optimization of IDSelectorRange and IDSelectorArray for IndexFlat and IDSelectorRange for IndexIVF
Reviewed By: alexanderguzhva
Differential Revision: D40037795
fbshipit-source-id: 61e01acb43c6aa39fea2c3b67a8bba9072383b74
Summary: Code would crash when deallocating the coarse quantizer for a IVFSpectralHash.
Reviewed By: algoriddle
Differential Revision: D40053030
fbshipit-source-id: 6a2987a6983f0e5fc5c5b6296d9000354176af83
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2508
the Faiss python module was in a monolythic __init__.py
This diff splits it in several sub-modules.
The tricky thing is to make inter-dependencies work.
Reviewed By: alexanderguzhva
Differential Revision: D39969794
fbshipit-source-id: 6e7f896a4b35a7c1a0a1f3a986daa32a00bfae6b
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2505
the SearchParameters made the swig wrapper too long. This diff attempts to work around.
Reviewed By: alexanderguzhva
Differential Revision: D39998713
fbshipit-source-id: 6938b5ca1c64bdc748899407909f7e59f62c0de3
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2483
This diff changes the following:
1. all search functions now take a `SearchParameters` argument that overrides the internal search parameters
2. the default implementation for most classes throws when the params argument is non-nullptr / non-None
3. the IndexIVF and IndexHNSW classes have functioning SearchPArameters
4. the SearchParameters includes an IDSelector that can search only in a subset of the index based on a defined subset of ids
There is also some refactoring: the IDSelector was moved to its own .h/.cpp and python/__init__.py is spit in parts.
The diff is quite bulky because the search function prototypes need to be changed in all index classes.
Things to fix in subsequent diffs:
- support SearchParameters for more index types (Flat variants)
- better sub-object ownership for SearchParams (with std::unique_ptr?)
- special handling of IDSelectorRange to make it faster
Reviewed By: alexanderguzhva
Differential Revision: D39852589
fbshipit-source-id: 4988bdb5b9bee1207cd327d3f80bf5e0e2467fe1
Summary:
Fixes OSX CI by pinning pytorch version for interop tests. The "real" fix is already landed in pytorch but has not been released yet.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2482
Reviewed By: alexanderguzhva
Differential Revision: D39891113
Pulled By: beauby
fbshipit-source-id: fa79bf9de1c93e056260ea64613e37625edfecc3
Summary:
CentOS 8 being EOL, some modifications to the OS packages repositories
are needed in order to keep the build working.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2495
Reviewed By: alexanderguzhva
Differential Revision: D39857306
Pulled By: beauby
fbshipit-source-id: 175f436132c589a9a22d93727bb58e556a43a8f6
Summary:
It seems that [`xcode:12.4.0` has been retired from CircleCI](https://circleci.com/docs/en/using-macos#supported-xcode-versions), so this PR updates the image version and fixes stopping the pipeline at the spin-up stage.
This PR changes only `.circleci/config.yml` , and doesn't affect the software behavior.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2442
Reviewed By: beauby
Differential Revision: D39259341
Pulled By: mdouze
fbshipit-source-id: 8c7b0f8eb6f6f951329b4e2a2964672d0ee75ceb
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2489
Currently, Faiss does not save the precomputed_table for IVFPQ when saving an index, and reconstructs one on the fly upon loading. The introduced flag allows to skip the reconstruction during faiss.read_index()
Reviewed By: mdouze
Differential Revision: D39747978
fbshipit-source-id: 47d3cfc59e791e2e0b986301764ccc5b220292f4
Summary:
The residual coarse quantizer could OOM because of the norms table that is of size ntotal.
This diff just re-uses a field that fixes a max amount of mem in the additive quantizers and throws if the norms would grow below that.
Reviewed By: alexanderguzhva
Differential Revision: D39771448
fbshipit-source-id: b6a071900e02a81848495e39691405b30f56e291
Summary:
support merge for all IndexFlatCodes children
make merge_from and check_compatible_for_merge methods of Index and IndexIVF and IndexFlatCodes(the only supported types) inherit them from Index.
This is part 1 of 2 as merge_into still not updated
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2488
Test Plan:
cd build
make -j
make test
cd faiss/python && python setup.py build
cd ../../..
PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
# fbcode
buck test //faiss/tests/:test_index_merge
buck test //faiss/tests/:test_io
Reviewed By: mdouze
Differential Revision: D39726378
Pulled By: AbdelrahmanElmeniawy
fbshipit-source-id: 6739477fddcad3c7a990f3aae9be07c1b2b74fef
Summary:
makes index_factory support IDMap2 not only IDMap and add required tests
adding IDMap2 to index_factory would help users to take advantage of its extra features more than IDMap such as reconstruct the indices.
solves [issue 1864](https://github.com/facebookresearch/faiss/issues/1864)
+fix downcast_index IDMap / IDMap2 order
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2478
Test Plan:
cd build
make -j
cd faiss/python && python setup.py build
cd ../../..
PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
Reviewed By: mdouze
Differential Revision: D39660813
Pulled By: AbdelrahmanElmeniawy
fbshipit-source-id: 4881d325bb3b0eaf9637a544511d18c2084453eb
Summary:
prevents resource leak in function OnDiskInvertedLists::do_mmap as the file is not closed when mmap fails.
solve [issue 2427](https://github.com/facebookresearch/faiss/issues/2427)
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2471
Test Plan:
cd build
make -j
make test
Reviewed By: mdouze
Differential Revision: D39539304
Pulled By: AbdelrahmanElmeniawy
fbshipit-source-id: 07985c5a4fc67facb67221d317c0978a4278464d
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2458
Add overloads for ::accum() for the case of each code sharing coarse quantizer centroids table and fine quantizer centroids table
Reviewed By: mdouze
Differential Revision: D39314206
fbshipit-source-id: 170a0a1c434e00c95c98151e026d1e30ac017149
Summary:
According to `InvertedLists` API conventions, pointers returned from `get_ids` must be released by `release_ids`, which is violated by `get_single_id`. Note that all subclasses of `InvertedLists` which overwrite `release_ids` also overwrite `get_single_id`, the code change has no actual runtime impact with respect to existing code. However, if someone wants to implement his or her `InvertedLists` subclass and chooses not to overwrite `get_single_id`, this code change will help him or her to avoid potential memory leak.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2412
Reviewed By: alexanderguzhva
Differential Revision: D39167152
Pulled By: mdouze
fbshipit-source-id: d2daef801a4c375d5e2c80ea1fdf259bf31e4b3d
Summary:
When using IndexFastScan and IndexIVFFastScan to build database with verbose=True, the progress output is garbled because of lack of \n.
For example:
```python
import faiss
import numpy as np
d = 64
M = 32
nbits = 4
db = np.random.rand(300000, d).astype(np.float32)
index = faiss.IndexPQFastScan(d, M, nbits)
index.train(db)
index.verbose = True
index.add(db)
```
outputs:
```
IndexFastScan::add 65536/300000IndexFastScan::add 131072/300000IndexFastScan::add 196608/300000IndexFastScan::add 262144/300000IndexFastScan::add 300000/300000%
```
This pull request can fix this problem.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2376
Reviewed By: alexanderguzhva
Differential Revision: D39166999
Pulled By: mdouze
fbshipit-source-id: 968805ca054d1841d94f47395408cf70c8736444
Summary:
I couldn't run the client-server implementation because of the logging. Indeed `LOG.info('Connected by', addr, end=' ')` raised an exception (`end` is not recognised as a valid argument).
Other warnings are also showing up. This PR clean things up a bit and fixes the client-server.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2433
Reviewed By: alexanderguzhva
Differential Revision: D39167576
Pulled By: mdouze
fbshipit-source-id: 6f74d582f14e353e04029e6465bd6e488a865289
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2445
Add overloads for ::accum() to process 3 vectors per call. It is faster than processing 2 vectors per call in certain cases, at least for the AVX2 code.
Reviewed By: mdouze
Differential Revision: D39176425
fbshipit-source-id: bb39bb1f7a77442d32f20cb29281ec2e2ed2600c
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2444
Add IndexPQDecoder. The following codecs are supported:
* PQ[1]x8
Additionally, AVX2 and ARM versions support the following codecs:
* PQ[1]x10
* PQ[1]x16
Reviewed By: mdouze
Differential Revision: D39176423
fbshipit-source-id: b002b3d3b0533849f72f3660e8088d8dc44a66d6
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2443
Add support for
* IVF[9-16 bit],PQ[1]x8 (such as IVF1024,PQ16np)
* Residual1x[9-16 bit],PQ[1]x8 (such as Residual1x9,PQ8)
Additionally, AVX2 and ARM versions support
* Residual[1]x8,PQ[2]x10
* Residual[1]x8,PQ[2]x16
* Residual1x[9-16 bit],PQ[1]x10 (such as Residual1x9,PQ16x10)
* Residual1x[9-16 bit],PQ[1]x16 (such as Residual1x9,PQ16x16)
* Residual[1]x10,PQ[2]x10
* Residual[1]x10,PQ[2]x16
* Residual[1]x16,PQ[2]x10
* Residual[1]x16,PQ[2]x16
IVF[9-16 bit],PQ[1]x10 and IVF[9-16 bit],PQ[1]x16 (such as IVF1024,PQ16x10np) are supported as well, but Faiss does not allow to train such Indices as this time.
Reviewed By: mdouze
Differential Revision: D39176424
fbshipit-source-id: 29b3d8d27a5fed0185df3e5484003fcc1521083a
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2439
Index wrapper that performs rowwise normalization to [0,1], preserving the coefficients. This is a vector codec index only.
Basically, this index performs a rowwise scaling to [0,1] of every row in an input dataset before calling subindex::train() and subindex::sa_encode(). sa_encode() call stores the scaling coefficients (scaler and minv) in the very beginning of every output code. The format:
[scaler][minv][subindex::sa_encode() output]
The de-scaling in sa_decode() is done using:
output_rescaled = scaler * output + minv
An additional ::train_inplace() function is provided in order to do an inplace scaling before calling subindex::train() and, thus, avoiding the cloning of the input dataset, but modifying the input dataset because of the scaling and the scaling back.
Derived classes provide different data types for scaling coefficients. Currently, versions with fp16 and fp32 scaling coefficients are available.
* fp16 version adds 4 extra bytes per encoded vector
* fp32 version adds 8 extra bytes per encoded vector
Reviewed By: mdouze
Differential Revision: D38581012
fbshipit-source-id: d739878f1db62ac5ab9e0db3f84aeb2b70a1b6c0
Summary:
According to [`CMakeLists.txt`](442d9f4a2d/CMakeLists.txt (L20)), current `faiss` doesn't recognize `sse4` as `FAISS_OPT_LEVEL` .
I've read `CMakeLists.txt`s and confirmed that (current `faiss` treats `sse4` as same as `generic`), so this PR removes the description of this outdated option from `INSTALL.md` .
This PR contains only document update, so this doesn't affect the software behavior.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2391
Reviewed By: alexanderguzhva
Differential Revision: D39167022
Pulled By: mdouze
fbshipit-source-id: ff36fc5167c4d2e8d16206061624a8ba2890b4b7
Summary:
This line should be searching for xq, but the current line is searching the first 5 line in xb. It should be a bug.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2419
Reviewed By: alexanderguzhva
Differential Revision: D39167612
Pulled By: mdouze
fbshipit-source-id: fc2534fa799dcbedae1af7881f5d70026b4de675
Summary:
For search request with few queries or single query, this PR adds the ability to run threads over both queries and different cluster of the IVF. For application where latency is important this can **dramatically reduce latency for single query requests**.
A new implementation (https://github.com/facebookresearch/faiss/issues/14) is added. The new implementation could be merged to the implementation 12 but for simplicity in this PR, I created a separate function.
Tests are added to cover the new implementation and new tests are added to specifically cover the case when a single query is used.
In my benchmarks a very good reduction of latency is observed for single query requests.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2380
Test Plan:
```
buck test //faiss/tests/:test_fast_scan_ivf -- implem14
buck test //faiss/tests/:test_fast_scan_ivf -- implem15
```
Reviewed By: alexanderguzhva
Differential Revision: D38074577
Pulled By: mdouze
fbshipit-source-id: e7a20b6ea2f9216e0a045764b5d7b7f550ea89fe