Summary:
CentOS 8 being EOL, some modifications to the OS packages repositories
are needed in order to keep the build working.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2495
Reviewed By: alexanderguzhva
Differential Revision: D39857306
Pulled By: beauby
fbshipit-source-id: 175f436132c589a9a22d93727bb58e556a43a8f6
Summary:
It seems that [`xcode:12.4.0` has been retired from CircleCI](https://circleci.com/docs/en/using-macos#supported-xcode-versions), so this PR updates the image version and fixes stopping the pipeline at the spin-up stage.
This PR changes only `.circleci/config.yml` , and doesn't affect the software behavior.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2442
Reviewed By: beauby
Differential Revision: D39259341
Pulled By: mdouze
fbshipit-source-id: 8c7b0f8eb6f6f951329b4e2a2964672d0ee75ceb
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2489
Currently, Faiss does not save the precomputed_table for IVFPQ when saving an index, and reconstructs one on the fly upon loading. The introduced flag allows to skip the reconstruction during faiss.read_index()
Reviewed By: mdouze
Differential Revision: D39747978
fbshipit-source-id: 47d3cfc59e791e2e0b986301764ccc5b220292f4
Summary:
The residual coarse quantizer could OOM because of the norms table that is of size ntotal.
This diff just re-uses a field that fixes a max amount of mem in the additive quantizers and throws if the norms would grow below that.
Reviewed By: alexanderguzhva
Differential Revision: D39771448
fbshipit-source-id: b6a071900e02a81848495e39691405b30f56e291
Summary:
support merge for all IndexFlatCodes children
make merge_from and check_compatible_for_merge methods of Index and IndexIVF and IndexFlatCodes(the only supported types) inherit them from Index.
This is part 1 of 2 as merge_into still not updated
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2488
Test Plan:
cd build
make -j
make test
cd faiss/python && python setup.py build
cd ../../..
PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
# fbcode
buck test //faiss/tests/:test_index_merge
buck test //faiss/tests/:test_io
Reviewed By: mdouze
Differential Revision: D39726378
Pulled By: AbdelrahmanElmeniawy
fbshipit-source-id: 6739477fddcad3c7a990f3aae9be07c1b2b74fef
Summary:
makes index_factory support IDMap2 not only IDMap and add required tests
adding IDMap2 to index_factory would help users to take advantage of its extra features more than IDMap such as reconstruct the indices.
solves [issue 1864](https://github.com/facebookresearch/faiss/issues/1864)
+fix downcast_index IDMap / IDMap2 order
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2478
Test Plan:
cd build
make -j
cd faiss/python && python setup.py build
cd ../../..
PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
Reviewed By: mdouze
Differential Revision: D39660813
Pulled By: AbdelrahmanElmeniawy
fbshipit-source-id: 4881d325bb3b0eaf9637a544511d18c2084453eb
Summary:
prevents resource leak in function OnDiskInvertedLists::do_mmap as the file is not closed when mmap fails.
solve [issue 2427](https://github.com/facebookresearch/faiss/issues/2427)
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2471
Test Plan:
cd build
make -j
make test
Reviewed By: mdouze
Differential Revision: D39539304
Pulled By: AbdelrahmanElmeniawy
fbshipit-source-id: 07985c5a4fc67facb67221d317c0978a4278464d
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2458
Add overloads for ::accum() for the case of each code sharing coarse quantizer centroids table and fine quantizer centroids table
Reviewed By: mdouze
Differential Revision: D39314206
fbshipit-source-id: 170a0a1c434e00c95c98151e026d1e30ac017149
Summary:
According to `InvertedLists` API conventions, pointers returned from `get_ids` must be released by `release_ids`, which is violated by `get_single_id`. Note that all subclasses of `InvertedLists` which overwrite `release_ids` also overwrite `get_single_id`, the code change has no actual runtime impact with respect to existing code. However, if someone wants to implement his or her `InvertedLists` subclass and chooses not to overwrite `get_single_id`, this code change will help him or her to avoid potential memory leak.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2412
Reviewed By: alexanderguzhva
Differential Revision: D39167152
Pulled By: mdouze
fbshipit-source-id: d2daef801a4c375d5e2c80ea1fdf259bf31e4b3d
Summary:
When using IndexFastScan and IndexIVFFastScan to build database with verbose=True, the progress output is garbled because of lack of \n.
For example:
```python
import faiss
import numpy as np
d = 64
M = 32
nbits = 4
db = np.random.rand(300000, d).astype(np.float32)
index = faiss.IndexPQFastScan(d, M, nbits)
index.train(db)
index.verbose = True
index.add(db)
```
outputs:
```
IndexFastScan::add 65536/300000IndexFastScan::add 131072/300000IndexFastScan::add 196608/300000IndexFastScan::add 262144/300000IndexFastScan::add 300000/300000%
```
This pull request can fix this problem.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2376
Reviewed By: alexanderguzhva
Differential Revision: D39166999
Pulled By: mdouze
fbshipit-source-id: 968805ca054d1841d94f47395408cf70c8736444
Summary:
I couldn't run the client-server implementation because of the logging. Indeed `LOG.info('Connected by', addr, end=' ')` raised an exception (`end` is not recognised as a valid argument).
Other warnings are also showing up. This PR clean things up a bit and fixes the client-server.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2433
Reviewed By: alexanderguzhva
Differential Revision: D39167576
Pulled By: mdouze
fbshipit-source-id: 6f74d582f14e353e04029e6465bd6e488a865289
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2445
Add overloads for ::accum() to process 3 vectors per call. It is faster than processing 2 vectors per call in certain cases, at least for the AVX2 code.
Reviewed By: mdouze
Differential Revision: D39176425
fbshipit-source-id: bb39bb1f7a77442d32f20cb29281ec2e2ed2600c
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2444
Add IndexPQDecoder. The following codecs are supported:
* PQ[1]x8
Additionally, AVX2 and ARM versions support the following codecs:
* PQ[1]x10
* PQ[1]x16
Reviewed By: mdouze
Differential Revision: D39176423
fbshipit-source-id: b002b3d3b0533849f72f3660e8088d8dc44a66d6
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2443
Add support for
* IVF[9-16 bit],PQ[1]x8 (such as IVF1024,PQ16np)
* Residual1x[9-16 bit],PQ[1]x8 (such as Residual1x9,PQ8)
Additionally, AVX2 and ARM versions support
* Residual[1]x8,PQ[2]x10
* Residual[1]x8,PQ[2]x16
* Residual1x[9-16 bit],PQ[1]x10 (such as Residual1x9,PQ16x10)
* Residual1x[9-16 bit],PQ[1]x16 (such as Residual1x9,PQ16x16)
* Residual[1]x10,PQ[2]x10
* Residual[1]x10,PQ[2]x16
* Residual[1]x16,PQ[2]x10
* Residual[1]x16,PQ[2]x16
IVF[9-16 bit],PQ[1]x10 and IVF[9-16 bit],PQ[1]x16 (such as IVF1024,PQ16x10np) are supported as well, but Faiss does not allow to train such Indices as this time.
Reviewed By: mdouze
Differential Revision: D39176424
fbshipit-source-id: 29b3d8d27a5fed0185df3e5484003fcc1521083a
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2439
Index wrapper that performs rowwise normalization to [0,1], preserving the coefficients. This is a vector codec index only.
Basically, this index performs a rowwise scaling to [0,1] of every row in an input dataset before calling subindex::train() and subindex::sa_encode(). sa_encode() call stores the scaling coefficients (scaler and minv) in the very beginning of every output code. The format:
[scaler][minv][subindex::sa_encode() output]
The de-scaling in sa_decode() is done using:
output_rescaled = scaler * output + minv
An additional ::train_inplace() function is provided in order to do an inplace scaling before calling subindex::train() and, thus, avoiding the cloning of the input dataset, but modifying the input dataset because of the scaling and the scaling back.
Derived classes provide different data types for scaling coefficients. Currently, versions with fp16 and fp32 scaling coefficients are available.
* fp16 version adds 4 extra bytes per encoded vector
* fp32 version adds 8 extra bytes per encoded vector
Reviewed By: mdouze
Differential Revision: D38581012
fbshipit-source-id: d739878f1db62ac5ab9e0db3f84aeb2b70a1b6c0
Summary:
According to [`CMakeLists.txt`](442d9f4a2d/CMakeLists.txt (L20)), current `faiss` doesn't recognize `sse4` as `FAISS_OPT_LEVEL` .
I've read `CMakeLists.txt`s and confirmed that (current `faiss` treats `sse4` as same as `generic`), so this PR removes the description of this outdated option from `INSTALL.md` .
This PR contains only document update, so this doesn't affect the software behavior.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2391
Reviewed By: alexanderguzhva
Differential Revision: D39167022
Pulled By: mdouze
fbshipit-source-id: ff36fc5167c4d2e8d16206061624a8ba2890b4b7
Summary:
This line should be searching for xq, but the current line is searching the first 5 line in xb. It should be a bug.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2419
Reviewed By: alexanderguzhva
Differential Revision: D39167612
Pulled By: mdouze
fbshipit-source-id: fc2534fa799dcbedae1af7881f5d70026b4de675
Summary:
For search request with few queries or single query, this PR adds the ability to run threads over both queries and different cluster of the IVF. For application where latency is important this can **dramatically reduce latency for single query requests**.
A new implementation (https://github.com/facebookresearch/faiss/issues/14) is added. The new implementation could be merged to the implementation 12 but for simplicity in this PR, I created a separate function.
Tests are added to cover the new implementation and new tests are added to specifically cover the case when a single query is used.
In my benchmarks a very good reduction of latency is observed for single query requests.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2380
Test Plan:
```
buck test //faiss/tests/:test_fast_scan_ivf -- implem14
buck test //faiss/tests/:test_fast_scan_ivf -- implem15
```
Reviewed By: alexanderguzhva
Differential Revision: D38074577
Pulled By: mdouze
fbshipit-source-id: e7a20b6ea2f9216e0a045764b5d7b7f550ea89fe
Summary:
- use `vaddvq_f32` instead of `vpaddq_f32` and `vdups_laneq_f32` in `fvec_L2sqr` , `fvec_inner_product` , and `fvec_norm_L2sqr`
- ~~implement `fvec_L1` and `fvec_Linf` for ARM SIMD (NEON)~~
- This causes performance regression, so I've droped it.
- implement `fvec_madd` and `fvec_madd_and_argmin` for ARM SIMD (NEON)
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2392
Reviewed By: patricklabatut
Differential Revision: D38198174
Pulled By: mdouze
fbshipit-source-id: 3488a0cf2db1ded458b3bf73f4bc9665413e3351
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2405
Move encode_fp16 and decode_fp16 out of impl/ScalarQuantizer.cpp into utils/fp_16.h. This is needed because fp16 functions might be needed elsewhere, not only in SQ code.
Reviewed By: mdouze
Differential Revision: D38428096
fbshipit-source-id: 73c9f32919b7b450827cc2394d4d083e0fff1aea
Summary:
Work in progress.
This PR is going to implement the following search methods for ProductAdditiveQuantizer, including index factory and I/O:
- [x] IndexProductAdditiveQuantizer
- [x] IndexIVFProductAdditiveQuantizer
- [x] IndexProductAdditiveQuantizerFastScan
- [x] IndexIVFProductAdditiveQuantizerFastScan
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2336
Test Plan:
buck test //faiss/tests/:test_fast_scan
buck test //faiss/tests/:test_fast_scan_ivf
buck test //faiss/tests/:test_local_search_quantizer
buck test //faiss/tests/:test_residual_quantizer
Reviewed By: alexanderguzhva
Differential Revision: D37172745
Pulled By: mdouze
fbshipit-source-id: 6ff18bfc462525478c90cd42e21805ab8605bd0f
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2396
Remove 1/3 of kernel tests that are likely to be redundant. This should reduce the running time for tests.
Reviewed By: mdouze
Differential Revision: D38086992
fbshipit-source-id: 468d5e9593cb7144986f763d31087b25965d45fa
Summary:
This PR contains below changes:
- Conform C++11
- [`faiss` is written in C++11](https://github.com/facebookresearch/faiss/blob/main/CONTRIBUTING.md#coding-style), but [`faiss/cppcontrib/SaDecodeKernels-avx2-inl.h`](442d9f4a2d/faiss/cppcontrib/SaDecodeKernels-avx2-inl.h) and [the test](442d9f4a2d/tests/test_cppcontrib_sa_decode.cpp) use some C++17 features. This PR rewrites these codes to make them independent to C++17.
- Enable AVX2 on `faiss_test`
- Currently `faiss_test` is compiled without `-mavx2` even if `-DFAISS_OPT_LEVEL=avx2` , so **`tests/test_cppcontrib_sa_decode.cpp` hasn't checked `faiss/cppcontrib/SaDecodeKernels-avx2-inl.h` at all** . This PR adds `-mavx2` to `faiss_test` if `-DFAISS_OPT_LEVEL=avx2` , so now `tests/test_cppcontrib_sa_decode.cpp` confirms `faiss/cppcontrib/SaDecodeKernels-avx2-inl.h` if `-DFAISS_OPT_LEVEL=avx2` , and does `faiss/cppcontrib/SaDecodeKernels-inl.h` if not `-DFAISS_OPT_LEVEL=avx2` .
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2388
Reviewed By: mdouze
Differential Revision: D38005738
Pulled By: alexanderguzhva
fbshipit-source-id: b9319c585c6849e1c7a4782770f2d7ce8c0d8660
Summary:
I found some tiny mistakes, so fixed it.
This PR doesn't change the software behavior.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2384
Reviewed By: beauby
Differential Revision: D37921065
Pulled By: mdouze
fbshipit-source-id: 060a969892e41b29485c5f2f358b5971ce9dfb8d
Summary:
Signed-off-by: Ryan Russell <git@ryanrussell.org>
Various readability fixes focused on `.md` files:
- Grammar
- Fix some incorrect command references to `distributed_kmeans.py`
- Styling the markdown bash code snippets sections so they format
Attempted to put a lot of little things into one PR and commit; let me know if any mods are needed!
Best,
Ryan
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2378
Reviewed By: alexanderguzhva
Differential Revision: D37717671
Pulled By: mdouze
fbshipit-source-id: 0039192901d98a083cd992e37f6b692d0572103a
Summary:
Exporting a few more functions to the C API
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2367
Reviewed By: alexanderguzhva
Differential Revision: D37480505
Pulled By: mdouze
fbshipit-source-id: 899baca8795e29b20e16b56ea3c0d13960e1ea37
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2327
Expose buffer sizes for:
* MultiIndexQuantizer::search
* IndexIVFPQ::add_core_o
* Index2Layer::sa_encode
* ProductQuantizer::compute_codes
These constants were introduced to handle the possible out-of-memory problem. Faiss performs certain operations in chunks. Increasing the chunk sizes reduces the OpenMP overhead and speeds up computations in certain cases at the cost of higher memory consumption.
Reviewed By: mdouze
Differential Revision: D36248391
fbshipit-source-id: 17b38f8b7f59748d5ff72c79938e66b1800983a9
Summary:
This diff added ProductAdditiveQuantizer.
A Simple Algo description:
1. Divide the vector space into several orthogonal sub-spaces, just like PQ does.
2. Quantize each sub-space by an independent additive quantizer.
Usage:
Construct a ProductAdditiveQuantizer object:
- `d`: dimensionality of the input vectors
- `nsplits`: number of sub-spaces divided into
- `Msub`: `M` of each additive quantizer
- `nbits`: `nbits` of each additive quantizer
```python
d = 128
nsplits = 2
Msub = 4
nbits = 8
plsq = faiss.ProductLocalSearchQuantizer(d, nsplits, Msub, nbits)
prq = faiss.ProductResidualQuantizer(d, nsplits, Msub, nbits)
```
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2286
Test Plan:
```
buck test //faiss/tests/:test_local_search_quantizer -- TestProductLocalSearchQuantizer
buck test //faiss/tests/:test_residual_quantizer -- TestProductResidualQuantizer
```
Reviewed By: alexanderguzhva
Differential Revision: D35907702
Pulled By: mdouze
fbshipit-source-id: 7428a196e6bd323569caa585c57281dd70e547b1
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2320
Checks are a bit stricted in platform010, so fix new CI errors here.
The errors corrected fall in 3 classes:
- `&vector[vector.size()]` now fails because `operator []` checks for array bounds even if only the address is maniuplated
- `omp schedule(dynamic)` does not run the loop in the correct order.
- several threads calling omp loop seems to cause errors in the distributed Faiss code
Reviewed By: beauby
Differential Revision: D35895550
fbshipit-source-id: e9dcf5615158610a42870e6a41c77e4db6ebeea0
Summary:
Fixed the include file for the IVFPQ demo in the GPU index. Adds a targets entry for it as well.
Fixes
https://github.com/facebookresearch/faiss/issues/2293
Reviewed By: beauby
Differential Revision: D35775928
fbshipit-source-id: 15ea837e5a67a6d692e980d90195400936dac1e1
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2294
there is a weird CI failure on one of the platforms occurring in the PR
https://github.com/facebookresearch/faiss/pull/2291
This diff makes the test a bit more robust, correcting inter_perf to computer the intersection measure. Hopefully this will make the bug go away.
Reviewed By: beauby
Differential Revision: D35558855
fbshipit-source-id: f5a926d9d8ebee975e538c65ac37b15d485798aa