Summary:
related: https://github.com/facebookresearch/faiss/issues/1812
This PR improves the performance of `IndexPQFastScan` and `IndexIVFPQFastScan` on aarch64 devices, e.g., 60x faster on an AWS Arm instance with the SIFT1M dataset.
The contents of this PR are below:
- Add `simdlib_neon.h`
- `simdlib_neon.h` has `simdlib` compatible API, and they are implemented with Arm NEON intrinsics.
- `simdlib.h` includes `simdlib_neon.h` if `__aarch64__` is defined.
- Move `geteven` , `getodd` , `getlow128` , and `gethigh128` from `distances_simd.cpp` to `simdlib_avx2.h` .
- Port `geteven` , `getodd` , `getlow128` , and `gethigh128` for non-AVX2 environments.
- These codes are implemented with AVX2 intrinsics, so they have prevented to implement `compute_PQ_dis_tables_dsub2` for non-AVX2 environments.
- Now `simdlib_avx2.h` , `simdlib_emulated.h` , and `simdlib_neon.h` all have those functions.
- Enable `compute_PQ_dis_tables_dsub2` on aarch64
- Above change makes `compute_PQ_dis_tables_dsub2` independent from `geteven` and so on.
- `compute_PQ_dis_tables_dsub2` implemented with `simdlib_neon.h` is little faster than current implementation, so enabling that.
- In contrast, `compute_PQ_dis_tables_dsub2` implemented with `simdlib_emulated.h` is slower than current implementation, so we have not enabled it in our PR.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1815
Reviewed By: beauby
Differential Revision: D27760259
Pulled By: mdouze
fbshipit-source-id: 5df6168ac35ae0174bedf04508dbaf19f11fab3f
Summary:
related: https://github.com/facebookresearch/faiss/issues/1812
This PR improves the performance of contents in `simdlib_emulated.h` .
`IndexPQFastScan` and `IndexIVFPQFastScan` will become faster on non-AVX2 environments, e.g., 4x faster on SIFT1M.
This PR contains below changes:
- Use `template` instead of `std::function` on argument of `unary_func` and `binary_func`
- Because `std::function` hinders some optimizations like function inlining
- Use `const T&` instead of `T` for vector classes like `simd16uint16` on argument of functions
- Vector classes on `simdlib_emulated.h` has the data member as array, so the runtime cost for copying is not so low.
- Passing by const lvalue-ref prevents copy.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1814
Reviewed By: beauby
Differential Revision: D27760072
Pulled By: mdouze
fbshipit-source-id: cbc5a14658d1960b24ce55a395e71c80998742dc
Summary:
This diff includes:
- progressive dimension k-means.
- the ResidualQuantizer object
- GpuProgressiveDimIndexFactory so that it can be trained on GPU
- corresponding tests
- reference Python implementation of the same in scripts/matthijs/LCC_encoding
Reviewed By: wickedfoo
Differential Revision: D27608029
fbshipit-source-id: 9a8cf3310c8439a93641961ca8b042941f0f4249
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1817
There were instantiations of the k-selection templates that operated on float16 data. These are no longer needed as instead Faiss will process all data in float32 (though input data can still be in float16), so removing them to speed compilation time.
Reviewed By: beauby
Differential Revision: D27742889
fbshipit-source-id: a3cf72a10df15f335d18d1e7709ffe269024121d
Summary:
This diff implements brute-force all-pairwise distances between two different sets of vectors using any of the Faiss supported metrics on the GPU (L2, IP, L1, Lp, Linf, etc).
It is implemented using the same C++ interface as `bfKnn`, except when `k == -1`, all pairwise distances will be returned (no k-selection is made). A restriction exists at present where the entire output data must be able to reside on the same GPU which may be lifted at a subsequent point.
This interface is available in python via `faiss.pairwise_distance_gpu(res, xq, xb, D, metric)` with both numpy and pytorch support which will return all of the distances in D.
Also cleaned up CUDA stream usage a little bit in Distance.cu/Distance.cuh in the C++ implementation.
Reviewed By: mdouze
Differential Revision: D27686773
fbshipit-source-id: 8de6a699cda5d7077f0ab583e9ce76e630f0f687
Summary:
After initial positive feedback to the idea in https://github.com/facebookresearch/faiss/issues/1741 from mdouze, here are the patches
I currently have as a basis for discussion.
Matthijs suggests to not bother with the deprecation warnings at all, which is fine for me
as well, though I would normally still advocate to provide users with _some_ advance notice
before removing parts of an interface.
Fixes https://github.com/facebookresearch/faiss/issues/1741
PS. The deprecation warning is only shown once per session (per class)
PPS. I have tested in https://github.com/conda-forge/faiss-split-feedstock/pull/32 that the respective
classes remain available both through `import faiss` and `from faiss import *`.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1742
Reviewed By: mdouze
Differential Revision: D26978886
Pulled By: beauby
fbshipit-source-id: b52e2b5b5b0117af7cd95ef5df3128e9914633ad
Summary:
## Description
This PR added NSG into the index factory. Here are the supported index strings:
1. `NSG{0}` or `NSG{0},Flat`: Create an IndexNSGFlat with `R = {0}`.
2. `IVF{0}_NSG{1},{2}`: Create an IndexIVF using NSG as a coarse quantizer where `ncentroids = {0}`, `R = {1}` and `{2}` is the second level quantizer.
These two types of indexes may be the most useful ones. Other composite indexes could be supported in the future.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1758
Test Plan: buck test //faiss/tests/:test_factory
Reviewed By: beauby
Differential Revision: D27189479
Pulled By: mdouze
fbshipit-source-id: b60000f985c490ef2e7bc561b4e209f9f61c3cc8
Summary:
As stated in https://github.com/facebookresearch/faiss/issues/1743, `_swigfaiss.so` will not exist if compiled with `-DFAISS_OPT_LEVEL=avx2`. This diff added a conditional statement before copying it.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1746
Reviewed By: mdouze
Differential Revision: D26978739
Pulled By: beauby
fbshipit-source-id: 34250c29585fca28677849c9734f6a421661f108
Summary:
## Description:
This diff implemented Navigating Spreading-out Graph (NSG) which accepts a KNN graph as input.
Here is the interface of building an NSG graph:
``` c++
void IndexNSG::build(idx_t n, const float *x, idx_t *knn_graph, int GK);
```
where `GK` is the nb of neighbors per node and `knn_graph[i * GK + j]` is the j-th neighbor of node i.
The `add` method is not implemented yet.
The unit tests could be found in `tests/test_nsg.cpp`.
mdouze beauby Maybe I need some advice on how to design the interface and support python.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1707
Test Plan: buck test //faiss/tests/:test_index -- TestNSG
Reviewed By: beauby
Differential Revision: D26748498
Pulled By: mdouze
fbshipit-source-id: 3280f705fb1b5f9c8cc5efeba63b904c3b832544
Summary:
After existing for close to a year and considering the quality (IMO) of the collaboration since then,
I believe it would be reasonable to also mention the conda-forge packages in `INSTALL.md`.
WDYT mdouze beauby?
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1740
Reviewed By: mdouze
Differential Revision: D26870699
Pulled By: beauby
fbshipit-source-id: 17465cd5c9f138f041d394d61fccd086bcafc3c7
Summary: Polysemous training can OOM because it uses tables of size n^2 with n is 2**nbit of the PQ. This throws and exception when the table threatens to become too large. It also reduces the number of threads when this would make it possible to fit the computation within max_memory bytes.
Reviewed By: wickedfoo
Differential Revision: D26856747
fbshipit-source-id: bd98e60293494e2f4b2b6d48eb1200efb1ce683c
Summary:
This adds docstrings for most of the replaced methods.
This will make the doc visible in notebooks.
Reviewed By: wickedfoo
Differential Revision: D26856664
fbshipit-source-id: da05cf8ac8380ee06a94a380d2547991b0c0a3be
Summary:
Apparently, this is now being supplied by CUDA libs. Without this patch, CUDA builds
on 11.1 & 11.2 give the following kind of warnings:
```
Compiling CUDA source file ..\..\faiss\gpu\GpuIndex.cu...
[...]/faiss/impl/platform_macros.h(42): warning : declaration overloads built-in function "__builtin_ctz"
[...]/faiss/impl/platform_macros.h(42): warning : declaration overloads built-in function "__builtin_ctz"
[...]/faiss/impl/platform_macros.h(42): warning : declaration overloads built-in function "__builtin_ctz"
```
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1737
Reviewed By: wickedfoo
Differential Revision: D26855669
Pulled By: mdouze
fbshipit-source-id: 9447ce20d5db76936c2fb8037560ae910f12b87f
Summary:
There's an annoying warning on every test run that I'd like to fix
```
=============================== warnings summary ===============================
tests/test_index_accuracy.py::TestRefine::test_IP
tests/test_index_accuracy.py::TestRefine::test_L2
$SRC_DIR/tests/test_index_accuracy.py:726: DeprecationWarning: Please use assertEqual instead.
self.assertEquals(recall1, recall2)
```
I've tried sneaking this into https://github.com/facebookresearch/faiss/issues/1704 & https://github.com/facebookresearch/faiss/issues/1717 already, but the first needs more time and
in the second, beauby asked me to keep this separate, so here's a new PR. :)
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1738
Reviewed By: wickedfoo
Differential Revision: D26855644
Pulled By: mdouze
fbshipit-source-id: 1198a9d9b3a79dfeb1d69513a61229fb45924f89
Summary: Checking for invalid parameters (number of nearest neighbors and number of probes where applicable) in the indices and throwing. Along with unit tests.
Reviewed By: wickedfoo
Differential Revision: D26582467
fbshipit-source-id: e345635d2f0f44ddcecc3f3314b2b9113359a787
Summary: Remove the shared mutable random generator, instead re-instanciate the rng everytime it's needed from a random_seed field. For each occurrence, the random_seed is multiplied by a prime number to generate some diveristy.
Reviewed By: beauby
Differential Revision: D26726888
fbshipit-source-id: 58ef99f522bc4adb8233b94f9b9ad9b9d0e1df0b
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1726
The diff enables clang-format for CUDA headers and applies it for fbsource
Reviewed By: zertosh
Differential Revision: D26695628
fbshipit-source-id: 30e53bfd6ad8aedd93c1b18076c5bd0a104a893f
Summary: A test was timing out but the culprit was not the functionality being tested but instead a very slow list comprehension. Also relaxed the test very slightly as it failed from time to time.
Reviewed By: wickedfoo
Differential Revision: D26727507
fbshipit-source-id: 5b3352674fbef1f0cb6155452e4a93adc631d6a7
Summary:
As discussed in https://github.com/facebookresearch/faiss/issues/685, I'm going to add an NSG index to faiss. This PR which adds an NNDescent index is the first step as I commented [here ](https://github.com/facebookresearch/faiss/issues/685#issuecomment-760608431).
**Changes:**
1. Add an `IndexNNDescent` and an `IndexNNDescentFlat` which allow users to construct a KNN graph on a million scale dataset using CPU and search NN on it. The implementation part is put under `faiss/impl`.
2. Add compilation entries to `CMakeLists.txt` for C++ and `swigfaiss.swig` for Python. `IndexNNDescentFlat` could be directly called by users in C++ and Python.
3. `VisitedTable` struct in `HNSW.h` is moved into `AuxIndexStructures.h`.
3. Add a demo `demo_nndescent.cpp` to demonstrate the effectiveness.
**TODO**
1. Support index factor.
2. Implement `IndexNNDescentPQ` and `IndexNNDescentSQ`
3. More comments in the code.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1654
Test Plan:
buck test //faiss/tests/:test_index_accuracy -- TestNNDescent
buck test //faiss/tests/:test_build_blocks -- TestNNDescentKNNG
Reviewed By: wickedfoo
Differential Revision: D26309716
Pulled By: mdouze
fbshipit-source-id: 2abade9708d29023f8bccbf77143e8eea14f66c4
Summary:
Adds the preassigned add and search python wrappers to contrib.
Adds the preassigned search for the binary case (was missing before).
Also adds a real test for that functionality.
Reviewed By: beauby
Differential Revision: D26560021
fbshipit-source-id: 330b715a9ed0073cfdadbfbcb1c23b10bed963a5
Summary:
## Description
It is the same as https://github.com/facebookresearch/faiss/pull/1673 but for `IndexBinaryIVF`. Ensure that `nprobe` is no more than `nlist`.
## Changes
1. Replace `nprobe` with `min(nprobe, nlist)`
2. Replace `long` with `idx_t` in `IndexBinaryIVF.cpp`
3. Add a unit test
4. Fix a small bug in https://github.com/facebookresearch/faiss/pull/1673, `index` should be replaced by `gt_index`
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1695
Reviewed By: wickedfoo
Differential Revision: D26603278
Pulled By: mdouze
fbshipit-source-id: a4fb79bdeb975e9d8ec507177596c36da1195646