Summary:
This pull request introduces a new default argument, `ngpu=-1`, to the `knn_ground_truth` function in the `faiss.contrib`.
## Purpose of Change
### Bug Fix
In the current implementation, running tests under the tests directory (CPU tests) in an environment with faiss-gpu installed would inadvertently use the GPU and cause unintended behavior.
This pull request prevents the GPU from being used during CPU-only tests by explicitly controlling GPU allocation via the ngpu parameter.
### API Consistency
Other functions that call `faiss.get_num_gpus` in `faiss.contrib`, such as `range_search_max_results` and `range_ground_truth`, already include the `ngpu` argument.
Adding this parameter to `knn_ground_truth` will ensure consistency across the API, reduce potential confusion, and improve ease of use.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4123
Reviewed By: asadoughi
Differential Revision: D68199506
Pulled By: junjieqi
fbshipit-source-id: cb50e206d8a1a982c21b0ccb42825ea45873f3ef
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4018
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4014
This diff adds support for bfloat16 vector/query data types with the GPU brute-force k-nearest neighbor function (`bfKnn`).
The change is largely just plumbing the new data type through the template hierarchy (so distances can be computed in bfloat16).
Of note, by design, all final distance results are produced in float32 regardless of input data type (float32, float16, bfloat16). This is because the true nearest neighbors in many data sets can often differ by only ~1000 float32 ULPs in terms of distance which will result in possible false equivalency. This seems to be one area where lossy compression/quantization thoughout does not work as well (and is also why `CUBLAS_MATH_DISALLOW_REDUCED_PRECISION_REDUCTION` is set in `StandardGpuResources.cpp`. However, given that there is native bf16 x bf16 = fp32 tensor core support on Ampere+ architectures, the matrix multiplication itself should use them.
As bfloat16 support is quite lacking on AMD/ROCm (see [here](https://rocm.docs.amd.com/projects/HIPIFY/en/latest/tables/CUDA_Device_API_supported_by_HIP.html), very few bf16 functions implemented), bf16 functionality is completely disabled / not compiled for AMD ROCm.
Reviewed By: mdouze
Differential Revision: D65459723
fbshipit-source-id: 8a6aec843f7e37c205d95f2485442a26c402a3b0
Summary:
Remove the dependency on `raft::compiled` and modify GPU implementations to use cuVS backend in place of RAFT.
A deeper insight into the dependency:
FAISS gets the ANN algorithm implementations such as IVF-Flat and IVF-PQ from cuVS. RAFT is meant to be a lightweight C++ header-only template library that cuVS relies on for the more fundamental / low-level utilities. Some examples of these are RAFT's device mdarray and mdspan objects; the RAFT resource object (`raft::resource`) that takes care of the stream ordering of device functions; linear algebra functions such as mapping, reduction, BLAS routines etc. A lot of the cuVS functions take the RAFT mdspan objects as arguments (for example `raft::device_matrix_view`). Therefore FAISS relies on both cuVS and RAFT. FAISS gets RAFT headers through cuVS and uses them to create the function arguments that can be consumed by cuVS. Note that we are not explicitly linking FAISS against `raft::raft` or `raft::compiled`. Only the required headers are included and compiled rather than compiling the whole RAFT shared library. This is the reason we still see mentions of `raft` in FAISS.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3549
Reviewed By: ramilbakhshyiev
Differential Revision: D62041013
Pulled By: asadoughi
fbshipit-source-id: 7230dcc06cf47baf95873adc1dec2adca4a8f82a
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3916
Adding missing wrapper to the torch wrappers in Faiss + test it.
Also factorized a bit of code between search functions.
Reviewed By: algoriddle
Differential Revision: D63974821
fbshipit-source-id: a0415a57a763e2d1896956c503e503615c167860
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3876
Demo script for distributed kmeans. It provides a `DatasetAssign` object and shows how to run it with torch.distributed.
Reviewed By: asadoughi, pankajsingh88
Differential Revision: D63013820
fbshipit-source-id: 22c959f3afdc04fd4aa8b9aeed309ea6290b1328
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3872
The contrib.torch subdirectory is intended to receive modules in python that are useful for similarity search and that apply to CPU or GPU pytorch tensors.
The current version includes CPU clustering on torch tensors. To be added:
* implementation of PQ
Reviewed By: asadoughi
Differential Revision: D62759207
fbshipit-source-id: 87dbaa5083e3f2f4f60526815e22ded4e83e8559
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3873
The previous version required scipy to do the accumulation, which is replaced here with a nifty piece of numpy accumulation.
This removes the need for scipy for non-sparse data.
Reviewed By: junjieqi
Differential Revision: D62884307
fbshipit-source-id: 5443634e487387a2b518fd2a7f9a3d9a40abd4b4
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3848
same as title.
Dataset can be referred from blobstore
Reviewed By: satymish
Differential Revision: D62476993
fbshipit-source-id: db2b4088ab6e02278b8b91194bf916fc476b79ec
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3611
Using the new dispatcher functions, add search func to flat codes. To test it, make IndexLattice a subclass of FlatCodes and check the resonstruction there.
Reviewed By: asadoughi
Differential Revision: D59367989
fbshipit-source-id: 405dab4358fe34b2e38ac8bcc222b19f58643229
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3423
Adding small fixes to run experiments from fbcode.
1. Added buck target
2. Full import path of faiss bench_fw modules
3. new dataset path to run tests locally as we can't use an existing directory ./data in fbcode.
Reviewed By: algoriddle, junjieqi
Differential Revision: D57235092
fbshipit-source-id: f78a23199e619b640a19ca37f8b52ff0abdd8298
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3422
Found vec_io failing when running some benchmarking.
There is no such field named big_endian in sys. So, reverting it to original field byteorder
Reviewed By: algoriddle
Differential Revision: D56718607
fbshipit-source-id: 553f1d2d6bc967581142a92282e534f3f164e8f9
Summary:
mdouze Please let me know if any additional unit tests are needed
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3444
Reviewed By: algoriddle
Differential Revision: D57665641
Pulled By: mdouze
fbshipit-source-id: 9bec91306a1c31ea4f1f1d726c9d60ac6415fdfc
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3361
Fix a few issues in the PR.
Normally all tests should pass on a litlle-endian machine
Reviewed By: junjieqi
Differential Revision: D56003181
fbshipit-source-id: 405dec8c71898494f5ddcd2718c35708a1abf9cb
Summary:
This pull request is for issue https://github.com/facebookresearch/faiss/issues/3330. This patch makes sure that packed code arrays are in big endian format. Kindly let us know if we need any changes or if we can have a better approach.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3345
Reviewed By: junjieqi
Differential Revision: D55957630
Pulled By: mdouze
fbshipit-source-id: f728f9563f6b942af9d8899b54662d7ceb811206
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3327
**Context**
1. [Issue 2621](https://github.com/facebookresearch/faiss/issues/2621) discuss inconsistency between OnDiskInvertedList and InvertedList. OnDiskInvertedList is supposed to handle disk based multiple Index Shards. Thus, we should name it differently when merging invls from index shard.
2. [Issue 2876](https://github.com/facebookresearch/faiss/issues/2876) provides usecase of shifting ids when merging invls from different shards.
**In this diff**,
1. To address #1 above, I renamed the merge_from function to merge_from_multiple without touching merge_from base class.
why so? To continue to allow merge invl from one index to ondiskinvl from other index.
2. To address #2 above, I have added support of shift_ids in merge_from_multiple to shift ids from different shards. This can be used when each shard has same set of ids but different data. This is not recommended if id is already unique across shards.
Reviewed By: mdouze
Differential Revision: D55482518
fbshipit-source-id: 95470c7449160488d2b45b024d134cbc037a2083
Summary: This diff replaces the use of pickle serialization with json to address a security vulnerability. Adding a warning message that this code is for demonstration purposes only.
Reviewed By: mdouze
Differential Revision: D52777650
fbshipit-source-id: d9d6a00fd341b29ac854adcbf675d2cd303d2f29
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3190
This diff adds more result handlers in order to expose them externally.
This enables range search for HSNW and Fast Scan, and nprobe parameter support for FastScan.
Reviewed By: pemazare
Differential Revision: D52547384
fbshipit-source-id: 271da5ffea6411df3d8e50641abade18bd7b774b
Summary:
1. Support for index construction parameters outside of the factory string (arbitrary depth of quantizers).
2. Refactor that provides an index wrapper which is a prereq for the optimizer, which will generate indices from pre-optimized components (particularly quantizers)
Reviewed By: mdouze
Differential Revision: D51427452
fbshipit-source-id: 014d05dd798d856360f2546963e7cad64c2fcaeb
Summary:
This PR adds a functionality where an IVF index can be searched and the corresponding codes be returned. It also adds a few functions to compress int arrays into a bit-compact representation.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3143
Test Plan:
```
buck test //faiss/tests/:test_index_composite -- TestSearchAndReconstruct
buck test //faiss/tests/:test_standalone_codec -- test_arrays
```
Reviewed By: algoriddle
Differential Revision: D51544613
Pulled By: mdouze
fbshipit-source-id: 875f72d0f9140096851592422570efa0f65431fc
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2984
It is not entirely trivial to access the NSG graph structure from Python (although it is a fixed size N-by-K matrix of vector ids).
This diff adds an inspect_tools function to do that.
Reviewed By: algoriddle
Differential Revision: D48026775
fbshipit-source-id: 94cd7be7f656bcd333d62586531f287ea8e052e5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2916
Overall better support for binary indexes:
- cloning (to CPU and GPU), only for BinaryFlat for now
- fix bug in reconstruct_n
- range_search_max_results
Reviewed By: algoriddle
Differential Revision: D46755778
fbshipit-source-id: 777ad90aff5c54a77f9685ed6512247a922c6ef5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2901
This diff allows each GPU to work independently, a hot centroid (eg. out-of-distribution queries that hit a centroid heavily) will only block the one GPU that is processing it, others will continue to pick up work independently.
Reviewed By: mdouze
Differential Revision: D46521298
fbshipit-source-id: 171cb06cce8b2d16b7bd744799b105b3cd525be3
Summary: In the IndexIVFIndepenentQuantizer, the coarse quantizer is applied on the input vectors, but the encoding is performed on a vector-transformed version of the database elements.
Reviewed By: alexanderguzhva
Differential Revision: D45950970
fbshipit-source-id: 30f6cf46d44174b1d99a12384b7d5e2d475c1f88
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2860
Optimized range search function where the GPU computes by default and falls back on gpu for queries where there are too many results.
Parallelize the CPU to GPU cloning, it seems to work.
Support range_search_preassigned in Python
Fix long-standing issue with SWIG exposed functions that did not release the GIL (in particular the MapLong2Long).
Adds a MapInt64ToInt64 that is more efficient than MapLong2Long.
Reviewed By: algoriddle
Differential Revision: D45672301
fbshipit-source-id: 2e77397c40083818584dbafa5427149359a2abfd
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2846
Adds a function to ivf_contrib to sort the inverted lists by size without changing the results. Also moves big_batch_search to its own module.
Reviewed By: algoriddle
Differential Revision: D45565880
fbshipit-source-id: 091a1c1c074f860d6953bf20d04523292fb55e1a
Summary: GIST1M is on the fair cluster but was not added to the datsets.py
Reviewed By: alexanderguzhva
Differential Revision: D45276664
fbshipit-source-id: 8db41d61b78983f5d01dedca1790618f80f6bc78
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2781
This is a benchmarking script for keypoint matching with labelled ground-truth.
Reviewed By: alexanderguzhva
Differential Revision: D44036091
fbshipit-source-id: d9d7c089c4d172b66f33dc968c00713a1b79c2d1
Summary: Big batch search can be running for hours so it's useful to have a checkpointing mechanism in case it's run on a best-effort cluster queue.
Reviewed By: algoriddle
Differential Revision: D44059758
fbshipit-source-id: 5cb5e80800c6d2bf76d9f6cb40736009cd5d4b8e
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2638
This diff is a more streamlined way of searching IVF indexes with precomputed clusters.
This will be used for experiments with hybrid CPU / GPU search.
Reviewed By: algoriddle
Differential Revision: D41301032
fbshipit-source-id: a1d645fd0f2bf806454dfd04971edc0a6200d20d
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2631
The pytorch in fbcode complains about `storage()` saying it is deprecated and we need to move to UntypedStorage `_storage()`, while github CI is using an older version of pytorch where `_storage()` doesn't exist.
As it is only a warning not an error in fbcode, revert to the old form, but we'll likely have to change to `_storage()` eventually.
Reviewed By: alexanderguzhva
Differential Revision: D42107029
fbshipit-source-id: 699c15932e6ae48cd1c60ebb7212dcd9b47626f6
Summary:
This diff fixes four separate issues:
- Using the pytorch bridge produces the following deprecation warning. We switch to `_storage()` instead.
```
torch_utils.py:51: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor._storage() instead of tensor.storage()
x.storage().data_ptr() + x.storage_offset() * 4)
```
- The `storage_offset` for certain types was wrong, but this would only affect torch tensors that were a view into a storage that didn't begin at the beginning.
- The `reconstruct_n` numpy pytorch bridge function allowed passing `-1` for `ni` which indicated that all vectors should be reconstructed. The torch bridge didn't follow this and throw an error:
```
TypeError: torch_replacement_reconstruct_n() missing 2 required positional arguments: 'n0' and 'ni'
```
- Choosing values in the range (1024, 2048] for `k` or `nprobe` were broken in D37777979; this is now fixed again.
Reviewed By: alexanderguzhva
Differential Revision: D42041239
fbshipit-source-id: c7d9b4aba63db8ac73e271c8ef34e231002963d9
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2567
Intuitively, it should be easier to handle big-batch searches because all distance computations for a set of queries can be done locally within each inverted list.
This benchmark implements this in pure python (but should be close to optimal in terms of speed), on CPU for IndexIVFFlat, IndexIVFPQ and IndexIVFScalarQuantizer. GPU is also supported.
The results are not systematically better, see https://docs.google.com/document/d/1d3YuV8uN7hut6aOATCOMx8Ut-QEl_oRnJdPgDBRF1QA/edit?usp=sharing
Reviewed By: algoriddle
Differential Revision: D41098338
fbshipit-source-id: 479e471b0d541f242d420f581775d57b708a61b8
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2582
A few more or less cosmetic improvements
* Index::idx_t was in the Index object, which does not make much sense, this diff moves it to faiss::idx_t
* replace multiprocessing.dummy with multiprocessing.pool
* add Alexandr as a core contributor of Faiss in the README ;-)
```
for i in $( find . -name \*.cu -o -name \*.cuh -o -name \*.h -o -name \*.cpp ) ; do
sed -i s/Index::idx_t/idx_t/ $i
done
```
For the fbcode deps:
```
for i in $( fbgs Index::idx_t --exclude fbcode/faiss -l ) ; do
sed -i s/Index::idx_t/idx_t/ $i
done
```
Reviewed By: algoriddle
Differential Revision: D41437507
fbshipit-source-id: 8300f2a3ae97cace6172f3f14a9be3a83999fb89
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2584
The `bfKnn` C++ function and `knn_gpu` Python functions for running brute-force k-NN on the GPU did not have a way to specify the GPU device on which the search should run, as it simply used the current thread-local `cudaGetDevice(...)` setting in the CUDA runtime API.
This is unlike the GPU index classes which takes a device argument in the index config struct. Now, both the C++ and Python interface to bfKnn have an optional argument to specify the device.
Default behavior is the current behavior; if the `device` is -1 then the current CUDA thread-local device is used, otherwise we perform the work on the desired device.
Reviewed By: mdouze
Differential Revision: D41448254
fbshipit-source-id: a63c68c12edbe4d725b9fc2a749d5dc935574e12
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2483
This diff changes the following:
1. all search functions now take a `SearchParameters` argument that overrides the internal search parameters
2. the default implementation for most classes throws when the params argument is non-nullptr / non-None
3. the IndexIVF and IndexHNSW classes have functioning SearchPArameters
4. the SearchParameters includes an IDSelector that can search only in a subset of the index based on a defined subset of ids
There is also some refactoring: the IDSelector was moved to its own .h/.cpp and python/__init__.py is spit in parts.
The diff is quite bulky because the search function prototypes need to be changed in all index classes.
Things to fix in subsequent diffs:
- support SearchParameters for more index types (Flat variants)
- better sub-object ownership for SearchParams (with std::unique_ptr?)
- special handling of IDSelectorRange to make it faster
Reviewed By: alexanderguzhva
Differential Revision: D39852589
fbshipit-source-id: 4988bdb5b9bee1207cd327d3f80bf5e0e2467fe1
Summary:
I couldn't run the client-server implementation because of the logging. Indeed `LOG.info('Connected by', addr, end=' ')` raised an exception (`end` is not recognised as a valid argument).
Other warnings are also showing up. This PR clean things up a bit and fixes the client-server.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2433
Reviewed By: alexanderguzhva
Differential Revision: D39167576
Pulled By: mdouze
fbshipit-source-id: 6f74d582f14e353e04029e6465bd6e488a865289
Summary:
Signed-off-by: Ryan Russell <git@ryanrussell.org>
Various readability fixes focused on `.md` files:
- Grammar
- Fix some incorrect command references to `distributed_kmeans.py`
- Styling the markdown bash code snippets sections so they format
Attempted to put a lot of little things into one PR and commit; let me know if any mods are needed!
Best,
Ryan
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2378
Reviewed By: alexanderguzhva
Differential Revision: D37717671
Pulled By: mdouze
fbshipit-source-id: 0039192901d98a083cd992e37f6b692d0572103a
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2217
This diff introduces a new Faiss contrib module that contains:
- generic k-means implemented in python (was in distributed_ondisk)
- the two-level clustering code, including a simple function that runs it on a Faiss IVF index.
- sparse clustering code (new)
The main idea is that that code is often re-used so better have it in contrib.
Reviewed By: beauby
Differential Revision: D34170932
fbshipit-source-id: cc297cc56d241b5ef421500ed410d8e2be0f1b77