Summary: Add normalize_l2 boolean to distributed training API. This is just adding the field, implementation will come in the next diff
Reviewed By: junjieqi
Differential Revision: D72621956
fbshipit-source-id: 830794250837ff17e3adcd2f8f5c332601d2386f
Summary:
The IVFPQ of raft/cuvs does not require pq length check for Faiss' original implementation. This check make IVFPQ support limited parameters than raft/cuvs in vain.
The check of supported PQ code length here
df6a8f6b4e/faiss/gpu/impl/IVFPQ.cu (L80-L102)
is for Faiss' original CUDA implementation. Raft/cuvs support more choices.
The wiki of faiss also describe the limitation (https://github.com/facebookresearch/faiss/wiki/Faiss-on-the-GPU#limitations), which needs to be update, too. Raft/cuvs is not limited to those choices.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4241
Reviewed By: bshethmeta, gtwang01
Differential Revision: D72200376
Pulled By: mnorris11
fbshipit-source-id: 2b81e822a397f3ab4a7c691e38be0186535d129d
Summary:
### Description
- Create custom readers and writers for index IO, which take function pointers as input
- Also expose these from the C_API
This is helpful for FFI use, where calling processes would pass upcall stubs for streamlined IO
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4180
Reviewed By: gtwang01
Differential Revision: D71208266
Pulled By: mnorris11
fbshipit-source-id: ab82397d4780a2a07c7bfdc52329968377f42af4
Summary:
This is a reference implementation of the https://arxiv.org/pdf/2405.12497
> Jianyang Gao, Cheng Long, "RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search".
The goal is to correctly set up the internals using Faiss.
The following comments for the implementation:
* The code does not include the computations for the symmetric distance, because it is absent in the original article. This can be added later, though.
* The original `RaBitQ` includes random matrix rotation as a part of it, but I've decided to rely on external `faiss::IndexPreTransform` and `faiss::RandomRotationMatrix` facilities.
* Certain features required internal changes in `faiss::IndexIVF`, but I did that as least invasive as possible, without breaking the backward compatibility.
* Not sure about naming convensions, maybe certain classes and structures need to be renamed
* `METRIC_INNER_PRODUCT` is supported as well
* More unit tests are needed?
* I did not bring any hardware-specific optimizations, bcz this is a reference implementation. Certain `simdlib` facilities may be added later, if needed
Here's how to use IndexRaBitQ
```Python
ds = datasets.SyntheticDataset(...)
index_rbq = faiss.IndexRaBitQ(ds.d, faiss.METRIC_L2)
index_rbq.qb = 8
# wrap with random rotations
rrot = faiss.RandomRotationMatrix(ds.d, ds.d)
rrot.init(rrot_seed)
index_cand = faiss.IndexPreTransform(rrot, index_rbq)
index_cand.train(ds.get_train())
index_cand.add(ds.get_database())
```
Here's how to use IndexIVFRaBitQ
```Python
ds = datasets.SyntheticDataset(...)
index_flat = faiss.IndexFlat(ds.d, faiss.METRIC_L2)
index_rbq = faiss.IndexIVFRaBitQ(index_flat, ds.d, nlist, faiss.METRIC_L2)
index_rbq.qb = 8
# wrap with random rotations
rrot = faiss.RandomRotationMatrix(ds.d, ds.d)
rrot.init(rrot_seed)
index_cand = faiss.IndexPreTransform(rrot, index_rbq)
index_cand.train(ds.get_train())
index_cand.add(ds.get_database())
```
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4235
Test Plan:
Imported from GitHub, without a `Test Plan:` line.
buck run 'fbcode//mode/dev' fbcode//faiss/tests:test_rabitq
Reviewed By: mdouze
Differential Revision: D71638302
Pulled By: junjieqi
fbshipit-source-id: de981a6aed91d296237d8accf337359de04a552e
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4256
Pass row filters to Hive Reader to filter rows. This is needed for filtering for is_high_priority=true for Unicorn dataset
Reviewed By: junjieqi
Differential Revision: D71874955
fbshipit-source-id: b8ab4d9fbc8493b0da44ada66fa03339aacba9f6
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4250
This is an attempt to re-land the diff stack D69972250 D70982449
It was reverted because the bottom of the stack did not pass the tests.
The original code comes from Alexandr Guzhva's https://github.com/facebookresearch/faiss/pull/4199
To the adsmarket steward: the diff was already accepted by your team (see D70982449), but reverted for an independent reason. So should be easy to accept now.
Reviewed By: mengdilin
Differential Revision: D71614511
fbshipit-source-id: 94139b4a4d457afe0d37ac95342537414aa81e7a
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4246
CUDA kernel variables matching the type `(thread|block|grid).(Idx|Dim).(x|y|z)` [have the data type `uint`](https://docs.nvidia.com/cuda/cuda-c-programming-guide/#built-in-variables).
Many programmers mistakenly use implicit casts to turn these data types into `int`. In fact, the [CUDA Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/) it self is inconsistent and incorrect in its use of data types in programming examples.
The result of these implicit casts is that our kernels may give unexpected results when exposed to large datasets, i.e., those exceeding >~2B items.
While we now have linters in place to prevent simple mistakes (D71236150), our codebase has many problematic instances. This diff fixes some of them.
Reviewed By: dtolnay
Differential Revision: D71355340
fbshipit-source-id: 77dac270e1d3415bfe7d5cc214006d5176508474
Summary:
the problem happens if `radius - normalizers[2 * q + 1]` is negative. Thus, it is possible to provide reasonable parameters to `IVFPQFastScan::RangeSearch()` and get an empty result.
I have no idea WHY (hardware implementation, it seems), but the following code
```C++
#include <cstddef>
#include <cstdint>
#include <iostream>
int main() {
float f = -25.5f;
uint16_t t = f;
std::cout << t << std::endl;
return 0;
}
```
prints `65511` on `x86` and `0` on ARM on the same compiler.
Thus, it is needed to wrap the `float` value with `int` to preserve a correct cast:
```C++
uint16_t t = (int)f;
```
Who would have thought...
It is useful to find some C++ compiler command line flags that will generate a compilation error on such a behavior.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4247
Reviewed By: junjieqi, satymish
Differential Revision: D71427185
Pulled By: gtwang01
fbshipit-source-id: 3ff3a9d3bb523e48bb9512c380c042bb1c2decdb
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4245
When number of clustering embeddings > int32 max, calculating imbalance factor from python side causes an function overload not found error.
```
[0]:[rank0]: return faiss.imbalance_factor(len(assign), k, faiss.swig_ptr(assign))
[0]:[rank0]: NotImplementedError: Wrong number or type of arguments for overloaded function 'imbalance_factor'.
[0]:[rank0]: Possible C/C++ prototypes are:
[0]:[rank0]: faiss::imbalance_factor(int,int,int64_t const *)
[0]:[rank0]: faiss::imbalance_factor(int,int const *)
```
Fixing it by changing the function signature in c++ land to support int64_t.
Reviewed By: bshethmeta
Differential Revision: D71130612
fbshipit-source-id: becbf464a9d3979269cc7f0cecc6b80a6f9e1199
Summary:
Fix https://github.com/facebookresearch/faiss/issues/4224.
The issue is that `IndexHNSW`'s internal `Index* storage` doesn't inherit `metric_arg`. One solution is to set `metric_arg` in `IndexHNSW::add`, which is what I did. Not sure what the best place to do this would be.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4239
Reviewed By: mdouze
Differential Revision: D71225749
Pulled By: gtwang01
fbshipit-source-id: b27a592febadea153b575252df0c8ef14f7705d2
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4243
`-Wunused-exception-parameter` has identified an unused exception parameter. This diff removes it.
This:
```
try {
...
} catch (exception& e) {
// no use of e
}
```
should instead be written as
```
} catch (exception&) {
```
If the code compiles, this is safe to land.
Reviewed By: dtolnay
Differential Revision: D71290934
fbshipit-source-id: f5e47eed369a9a024cc1e16a23acafa49f75b651
Summary:
Context issue: https://github.com/facebookresearch/faiss/issues/3503
We need search params support for binary flat index to be able to use it in RAG applications that support search with pre-filtering.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4055
Reviewed By: junjieqi
Differential Revision: D69538514
Pulled By: gtwang01
fbshipit-source-id: 4b6811fd8323b4c39e726b7fd33dfe0384dd57fc
Summary:
This PR introduces a backport of a combination of https://github.com/zilliztech/knowhere/pull/996 and https://github.com/zilliztech/knowhere/pull/1032 that allow to have memory-mapped and zerocopy indces.
The root underlying idea is that we replace certain `std::vector<>` containers with a custom `faiss::MaybeOwnedVector<>` container, which may behave either as `std::vector<>`, or as a view of a certain pointer / descriptor. We don't replace all the instances of `std::vector<>`, but the largest ones.
This change affects `IndexFlatCodes`-based and `IndexHNSW` CPU indices.
(done) alter IVF lists as well.
(done) alter binary indices as well.
Memory-mapped index works like this:
```C++
std::unique_ptr<faiss::Index> index_mm(
faiss::read_index(filenamename.c_str(), faiss::IO_FLAG_MMAP_IFC));
```
In theory, it should be ready to be used from Python. All the descriptor management should be working.
Zero-copy index works like this:
```C++
#include <faiss/impl/zerocopy_io.h>
faiss::ZeroCopyIOReader reader(buffer.data(), buffer.size());
std::unique_ptr<faiss::Index> index_zc(faiss::read_index(&reader));
```
All the pointer management for `faiss::ZeroCopyIOReader` should be handled manually.
I'm not sure how to plug this into Python yet, maybe, some ref-counting is required.
(done) some refactoring
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4199
Reviewed By: mengdilin
Differential Revision: D69972250
Pulled By: mdouze
fbshipit-source-id: 98a3f94d6884814873d3534ee25f960892ef1076
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4232
`nullptr` is preferable to `0` or `NULL`. Let's use it everywhere so we can enable `-Wzero-as-null-pointer-constant`.
- If you approve of this diff, please use the "Accept & Ship" button :-)
Reviewed By: dtolnay
Differential Revision: D70818157
fbshipit-source-id: a46d64b6d80844f5246f7df236eb6ec54ce2886f
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4219
`code_distance-sve.h` references `PQDecoder8` but doesn't include the header. The issue is revealed by D68784260 which removed some includes from a header that indirectly included `ProductQuantizer.h`
```
headers/faiss/impl/code_distance/code_distance-sve.h:74:45: error: unknown type name 'PQDecoder8'; did you mean 'PQDecoderT'?
74 | std::enable_if_t<std::is_same_v<PQDecoderT, PQDecoder8>, float> inline distance_single_code_sve(
| ^~~~~~~~~~
| PQDecoderT
```
Reviewed By: ddrcoder
Differential Revision: D70433576
fbshipit-source-id: 12945b16003a3d6a995b18ffe9798179ecf826f4
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4197
Ivan and I discussed 2 problems:
1. We may want to try to offload/shard PQ or SQ table data if there is a big enough win (pending)
2. IDs seem to be random after sharding.
This diff solves 2.
Root cause is that we add to quantizer without IDs.
Instead, we wrap in IndexIDMap2 (which provides reconstruction, whereas IndexIDMap does not).
Laser's quantizers are Flat and HNSW, so we can wrap like this.
Reviewed By: ivansopin
Differential Revision: D69832788
fbshipit-source-id: 331b6d1cf52666f5dac61e2b52302d46b0a83708
Summary:
If both `avx512` and `avx512_spr` are compiled, Sapphire Rapids capabilities are never loaded when using the Python bindings, as the `avx512` import always overrides the `avx512_spr` one.
This very small PR solves the issue.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4209
Reviewed By: mengdilin
Differential Revision: D70015045
Pulled By: gtwang01
fbshipit-source-id: d3553a6c9048a534c0901ee29e7e2354de96e79f
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4205
Removing unused variable.
This piece of code began to be compiled after armv9a has been set as default compilation profile
Reviewed By: andrewjcg
Differential Revision: D69946389
fbshipit-source-id: f2b5e57585506eb7cecbf76bf71bc6a2b5cc7133
Summary:
This is required to enable lazy setting of a device copy of the training dataset to a cuVS CAGRA index.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4173
Reviewed By: mnorris11
Differential Revision: D69795662
Pulled By: gtwang01
fbshipit-source-id: 68cda198ed7983800b64d3e5fac1b77ff55ecd12
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4198
1. pins lief due to `AttributeError: type object 'CLASS' has no attribute 'CLASS64'` (just set it to last passing nightly version)
2. pins mkl in gpu builds due to it trying to pull in 2024.2.2 which conflicts with 2023 in the libfaiss.
Added nightlies to make sure they pass https://github.com/facebookresearch/faiss/actions/runs/13422430425/job/37498020894. Not all passed: I'm not sure the `build-pull-request / Linux x86_64 GPU w/ cuVS nightlies (CUDA 12.4.0)` nightly is actually broken, but this unblocks the PR builds for now.
Reviewed By: junjieqi
Differential Revision: D69860604
fbshipit-source-id: 2da623c71b03c22d581b78655253a863fbafd3ed
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4150
Creates a sharding convenience function for IVF indexes.
- The __**centroids on the quantizer**__ are sharded based on the given sharding function. (not the data, as data sharding by ids is already implemented by copy_subuset_to, https://github.com/facebookresearch/faiss/blob/main/faiss/IndexIVF.h#L408)
- The output is written to files based on the template filename generator param.
- The default sharding function is simply the ith vector mod the total shard count.
This would called by Laser here: https://www.internalfb.com/code/fbsource/[ce1f2e028e79]/fbcode/fblearner/flow/projects/laser/laser_sim_search/knn_trainer.py?lines=295-296. This convenience function will do the file writing, and return the created file names.
There's a few key required changes in FAISS:
1. Allow `std::vector<std::string>` to be used. Updates swigfaiss.swig and array_conversions.py to accommodate. These have to be numpy dtype of `object` instead of the more correct `unicode`, because unicode dtype is fixed length. I couldn't figure out how to create a numpy array with each of the output file names where they have different dtypes. (Say the file names are like file1, file11, file111. The dtype would need to be U5, U6, U7 respectively, as the dtype for unicode contains the length). I tried structured arrays : this does not work either, as numpy makes it into a matrix instead: the `file1 file11 file111` example with explicit setting of U5, U6, U7 turns into `[[file1 file1 file1], [file1 file11 file11], [file1 file11 file111]]`, which we do not want. If someone knows the right syntax, please yell at me
2. Create Python callbacks for sharding and template filename: `PyCallbackFilenameTemplateGenerator` and `PyCallbackShardingFunction`. Users of this function would inherit from the FilenameTemplateGenerator or ShardingFunction in C++ to pass to `shard_ivf_index_centroids`. See the other examples in python_callbacks.cpp. This is required because Python functions cannot be passed through SWIG to C++ (i.e. no std::function or function pointers), so we have to use this approach. This approach allows it to be called from both C++ and Python. test_sharding.py shows the Python calling, test_utils.cpp shows the C++ calling.
Reviewed By: asadoughi
Differential Revision: D68534991
fbshipit-source-id: b857e20c6cc4249a2ab7792db4c93dd4fb8403fd
Summary:
Add ability to search HNSW indexes using a plain [`SearchParameters`](6c046992a7/faiss/Index.h (L64-L69)) object (i.e. only an [`IDSelector`](6c046992a7/faiss/Index.h (L66)))
Issue: Currently if a plain `SearchParameters` is used to query an HNSW index, [an error is thrown](6c046992a7/faiss/IndexHNSW.cpp (L251)) -- when the user's intent was only to filter some documents, and rely on index settings for remaining parameters (like `efSearch`, `check_relative_distance`, `search_bounded_queue`)
Motivation: Faiss provides an amazing [index factory](https://github.com/facebookresearch/faiss/wiki/The-index-factory) and [parameter setter](https://github.com/facebookresearch/faiss/wiki/Index-IO,-cloning-and-hyper-parameter-tuning) to abstract away internals of the index type and settings used, like:
```cpp
Index* index = index_factory(256, "HNSW32");
ParameterSpace().set_index_parameters(index, "efConstruction=200,efSearch=150");
```
Now if a user wants to perform a filtered search on this _opaque_ index using:
```cpp
SearchParameters parameters;
parameters.sel = new IDSelectorRange(10, 20);
index->search(nq, xq, k, d, id, ¶meters);
```
they are met with an error:
```
faiss/IndexHNSW.cpp:251: Error: '!(params)' failed: params type invalid
```
An easy way to reproduce this issue is to replace `Flat` -> `HNSW` [here](6c046992a7/c_api/example_c.c (L60)) and run `example_c` like:
```
make -C build example_c
./build/c_api/example_c
```
This PR allows passing a plain `SearchParameters` to HNSW indexes, and use index settings as a fallback
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4167
Reviewed By: asadoughi
Differential Revision: D69312175
Pulled By: mnorris11
fbshipit-source-id: 63cc1deb6cb6116850cb3f8f7866eaa3a911ee48