faiss

mirror of https://github.com/facebookresearch/faiss.git synced 2025-06-03 03:08:47 +08:00

Author	SHA1	Message	Date
CodemodService Bot	b122987939	Fix CQS signal. Id] 88153895 -- readability-redundant-string-init in fbcode/faiss Reviewed By: dtolnay Differential Revision: D72700675	2025-04-09 08:14:34 -07:00
Satyendra Mishra	7eac0346f5	Add normalize_l2 boolean to distributed training API Summary: Add normalize_l2 boolean to distributed training API. This is just adding the field, implementation will come in the next diff Reviewed By: junjieqi Differential Revision: D72621956 fbshipit-source-id: 830794250837ff17e3adcd2f8f5c332601d2386f	2025-04-08 16:23:27 -07:00
Jaap Aarts	0dfb599eac	Handle insufficient driver gracefully (#4271 ) Summary: Gracefully handle insufficient drivers (ex. no driver available.) Resolves https://github.com/facebookresearch/faiss/issues/4251 Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4271 Reviewed By: mnorris11 Differential Revision: D72351969 Pulled By: ramilbakhshyiev fbshipit-source-id: de2b6f741087c59665e7f9f171ee6096c7eea39b	2025-04-03 00:27:37 -07:00
Alexandr Guzhva	d4e236b500	relax input params for IndexIVFRaBitQ::get_InvertedListScanner() (#4270 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4270 Reviewed By: mnorris11 Differential Revision: D72254929 Pulled By: junjieqi fbshipit-source-id: 8354b58007d50d1daf06a3bfff4d2d05962c16af	2025-04-01 21:35:03 -07:00
Alexandr Guzhva	df9e2c48d6	Fix a placeholder for 'unimplemented' in mapped_io.cpp (#4268 ) Summary: This should fix a problem on macos compilation (just compilation), as discussed in https://github.com/facebookresearch/faiss/pull/4250#issuecomment-2767317033 mnorris11 please verify Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4268 Reviewed By: junjieqi Differential Revision: D72215145 Pulled By: mnorris11 fbshipit-source-id: ccac8aedacaef330dbdc18888d16f870d008df0f	2025-04-01 21:29:26 -07:00
wwq	0d3aff9066	fix bug: IVFPQ of raft/cuvs does not require redundant check (#4241 ) Summary: The IVFPQ of raft/cuvs does not require pq length check for Faiss' original implementation. This check make IVFPQ support limited parameters than raft/cuvs in vain. The check of supported PQ code length here `df6a8f6b4e/faiss/gpu/impl/IVFPQ.cu (L80-L102)` is for Faiss' original CUDA implementation. Raft/cuvs support more choices. The wiki of faiss also describe the limitation (https://github.com/facebookresearch/faiss/wiki/Faiss-on-the-GPU#limitations), which needs to be update, too. Raft/cuvs is not limited to those choices. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4241 Reviewed By: bshethmeta, gtwang01 Differential Revision: D72200376 Pulled By: mnorris11 fbshipit-source-id: 2b81e822a397f3ab4a7c691e38be0186535d129d	2025-04-01 13:31:48 -07:00
Kaival Parikh	a4401c13d8	Allow using custom index readers and writers (#4180 ) Summary: ### Description - Create custom readers and writers for index IO, which take function pointers as input - Also expose these from the C_API This is helpful for FFI use, where calling processes would pass upcall stubs for streamlined IO Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4180 Reviewed By: gtwang01 Differential Revision: D71208266 Pulled By: mnorris11 fbshipit-source-id: ab82397d4780a2a07c7bfdc52329968377f42af4	2025-04-01 11:05:29 -07:00
Tarang Jain	636d95e8a4	Upgrade to libcuvs=25.04 (#4164 ) Summary: - [x] Upgrade cuVS version to 25.04 (nightly) - [x] Update install docs; deprecate faiss-gpu-raft - [x] CAGRA IVF-PQ Params as shared_ptr Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4164 Reviewed By: bshethmeta, gtwang01 Differential Revision: D72194928 Pulled By: mnorris11 fbshipit-source-id: ef5143760bebc2fcb2a3dc20ddc26b5d02a5c21d	2025-04-01 10:28:02 -07:00
Junjie Qi	7f523f0849	ignore regex (#4264 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4264 same as title Reviewed By: bshethmeta Differential Revision: D72179831 fbshipit-source-id: 9e77ef382312e843e68f388ee6360a6b26b032d4	2025-03-31 23:00:18 -07:00
Alexandr Guzhva	ccc2b33c88	fix a serialization problem in RaBitQ (#4261 ) Summary: it seems that `2937f94751` was not included in https://github.com/facebookresearch/faiss/pull/4235. This PR fixes this problem. junjieqi Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4261 Reviewed By: gtwang01 Differential Revision: D72180816 Pulled By: junjieqi fbshipit-source-id: 55e156a3499fda6f8cdbb99ed941a3cbdd721417	2025-03-31 22:00:35 -07:00
Kaival Parikh	13255a8bf0	Publish the C API to Conda (#4186 ) Summary: Addresses https://github.com/facebookresearch/faiss/issues/4181 Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4186 Reviewed By: junjieqi Differential Revision: D70328657 Pulled By: mnorris11 fbshipit-source-id: a8bda4f3342af557369b625e8ace13e9a1d92d65	2025-03-30 20:11:39 -07:00
Alexandr Guzhva	3a49130cec	RaBitQ implementation (#4235 ) Summary: This is a reference implementation of the https://arxiv.org/pdf/2405.12497 > Jianyang Gao, Cheng Long, "RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search". The goal is to correctly set up the internals using Faiss. The following comments for the implementation: * The code does not include the computations for the symmetric distance, because it is absent in the original article. This can be added later, though. * The original `RaBitQ` includes random matrix rotation as a part of it, but I've decided to rely on external `faiss::IndexPreTransform` and `faiss::RandomRotationMatrix` facilities. * Certain features required internal changes in `faiss::IndexIVF`, but I did that as least invasive as possible, without breaking the backward compatibility. * Not sure about naming convensions, maybe certain classes and structures need to be renamed * `METRIC_INNER_PRODUCT` is supported as well * More unit tests are needed? * I did not bring any hardware-specific optimizations, bcz this is a reference implementation. Certain `simdlib` facilities may be added later, if needed Here's how to use IndexRaBitQ ```Python ds = datasets.SyntheticDataset(...) index_rbq = faiss.IndexRaBitQ(ds.d, faiss.METRIC_L2) index_rbq.qb = 8 # wrap with random rotations rrot = faiss.RandomRotationMatrix(ds.d, ds.d) rrot.init(rrot_seed) index_cand = faiss.IndexPreTransform(rrot, index_rbq) index_cand.train(ds.get_train()) index_cand.add(ds.get_database()) ``` Here's how to use IndexIVFRaBitQ ```Python ds = datasets.SyntheticDataset(...) index_flat = faiss.IndexFlat(ds.d, faiss.METRIC_L2) index_rbq = faiss.IndexIVFRaBitQ(index_flat, ds.d, nlist, faiss.METRIC_L2) index_rbq.qb = 8 # wrap with random rotations rrot = faiss.RandomRotationMatrix(ds.d, ds.d) rrot.init(rrot_seed) index_cand = faiss.IndexPreTransform(rrot, index_rbq) index_cand.train(ds.get_train()) index_cand.add(ds.get_database()) ``` Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4235 Test Plan: Imported from GitHub, without a `Test Plan:` line. buck run 'fbcode//mode/dev' fbcode//faiss/tests:test_rabitq Reviewed By: mdouze Differential Revision: D71638302 Pulled By: junjieqi fbshipit-source-id: de981a6aed91d296237d8accf337359de04a552e	2025-03-29 12:26:39 -07:00
Satyendra Mishra	c2fc549085	Pass row filters to Hive Reader to filter rows (#4256 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4256 Pass row filters to Hive Reader to filter rows. This is needed for filtering for is_high_priority=true for Unicorn dataset Reviewed By: junjieqi Differential Revision: D71874955 fbshipit-source-id: b8ab4d9fbc8493b0da44ada66fa03339aacba9f6	2025-03-27 18:32:33 -07:00
Mayank Bhatia	6116d36af7	Grammar fix in FlatIndexHNSW (#4253 ) Summary: Changed "with with" to "with" Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4253 Reviewed By: gtwang01 Differential Revision: D71857100 Pulled By: junjieqi fbshipit-source-id: 6c11e3767cb0d244707c889206de10169fccd6bf	2025-03-26 01:26:39 -07:00
Matthijs Douze	1debb7d812	re-land mmap diff (#4250 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4250 This is an attempt to re-land the diff stack D69972250 D70982449 It was reverted because the bottom of the stack did not pass the tests. The original code comes from Alexandr Guzhva's https://github.com/facebookresearch/faiss/pull/4199 To the adsmarket steward: the diff was already accepted by your team (see D70982449), but reverted for an independent reason. So should be easy to accept now. Reviewed By: mengdilin Differential Revision: D71614511 fbshipit-source-id: 94139b4a4d457afe0d37ac95342537414aa81e7a	2025-03-24 09:56:45 -07:00
Richard Barnes	0f2035cc83	Fix CUDA kernel index data type in faiss/gpu/impl/DistanceUtils.cuh +10 (#4246 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4246 CUDA kernel variables matching the type `(thread\|block\|grid).(Idx\|Dim).(x\|y\|z)` [have the data type `uint`](https://docs.nvidia.com/cuda/cuda-c-programming-guide/#built-in-variables). Many programmers mistakenly use implicit casts to turn these data types into `int`. In fact, the [CUDA Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/) it self is inconsistent and incorrect in its use of data types in programming examples. The result of these implicit casts is that our kernels may give unexpected results when exposed to large datasets, i.e., those exceeding >~2B items. While we now have linters in place to prevent simple mistakes (D71236150), our codebase has many problematic instances. This diff fixes some of them. Reviewed By: dtolnay Differential Revision: D71355340 fbshipit-source-id: 77dac270e1d3415bfe7d5cc214006d5176508474	2025-03-19 13:19:34 -07:00
Alexandr Guzhva	1dcbb4af32	fix `IVFPQFastScan::RangeSearch()` on the `ARM` architecture (#4247 ) Summary: the problem happens if `radius - normalizers[2 * q + 1]` is negative. Thus, it is possible to provide reasonable parameters to `IVFPQFastScan::RangeSearch()` and get an empty result. I have no idea WHY (hardware implementation, it seems), but the following code ```C++ #include <cstddef> #include <cstdint> #include <iostream> int main() { float f = -25.5f; uint16_t t = f; std::cout << t << std::endl; return 0; } ``` prints `65511` on `x86` and `0` on ARM on the same compiler. Thus, it is needed to wrap the `float` value with `int` to preserve a correct cast: ```C++ uint16_t t = (int)f; ``` Who would have thought... It is useful to find some C++ compiler command line flags that will generate a compilation error on such a behavior. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4247 Reviewed By: junjieqi, satymish Differential Revision: D71427185 Pulled By: gtwang01 fbshipit-source-id: 3ff3a9d3bb523e48bb9512c380c042bb1c2decdb	2025-03-19 11:17:12 -07:00
Mengdi Lin	8bce244f1f	fix integer overflow issue when calculating imbalance_factor (#4245 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4245 When number of clustering embeddings > int32 max, calculating imbalance factor from python side causes an function overload not found error. ``` [0]:[rank0]: return faiss.imbalance_factor(len(assign), k, faiss.swig_ptr(assign)) [0]:[rank0]: NotImplementedError: Wrong number or type of arguments for overloaded function 'imbalance_factor'. [0]:[rank0]: Possible C/C++ prototypes are: [0]:[rank0]: faiss::imbalance_factor(int,int,int64_t const ) [0]:[rank0]: faiss::imbalance_factor(int,int const ) ``` Fixing it by changing the function signature in c++ land to support int64_t. Reviewed By: bshethmeta Differential Revision: D71130612 fbshipit-source-id: becbf464a9d3979269cc7f0cecc6b80a6f9e1199	2025-03-19 04:28:16 -07:00
Rohil Shah	5adab67efb	Fix bug with metric_arg in IndexHNSW (#4239 ) Summary: Fix https://github.com/facebookresearch/faiss/issues/4224. The issue is that `IndexHNSW`'s internal `Index* storage` doesn't inherit `metric_arg`. One solution is to set `metric_arg` in `IndexHNSW::add`, which is what I did. Not sure what the best place to do this would be. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4239 Reviewed By: mdouze Differential Revision: D71225749 Pulled By: gtwang01 fbshipit-source-id: b27a592febadea153b575252df0c8ef14f7705d2	2025-03-18 23:49:28 -07:00
Mengdi Lin	f2f7a66b50	Back out "test merge with internal repo" (#4244 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4244 Original commit changeset: cd8a17e6527d Original Phabricator Diff: D71327673 Reviewed By: junjieqi, gtwang01 Differential Revision: D71348755 fbshipit-source-id: 98058e81d3ba4e1d1614cc346d1b455d1de6e635	2025-03-17 19:16:13 -07:00
Junjie Qi	caa5f24656	test merge with internal repo (#4242 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4242 Reviewed By: bshethmeta Differential Revision: D71327673 Pulled By: mengdilin fbshipit-source-id: cd8a17e6527d245adc6689708f94e2932324adf5	2025-03-17 14:17:34 -07:00
Richard Barnes	9e808d4ea1	Remove unused exception parameter from faiss/impl/ResultHandler.h (#4243 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4243 `-Wunused-exception-parameter` has identified an unused exception parameter. This diff removes it. This: ``` try { ... } catch (exception& e) { // no use of e } ``` should instead be written as ``` } catch (exception&) { ``` If the code compiles, this is safe to land. Reviewed By: dtolnay Differential Revision: D71290934 fbshipit-source-id: f5e47eed369a9a024cc1e16a23acafa49f75b651	2025-03-17 13:32:43 -07:00
Gustav von Zitzewitz	fec7ce96fb	SearchParameters support for IndexBinaryFlat (#4055 ) Summary: Context issue: https://github.com/facebookresearch/faiss/issues/3503 We need search params support for binary flat index to be able to use it in RAG applications that support search with pre-filtering. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4055 Reviewed By: junjieqi Differential Revision: D69538514 Pulled By: gtwang01 fbshipit-source-id: 4b6811fd8323b4c39e726b7fd33dfe0384dd57fc	2025-03-17 13:32:17 -07:00
George Wang	df6a8f6b4e	Address compile errors and warnings (#4238 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4238 Nightly has been broken and PRs have been blocked: https://github.com/facebookresearch/faiss/actions/runs/13798181461/job/38595760879?pr=4055 There are compiler errors in Autotune.cpp and warnings in some other files that this diff seeks to address. Reviewed By: r-barnes Differential Revision: D71135388 fbshipit-source-id: b3daeff8c93dfb45144b266f3b0562164959710c	2025-03-13 16:20:56 -07:00
Saumya Agarwal	15491a1e4f	Revert D69972250: Memory-mapping and Zero-copy deserializers Differential Revision: D69972250 Original commit changeset: 98a3f94d6884 Original Phabricator Diff: D69972250 fbshipit-source-id: 1bea8b8a26c14061a01f8b26b66f0c4e6a75f550	2025-03-11 11:43:17 -07:00
Saumya Agarwal	fbc7db2cce	Revert D69984379: mem mapping and zero-copy python fixes Differential Revision: D69984379 Original commit changeset: 9437b4ad92ef Original Phabricator Diff: D69984379 fbshipit-source-id: 3cb921fa79b6f20b6455b17e50acc3cb96bcbe7b	2025-03-11 11:43:17 -07:00
Matthijs Douze	631b0fde4f	mem mapping and zero-copy python fixes (#4212 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4212 Add files to TARGETS fix python Reviewed By: mengdilin Differential Revision: D69984379 fbshipit-source-id: 9437b4ad92ef49333a44ea37ec194364123fe825	2025-03-11 11:11:14 -07:00
Alexandr Guzhva	55a3c2aff4	Memory-mapping and Zero-copy deserializers (#4199 ) Summary: This PR introduces a backport of a combination of https://github.com/zilliztech/knowhere/pull/996 and https://github.com/zilliztech/knowhere/pull/1032 that allow to have memory-mapped and zerocopy indces. The root underlying idea is that we replace certain `std::vector<>` containers with a custom `faiss::MaybeOwnedVector<>` container, which may behave either as `std::vector<>`, or as a view of a certain pointer / descriptor. We don't replace all the instances of `std::vector<>`, but the largest ones. This change affects `IndexFlatCodes`-based and `IndexHNSW` CPU indices. (done) alter IVF lists as well. (done) alter binary indices as well. Memory-mapped index works like this: ```C++ std::unique_ptr<faiss::Index> index_mm( faiss::read_index(filenamename.c_str(), faiss::IO_FLAG_MMAP_IFC)); ``` In theory, it should be ready to be used from Python. All the descriptor management should be working. Zero-copy index works like this: ```C++ #include <faiss/impl/zerocopy_io.h> faiss::ZeroCopyIOReader reader(buffer.data(), buffer.size()); std::unique_ptr<faiss::Index> index_zc(faiss::read_index(&reader)); ``` All the pointer management for `faiss::ZeroCopyIOReader` should be handled manually. I'm not sure how to plug this into Python yet, maybe, some ref-counting is required. (done) some refactoring Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4199 Reviewed By: mengdilin Differential Revision: D69972250 Pulled By: mdouze fbshipit-source-id: 98a3f94d6884814873d3534ee25f960892ef1076	2025-03-11 11:11:14 -07:00
Richard Barnes	653be59386	Use `nullptr` in faiss/gpu/StandardGpuResources.cpp (#4232 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4232 `nullptr` is preferable to `0` or `NULL`. Let's use it everywhere so we can enable `-Wzero-as-null-pointer-constant`. - If you approve of this diff, please use the "Accept & Ship" button :-) Reviewed By: dtolnay Differential Revision: D70818157 fbshipit-source-id: a46d64b6d80844f5246f7df236eb6ec54ce2886f	2025-03-09 11:24:20 -07:00
Lucian Grijincu	3d96ad56a4	faiss: fix non-templated hammings function (#4195 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4195 Non-templated `hammings` call produced incorrect values. `hammings` is called from `hamming_distance_table`, which in turn is unused so no impact. https://www.internalfb.com/code/fbsource/[85684614381d9bdfaaa0bb4a42e244296e350848]/fbcode/faiss/IndexPQ.cpp?lines=439-446 Reviewed By: gtwang01 Differential Revision: D69613329 fbshipit-source-id: 5d02a99b04492a61ebf0134f0c1719eac86fbb4f	2025-03-07 11:38:20 -08:00
Junjie Qi	4cd2f6e007	Support non-partition col and map in the embedding reader (#4229 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4229 same as title Differential Revision: D70728870 fbshipit-source-id: aeb817d80b20e5671c81ba88cdd05797cb070d23	2025-03-06 19:01:59 -08:00
Junjie Qi	a22ec32dd3	Support cosine distance for training vectors (#4227 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4227 same as title Differential Revision: D70724590 fbshipit-source-id: 943648d9002b38ba967c254c8c7014fdc7ab3de8	2025-03-06 18:07:21 -08:00
Richard Barnes	c109174198	Fix LLVM-19 compilation issue in faiss/AutoTune.cpp (#4220 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4220 LLVM-19 is incoming. This fixes an issue preventing it. Delays to previous platform upgrades cost $3M/week. Reviewed By: dtolnay Differential Revision: D70449926 fbshipit-source-id: 20e0882b9363670d6c010e1c7870cb04155a3a9d	2025-03-02 21:20:51 -08:00
Shuyao Qi	615c17ea27	Add missing #include in code_distance-sve.h (#4219 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4219 `code_distance-sve.h` references `PQDecoder8` but doesn't include the header. The issue is revealed by D68784260 which removed some includes from a header that indirectly included `ProductQuantizer.h` ``` headers/faiss/impl/code_distance/code_distance-sve.h:74:45: error: unknown type name 'PQDecoder8'; did you mean 'PQDecoderT'? 74 \| std::enable_if_t<std::is_same_v<PQDecoderT, PQDecoder8>, float> inline distance_single_code_sve( \| ^~~~~~~~~~ \| PQDecoderT ``` Reviewed By: ddrcoder Differential Revision: D70433576 fbshipit-source-id: 12945b16003a3d6a995b18ffe9798179ecf826f4	2025-02-28 22:09:17 -08:00
Tom Jackson	eab52af8ea	Fix cloning and reverse index factory for NSG indices (#4151 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4151 Reviewed By: junjieqi, asadoughi Differential Revision: D68784260 fbshipit-source-id: a715b02fd18a59c393be3ccc9aa1a7be8b196cc8	2025-02-28 15:13:56 -08:00
George Wang	1a295cd544	Remove python_abi to fix nightly (#4217 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4217 liblief seemed to be causing issues in nightly: https://github.com/facebookresearch/faiss/actions/runs/13560151536/job/37908376889 Removing the pin while pinning conda-build resolves the issue. Reviewed By: mnorris11 Differential Revision: D70344910 fbshipit-source-id: c19bfcf187714fbe36e549bfb007eb9787a011b6	2025-02-27 16:20:26 -08:00
Shuyao Qi	4cea80b41c	Make static method in header inline (#4214 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4214 Got build failure with flags `[-Werror,-Wunneeded-internal-declaration]` ``` faiss/impl/code_distance/code_distance-sve.h:199:13 error: 'static' function 'distance_four_codes_sve_for_small_m' declared in header file should be declared 'static inline' [-Werror,-Wunneeded-internal-declaration] ``` Reviewed By: vit-ka Differential Revision: D70279069 fbshipit-source-id: 28b5cc8394a9a508e25f72777f74de685d242dc4	2025-02-26 22:09:54 -08:00
Michael Norris	835b3ea1bd	Fix IVF quantizer centroid sharding so IDs are generated (#4197 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4197 Ivan and I discussed 2 problems: 1. We may want to try to offload/shard PQ or SQ table data if there is a big enough win (pending) 2. IDs seem to be random after sharding. This diff solves 2. Root cause is that we add to quantizer without IDs. Instead, we wrap in IndexIDMap2 (which provides reconstruction, whereas IndexIDMap does not). Laser's quantizers are Flat and HNSW, so we can wrap like this. Reviewed By: ivansopin Differential Revision: D69832788 fbshipit-source-id: 331b6d1cf52666f5dac61e2b52302d46b0a83708	2025-02-24 16:01:08 -08:00
Michael Norris	65222b3ed7	Pin lief to fix nightly (#4211 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4211 Reviewed By: gtwang01 Differential Revision: D70102429 fbshipit-source-id: 68e265699448a825b82467064ca95742bd4e49c3	2025-02-24 12:46:51 -08:00
lkuffo	7cb4556456	Fix Sapphire Rapids never loading in Python bindings (#4209 ) Summary: If both `avx512` and `avx512_spr` are compiled, Sapphire Rapids capabilities are never loaded when using the Python bindings, as the `avx512` import always overrides the `avx512_spr` one. This very small PR solves the issue. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4209 Reviewed By: mengdilin Differential Revision: D70015045 Pulled By: gtwang01 fbshipit-source-id: d3553a6c9048a534c0901ee29e7e2354de96e79f	2025-02-21 22:38:30 -08:00
Michael Norris	20c7ca35bb	Upgrade openblas to 0.3.29 for ARM architectures (#4203 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4203 Related to issue: https://github.com/facebookresearch/faiss/issues/4202 Reviewed By: mengdilin Differential Revision: D69933126 fbshipit-source-id: cafc5f34d0f91450c5067827756b1297684b0ce3	2025-02-21 17:46:25 -08:00
George Wang	55d022fbb0	Attempt to nightly fix (#4204 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4204 Fix for S492386 I found a slight difference between failing nightly: https://github.com/facebookresearch/faiss/actions/runs/13429138293/job/37523589618 And last succeeding nightly: https://github.com/facebookresearch/faiss/actions/runs/13301645334/job/37182266030 The mkl package in the last succeeding nightly is 2023.0.0, and it is 2023.2.0 in the failing nightly. Since mkl was recently causing trouble, I pin mkl to 2023.0.0 in this diff to match the las succeeding nightly. Reviewed By: mnorris11 Differential Revision: D69937976 fbshipit-source-id: 0c4aba4322e26aa6a03bf3ea1dbee6ed7049092c	2025-02-21 02:37:15 -08:00
Navneet Verma	00ce0e2189	Add the support for IndexIDMap with Cagra index (#4188 ) Summary: ## Description Add the support for adding vectors with ids when IndexIDMap is used with Cagra Index. Resolves issue: https://github.com/facebookresearch/faiss/issues/4107 Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4188 Reviewed By: mnorris11 Differential Revision: D69812544 Pulled By: gtwang01 fbshipit-source-id: 3c12c930e5d10ce214b12e68dacd63a644011b79	2025-02-21 00:32:05 -08:00
Nicolas De Carli	1fe8b8b5f1	Remove unused variable (#4205 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4205 Removing unused variable. This piece of code began to be compiled after armv9a has been set as default compilation profile Reviewed By: andrewjcg Differential Revision: D69946389 fbshipit-source-id: f2b5e57585506eb7cecbf76bf71bc6a2b5cc7133	2025-02-20 20:54:22 -08:00
Divye Gala	6b652892ff	Pass `store_dataset` argument along to cuVS CAGRA (#4173 ) Summary: This is required to enable lazy setting of a device copy of the training dataset to a cuVS CAGRA index. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4173 Reviewed By: mnorris11 Differential Revision: D69795662 Pulled By: gtwang01 fbshipit-source-id: 68cda198ed7983800b64d3e5fac1b77ff55ecd12	2025-02-20 19:30:30 -08:00
Michael Norris	d72d0cab6b	Fix nightly by installing earlier version of lief (#4198 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4198 1. pins lief due to `AttributeError: type object 'CLASS' has no attribute 'CLASS64'` (just set it to last passing nightly version) 2. pins mkl in gpu builds due to it trying to pull in 2024.2.2 which conflicts with 2023 in the libfaiss. Added nightlies to make sure they pass https://github.com/facebookresearch/faiss/actions/runs/13422430425/job/37498020894. Not all passed: I'm not sure the `build-pull-request / Linux x86_64 GPU w/ cuVS nightlies (CUDA 12.4.0)` nightly is actually broken, but this unblocks the PR builds for now. Reviewed By: junjieqi Differential Revision: D69860604 fbshipit-source-id: 2da623c71b03c22d581b78655253a863fbafd3ed	2025-02-19 21:44:03 -08:00
Bhavik Sheth	657c563604	Add bounds checking to hnsw nb_neighbors (#4185 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4185 Based on this users comment it seems like we should do bound checking: https://github.com/facebookresearch/faiss/issues/4177 Reviewed By: mnorris11 Differential Revision: D69497295 fbshipit-source-id: 97025cf29c464afb0f85aa98f4b303489b7fc989	2025-02-14 10:39:11 -08:00
George Wang	f0e3832986	Check for not completed Summary: Check for not completed rather than just in_progress, as runs can be queued, waiting, etc. Fix due to failed nightly not retrying because retry build found it was "queued" instead of "in_progress" Failed nightly: https://github.com/facebookresearch/faiss/actions/runs/13301645334/attempts/1 Retry that didn't trigger: https://github.com/facebookresearch/faiss/actions/runs/13301647044/job/37144032841 Reviewed By: mengdilin Differential Revision: D69610422 fbshipit-source-id: a7a9b998bba160e8d1ba13c7ae2426d99125a7e8	2025-02-13 15:49:45 -08:00
Michael Norris	aff6bfcd80	Add sharding convenience function for IVF indexes (#4150 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4150 Creates a sharding convenience function for IVF indexes. - The __centroids on the quantizer__ are sharded based on the given sharding function. (not the data, as data sharding by ids is already implemented by copy_subuset_to, https://github.com/facebookresearch/faiss/blob/main/faiss/IndexIVF.h#L408) - The output is written to files based on the template filename generator param. - The default sharding function is simply the ith vector mod the total shard count. This would called by Laser here: https://www.internalfb.com/code/fbsource/[ce1f2e028e79]/fbcode/fblearner/flow/projects/laser/laser_sim_search/knn_trainer.py?lines=295-296. This convenience function will do the file writing, and return the created file names. There's a few key required changes in FAISS: 1. Allow `std::vector<std::string>` to be used. Updates swigfaiss.swig and array_conversions.py to accommodate. These have to be numpy dtype of `object` instead of the more correct `unicode`, because unicode dtype is fixed length. I couldn't figure out how to create a numpy array with each of the output file names where they have different dtypes. (Say the file names are like file1, file11, file111. The dtype would need to be U5, U6, U7 respectively, as the dtype for unicode contains the length). I tried structured arrays : this does not work either, as numpy makes it into a matrix instead: the `file1 file11 file111` example with explicit setting of U5, U6, U7 turns into `[[file1 file1 file1], [file1 file11 file11], [file1 file11 file111]]`, which we do not want. If someone knows the right syntax, please yell at me 2. Create Python callbacks for sharding and template filename: `PyCallbackFilenameTemplateGenerator` and `PyCallbackShardingFunction`. Users of this function would inherit from the FilenameTemplateGenerator or ShardingFunction in C++ to pass to `shard_ivf_index_centroids`. See the other examples in python_callbacks.cpp. This is required because Python functions cannot be passed through SWIG to C++ (i.e. no std::function or function pointers), so we have to use this approach. This approach allows it to be called from both C++ and Python. test_sharding.py shows the Python calling, test_utils.cpp shows the C++ calling. Reviewed By: asadoughi Differential Revision: D68534991 fbshipit-source-id: b857e20c6cc4249a2ab7792db4c93dd4fb8403fd	2025-02-07 11:39:59 -08:00
Kaival Parikh	1d8f3931a3	Handle plain SearchParameters in HNSW searches (#4167 ) Summary: Add ability to search HNSW indexes using a plain [`SearchParameters`](`6c046992a7/faiss/Index.h (L64-L69)`) object (i.e. only an [`IDSelector`](`6c046992a7/faiss/Index.h (L66)`)) Issue: Currently if a plain `SearchParameters` is used to query an HNSW index, [an error is thrown](`6c046992a7/faiss/IndexHNSW.cpp (L251)`) -- when the user's intent was only to filter some documents, and rely on index settings for remaining parameters (like `efSearch`, `check_relative_distance`, `search_bounded_queue`) Motivation: Faiss provides an amazing [index factory](https://github.com/facebookresearch/faiss/wiki/The-index-factory) and [parameter setter](https://github.com/facebookresearch/faiss/wiki/Index-IO,-cloning-and-hyper-parameter-tuning) to abstract away internals of the index type and settings used, like: ```cpp Index* index = index_factory(256, "HNSW32"); ParameterSpace().set_index_parameters(index, "efConstruction=200,efSearch=150"); ``` Now if a user wants to perform a filtered search on this _opaque_ index using: ```cpp SearchParameters parameters; parameters.sel = new IDSelectorRange(10, 20); index->search(nq, xq, k, d, id, &parameters); ``` they are met with an error: ``` faiss/IndexHNSW.cpp:251: Error: '!(params)' failed: params type invalid ``` An easy way to reproduce this issue is to replace `Flat` -> `HNSW` [here](`6c046992a7/c_api/example_c.c (L60)`) and run `example_c` like: ``` make -C build example_c ./build/c_api/example_c ``` This PR allows passing a plain `SearchParameters` to HNSW indexes, and use index settings as a fallback Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4167 Reviewed By: asadoughi Differential Revision: D69312175 Pulled By: mnorris11 fbshipit-source-id: 63cc1deb6cb6116850cb3f8f7866eaa3a911ee48	2025-02-07 11:39:49 -08:00

1 2 3 4 5 ...

1432 Commits