faiss

mirror of https://github.com/facebookresearch/faiss.git synced 2025-06-03 21:54:02 +08:00

Author	SHA1	Message	Date
Maria Lomeli	c09992bc8a	Back out "Better NaN handling" (#3006 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3006 Original commit changeset: 99e7786582e9 Original Phabricator Diff: D48031390 Reviewed By: algoriddle Differential Revision: D48353221 fbshipit-source-id: fd326f2a45d20f68507ca39a33a325528651b37d	2023-08-15 09:32:01 -07:00
Fernando Gasperi	e3deb71cdb	Enable for faiss tests (#3002 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3002 title Reviewed By: jbardini Differential Revision: D48266242 fbshipit-source-id: b53e186f1954916a90dc8dbba67963f40d0aead7	2023-08-14 08:03:40 -07:00
Gergely Szilvasy	ef7e945b4d	remove avx2 from raft cmake contbuild Summary: Unnecessary for contbuild and doubles the build time. Reviewed By: mlomeli1 Differential Revision: D48148734 fbshipit-source-id: ca44a1e328ce6980c8a867a33ce311fe6eeb90e0	2023-08-08 11:44:14 -07:00
Matthijs Douze	687457b2f4	Access graph structure for NSG (#2984 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2984 It is not entirely trivial to access the NSG graph structure from Python (although it is a fixed size N-by-K matrix of vector ids). This diff adds an inspect_tools function to do that. Reviewed By: algoriddle Differential Revision: D48026775 fbshipit-source-id: 94cd7be7f656bcd333d62586531f287ea8e052e5	2023-08-04 06:55:24 -07:00
Gergely Szilvasy	da16d9d3ca	simplify raft build (#2983 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2983 Reviewed By: mdouze Differential Revision: D48063550 Pulled By: algoriddle fbshipit-source-id: c67e13cec97f4de8cc30cae47186593dbe0bdadb	2023-08-04 06:52:07 -07:00
Matthijs Douze	a3fbf2d61c	Better NaN handling (#2986 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2986 A NaN vector is a vector with at least one NaN (not-a-number) entry. After discussion in the Faiss team we decided that: - training should throw an exception on NaN vectors - added NaN vectors should be ignored (never returned) - searched NaN vectors should return only -1s This diff implements this for a few common index types + adds relevant tests. Reviewed By: algoriddle Differential Revision: D48031390 fbshipit-source-id: 99e7786582e91950e3a53c1d8bcffdd00b6afd24	2023-08-04 06:51:06 -07:00
generatedunixname89002005325676	a4ddb18605	Daily `arc lint --take CLANGFORMAT` Reviewed By: 0x1eaf Differential Revision: D47985815 fbshipit-source-id: 47bbe26ec689ac5521fe94ab52d174c60ded2ba5	2023-08-02 07:34:56 -07:00
Maria	35dac924d1	Added version to nighly install (#2982 ) Summary: The gpu nightly package install command did not install v1.7.4, see [P801820926](https://www.internalfb.com/intern/paste/P801820926) Adding the version fixes this issue, see [P801849181](https://www.internalfb.com/intern/paste/P801849181) Funnily enough, faiss-cpu nightly command works fine, see [P801848411](https://www.internalfb.com/intern/paste/P801848411) Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2982 Reviewed By: mdouze Differential Revision: D47952190 Pulled By: mlomeli1 fbshipit-source-id: 2185197e0a513c7da441d791c0b373f06f570f62	2023-08-01 12:14:35 -07:00
Alexandr Guzhva	5a95d47858	Upgrade AVX2 code for SQ8 (#2942 ) Summary: More efficient code for SQ8 for AVX2. For clang-15, improves a number of Instructions per cycle (IPC) from 2.49 to 3.20 Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2942 Reviewed By: algoriddle Differential Revision: D47946167 Pulled By: mdouze fbshipit-source-id: da864bac8d452f2eb111ca356e54a8a69cd03dbf	2023-08-01 06:08:44 -07:00
youcheng huang	0aae4d3eec	fix hnsw shrink_neighbor_list comment (#2980 ) Summary: This pr is to fix the issue https://github.com/facebookresearch/faiss/issues/2978 . Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2980 Reviewed By: mdouze Differential Revision: D47950592 Pulled By: mlomeli1 fbshipit-source-id: 32ef06c3775f7234a5a4bb4dab36c176edea2d1f	2023-08-01 05:01:30 -07:00
Corey J. Nolet	7bf714928c	Adding `libraft` dependency to speed up compile times with `USE_RAFT` (#2958 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2958 Reviewed By: mlomeli1, mdouze Differential Revision: D47678341 Pulled By: algoriddle fbshipit-source-id: 2ab2d0e8349498faa0fc59ac9800da29a201c766	2023-07-31 07:37:27 -07:00
Gergely Szilvasy	726143d056	install libraft for cmake build (#2968 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2968 Reviewed By: mlomeli1, mdouze Differential Revision: D47677660 Pulled By: algoriddle fbshipit-source-id: 8fad8323ea3c0a264149c76fc9519d9c63346d00	2023-07-31 07:37:27 -07:00
Gergely Szilvasy	821a401ae9	CodeSet for deduping large datasets (#2949 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2949 A more scalable alternative to `np.unique` for deduping large datasets with a quantized code. Reviewed By: mlomeli1 Differential Revision: D47443953 fbshipit-source-id: 4a1554d4d4200b5fa657e9d8b7395bba9856a8e3	2023-07-19 10:05:46 -07:00
Matthijs Douze	43d86e3073	Relax IVF AQ FastScan (#2940 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2940 This test fails on some occasions. After investigation it turns out this is due to non reproducible behavior IndexIVFFastScan::search_implem_14 with a parallel loop, where there are ties in the resutls (ie. the resulting distances are the same but not the ids). As a workaround I relaxed the test slightly. + a fix in the checksum function. Reviewed By: algoriddle Differential Revision: D47229086 fbshipit-source-id: 55e53bcfe47cf33041cc7fd5691b5de65067ce0f	2023-07-05 21:51:12 -07:00
Maria	a757806ae9	added blas=1.0=mkl to INSTALL (#2939 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2939 Reviewed By: algoriddle Differential Revision: D47229098 Pulled By: mlomeli1 fbshipit-source-id: 91761499d9cd13ecafe12186ddbd80224c2e7410	2023-07-05 10:05:19 -07:00
Sid Jha	d48e777412	Fix import (#2936 ) Summary: Previous import does not exist. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2936 Reviewed By: mlomeli1 Differential Revision: D47221019 Pulled By: mdouze fbshipit-source-id: 9ceeba229a10dd4b66da3483cc7695b198e1a8d8	2023-07-05 06:59:05 -07:00
Matthijs Douze	1c1d5c808f	Make tests a little less verbose Summary: Useful info on github test runs is burried in spurious logging. Avoid this. Reviewed By: mlomeli1 Differential Revision: D47209139 fbshipit-source-id: b5111c91e2b94f0c3678d599197f8e7094993df1	2023-07-04 07:02:53 -07:00
Richard Barnes	4bfdd4324f	Parallelize kernel compilation in FAISS (#2922 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2922 This parallelizes kernel compilation by taking a template function from much deeper in the stack than was previously the case and generating 128 compilation units rather than the original 8. Reviewed By: mdouze Differential Revision: D46674315 fbshipit-source-id: 830eeaf43dee2c081f735be47c809b28aa3a05f6	2023-06-30 01:30:01 -07:00
Matthijs Douze	a91a2887fe	use dispatcher function to call HammingComputer (#2918 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2918 The HammingComputer class is optimized for several vector sizes. So far it's been the caller's responsiblity to instanciate the relevant optimized version. This diff introduces a `dispatch_HammingComputer` function that can be called with a template class that is instanciated for all existing optimized HammingComputer's. Reviewed By: algoriddle Differential Revision: D46858553 fbshipit-source-id: 32c31689bba7c0b406b309fc8574c95fa24022ba	2023-06-26 14:06:10 -07:00
Matthijs Douze	a27036aa72	add small benchmark for hamming computers Summary: to measure impact of hamming computer diff Reviewed By: algoriddle Differential Revision: D46913890 fbshipit-source-id: 7b9850205885b9b7c5f394f17a79ba222e7b1e2e	2023-06-26 14:06:10 -07:00
Gergely Szilvasy	391601dc3f	relax test_ivf_train_2level threshold (#2927 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2927 Reviewed By: mlomeli1 Differential Revision: D47017009 fbshipit-source-id: cfa1df4b9632b085d3a61b56d8617bebd7e5aad6	2023-06-26 05:02:47 -07:00
Gergely Szilvasy	1d7c05de5f	raft nightly (#2926 ) Summary: Moving the raft build to a nightly, to remove the noise from the PR contbuilds. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2926 Reviewed By: mlomeli1 Differential Revision: D47016318 Pulled By: algoriddle fbshipit-source-id: 3c60aa382b9aa68dcadb929e0e4afade13c9123e	2023-06-26 03:10:05 -07:00
Octavian Guzu	9126f863d4	Prevent snprintf vulnerability Summary: With a very big name for a `ParameterRange`, the `snprintf` call from `combination_name` can end up having a negative second parameter, causing a memory overflow, which can lead to a serious security issue. We can checking that the second parameter is always >= 0 and throw an exception if not. See the new GTEST. Reviewed By: mdouze Differential Revision: D46856956 fbshipit-source-id: 91c657ec028c462d4b808b595811342034e00133	2023-06-23 08:52:20 -07:00
Richard Barnes	8ac4e41983	Switch //faiss/gpu to use templates instead of macros (#2914 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2914 The macros are part of a system to reduce compilation time via separate compilation units. Unfortunately, the parallelization is across C++ template functions instead of NVCC invocations on kernel compilation, which would be much more effective. This diff removes the preprocessor macros and expands them into templates. Compilation time after this diff is given by [this buck2 output](https://www.internalfb.com/buck2/ae9e6b28-a1bd-4d46-8af8-2895e6f182c8) with 1,043s through impl/scan/IVFInterleaved2048.cu Reviewed By: mdouze Differential Revision: D46549341 fbshipit-source-id: 5c3457876fd649e03ebeac89e4d1713f091ee9f5	2023-06-21 08:04:58 -07:00
Gergely Szilvasy	e0741ca5d7	fix for lib/jvm/languages/python/bin/conda no such file (#2917 ) Summary: environment: line 9: /opt/conda/lib/jvm/languages/python/bin/conda: No such file or directory Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2917 Reviewed By: mdouze Differential Revision: D46841321 Pulled By: algoriddle fbshipit-source-id: bdfbc16fbf422406c5195293dd4730f71a261e40	2023-06-21 00:29:51 -07:00
Gergely Szilvasy	f69b1db60a	update installation instructions with notes about mkl and the nvidia channel Reviewed By: mdouze Differential Revision: D46844223 fbshipit-source-id: 1a0862c160f2c9656db68b80475712815ee81daa	2023-06-19 11:47:31 -07:00
Matthijs Douze	07fe2b622f	Binary cloning and GPU range search (#2916 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2916 Overall better support for binary indexes: - cloning (to CPU and GPU), only for BinaryFlat for now - fix bug in reconstruct_n - range_search_max_results Reviewed By: algoriddle Differential Revision: D46755778 fbshipit-source-id: 777ad90aff5c54a77f9685ed6512247a922c6ef5	2023-06-19 06:05:14 -07:00
Gergely Szilvasy	e153cac419	fix the osx nightly build (#2896 ) Summary: Based on comments in https://github.com/conda/conda-build/issues/4498 Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2896 Reviewed By: mdouze Differential Revision: D46802512 Pulled By: algoriddle fbshipit-source-id: 7449b2f0db08fdd793770a44afb659d7ac28e3cd	2023-06-16 13:01:17 -07:00
Gergely Szilvasy	092606b293	bbs producer/consumer threading (#2901 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2901 This diff allows each GPU to work independently, a hot centroid (eg. out-of-distribution queries that hit a centroid heavily) will only block the one GPU that is processing it, others will continue to pick up work independently. Reviewed By: mdouze Differential Revision: D46521298 fbshipit-source-id: 171cb06cce8b2d16b7bd744799b105b3cd525be3	2023-06-14 07:58:44 -07:00
I	d8a6350607	Update docs (C++11 -> C++17) (#2907 ) Summary: following https://github.com/facebookresearch/faiss/issues/2899 This PR doesn't affect the software behavior Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2907 Reviewed By: mdouze Differential Revision: D46720499 Pulled By: algoriddle fbshipit-source-id: 00b47baf526a94449e2b1c9ca5fcd4cf961f6f17	2023-06-14 05:06:15 -07:00
Gergely Szilvasy	6951466b43	raft enabled cmake build (#2898 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2898 Reviewed By: mdouze Differential Revision: D46561295 Pulled By: algoriddle fbshipit-source-id: b9806c0c52acf82124c3b2e0095b1c1979318dcd	2023-06-13 08:43:18 -07:00
Richard Barnes	27ffd14ae4	Use C++17 [[fallthrough]] in faiss/utils/distances_simd.cpp (#2913 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2913 Reviewed By: algoriddle Differential Revision: D46603510 fbshipit-source-id: 374d530d79176ac553b40d5ad04bf83d4920b107	2023-06-12 15:07:08 -07:00
Richard Barnes	100beb8565	Use C++17 [[fallthrough]] in faiss/utils/hamming_distance/avx2-inl.h Reviewed By: mdouze Differential Revision: D46603512 fbshipit-source-id: fa4bab4d24f5c9e2a3506f2a67d3a7db2a01512f	2023-06-12 08:19:22 -07:00
Richard Barnes	463ffd8e28	Indicate that fallthrough is intentional in faiss (#2897 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2897 Reviewed By: algoriddle Differential Revision: D46385243 fbshipit-source-id: f08b16c9db91edca53cdbf0932a990c8c1f9d0db	2023-06-08 12:22:11 -07:00
Taras Tsugrii	8ec166c9fd	Simplify non-optimal points removal. Summary: This version is more concise and doesn't need a new scope to reduce visibility of local variable `i`. Created from CodeHub with https://fburl.com/edit-in-codehub Reviewed By: mdouze Differential Revision: D46431189 fbshipit-source-id: 5bbe8df6014d8e25aeb8d5d15145b703e9651327	2023-06-08 08:50:28 -07:00
Taras Tsugrii	f82298ffe5	Remove unused unordered_map include. (#2900 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2900 This makes builds brittle and slows down builds. Reviewed By: algoriddle Differential Revision: D46445595 fbshipit-source-id: 03a02e274922dd6215e467ead148890d79b3c2f8	2023-06-07 12:39:24 -07:00
Gergely Szilvasy	451f6cdbe5	c++ 17 (#2899 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2899 Reviewed By: mlomeli1 Differential Revision: D46521588 Pulled By: algoriddle fbshipit-source-id: 6ac4b9d7590329317455d35256cab9dc820dfccf	2023-06-07 09:10:11 -07:00
I	9c884225c1	Some changes to simdlib (#2885 ) Summary: - Use elementwise operation and reduction once instead of across-vector comparing operation twice - Use already implemented supporting functions - Unify semantics of `operator==` as same as `simd16uint16` - `operator==` of `simd8uint32` and `simd8float32` had been implemented on https://github.com/facebookresearch/faiss/issues/2568, but these has not same semantics as `simd16uint16` (which had been implemented in a long time ago). For getting the vector equality as `bool` , now we should use `is_same_as` member function. - Change `is_same_as` to accept any vector type as argument for `simdlib_neon` - `is_same_as` has supported any vector type on `simdlib_avx2` and `simdlib_emulated` already - Remove unused function `simd16uint16::is_same` on `simdlib_avx2` - Is it typo of `is_same_as` ? Anyway it seems to be used unlikely Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2885 Reviewed By: mdouze Differential Revision: D46330666 Pulled By: alexanderguzhva fbshipit-source-id: 0ea14f8e9a8bda78f24a655219dffe3e07fc110f	2023-06-01 07:39:02 -07:00
I	bbc95b1a6c	Fix windows CI (#2889 ) Summary: https://github.com/facebookresearch/faiss/issues/2882 added [a for loop, which has unsigned index, qualified with `#pragma omp parallel for`](https://github.com/facebookresearch/faiss/pull/2882/files#diff-5a89dcb99a1cce3f297c7f7dfc8e221306b281d4ced6dac1e0fc0fa54188195fR449-R452), but it seems that [MSVC doesn't support unsigned index with `#pragma omp parallel for`](https://app.circleci.com/pipelines/github/facebookresearch/faiss/4220/workflows/ee72de05-6ead-42d9-8ec5-44772e9fd41b/jobs/22529?invite=true#step-104-333) (I think this would not be conformed to OpenMP specification, but...) I (finally) change the loop with signed index. This changes introduce the precondition `n <= std::numeric_limits<std::make_signed_t<std::size_t>>::max()` , but usually this is `true` I think, so I just put this limitation as a comment instead of any `FAISS_ASSERT` or something like that. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2889 Reviewed By: wickedfoo Differential Revision: D46325322 Pulled By: alexanderguzhva fbshipit-source-id: c68f4c8be3db188ac067e053c6c716e2896f75c0	2023-05-31 13:00:30 -07:00
Matthijs Douze	90349f264b	Large two-level clustering (#2882 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2882 A two level clustering version where the training data does not need to fit in RAM. Reviewed By: algoriddle Differential Revision: D44557021 fbshipit-source-id: 892d4fec4588eb33da6e7a82c15040f39426485e	2023-05-31 00:15:03 -07:00
Alexandr Guzhva	6fd0cb60be	fix a typo (#2881 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2881 Reviewed By: algoriddle Differential Revision: D46227909 fbshipit-source-id: 9af689947f003b1f9c1dcdedcb1783b78b4bd21a	2023-05-26 11:48:19 -07:00
Alexandr Guzhva	e8b7575e93	AVX2 version of faiss::HNSW::MinimaxHeap::pop_min() (#2874 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2874 Reviewed By: mdouze Differential Revision: D46125506 fbshipit-source-id: 4099e5c95bfb168b2097a42f5308c4bea1f72ca8	2023-05-26 11:35:21 -07:00
Matthijs Douze	6800ebef83	Support independent IVF coarse quantizer Summary: In the IndexIVFIndepenentQuantizer, the coarse quantizer is applied on the input vectors, but the encoding is performed on a vector-transformed version of the database elements. Reviewed By: alexanderguzhva Differential Revision: D45950970 fbshipit-source-id: 30f6cf46d44174b1d99a12384b7d5e2d475c1f88	2023-05-26 02:59:01 -07:00
Alexandr Guzhva	a3296f42ad	Use uint8_t instead of uint32_t for faiss::VisitedTable.visno (#2873 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2873 Reviewed By: mdouze Differential Revision: D46125491 fbshipit-source-id: 9c48bb55e54eb361438521494093a3f9ab823857	2023-05-24 07:56:38 -07:00
Matthijs Douze	fd09e51316	move by_residual to IndexIVF (#2870 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2870 Factor by_residual for all the IndexIVF inheritors. Some training code can be put in IndexIVF and `train_residual` is replaced with `train_encoder`. This will be used for the IndependentQuantizer work. Reviewed By: alexanderguzhva Differential Revision: D45987304 fbshipit-source-id: 7310a687b556b2faa15a76456b1d9000e21b58ce	2023-05-23 09:56:19 -07:00
Gergely Szilvasy	1c1879b17c	tiling bfKnn (#2865 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2865 Introduces a tiling version of `bfKnn` called `bfKnn_tiling`, which can break up both queries and vectors into tiles of size vectorsMemoryLimit and queriesMemoryLimit. Reviewed By: wickedfoo Differential Revision: D45944524 fbshipit-source-id: f9cd4c14dbf2d43def773124f19e92d25c86fc5a	2023-05-23 00:38:43 -07:00
generatedunixname89002005325676	5c221edf57	Daily `arc lint --take CLANGFORMAT` Reviewed By: ivanmurashko Differential Revision: D46063974 fbshipit-source-id: 949e60d45ebb0c0e4e59c1adcc41cd43a65086df	2023-05-22 02:14:50 -07:00
Matthijs Douze	a878c79db3	Support RAFT from python (#2864 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2864 Adds use_raft to the cloner options. Adds tests for the python interface. Also continue cleanup of data structures to set default arguments. Add flags GPU and NVIDIA_RAFT to get_compile_options() Reviewed By: algoriddle Differential Revision: D45943372 fbshipit-source-id: 3428b24d309e9facfb4ebcf0d2d108dccfb4ad01	2023-05-19 20:49:01 -07:00
Matthijs Douze	48d48a37ac	fix windows test (#2862 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2862 Fix windows test introduced by range search diff Reviewed By: algoriddle Differential Revision: D45901726 fbshipit-source-id: 16259b7718f1409adef814ea4c2b5707304849ca	2023-05-17 03:17:48 -07:00
generatedunixname89002005325676	615e3fca7f	Daily `arc lint --take CLANGFORMAT` Reviewed By: jhpowell Differential Revision: D45908420 fbshipit-source-id: 84dfedc4af9dea3887e27e79b53414afd0c1790d	2023-05-16 07:40:22 -07:00

... 3 4 5 6 7 ...

1065 Commits