faiss

mirror of https://github.com/facebookresearch/faiss.git synced 2025-06-03 21:54:02 +08:00

Author	SHA1	Message	Date
Amir Sadoughi	f04424c9f6	\#3526: address numpy 2 upgrade	2025-02-07 17:35:59 -08:00
Michael Norris	32beb162f2	Migration off defaults to conda-forge channel (#4126 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4126 Good resource on overriding channels to make sure we aren't using `defaults`:https://stackoverflow.com/questions/67695893/how-do-i-completely-purge-and-disable-the-default-channel-in-anaconda-and-switch Explanation of changes: - - changed to miniforge from miniconda: this ensures we only pull in from conda-defaults when creating the environment - architecture: ARM64 and aarch64 are the same thing. But there is no miniforge package for ARM64, so we need to make it check for aarch64 instead. However, mac breaks this rule, and does have macOS-arm64! So there is a conditional for mac to use arm64. https://github.com/conda-forge/miniforge/releases/ - action.yml mkl 2022.2.1 change: conda-forge and defaults have completely different dependencies. Defaults required intel-openmp, but now on conda-forge, mkl 2023.1 or higher requires llvm-openmp >=14.0.6, but this is incompatible with the pytorch build <2.5 which requires llvm-openmp<14.0. We would need to upgrade Python to 3.12 first, upgrade Pytorch build, then upgrade this mkl. (The meta.yaml changes are the ones that narrow it to 2022.2.1 during `conda build faiss`.) So, this has just been changed to 2022.2.1. - mkl now requires _openmp_mutex of type "llvm" instead of "gnu": prior non-cuVS builds all used gnu, because intel-openmp from anaconda defaults channel does not require llvm-openmp. Now we need to remove the gnu one which is automatically pulled in during miniconda setup, and only keep the llvm version of _openmp_mutex. - liblief: The above changes tried to pull in liblief 0.15. This results in an error like `AttributeError: module 'lief._lief.ELF' has no attribute 'ELF_CLASS'`. When I checked passing PR builds on defaults, they use lief 0.12, so I pinned that one for Python 3.9 3.10 3.11. For Python 3.12, we need lief 0.14 or higher. - gcc_linux-64 =11.2 for faiss-gpu on cudatoolkit-11.2: GPU builds kept trying to reference 11.2 when 14.2 was installed. I couldn't figure out why, or how to point it to the 14.2 installed on the host. Current nightly builds still reference 11.2, so I gave up and pinned 11.2 to keep it the same. Moving to 14.2 will take some more investigation. - meta.yaml mkl 2023.0 vs 2023.1 with python versions: 3.9, 3.10, and 3.11 pass with 2023.0, but python 3.12 needs mkl 2023.1 or higher. Otherwise we get: ``` INTEL MKL ERROR: $PREFIX/lib/python3.12/site-packages/faiss/../../.././libmkl_def.so.2: undefined symbol: mkl_sparse_optimize_bsr_trsm_i8. Intel MKL FATAL ERROR: Cannot load libmkl_def.so.2. ``` so the solution was to put a bunch of conditions in in faiss/meta.yaml. We should be able to use Jinja macros to reduce duplication but it requires some investigation. It was failing: https://github.com/facebookresearch/faiss/actions/runs/12915187334/job/36016477707?pr=4126 (paste of logs here: P1716887936). This can be a future BE task. Macro example (the `-` signs remove whitespace lines before and after) ``` {% macro inclmkldevel() %} {%- if PY_VER == '3.9' or PY_VER == '3.10' or PY_VER == '3.11' -%} - mkl-devel =2023.0 # [x86_64] - liblief =0.12.3 # [not win] - python_abi <3.12 {%- elif PY_VER == '3.12' %} - mkl-devel >=2023.2.0 # [x86_64] - liblief =0.15.1 # [not win] - python_abi =3.12 {% endif -%} {% endmacro %} ``` The python_abi was required to be pinned inside these conditions because otherwise several builds got this error: ``` File "/Users/runner/miniconda3/lib/python3.12/site-packages/conda_build/utils.py", line 1919, in insert_variant_versions matches = [regex.match(pkg) for pkg in reqs] ^^^^^^^^^^^^^^^^ TypeError: expected string or bytes-like object, got 'list' ``` Unit test notes: - - test_gpu_basics.py: GPU residual quantizer: Debugged extensively with Matthijs. The problem is in the C++ -> Python conversion. The C++ side prints the right values, but when getting it back to Python, it is filled with junk data. It is only reproducible on CUDA 11.4.4 after switching channels. It is likely a compiler problem. We discussed, and resolved to create a C++ side unit test (so this diff creates TestGpuResidualQuantizer) to verify the functionality and disable the Python unit test, but leave it in the codebase with a comment. Matthijs made extensive notes in https://docs.google.com/document/d/1MjMdOpPgx-MArdrYJZCaQlRqlrhSj5Y1Z9lTyiab8jc/edit?usp=sharing . - test_contrib.py: this now hangs forever and times out the runner for Windows on Python 3.12. I have it skipping now. - test_mem_leak.cpp seems flaky. It sometimes fails, then passes with rerun. Unfixed issues: - - I noticed sometimes downloads will fail with the text like below. It passes on re-run. ``` libgomp-14.2.0-h77fa898_1.conda extraction failed Warning: error libmamba Error when extracting package: Could not chdir info/recipe/parent/patches/0005-Hardcode-HAVE_ALIGNED_ALLOC-1-in-libstdc-v3-configur.patch error libmamba Error when extracting package: Could not chdir info/recipe/parent/patches/0005-Hardcode-HAVE_ALIGNED_ALLOC-1-in-libstdc-v3-configur.patch Warning: Found incorrect download: libgomp. Aborting Found incorrect download: libgomp. Aborting Warning: ``` Green build and tests for both build pull request and nightlies: https://github.com/facebookresearch/faiss/actions/runs/12956402963/job/36148818361 Reviewed By: asadoughi Differential Revision: D68043874 fbshipit-source-id: b105a1e3e6272763ad9daab7fc6f05a79f01c9e2	2025-01-27 14:49:18 -08:00
Junjie Qi	3f3d18d2e7	add test to cover GPU (#4130 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4130 same as title Reviewed By: asadoughi Differential Revision: D68388863 fbshipit-source-id: 4ebb38d8454bf95733c918950d7d8d3b22e00d5d	2025-01-21 11:45:07 -08:00
Di-Is	905963f344	Add `ngpu` default argument to `knn_ground_truth` (#4123 ) Summary: This pull request introduces a new default argument, `ngpu=-1`, to the `knn_ground_truth` function in the `faiss.contrib`. ## Purpose of Change ### Bug Fix In the current implementation, running tests under the tests directory (CPU tests) in an environment with faiss-gpu installed would inadvertently use the GPU and cause unintended behavior. This pull request prevents the GPU from being used during CPU-only tests by explicitly controlling GPU allocation via the ngpu parameter. ### API Consistency Other functions that call `faiss.get_num_gpus` in `faiss.contrib`, such as `range_search_max_results` and `range_ground_truth`, already include the `ngpu` argument. Adding this parameter to `knn_ground_truth` will ensure consistency across the API, reduce potential confusion, and improve ease of use. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4123 Reviewed By: asadoughi Differential Revision: D68199506 Pulled By: junjieqi fbshipit-source-id: cb50e206d8a1a982c21b0ccb42825ea45873f3ef	2025-01-21 11:09:02 -08:00
Mulugeta Mammo	3beb07b198	Add a new architecture mode: 'avx512_spr'. (#4025 ) Summary: This PR adds a new architecture mode to support the new extensions to AVX512, namely [AVX512-FP16](https://networkbuilders.intel.com/solutionslibrary/intel-avx-512-fp16-instruction-set-for-intel-xeon-processor-based-products-technology-guide), which have been available since Intel® Sapphire Rapids. This PR is a prerequisite for [PR#4020](https://github.com/facebookresearch/faiss/pull/4020) that speeds up hamming distance evaluations. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4025 Reviewed By: pankajsingh88 Differential Revision: D67524575 Pulled By: mengdilin fbshipit-source-id: f3a09943b062d720b241f95aef2f390923ffd779	2024-12-23 08:56:26 -08:00
Junjie Qi	3c25a68e52	fix linter (#4035 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4035 same as title Reviewed By: bshethmeta Differential Revision: D66154375 fbshipit-source-id: c5d05aac2d8e502058f0403b0ec0ea243afa18e2	2024-11-19 12:09:40 -08:00
Michael Norris	eff0898a13	Enable linting: lint config changes plus arc lint command (#3966 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3966 This actually enables the linting. Manual changes: - tools/arcanist/lint/fbsource-licenselint-config.toml - tools/arcanist/lint/fbsource-lint-engine.toml Automated changes: `arc lint --apply-patches --take LICENSELINT --paths-cmd 'hg files faiss'` Reviewed By: asadoughi Differential Revision: D64484165 fbshipit-source-id: 4f2f6e953c94ef6ebfea8a5ae035ccfbea65ed04	2024-10-22 09:46:48 -07:00
Matthijs Douze	6baebe2cee	begin torch_contrib (#3872 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3872 The contrib.torch subdirectory is intended to receive modules in python that are useful for similarity search and that apply to CPU or GPU pytorch tensors. The current version includes CPU clustering on torch tensors. To be added: * implementation of PQ Reviewed By: asadoughi Differential Revision: D62759207 fbshipit-source-id: 87dbaa5083e3f2f4f60526815e22ded4e83e8559	2024-09-20 09:15:27 -07:00
Matthijs Douze	0d7817e88f	rewrite python kmeans without scipy (#3873 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3873 The previous version required scipy to do the accumulation, which is replaced here with a nifty piece of numpy accumulation. This removes the need for scipy for non-sparse data. Reviewed By: junjieqi Differential Revision: D62884307 fbshipit-source-id: 5443634e487387a2b518fd2a7f9a3d9a40abd4b4	2024-09-20 09:15:27 -07:00
Xiao Fu	bf8bd6b689	Delete all remaining print (#3452 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3452 Delete all remaining print within the Tests to improve the readability and effectiveness of the codebase. Reviewed By: junjieqi Differential Revision: D57466393 fbshipit-source-id: 6ebd66ae2e769894d810d4ba7a5f69fc865b797d	2024-05-16 19:51:07 -07:00
Kumar Saurabh Arora	da9f292a4b	Support of skip_ids in merge_from_multiple function of OnDiskInvertedLists (#3327 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3327 Context 1. [Issue 2621](https://github.com/facebookresearch/faiss/issues/2621) discuss inconsistency between OnDiskInvertedList and InvertedList. OnDiskInvertedList is supposed to handle disk based multiple Index Shards. Thus, we should name it differently when merging invls from index shard. 2. [Issue 2876](https://github.com/facebookresearch/faiss/issues/2876) provides usecase of shifting ids when merging invls from different shards. In this diff, 1. To address #1 above, I renamed the merge_from function to merge_from_multiple without touching merge_from base class. why so? To continue to allow merge invl from one index to ondiskinvl from other index. 2. To address #2 above, I have added support of shift_ids in merge_from_multiple to shift ids from different shards. This can be used when each shard has same set of ids but different data. This is not recommended if id is already unique across shards. Reviewed By: mdouze Differential Revision: D55482518 fbshipit-source-id: 95470c7449160488d2b45b024d134cbc037a2083	2024-04-03 10:36:56 -07:00
Maria Lomeli	420d25f51c	Index pretransform support in search_preassigned (#3225 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3225 This diff fixes issue [#3113](https://github.com/facebookresearch/faiss/issues/3113), e.g. introduces support for index pretransform in `search_preassigned`. Reviewed By: mdouze Differential Revision: D53188584 fbshipit-source-id: 8189c0a59f957a2606391f22cf3fdc8874110a6e	2024-01-30 09:20:07 -08:00
Gergely Szilvasy	2768fb38b2	faiss-gpu-raft package (#2992 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2992 Reviewed By: mdouze Differential Revision: D48391366 Pulled By: algoriddle fbshipit-source-id: 94b7f62afc8a09a9feaea47bf60e5358d89fcde5	2023-08-16 09:30:41 -07:00
Matthijs Douze	687457b2f4	Access graph structure for NSG (#2984 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2984 It is not entirely trivial to access the NSG graph structure from Python (although it is a fixed size N-by-K matrix of vector ids). This diff adds an inspect_tools function to do that. Reviewed By: algoriddle Differential Revision: D48026775 fbshipit-source-id: 94cd7be7f656bcd333d62586531f287ea8e052e5	2023-08-04 06:55:24 -07:00
Gergely Szilvasy	821a401ae9	CodeSet for deduping large datasets (#2949 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2949 A more scalable alternative to `np.unique` for deduping large datasets with a quantized code. Reviewed By: mlomeli1 Differential Revision: D47443953 fbshipit-source-id: 4a1554d4d4200b5fa657e9d8b7395bba9856a8e3	2023-07-19 10:05:46 -07:00
Gergely Szilvasy	391601dc3f	relax test_ivf_train_2level threshold (#2927 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2927 Reviewed By: mlomeli1 Differential Revision: D47017009 fbshipit-source-id: cfa1df4b9632b085d3a61b56d8617bebd7e5aad6	2023-06-26 05:02:47 -07:00
Matthijs Douze	07fe2b622f	Binary cloning and GPU range search (#2916 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2916 Overall better support for binary indexes: - cloning (to CPU and GPU), only for BinaryFlat for now - fix bug in reconstruct_n - range_search_max_results Reviewed By: algoriddle Differential Revision: D46755778 fbshipit-source-id: 777ad90aff5c54a77f9685ed6512247a922c6ef5	2023-06-19 06:05:14 -07:00
Gergely Szilvasy	092606b293	bbs producer/consumer threading (#2901 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2901 This diff allows each GPU to work independently, a hot centroid (eg. out-of-distribution queries that hit a centroid heavily) will only block the one GPU that is processing it, others will continue to pick up work independently. Reviewed By: mdouze Differential Revision: D46521298 fbshipit-source-id: 171cb06cce8b2d16b7bd744799b105b3cd525be3	2023-06-14 07:58:44 -07:00
Matthijs Douze	6800ebef83	Support independent IVF coarse quantizer Summary: In the IndexIVFIndepenentQuantizer, the coarse quantizer is applied on the input vectors, but the encoding is performed on a vector-transformed version of the database elements. Reviewed By: alexanderguzhva Differential Revision: D45950970 fbshipit-source-id: 30f6cf46d44174b1d99a12384b7d5e2d475c1f88	2023-05-26 02:59:01 -07:00
Matthijs Douze	b9ea339617	support range search from GPU (#2860 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2860 Optimized range search function where the GPU computes by default and falls back on gpu for queries where there are too many results. Parallelize the CPU to GPU cloning, it seems to work. Support range_search_preassigned in Python Fix long-standing issue with SWIG exposed functions that did not release the GIL (in particular the MapLong2Long). Adds a MapInt64ToInt64 that is more efficient than MapLong2Long. Reviewed By: algoriddle Differential Revision: D45672301 fbshipit-source-id: 2e77397c40083818584dbafa5427149359a2abfd	2023-05-16 00:27:53 -07:00
Matthijs Douze	2d8886cd4f	IVF sorting routine (#2846 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2846 Adds a function to ivf_contrib to sort the inverted lists by size without changing the results. Also moves big_batch_search to its own module. Reviewed By: algoriddle Differential Revision: D45565880 fbshipit-source-id: 091a1c1c074f860d6953bf20d04523292fb55e1a	2023-05-04 09:59:06 -07:00
Matthijs Douze	016aa04602	make balanced clusters the default (#2796 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2796 This diff makes balanced clusters the default for 2-level clustering. This seems to improve a bit over the default uniform clusters, see https://github.com/fairinternal/faiss_improvements/blob/master/better_coarse_quantizer/two_level_clustering.ipynb Warning: the nc2 argument of two_level_clustering becomes the total number of clusters. Reviewed By: algoriddle Differential Revision: D44421222 fbshipit-source-id: 951b7fc043be4a41762a7e6f7a6fcfb71e303832	2023-03-28 07:23:30 -07:00
Matthijs Douze	0200d131fc	fix windows test (#2775 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2775 Reviewed By: algoriddle Differential Revision: D44210010 fbshipit-source-id: b9b620a4b0a874e09ee2f6082ff0f9463716fdf4	2023-03-21 05:34:50 -07:00
Matthijs Douze	2d7dd5b0a6	support checkpointing in big batch search Summary: Big batch search can be running for hours so it's useful to have a checkpointing mechanism in case it's run on a best-effort cluster queue. Reviewed By: algoriddle Differential Revision: D44059758 fbshipit-source-id: 5cb5e80800c6d2bf76d9f6cb40736009cd5d4b8e	2023-03-14 11:11:50 -07:00
Matthijs Douze	fa53e2c941	Implementation of big-batch IVF search (single machine) (#2567 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2567 Intuitively, it should be easier to handle big-batch searches because all distance computations for a set of queries can be done locally within each inverted list. This benchmark implements this in pure python (but should be close to optimal in terms of speed), on CPU for IndexIVFFlat, IndexIVFPQ and IndexIVFScalarQuantizer. GPU is also supported. The results are not systematically better, see https://docs.google.com/document/d/1d3YuV8uN7hut6aOATCOMx8Ut-QEl_oRnJdPgDBRF1QA/edit?usp=sharing Reviewed By: algoriddle Differential Revision: D41098338 fbshipit-source-id: 479e471b0d541f242d420f581775d57b708a61b8	2022-12-09 08:53:13 -08:00
Matthijs Douze	f68ddd0564	fix test in test_contrib (#2294 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2294 there is a weird CI failure on one of the platforms occurring in the PR https://github.com/facebookresearch/faiss/pull/2291 This diff makes the test a bit more robust, correcting inter_perf to computer the intersection measure. Hopefully this will make the bug go away. Reviewed By: beauby Differential Revision: D35558855 fbshipit-source-id: f5a926d9d8ebee975e538c65ac37b15d485798aa	2022-04-20 03:03:38 -07:00
Matthijs Douze	b8fe92dfee	contrib clustering module (#2217 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2217 This diff introduces a new Faiss contrib module that contains: - generic k-means implemented in python (was in distributed_ondisk) - the two-level clustering code, including a simple function that runs it on a Faiss IVF index. - sparse clustering code (new) The main idea is that that code is often re-used so better have it in contrib. Reviewed By: beauby Differential Revision: D34170932 fbshipit-source-id: cc297cc56d241b5ef421500ed410d8e2be0f1b77	2022-02-28 14:18:47 -08:00
Matthijs Douze	eb8781557f	Fix exhaustive search GT computation with IP distance (#2212 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2212 Fixes issue https://github.com/facebookresearch/faiss/issues/2205 clear bug report easy fix easy to accept ;-) Reviewed By: beauby Differential Revision: D33975281 fbshipit-source-id: 088e1f3078dc79402563be7fac3530d76b197006	2022-02-07 19:36:21 -08:00
Matthijs Douze	c0052c1533	IndexFlatCodes: a single parent for all flat codecs (#2132 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2132 This diff adds the class IndexFlatCodes that becomes the parent of all "flat" encodings. IndexPQ IndexFlat IndexAdditiveQuantizer IndexScalarQuantizer IndexLSH Index2Layer The other changes are: - for IndexFlat, there is no vector<float> with the data anymore. It is replaced with a `get_xb()` function. This broke quite a few external codes, that this diff also attempts to fix. - I/O functions needed to be adapted. This is done without changing the I/O format for any index. - added a small contrib function to get the data from the IndexFlat - the functionality has been made uniform, for example remove_ids and add are now in the parent class. Eventually, we may support generic storage for flat indexes, similar to `InvertedLists`, eg to memmap the data, but this will again require a big change. Reviewed By: wickedfoo Differential Revision: D32646769 fbshipit-source-id: 04a1659173fd51b130ae45d345176b72183cae40	2021-12-07 01:31:07 -08:00
Matthijs Douze	1829aa92a1	three small fixes (#1972 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1972 This fixes a few issues that I ran into + adds tests: - range_search_max_results with IP search - a few missing downcasts for VectorTRansforms - ResultHeap supports max IP search Reviewed By: wickedfoo Differential Revision: D29525093 fbshipit-source-id: d4ff0aff1d83af9717ff1aaa2fe3cda7b53019a3	2021-07-01 16:08:45 -07:00
Matthijs Douze	2d380e992b	Add manifold check for size 0 (#1867 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1867 Merging code for the 1T photodna index seems to fail at https://www.internalfb.com/phabricator/paste/view/P412975011?lines=174 with ``` terminate called after throwing an instance of 'facebook::manifold::blobstore::StorageException' what(): [400] Begin offset and/or length were invalid -- Begin offset must be positive and length must be non-negative. Received: offset = 2642410612, length = 0 Aborted (core dumped) ``` traces back to https://www.internalfb.com/intern/diffusion/FBS/browsefile/master/fbcode/manifold/blobstore/BlobstoreThriftHandler.cpp?lines=671%2C700%2C732 There is a single case where we don't check if the read or write size is 0. So let's try this fix. In the process I realized that the Manifold tests were non functional due to a name collision on common.py. Also fix this in all dependent files. Differential Revision: D28231710 fbshipit-source-id: 700ffa6ca0c82c49e7d1eae9e76549ec5ff16332	2021-05-09 22:30:31 -07:00
Matthijs Douze	3f2ebf4b1c	Add preassigned functions to contrib Summary: Adds the preassigned add and search python wrappers to contrib. Adds the preassigned search for the binary case (was missing before). Also adds a real test for that functionality. Reviewed By: beauby Differential Revision: D26560021 fbshipit-source-id: 330b715a9ed0073cfdadbfbcb1c23b10bed963a5	2021-02-25 11:39:07 -08:00
Matthijs Douze	5602724979	make calling conventions uniform between faiss.knn and faiss.knn_gpu Summary: The order of xb an xq was different between `faiss.knn` and `faiss.knn_gpu`. Also the metric argument was called distance_type. This diff fixes both. Hopefully not too much external code depends on it. Reviewed By: wickedfoo Differential Revision: D26222853 fbshipit-source-id: b43e143d64d9ecbbdf541734895c13847cf2696c	2021-02-03 12:21:40 -08:00
Matthijs Douze	3dd7ba8ff9	Add range search accuracy evaluation Summary: Added a few functions in contrib to: - run range searches by batches on the query or the database side - emulate range search on GPU: search on GPU with k=1024, if the farthest neighbor is still within range, re-perform search on CPU - as reference implementations for precision-recall on range search datasets - optimized code to plot precision-recall plots (ie. sweep over thresholds) The new functions are mainly in a new `evaluation.py` Reviewed By: wickedfoo Differential Revision: D25627619 fbshipit-source-id: 58f90654c32c925557d7bbf8083efbb710712e03	2020-12-17 17:17:09 -08:00
Matthijs Douze	6d0bc58db6	Implementation of PQ4 search with SIMD instructions (#1542 ) Summary: IndexPQ and IndexIVFPQ implementations with AVX shuffle instructions. The training and computing of the codes does not change wrt. the original PQ versions but the code layout is "packed" so that it can be used efficiently by the SIMD computation kernels. The main changes are: - new IndexPQFastScan and IndexIVFPQFastScan objects - simdib.h for an abstraction above the AVX2 intrinsics - BlockInvertedLists for invlists that are 32-byte aligned and where codes are not sequential - pq4_fast_scan.h/.cpp: for packing codes and look-up tables + optmized distance comptuation kernels - simd_result_hander.h: SIMD version of result collection in heaps / reservoirs Misc changes: - added contrib.inspect_tools to access fields in C++ objects - moved .h and .cpp code for inverted lists to an invlists/ subdirectory, and made a .h/.cpp for InvertedListsIOHook - added a new inverted lists type with 32-byte aligned codes (for consumption by SIMD) - moved Windows-specific intrinsics to platfrom_macros.h Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1542 Test Plan: ``` buck test mode/opt -j 4 //faiss/tests/:test_fast_scan_ivf //faiss/tests/:test_fast_scan buck test mode/opt //faiss/manifold/... ``` Reviewed By: wickedfoo Differential Revision: D25175439 Pulled By: mdouze fbshipit-source-id: ad1a40c0df8c10f4b364bdec7172e43d71b56c34	2020-12-03 10:06:38 -08:00
Matthijs Douze	92306e3a69	Synthetic dataset with inner product option Summary: The synthetic dataset can now have IP groundtruth Reviewed By: wickedfoo Differential Revision: D24219860 fbshipit-source-id: 42e094479311135e932821ac0a97ed0fb237bf78	2020-10-20 03:46:26 -07:00
Lucas Hosseini	70eaa9b1a3	Add missing copyright headers. (#1460 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1460 Reviewed By: wickedfoo Differential Revision: D24278804 Pulled By: beauby fbshipit-source-id: 5ea96ceb63be76a34f1eb4da03972159342cd5b6	2020-10-13 11:15:59 -07:00
Matthijs Douze	8b05434a50	Remove useless function Summary: Removed an unused function that caused compile errors in some configurations. Added contrib function (exhaustive_search.knn) to compute the k nearest neighbors without constructing an index. Renamed the equivalent GPU function as exhaustive_search.knn_gpu (it does not make much sense to mention numpy in the name as all functions take numpy arguments by default). Reviewed By: beauby Differential Revision: D24215427 fbshipit-source-id: 6d8e1eafa7c57593304b7b76f83b3015e4d2a2bb	2020-10-09 07:57:04 -07:00
Matthijs Douze	65ee09484f	Test GPU ground-truth computation (#1432 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1432 The contrib function knn_ground_truth does not provide exactly the same resutls on GPU and CPU (but relative accuracy is still 1e-7). This diff relaxes the constraint on CPU and added test on GPU. Reviewed By: wickedfoo Differential Revision: D24012199 fbshipit-source-id: aaa20dbdf42b876b3ed7da34028646dbb20833d3	2020-09-30 11:14:18 -07:00
Matthijs Douze	f849680777	Dataset access in contrib Summary: This diff adds an object for a few useful dataset in faiss.contrib. This includes synthetic datasets and the classic ones. It is intended to work on: - the FAIR cluster - gluster - manifold Reviewed By: wickedfoo Differential Revision: D23378763 fbshipit-source-id: 2437a7be9e712fd5ad1bccbe523cc1c936f7ab35	2020-08-27 19:19:33 -07:00
Lucas Hosseini	e5d2defaae	Disable contrib tests for python2. (#1364 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1364 Test Plan: Imported from OSS Reviewed By: mdouze Differential Revision: D23314732 Pulled By: beauby fbshipit-source-id: 788465c353bbc65947a6c766e8509f35f35e4134	2020-08-25 16:58:24 -07:00
Lucas Hosseini	a8e4c5e2d5	Move build to CMake (#1313 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1313 Reviewed By: mdouze Differential Revision: D22948267 Pulled By: beauby fbshipit-source-id: ec16fa0342f37672d46fb7886ecc55c7996011c4	2020-08-14 15:03:10 -07:00
Lucas Hosseini	cd38e82f0c	Facebook sync 2020-07-31 (#1308 )	2020-08-03 22:15:02 +02:00

43 Commits