faiss

Commit Graph

Author	SHA1	Message	Date
Michael Norris	eff0898a13	Enable linting: lint config changes plus arc lint command (#3966 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3966 This actually enables the linting. Manual changes: - tools/arcanist/lint/fbsource-licenselint-config.toml - tools/arcanist/lint/fbsource-lint-engine.toml Automated changes: `arc lint --apply-patches --take LICENSELINT --paths-cmd 'hg files faiss'` Reviewed By: asadoughi Differential Revision: D64484165 fbshipit-source-id: 4f2f6e953c94ef6ebfea8a5ae035ccfbea65ed04	2024-10-22 09:46:48 -07:00
Xiao Fu	bf8bd6b689	Delete all remaining print (#3452 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3452 Delete all remaining print within the Tests to improve the readability and effectiveness of the codebase. Reviewed By: junjieqi Differential Revision: D57466393 fbshipit-source-id: 6ebd66ae2e769894d810d4ba7a5f69fc865b797d	2024-05-16 19:51:07 -07:00
Matthijs Douze	b109d086a2	Search and return codes (#3143 ) Summary: This PR adds a functionality where an IVF index can be searched and the corresponding codes be returned. It also adds a few functions to compress int arrays into a bit-compact representation. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3143 Test Plan: ``` buck test //faiss/tests/:test_index_composite -- TestSearchAndReconstruct buck test //faiss/tests/:test_standalone_codec -- test_arrays ``` Reviewed By: algoriddle Differential Revision: D51544613 Pulled By: mdouze fbshipit-source-id: 875f72d0f9140096851592422570efa0f65431fc	2023-11-25 13:57:25 -08:00
Matthijs Douze	9db182460c	Relax IVFFlatDedup test (#3077 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3077 This diff relaxes some IVFFlatDedup tests where distances are slighlty different over runs. Should fix https://app.circleci.com/pipelines/github/facebookresearch/faiss/4709/workflows/8c8213bf-8fe0-4c4e-9a7d-991f44bf1010/jobs/25551 https://app.circleci.com/pipelines/github/facebookresearch/faiss/4709/workflows/8c8213bf-8fe0-4c4e-9a7d-991f44bf1010/jobs/25547 Reviewed By: algoriddle Differential Revision: D49732349 fbshipit-source-id: 728b9885c6b7d6ba697ccb6bacc0abd0ee2b0679	2023-09-29 01:16:59 -07:00
Matthijs Douze	6800ebef83	Support independent IVF coarse quantizer Summary: In the IndexIVFIndepenentQuantizer, the coarse quantizer is applied on the input vectors, but the encoding is performed on a vector-transformed version of the database elements. Reviewed By: alexanderguzhva Differential Revision: D45950970 fbshipit-source-id: 30f6cf46d44174b1d99a12384b7d5e2d475c1f88	2023-05-26 02:59:01 -07:00
Matthijs Douze	8fc3775472	building blocks for hybrid CPU / GPU search (#2638 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2638 This diff is a more streamlined way of searching IVF indexes with precomputed clusters. This will be used for experiments with hybrid CPU / GPU search. Reviewed By: algoriddle Differential Revision: D41301032 fbshipit-source-id: a1d645fd0f2bf806454dfd04971edc0a6200d20d	2023-01-12 13:34:44 -08:00
Matthijs Douze	96c868f50b	move invertedlists splitting to InvertedLists.h (#2611 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2611 Moves the invlist splitting code so that it can be used independently from the IndexIVF. Add a simple test for the splitting code. Fix a bug in the IndexShards implementation. Reviewed By: alexanderguzhva Differential Revision: D41807025 fbshipit-source-id: 3f53afc5f81744343597bdfcfa90daa4f324a673	2022-12-08 01:58:22 -08:00
Abdelrahman Elmeniawy	47a9953a35	add remove and merge features for IndexFastScan (#2497 ) Summary: * Modify pq4_get_paked_element to make it not depend on an auxiliary table * Create pq4_set_packed_element which sets a single element in codes in packed format (These methods would be used in merge and remove for IndexFastScan get method is also used in FastScan indices for reconstruction) * Add remove feature for IndexFastScan * Add merge feature for indexFast Scan Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2497 Test Plan: cd build && make -j make test cd faiss/python && python setup.py build && cd ../../.. PYTHONPATH="$(ls -d ./build/faiss/python/build/lib/)" pytest tests/test_.py Reviewed By: mdouze Differential Revision: D39927403 Pulled By: mdouze fbshipit-source-id: 45271b98419203dfb1cea4f4e7eaf0662523a5b5	2022-10-11 04:14:29 -07:00
Abdelrahman Elmeniawy	f00de85645	T131797600 make index_factory support IDMap2 (#2478 ) Summary: makes index_factory support IDMap2 not only IDMap and add required tests adding IDMap2 to index_factory would help users to take advantage of its extra features more than IDMap such as reconstruct the indices. solves [issue 1864](https://github.com/facebookresearch/faiss/issues/1864) +fix downcast_index IDMap / IDMap2 order Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2478 Test Plan: cd build make -j cd faiss/python && python setup.py build cd ../../.. PYTHONPATH="$(ls -d ./build/faiss/python/build/lib/)" pytest tests/test_.py Reviewed By: mdouze Differential Revision: D39660813 Pulled By: AbdelrahmanElmeniawy fbshipit-source-id: 4881d325bb3b0eaf9637a544511d18c2084453eb	2022-09-23 06:16:36 -07:00
Matthijs Douze	c0052c1533	IndexFlatCodes: a single parent for all flat codecs (#2132 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2132 This diff adds the class IndexFlatCodes that becomes the parent of all "flat" encodings. IndexPQ IndexFlat IndexAdditiveQuantizer IndexScalarQuantizer IndexLSH Index2Layer The other changes are: - for IndexFlat, there is no vector<float> with the data anymore. It is replaced with a `get_xb()` function. This broke quite a few external codes, that this diff also attempts to fix. - I/O functions needed to be adapted. This is done without changing the I/O format for any index. - added a small contrib function to get the data from the IndexFlat - the functionality has been made uniform, for example remove_ids and add are now in the parent class. Eventually, we may support generic storage for flat indexes, similar to `InvertedLists`, eg to memmap the data, but this will again require a big change. Reviewed By: wickedfoo Differential Revision: D32646769 fbshipit-source-id: 04a1659173fd51b130ae45d345176b72183cae40	2021-12-07 01:31:07 -08:00
Matthijs Douze	2d380e992b	Add manifold check for size 0 (#1867 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1867 Merging code for the 1T photodna index seems to fail at https://www.internalfb.com/phabricator/paste/view/P412975011?lines=174 with ``` terminate called after throwing an instance of 'facebook::manifold::blobstore::StorageException' what(): [400] Begin offset and/or length were invalid -- Begin offset must be positive and length must be non-negative. Received: offset = 2642410612, length = 0 Aborted (core dumped) ``` traces back to https://www.internalfb.com/intern/diffusion/FBS/browsefile/master/fbcode/manifold/blobstore/BlobstoreThriftHandler.cpp?lines=671%2C700%2C732 There is a single case where we don't check if the read or write size is 0. So let's try this fix. In the process I realized that the Manifold tests were non functional due to a name collision on common.py. Also fix this in all dependent files. Differential Revision: D28231710 fbshipit-source-id: 700ffa6ca0c82c49e7d1eae9e76549ec5ff16332	2021-05-09 22:30:31 -07:00
Matthijs Douze	28edc56fa8	Search in sharded invlists Summary: This diff adds a CombinedIndexSharded1T class to combined_index that uses the 30 shards from the Spark reducer. The metadata is stored in pickle files on manifold. Differential Revision: D24018824 fbshipit-source-id: be4ff8b38c3d6e1bb907e02b655d0e419b7a6fea	2020-10-19 10:39:22 -07:00
Matthijs Douze	6d73c2ff69	Fix int64 for python tests in windows (#1381 ) Summary: `long` is 32 bits on windows and so is the default int type for numpy (eg. the one used for `np.arange`). This diff explicitly specifies 64-bit ints for all occurrences where it matters. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1381 Reviewed By: wickedfoo Differential Revision: D23371232 Pulled By: mdouze fbshipit-source-id: 220262cd70ee70379f83de93561a4eae71c94b04	2020-08-27 12:40:55 -07:00
Lucas Hosseini	7c6a446bf5	Avoid building OnDiskInvertedLists on Windows. (#1374 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1374 Test Plan: Imported from OSS Reviewed By: mdouze Differential Revision: D23314729 Pulled By: beauby fbshipit-source-id: 5ad7fa3ed830b17a5be66fb2995dd94e079d8507	2020-08-25 16:58:24 -07:00
Lucas Hosseini	24c4460dd2	Avoid leaking file descriptors in python tests. (#1353 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1353 Test Plan: Imported from OSS Reviewed By: mdouze Differential Revision: D23292456 Pulled By: beauby fbshipit-source-id: 44458eb16d037883ff39827accf5edddb1b1bb89	2020-08-24 06:46:52 -07:00
Lucas Hosseini	22b7876ef5	Facebook sync (2020-03-10) (#1136 )	2020-03-10 14:24:07 +01:00
Lucas Hosseini	36ddba9196	Facebook sync (2019-09-10) (#943 ) * Facebook sync (2019-09-10) * Fix depends Makefile target. * Add faiss symlink for new include directives. * Fix missing header. * Fix tests. * Fix Makefile. * Update depend. * Fix include directives spacing.	2019-09-20 18:59:10 +02:00
Lucas Hosseini	3896b12c65	Facebook sync (Jun 2019) (#862 ) Bugfixes: - slow scanning of inverted lists (#836). Features: - add basic support for 6 new metrics in CPU `IndexFlat` and `IndexHNSW` (#848); - add support for `IndexIDMap`/`IndexIDMap2` with binary indexes (#780). Misc: - throw python exception for OOM (#758); - make `DistanceComputer` available for all random access indexes; - gradually moving from `long` to `int64_t` for portability.	2019-06-19 15:59:06 +02:00
Lucas Hosseini	a8118acbc5	Facebook sync (May 2019) + relicense (#838 ) Changelog: - changed license: BSD+Patents -> MIT - propagates exceptions raised in sub-indexes of IndexShards and IndexReplicas - support for searching several inverted lists in parallel (parallel_mode != 0) - better support for PQ codes where nbit != 8 or 16 - IVFSpectralHash implementation: spectral hash codes inside an IVF - 6-bit per component scalar quantizer (4 and 8 bit were already supported) - combinations of inverted lists: HStackInvertedLists and VStackInvertedLists - configurable number of threads for OnDiskInvertedLists prefetching (including 0=no prefetch) - more test and demo code compatible with Python 3 (print with parentheses) - refactored benchmark code: data loading is now in a single file	2019-05-28 16:17:22 +02:00
Lucas Hosseini	afe0fdc161	Facebook sync (Mar 2019) (#756 ) Facebook sync (Mar 2019) - MatrixStats object - option to round coordinates during k-means optimization - alternative option for search in HNSW - moved stats and imbalance_factor of IndexIVF to InvertedLists object - range search for IVFScalarQuantizer - direct unit8 codec in ScalarQuantizer - renamed IndexProxy to IndexReplicas and moved to main Faiss - better support for PQ code assignment with external index - support for IMI2x16 (4B virtual centroids!) - support for k = 2048 search on GPU (instead of 1024) - most CUDA mem alloc failures throw exceptions instead of terminating on an assertion - support for renaming an ondisk invertedlists - interrupt computations with ctrl-C in python	2019-03-29 16:32:28 +01:00
Lucas Hosseini	323dbf3be3	Facebook sync (Dec 2018). (#660 ) * Add GpuIndexBinaryFlat * Add IndexBinaryHNSW	2018-12-19 17:48:35 +01:00
Lucas Hosseini	76bec0b500	Facebook sync (#573 ) Features: - automatic tracking of C++ references in Python - non-intel platforms supported -- some functions optimized for ARM - override nprobe for concurrent searches - support for floating-point quantizers in binary indexes Bug fixes: - no more segfaults in python (I know it's the same as the first feature but it's important!) - fix GpuIndexIVFFlat issues for float32 with 64 / 128 dims - fix sharding of flat indexes on GPU with index_cpu_to_gpu_multiple	2018-08-30 19:38:50 +02:00
Lucas Hosseini	6880286ea0	Facebook sync (#504 ) * Facebook sync * Update swig wrappers. * Fix comment.	2018-07-06 14:12:11 +02:00
Lucas Hosseini	6e40d6689f	Move python tests back together with C++ tests. (#479 )	2018-06-04 12:20:44 +02:00
Lucas Hosseini	cf18101f6d	Refactor makefiles and add configure script (#466 ) * Refactors Makefiles and add configure script. * Give MKL higher priority in configure script. * Clean up Linux example makefile.inc. * Cleanup makefile.inc examples. * Fix python clean Makefile target. * Regen swig wrappers. * Remove useless CUDAFLAGS variable. * Fix python linking flags. * Separate compile and link phase in python makefile. * Add macro to look for swig. * Add CUDA check in configure script. * Cleanup make depend targets. * Cleanup CUDA flags. * Fix linking flags. * Fix python GPU linking. * Remove useless flags from python gpu module linking. * Add check for cuda libs. * Cleanup GPU targets. * Clean up test target. * Add cpu/gpu targets to python makefile. * Clean up tutorial Makefile. * Remove stale OS var from example makefiles. * Clean up cuda example flags.	2018-06-02 08:35:30 +02:00
Ailing	cd884114d0	Make tests compatible with py3 (#348 )	2018-02-24 00:38:45 +01:00
Matthijs Douze	0c482e54eb	sync with FB version 2018-02-23 (#347 ) - support on-disk IVF	2018-02-23 07:49:45 -08:00
matthijs	250a3d3f18	sync with FB version 2017-11-22 various bugfixes from github issues kmean with some frozen centroids GPU better tiling for large flat datasets default AVX for vector ops	2017-11-22 05:11:28 -08:00
matthijs	8e3dc6f2b0	changed license	2017-07-30 00:18:45 -07:00
matthijs	12f181ee44	forgotten	2017-07-18 02:55:11 -07:00

30 Commits (4bf99c317113bb09c7f451474521462b10bea2df)