faiss

mirror of https://github.com/facebookresearch/faiss.git synced 2025-06-03 21:54:02 +08:00

Author	SHA1	Message	Date
Matthijs Douze	c0052c1533	IndexFlatCodes: a single parent for all flat codecs (#2132 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2132 This diff adds the class IndexFlatCodes that becomes the parent of all "flat" encodings. IndexPQ IndexFlat IndexAdditiveQuantizer IndexScalarQuantizer IndexLSH Index2Layer The other changes are: - for IndexFlat, there is no vector<float> with the data anymore. It is replaced with a `get_xb()` function. This broke quite a few external codes, that this diff also attempts to fix. - I/O functions needed to be adapted. This is done without changing the I/O format for any index. - added a small contrib function to get the data from the IndexFlat - the functionality has been made uniform, for example remove_ids and add are now in the parent class. Eventually, we may support generic storage for flat indexes, similar to `InvertedLists`, eg to memmap the data, but this will again require a big change. Reviewed By: wickedfoo Differential Revision: D32646769 fbshipit-source-id: 04a1659173fd51b130ae45d345176b72183cae40	2021-12-07 01:31:07 -08:00
Matthijs Douze	2d380e992b	Add manifold check for size 0 (#1867 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1867 Merging code for the 1T photodna index seems to fail at https://www.internalfb.com/phabricator/paste/view/P412975011?lines=174 with ``` terminate called after throwing an instance of 'facebook::manifold::blobstore::StorageException' what(): [400] Begin offset and/or length were invalid -- Begin offset must be positive and length must be non-negative. Received: offset = 2642410612, length = 0 Aborted (core dumped) ``` traces back to https://www.internalfb.com/intern/diffusion/FBS/browsefile/master/fbcode/manifold/blobstore/BlobstoreThriftHandler.cpp?lines=671%2C700%2C732 There is a single case where we don't check if the read or write size is 0. So let's try this fix. In the process I realized that the Manifold tests were non functional due to a name collision on common.py. Also fix this in all dependent files. Differential Revision: D28231710 fbshipit-source-id: 700ffa6ca0c82c49e7d1eae9e76549ec5ff16332	2021-05-09 22:30:31 -07:00
Matthijs Douze	28edc56fa8	Search in sharded invlists Summary: This diff adds a CombinedIndexSharded1T class to combined_index that uses the 30 shards from the Spark reducer. The metadata is stored in pickle files on manifold. Differential Revision: D24018824 fbshipit-source-id: be4ff8b38c3d6e1bb907e02b655d0e419b7a6fea	2020-10-19 10:39:22 -07:00
Matthijs Douze	6d73c2ff69	Fix int64 for python tests in windows (#1381 ) Summary: `long` is 32 bits on windows and so is the default int type for numpy (eg. the one used for `np.arange`). This diff explicitly specifies 64-bit ints for all occurrences where it matters. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1381 Reviewed By: wickedfoo Differential Revision: D23371232 Pulled By: mdouze fbshipit-source-id: 220262cd70ee70379f83de93561a4eae71c94b04	2020-08-27 12:40:55 -07:00
Lucas Hosseini	7c6a446bf5	Avoid building OnDiskInvertedLists on Windows. (#1374 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1374 Test Plan: Imported from OSS Reviewed By: mdouze Differential Revision: D23314729 Pulled By: beauby fbshipit-source-id: 5ad7fa3ed830b17a5be66fb2995dd94e079d8507	2020-08-25 16:58:24 -07:00
Lucas Hosseini	24c4460dd2	Avoid leaking file descriptors in python tests. (#1353 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1353 Test Plan: Imported from OSS Reviewed By: mdouze Differential Revision: D23292456 Pulled By: beauby fbshipit-source-id: 44458eb16d037883ff39827accf5edddb1b1bb89	2020-08-24 06:46:52 -07:00
Lucas Hosseini	22b7876ef5	Facebook sync (2020-03-10) (#1136 )	2020-03-10 14:24:07 +01:00
Lucas Hosseini	36ddba9196	Facebook sync (2019-09-10) (#943 ) * Facebook sync (2019-09-10) * Fix depends Makefile target. * Add faiss symlink for new include directives. * Fix missing header. * Fix tests. * Fix Makefile. * Update depend. * Fix include directives spacing.	2019-09-20 18:59:10 +02:00
Lucas Hosseini	3896b12c65	Facebook sync (Jun 2019) (#862 ) Bugfixes: - slow scanning of inverted lists (#836). Features: - add basic support for 6 new metrics in CPU `IndexFlat` and `IndexHNSW` (#848); - add support for `IndexIDMap`/`IndexIDMap2` with binary indexes (#780). Misc: - throw python exception for OOM (#758); - make `DistanceComputer` available for all random access indexes; - gradually moving from `long` to `int64_t` for portability.	2019-06-19 15:59:06 +02:00
Lucas Hosseini	a8118acbc5	Facebook sync (May 2019) + relicense (#838 ) Changelog: - changed license: BSD+Patents -> MIT - propagates exceptions raised in sub-indexes of IndexShards and IndexReplicas - support for searching several inverted lists in parallel (parallel_mode != 0) - better support for PQ codes where nbit != 8 or 16 - IVFSpectralHash implementation: spectral hash codes inside an IVF - 6-bit per component scalar quantizer (4 and 8 bit were already supported) - combinations of inverted lists: HStackInvertedLists and VStackInvertedLists - configurable number of threads for OnDiskInvertedLists prefetching (including 0=no prefetch) - more test and demo code compatible with Python 3 (print with parentheses) - refactored benchmark code: data loading is now in a single file	2019-05-28 16:17:22 +02:00
Lucas Hosseini	afe0fdc161	Facebook sync (Mar 2019) (#756 ) Facebook sync (Mar 2019) - MatrixStats object - option to round coordinates during k-means optimization - alternative option for search in HNSW - moved stats and imbalance_factor of IndexIVF to InvertedLists object - range search for IVFScalarQuantizer - direct unit8 codec in ScalarQuantizer - renamed IndexProxy to IndexReplicas and moved to main Faiss - better support for PQ code assignment with external index - support for IMI2x16 (4B virtual centroids!) - support for k = 2048 search on GPU (instead of 1024) - most CUDA mem alloc failures throw exceptions instead of terminating on an assertion - support for renaming an ondisk invertedlists - interrupt computations with ctrl-C in python	2019-03-29 16:32:28 +01:00
Lucas Hosseini	323dbf3be3	Facebook sync (Dec 2018). (#660 ) * Add GpuIndexBinaryFlat * Add IndexBinaryHNSW	2018-12-19 17:48:35 +01:00
Lucas Hosseini	76bec0b500	Facebook sync (#573 ) Features: - automatic tracking of C++ references in Python - non-intel platforms supported -- some functions optimized for ARM - override nprobe for concurrent searches - support for floating-point quantizers in binary indexes Bug fixes: - no more segfaults in python (I know it's the same as the first feature but it's important!) - fix GpuIndexIVFFlat issues for float32 with 64 / 128 dims - fix sharding of flat indexes on GPU with index_cpu_to_gpu_multiple	2018-08-30 19:38:50 +02:00
Lucas Hosseini	6880286ea0	Facebook sync (#504 ) * Facebook sync * Update swig wrappers. * Fix comment.	2018-07-06 14:12:11 +02:00
Lucas Hosseini	6e40d6689f	Move python tests back together with C++ tests. (#479 )	2018-06-04 12:20:44 +02:00
Lucas Hosseini	cf18101f6d	Refactor makefiles and add configure script (#466 ) * Refactors Makefiles and add configure script. * Give MKL higher priority in configure script. * Clean up Linux example makefile.inc. * Cleanup makefile.inc examples. * Fix python clean Makefile target. * Regen swig wrappers. * Remove useless CUDAFLAGS variable. * Fix python linking flags. * Separate compile and link phase in python makefile. * Add macro to look for swig. * Add CUDA check in configure script. * Cleanup make depend targets. * Cleanup CUDA flags. * Fix linking flags. * Fix python GPU linking. * Remove useless flags from python gpu module linking. * Add check for cuda libs. * Cleanup GPU targets. * Clean up test target. * Add cpu/gpu targets to python makefile. * Clean up tutorial Makefile. * Remove stale OS var from example makefiles. * Clean up cuda example flags.	2018-06-02 08:35:30 +02:00
Ailing	cd884114d0	Make tests compatible with py3 (#348 )	2018-02-24 00:38:45 +01:00
Matthijs Douze	0c482e54eb	sync with FB version 2018-02-23 (#347 ) - support on-disk IVF	2018-02-23 07:49:45 -08:00
matthijs	250a3d3f18	sync with FB version 2017-11-22 various bugfixes from github issues kmean with some frozen centroids GPU better tiling for large flat datasets default AVX for vector ops	2017-11-22 05:11:28 -08:00
matthijs	8e3dc6f2b0	changed license	2017-07-30 00:18:45 -07:00
matthijs	12f181ee44	forgotten	2017-07-18 02:55:11 -07:00

21 Commits