faiss

Commit Graph

Author	SHA1	Message	Date
Check Deng	6c99782f7c	Fix unorder bug in NSG (#2086 ) Summary: The results returned by `NSG::search` are already sorted. Calling `maxheap_reorder` will make the results unorder. Fixed https://github.com/facebookresearch/faiss/issues/2081. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2086 Test Plan: buck test //faiss/tests/:test_index -- test_order Reviewed By: beauby Differential Revision: D32593924 Pulled By: mdouze fbshipit-source-id: 794b94681610657bd2f305f7e3d6cd5d25c6bdba	2021-11-22 11:41:01 -08:00
Matthijs Douze	3eb82e32dc	Range search bug Summary: This diff fixes a serious bug in the range search implementation. During range search in a flat index, (exhaustive_L2sqr_seq and exhaustive_inner_product_seq) when running in multiple threads, the per-thread results are collected into RangeSearchPartialResult structures. When the computation is finished, they are aggregated into a RangeSearchResult. In the previous version of the code, this loop was nested into a second loop that is used to check for KeyboardInterrupts. Thus, at each iteration, the results were overwritten. The fix removes the outer loop. It is most likely useless anyways because the sequential code is called only for a small number of queries, for a larger number the BLAS version is used. Reviewed By: wickedfoo Differential Revision: D28486415 fbshipit-source-id: 89a52b17f6ca1ef68fc5e758f0e5a44d0df9fe38	2021-05-17 23:10:20 -07:00
Matthijs Douze	2d380e992b	Add manifold check for size 0 (#1867 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1867 Merging code for the 1T photodna index seems to fail at https://www.internalfb.com/phabricator/paste/view/P412975011?lines=174 with ``` terminate called after throwing an instance of 'facebook::manifold::blobstore::StorageException' what(): [400] Begin offset and/or length were invalid -- Begin offset must be positive and length must be non-negative. Received: offset = 2642410612, length = 0 Aborted (core dumped) ``` traces back to https://www.internalfb.com/intern/diffusion/FBS/browsefile/master/fbcode/manifold/blobstore/BlobstoreThriftHandler.cpp?lines=671%2C700%2C732 There is a single case where we don't check if the read or write size is 0. So let's try this fix. In the process I realized that the Manifold tests were non functional due to a name collision on common.py. Also fix this in all dependent files. Differential Revision: D28231710 fbshipit-source-id: 700ffa6ca0c82c49e7d1eae9e76549ec5ff16332	2021-05-09 22:30:31 -07:00
Chengqi Deng	c62ab3a696	Use BLAS to compute sdc table (#1809 ) Summary: This PR used BLAS to compute sdc table in ProductQuantizer. Here is the time of computing sdc tables: ``` nbits=8, d=128 (this commit) M: 2, sdc: 0.0001361370086669922s M: 4, sdc: 8.273124694824219e-05s M: 8, sdc: 7.867813110351562e-05s M: 16, sdc: 0.0001227855682373047s M: 32, sdc: 0.0001697540283203125s M: 64, sdc: 0.0007395744323730469s ``` ``` nbits=8, d=128 (master) M: 2, sdc: 0.0055773258209228516s M: 4, sdc: 0.005366802215576172s M: 8, sdc: 0.0050809383392333984s M: 16, sdc: 0.005639791488647461s M: 32, sdc: 0.006036281585693359s M: 64, sdc: 0.009720802307128906s ``` Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1809 Reviewed By: beauby Differential Revision: D27706249 Pulled By: mdouze fbshipit-source-id: 102ae0c1c157e244e40557656934062f537b74d4	2021-04-16 00:17:51 -07:00
Check Deng	c37c2fa393	Support I/O and clone for NSG (#1766 ) Summary: This PR added IO and clone support to NSG. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1766 Test Plan: buck test //faiss/tests/:test_index -- TestNSG Reviewed By: beauby Differential Revision: D27189414 Pulled By: mdouze fbshipit-source-id: c35c253d043c711d09a675f4ba5c3317b9423b5b	2021-03-23 09:18:15 -07:00
Check Deng	b35103a138	Add NSG (#1707 ) Summary: ## Description: This diff implemented Navigating Spreading-out Graph (NSG) which accepts a KNN graph as input. Here is the interface of building an NSG graph: ``` c++ void IndexNSG::build(idx_t n, const float x, idx_t knn_graph, int GK); ``` where `GK` is the nb of neighbors per node and `knn_graph[i * GK + j]` is the j-th neighbor of node i. The `add` method is not implemented yet. The unit tests could be found in `tests/test_nsg.cpp`. mdouze beauby Maybe I need some advice on how to design the interface and support python. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1707 Test Plan: buck test //faiss/tests/:test_index -- TestNSG Reviewed By: beauby Differential Revision: D26748498 Pulled By: mdouze fbshipit-source-id: 3280f705fb1b5f9c8cc5efeba63b904c3b832544	2021-03-10 15:03:00 -08:00
Dikpal Reddy	2b1194a3fa	Ensure that invalid k/nprobe search input parameters to Faiss / Faiss GPU don't crash Summary: Checking for invalid parameters (number of nearest neighbors and number of probes where applicable) in the indices and throwing. Along with unit tests. Reviewed By: wickedfoo Differential Revision: D26582467 fbshipit-source-id: e345635d2f0f44ddcecc3f3314b2b9113359a787	2021-03-03 21:17:28 -08:00
Lucas Hosseini	6d51766607	Fix unused variables in python Reviewed By: mdouze Differential Revision: D26633983 fbshipit-source-id: 32b9f95ed9647716f65b93f2713a8d5bad6abe78	2021-02-24 11:52:18 -08:00
Matthijs Douze	c5975cda72	PQ4 fast scan benchmarks (#1555 ) Summary: Code + scripts for Faiss benchmarks around the Fast scan codes. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1555 Test Plan: buck test //faiss/tests/:test_refine Reviewed By: wickedfoo Differential Revision: D25546505 Pulled By: mdouze fbshipit-source-id: 902486b7f47e36221a2671d124df8c114f25db58	2020-12-16 01:18:58 -08:00
Matthijs Douze	e1adde0d84	Faster brute force search (#1502 ) Summary: This diff streamlines the code that collects results for brute force distance computations for the L2 / IP and range search / knn search combinations. It introduces a `ResultHandler` template class that abstracts what happens with the computed distances and ids. In addition to the heap result handler and the range search result handler, it introduces a reservoir result handler that improves the search speed for large k (>=100). Benchmark results (https://fb.quip.com/y0g1ACLEqJXx#OCaACA2Gm45) show that on small datasets (10k) search is 10-50% faster (improvements are larger for small k). There is room for improvement in the reservoir implementation, whose implementation is quite naive currently, but the diff is already useful in its current form. Experiments on precomputed db vector norms for L2 distance computations were not very concluding performance-wise, so the implementation is removed from IndexFlatL2. This diff also removes IndexL2BaseShift, which was never used. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1502 Test Plan: ``` buck test //faiss/tests/:test_product_quantizer buck test //faiss/tests/:test_index -- TestIndexFlat ``` Reviewed By: wickedfoo Differential Revision: D24705464 Pulled By: mdouze fbshipit-source-id: 270e10b19f3c89ed7b607ec30549aca0ac5027fe	2020-11-04 22:16:23 -08:00
Jeff Johnson	ef6e53f8ba	Cleanup flag/data propagation for IndexShards and IndexReplicas Summary: This diff fixes https://github.com/facebookresearch/faiss/issues/1412 There were various inconsistencies in how the shard and replica wrappers updated their internal state as the sub-indices were updated. This makes the two container classes work in the same way with similar synchronization functionality. Reviewed By: beauby Differential Revision: D23974186 fbshipit-source-id: c688c0c9124f823e4239aa2ff617b007b4564859	2020-09-29 10:25:46 -07:00
Matthijs Douze	6d73c2ff69	Fix int64 for python tests in windows (#1381 ) Summary: `long` is 32 bits on windows and so is the default int type for numpy (eg. the one used for `np.arange`). This diff explicitly specifies 64-bit ints for all occurrences where it matters. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1381 Reviewed By: wickedfoo Differential Revision: D23371232 Pulled By: mdouze fbshipit-source-id: 220262cd70ee70379f83de93561a4eae71c94b04	2020-08-27 12:40:55 -07:00
Lucas Hosseini	24c4460dd2	Avoid leaking file descriptors in python tests. (#1353 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1353 Test Plan: Imported from OSS Reviewed By: mdouze Differential Revision: D23292456 Pulled By: beauby fbshipit-source-id: 44458eb16d037883ff39827accf5edddb1b1bb89	2020-08-24 06:46:52 -07:00
Lucas Hosseini	a17a631dc3	Sync 20200323. (#1157 ) * Sync 20200323. * Bump version. * Remove warning filter.	2020-03-24 14:06:48 +01:00
Lucas Hosseini	22b7876ef5	Facebook sync (2020-03-10) (#1136 )	2020-03-10 14:24:07 +01:00
Lucas Hosseini	36ddba9196	Facebook sync (2019-09-10) (#943 ) * Facebook sync (2019-09-10) * Fix depends Makefile target. * Add faiss symlink for new include directives. * Fix missing header. * Fix tests. * Fix Makefile. * Update depend. * Fix include directives spacing.	2019-09-20 18:59:10 +02:00
Lucas Hosseini	a8118acbc5	Facebook sync (May 2019) + relicense (#838 ) Changelog: - changed license: BSD+Patents -> MIT - propagates exceptions raised in sub-indexes of IndexShards and IndexReplicas - support for searching several inverted lists in parallel (parallel_mode != 0) - better support for PQ codes where nbit != 8 or 16 - IVFSpectralHash implementation: spectral hash codes inside an IVF - 6-bit per component scalar quantizer (4 and 8 bit were already supported) - combinations of inverted lists: HStackInvertedLists and VStackInvertedLists - configurable number of threads for OnDiskInvertedLists prefetching (including 0=no prefetch) - more test and demo code compatible with Python 3 (print with parentheses) - refactored benchmark code: data loading is now in a single file	2019-05-28 16:17:22 +02:00
Lucas Hosseini	afe0fdc161	Facebook sync (Mar 2019) (#756 ) Facebook sync (Mar 2019) - MatrixStats object - option to round coordinates during k-means optimization - alternative option for search in HNSW - moved stats and imbalance_factor of IndexIVF to InvertedLists object - range search for IVFScalarQuantizer - direct unit8 codec in ScalarQuantizer - renamed IndexProxy to IndexReplicas and moved to main Faiss - better support for PQ code assignment with external index - support for IMI2x16 (4B virtual centroids!) - support for k = 2048 search on GPU (instead of 1024) - most CUDA mem alloc failures throw exceptions instead of terminating on an assertion - support for renaming an ondisk invertedlists - interrupt computations with ctrl-C in python	2019-03-29 16:32:28 +01:00
Lucas Hosseini	f417a53628	Fix CI tests (#687 ) * Fix test_transfer_invlists.cpp * Fix relative imports. * Fix test_index_accuracy.py. * Use default OSX version. * Allow osx gcc6 build to fail.	2019-01-08 17:52:36 +01:00
matthijs	daf589d9d2	add bench_all_ivf	2018-12-20 05:43:36 -08:00
Lucas Hosseini	323dbf3be3	Facebook sync (Dec 2018). (#660 ) * Add GpuIndexBinaryFlat * Add IndexBinaryHNSW	2018-12-19 17:48:35 +01:00
Lucas Hosseini	76bec0b500	Facebook sync (#573 ) Features: - automatic tracking of C++ references in Python - non-intel platforms supported -- some functions optimized for ARM - override nprobe for concurrent searches - support for floating-point quantizers in binary indexes Bug fixes: - no more segfaults in python (I know it's the same as the first feature but it's important!) - fix GpuIndexIVFFlat issues for float32 with 64 / 128 dims - fix sharding of flat indexes on GPU with index_cpu_to_gpu_multiple	2018-08-30 19:38:50 +02:00
Lucas Hosseini	6880286ea0	Facebook sync (#504 ) * Facebook sync * Update swig wrappers. * Fix comment.	2018-07-06 14:12:11 +02:00
Lucas Hosseini	6e40d6689f	Move python tests back together with C++ tests. (#479 )	2018-06-04 12:20:44 +02:00
Lucas Hosseini	cf18101f6d	Refactor makefiles and add configure script (#466 ) * Refactors Makefiles and add configure script. * Give MKL higher priority in configure script. * Clean up Linux example makefile.inc. * Cleanup makefile.inc examples. * Fix python clean Makefile target. * Regen swig wrappers. * Remove useless CUDAFLAGS variable. * Fix python linking flags. * Separate compile and link phase in python makefile. * Add macro to look for swig. * Add CUDA check in configure script. * Cleanup make depend targets. * Cleanup CUDA flags. * Fix linking flags. * Fix python GPU linking. * Remove useless flags from python gpu module linking. * Add check for cuda libs. * Cleanup GPU targets. * Clean up test target. * Add cpu/gpu targets to python makefile. * Clean up tutorial Makefile. * Remove stale OS var from example makefiles. * Clean up cuda example flags.	2018-06-02 08:35:30 +02:00
Matthijs Douze	0c482e54eb	sync with FB version 2018-02-23 (#347 ) - support on-disk IVF	2018-02-23 07:49:45 -08:00
matthijs	9933892ec9	sync with FB version 2017-01-09 - adding HNSW indexing method - simultaneous search and reconstruction for IndexIVFPQ	2018-01-09 06:42:06 -08:00
matthijs	250a3d3f18	sync with FB version 2017-11-22 various bugfixes from github issues kmean with some frozen centroids GPU better tiling for large flat datasets default AVX for vector ops	2017-11-22 05:11:28 -08:00
matthijs	a5ef16db89	sync with FB version 2017-08-09	2017-08-09 11:13:51 -07:00
matthijs	8e3dc6f2b0	changed license	2017-07-30 00:18:45 -07:00
matthijs	f7aedbdfc0	sync with FB version 2017-07-18 - implemented ScalarQuantizer (without IVF) - implemented update for IndexIVFFlat - implemented L2 normalization preproc	2017-07-18 02:51:27 -07:00
matthijs	784e2facd8	Synchronization with FB version 2017-06-21 * moved most FAISS_ASSERT calls to C++ exceptions, and adjusted memory allocation to avoid mem leaks * added an IndexIVFScalarQuantizer type that offers an intermediate compression between IVFFlat and IVFPQ * support removal of indices in IndexIDMap / IndexFlat combination * various fixes in GPU code	2017-06-21 09:01:06 -07:00

32 Commits (c0052c15336a57f7068a7d098d5ce5b6234a2d70)