faiss

Commit Graph

Author	SHA1	Message	Date
Matthijs Douze	189aecb224	Fix polysemous OOM Summary: Polysemous training can OOM because it uses tables of size n^2 with n is 2**nbit of the PQ. This throws and exception when the table threatens to become too large. It also reduces the number of threads when this would make it possible to fit the computation within max_memory bytes. Reviewed By: wickedfoo Differential Revision: D26856747 fbshipit-source-id: bd98e60293494e2f4b2b6d48eb1200efb1ce683c	2021-03-06 00:40:05 -08:00
H. Vetinari	42c6175535	fix warning about deprecate assertEquals (#1738 ) Summary: There's an annoying warning on every test run that I'd like to fix ``` =============================== warnings summary =============================== tests/test_index_accuracy.py::TestRefine::test_IP tests/test_index_accuracy.py::TestRefine::test_L2 $SRC_DIR/tests/test_index_accuracy.py:726: DeprecationWarning: Please use assertEqual instead. self.assertEquals(recall1, recall2) ``` I've tried sneaking this into https://github.com/facebookresearch/faiss/issues/1704 & https://github.com/facebookresearch/faiss/issues/1717 already, but the first needs more time and in the second, beauby asked me to keep this separate, so here's a new PR. :) Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1738 Reviewed By: wickedfoo Differential Revision: D26855644 Pulled By: mdouze fbshipit-source-id: 1198a9d9b3a79dfeb1d69513a61229fb45924f89	2021-03-05 13:46:35 -08:00
Check Deng	d6535a3d87	Add NNDescent to faiss (#1654 ) Summary: As discussed in https://github.com/facebookresearch/faiss/issues/685, I'm going to add an NSG index to faiss. This PR which adds an NNDescent index is the first step as I commented [here ](https://github.com/facebookresearch/faiss/issues/685#issuecomment-760608431). Changes: 1. Add an `IndexNNDescent` and an `IndexNNDescentFlat` which allow users to construct a KNN graph on a million scale dataset using CPU and search NN on it. The implementation part is put under `faiss/impl`. 2. Add compilation entries to `CMakeLists.txt` for C++ and `swigfaiss.swig` for Python. `IndexNNDescentFlat` could be directly called by users in C++ and Python. 3. `VisitedTable` struct in `HNSW.h` is moved into `AuxIndexStructures.h`. 3. Add a demo `demo_nndescent.cpp` to demonstrate the effectiveness. TODO 1. Support index factor. 2. Implement `IndexNNDescentPQ` and `IndexNNDescentSQ` 3. More comments in the code. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1654 Test Plan: buck test //faiss/tests/:test_index_accuracy -- TestNNDescent buck test //faiss/tests/:test_build_blocks -- TestNNDescentKNNG Reviewed By: wickedfoo Differential Revision: D26309716 Pulled By: mdouze fbshipit-source-id: 2abade9708d29023f8bccbf77143e8eea14f66c4	2021-02-25 16:48:28 -08:00
shengjun.li	908812266c	Add heap_replace_top to simplify heap_pop + heap_push (#1597 ) Summary: Signed-off-by: shengjun.li <shengjun.li@zilliz.com> Add heap_replace_top to simplify heap_pop + heap_push Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1597 Test Plan: OMP_NUM_THREADS=1 buck run mode/opt //faiss/benchs/:bench_heap_replace OMP_NUM_THREADS=8 buck run mode/opt //faiss/benchs/:bench_heap_replace Reviewed By: beauby Differential Revision: D25943140 Pulled By: mdouze fbshipit-source-id: 66fe67779dd281a7753f597542c2e797ba0d7df5	2021-01-20 11:28:08 -08:00
Matthijs Douze	c5975cda72	PQ4 fast scan benchmarks (#1555 ) Summary: Code + scripts for Faiss benchmarks around the Fast scan codes. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1555 Test Plan: buck test //faiss/tests/:test_refine Reviewed By: wickedfoo Differential Revision: D25546505 Pulled By: mdouze fbshipit-source-id: 902486b7f47e36221a2671d124df8c114f25db58	2020-12-16 01:18:58 -08:00
Matthijs Douze	6d0bc58db6	Implementation of PQ4 search with SIMD instructions (#1542 ) Summary: IndexPQ and IndexIVFPQ implementations with AVX shuffle instructions. The training and computing of the codes does not change wrt. the original PQ versions but the code layout is "packed" so that it can be used efficiently by the SIMD computation kernels. The main changes are: - new IndexPQFastScan and IndexIVFPQFastScan objects - simdib.h for an abstraction above the AVX2 intrinsics - BlockInvertedLists for invlists that are 32-byte aligned and where codes are not sequential - pq4_fast_scan.h/.cpp: for packing codes and look-up tables + optmized distance comptuation kernels - simd_result_hander.h: SIMD version of result collection in heaps / reservoirs Misc changes: - added contrib.inspect_tools to access fields in C++ objects - moved .h and .cpp code for inverted lists to an invlists/ subdirectory, and made a .h/.cpp for InvertedListsIOHook - added a new inverted lists type with 32-byte aligned codes (for consumption by SIMD) - moved Windows-specific intrinsics to platfrom_macros.h Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1542 Test Plan: ``` buck test mode/opt -j 4 //faiss/tests/:test_fast_scan_ivf //faiss/tests/:test_fast_scan buck test mode/opt //faiss/manifold/... ``` Reviewed By: wickedfoo Differential Revision: D25175439 Pulled By: mdouze fbshipit-source-id: ad1a40c0df8c10f4b364bdec7172e43d71b56c34	2020-12-03 10:06:38 -08:00
Lucas Hosseini	cd38e82f0c	Facebook sync 2020-07-31 (#1308 )	2020-08-03 22:15:02 +02:00
Lucas Hosseini	22b7876ef5	Facebook sync (2020-03-10) (#1136 )	2020-03-10 14:24:07 +01:00
Lucas Hosseini	36ddba9196	Facebook sync (2019-09-10) (#943 ) * Facebook sync (2019-09-10) * Fix depends Makefile target. * Add faiss symlink for new include directives. * Fix missing header. * Fix tests. * Fix Makefile. * Update depend. * Fix include directives spacing.	2019-09-20 18:59:10 +02:00
Lucas Hosseini	3896b12c65	Facebook sync (Jun 2019) (#862 ) Bugfixes: - slow scanning of inverted lists (#836). Features: - add basic support for 6 new metrics in CPU `IndexFlat` and `IndexHNSW` (#848); - add support for `IndexIDMap`/`IndexIDMap2` with binary indexes (#780). Misc: - throw python exception for OOM (#758); - make `DistanceComputer` available for all random access indexes; - gradually moving from `long` to `int64_t` for portability.	2019-06-19 15:59:06 +02:00
Lucas Hosseini	a8118acbc5	Facebook sync (May 2019) + relicense (#838 ) Changelog: - changed license: BSD+Patents -> MIT - propagates exceptions raised in sub-indexes of IndexShards and IndexReplicas - support for searching several inverted lists in parallel (parallel_mode != 0) - better support for PQ codes where nbit != 8 or 16 - IVFSpectralHash implementation: spectral hash codes inside an IVF - 6-bit per component scalar quantizer (4 and 8 bit were already supported) - combinations of inverted lists: HStackInvertedLists and VStackInvertedLists - configurable number of threads for OnDiskInvertedLists prefetching (including 0=no prefetch) - more test and demo code compatible with Python 3 (print with parentheses) - refactored benchmark code: data loading is now in a single file	2019-05-28 16:17:22 +02:00
Lucas Hosseini	afe0fdc161	Facebook sync (Mar 2019) (#756 ) Facebook sync (Mar 2019) - MatrixStats object - option to round coordinates during k-means optimization - alternative option for search in HNSW - moved stats and imbalance_factor of IndexIVF to InvertedLists object - range search for IVFScalarQuantizer - direct unit8 codec in ScalarQuantizer - renamed IndexProxy to IndexReplicas and moved to main Faiss - better support for PQ code assignment with external index - support for IMI2x16 (4B virtual centroids!) - support for k = 2048 search on GPU (instead of 1024) - most CUDA mem alloc failures throw exceptions instead of terminating on an assertion - support for renaming an ondisk invertedlists - interrupt computations with ctrl-C in python	2019-03-29 16:32:28 +01:00
Lucas Hosseini	f417a53628	Fix CI tests (#687 ) * Fix test_transfer_invlists.cpp * Fix relative imports. * Fix test_index_accuracy.py. * Use default OSX version. * Allow osx gcc6 build to fail.	2019-01-08 17:52:36 +01:00
matthijs	daf589d9d2	add bench_all_ivf	2018-12-20 05:43:36 -08:00
Lucas Hosseini	323dbf3be3	Facebook sync (Dec 2018). (#660 ) * Add GpuIndexBinaryFlat * Add IndexBinaryHNSW	2018-12-19 17:48:35 +01:00
Tom Forbes	a91a24e77a	Fix #558 - Make `M` an integer (#589 ) * Fix #558 - Make `d` an integer * Use int()	2018-09-18 22:08:31 +02:00
Lucas Hosseini	76bec0b500	Facebook sync (#573 ) Features: - automatic tracking of C++ references in Python - non-intel platforms supported -- some functions optimized for ARM - override nprobe for concurrent searches - support for floating-point quantizers in binary indexes Bug fixes: - no more segfaults in python (I know it's the same as the first feature but it's important!) - fix GpuIndexIVFFlat issues for float32 with 64 / 128 dims - fix sharding of flat indexes on GPU with index_cpu_to_gpu_multiple	2018-08-30 19:38:50 +02:00

17 Commits (5e54fb57d85ebefad1ef97913e982a1767eb8e15)