Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3452
Delete all remaining print within the Tests to improve the readability and effectiveness of the codebase.
Reviewed By: junjieqi
Differential Revision: D57466393
fbshipit-source-id: 6ebd66ae2e769894d810d4ba7a5f69fc865b797d
Summary:
This PR adds a functionality where an IVF index can be searched and the corresponding codes be returned. It also adds a few functions to compress int arrays into a bit-compact representation.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3143
Test Plan:
```
buck test //faiss/tests/:test_index_composite -- TestSearchAndReconstruct
buck test //faiss/tests/:test_standalone_codec -- test_arrays
```
Reviewed By: algoriddle
Differential Revision: D51544613
Pulled By: mdouze
fbshipit-source-id: 875f72d0f9140096851592422570efa0f65431fc
Summary: In the IndexIVFIndepenentQuantizer, the coarse quantizer is applied on the input vectors, but the encoding is performed on a vector-transformed version of the database elements.
Reviewed By: alexanderguzhva
Differential Revision: D45950970
fbshipit-source-id: 30f6cf46d44174b1d99a12384b7d5e2d475c1f88
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2638
This diff is a more streamlined way of searching IVF indexes with precomputed clusters.
This will be used for experiments with hybrid CPU / GPU search.
Reviewed By: algoriddle
Differential Revision: D41301032
fbshipit-source-id: a1d645fd0f2bf806454dfd04971edc0a6200d20d
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2611
Moves the invlist splitting code so that it can be used independently from the IndexIVF.
Add a simple test for the splitting code.
Fix a bug in the IndexShards implementation.
Reviewed By: alexanderguzhva
Differential Revision: D41807025
fbshipit-source-id: 3f53afc5f81744343597bdfcfa90daa4f324a673
Summary:
* Modify pq4_get_paked_element to make it not depend on an auxiliary table
* Create pq4_set_packed_element which sets a single element in codes in packed format
(These methods would be used in merge and remove for IndexFastScan
get method is also used in FastScan indices for reconstruction)
* Add remove feature for IndexFastScan
* Add merge feature for indexFast Scan
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2497
Test Plan:
cd build && make -j
make test
cd faiss/python && python setup.py build && cd ../../..
PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
Reviewed By: mdouze
Differential Revision: D39927403
Pulled By: mdouze
fbshipit-source-id: 45271b98419203dfb1cea4f4e7eaf0662523a5b5
Summary:
makes index_factory support IDMap2 not only IDMap and add required tests
adding IDMap2 to index_factory would help users to take advantage of its extra features more than IDMap such as reconstruct the indices.
solves [issue 1864](https://github.com/facebookresearch/faiss/issues/1864)
+fix downcast_index IDMap / IDMap2 order
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2478
Test Plan:
cd build
make -j
cd faiss/python && python setup.py build
cd ../../..
PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
Reviewed By: mdouze
Differential Revision: D39660813
Pulled By: AbdelrahmanElmeniawy
fbshipit-source-id: 4881d325bb3b0eaf9637a544511d18c2084453eb
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2132
This diff adds the class IndexFlatCodes that becomes the parent of all "flat" encodings.
IndexPQ
IndexFlat
IndexAdditiveQuantizer
IndexScalarQuantizer
IndexLSH
Index2Layer
The other changes are:
- for IndexFlat, there is no vector<float> with the data anymore. It is replaced with a `get_xb()` function. This broke quite a few external codes, that this diff also attempts to fix.
- I/O functions needed to be adapted. This is done without changing the I/O format for any index.
- added a small contrib function to get the data from the IndexFlat
- the functionality has been made uniform, for example remove_ids and add are now in the parent class.
Eventually, we may support generic storage for flat indexes, similar to `InvertedLists`, eg to memmap the data, but this will again require a big change.
Reviewed By: wickedfoo
Differential Revision: D32646769
fbshipit-source-id: 04a1659173fd51b130ae45d345176b72183cae40
Summary:
This diff adds a CombinedIndexSharded1T class to combined_index that uses the 30 shards from the Spark reducer.
The metadata is stored in pickle files on manifold.
Differential Revision: D24018824
fbshipit-source-id: be4ff8b38c3d6e1bb907e02b655d0e419b7a6fea
Summary:
`long` is 32 bits on windows and so is the default int type for numpy (eg. the one used for `np.arange`).
This diff explicitly specifies 64-bit ints for all occurrences where it matters.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1381
Reviewed By: wickedfoo
Differential Revision: D23371232
Pulled By: mdouze
fbshipit-source-id: 220262cd70ee70379f83de93561a4eae71c94b04
Bugfixes:
- slow scanning of inverted lists (#836).
Features:
- add basic support for 6 new metrics in CPU `IndexFlat` and `IndexHNSW` (#848);
- add support for `IndexIDMap`/`IndexIDMap2` with binary indexes (#780).
Misc:
- throw python exception for OOM (#758);
- make `DistanceComputer` available for all random access indexes;
- gradually moving from `long` to `int64_t` for portability.
Changelog:
- changed license: BSD+Patents -> MIT
- propagates exceptions raised in sub-indexes of IndexShards and IndexReplicas
- support for searching several inverted lists in parallel (parallel_mode != 0)
- better support for PQ codes where nbit != 8 or 16
- IVFSpectralHash implementation: spectral hash codes inside an IVF
- 6-bit per component scalar quantizer (4 and 8 bit were already supported)
- combinations of inverted lists: HStackInvertedLists and VStackInvertedLists
- configurable number of threads for OnDiskInvertedLists prefetching (including 0=no prefetch)
- more test and demo code compatible with Python 3 (print with parentheses)
- refactored benchmark code: data loading is now in a single file
Facebook sync (Mar 2019)
- MatrixStats object
- option to round coordinates during k-means optimization
- alternative option for search in HNSW
- moved stats and imbalance_factor of IndexIVF to InvertedLists object
- range search for IVFScalarQuantizer
- direct unit8 codec in ScalarQuantizer
- renamed IndexProxy to IndexReplicas and moved to main Faiss
- better support for PQ code assignment with external index
- support for IMI2x16 (4B virtual centroids!)
- support for k = 2048 search on GPU (instead of 1024)
- most CUDA mem alloc failures throw exceptions instead of terminating on an assertion
- support for renaming an ondisk invertedlists
- interrupt computations with ctrl-C in python
Features:
- automatic tracking of C++ references in Python
- non-intel platforms supported -- some functions optimized for ARM
- override nprobe for concurrent searches
- support for floating-point quantizers in binary indexes
Bug fixes:
- no more segfaults in python (I know it's the same as the first feature but it's important!)
- fix GpuIndexIVFFlat issues for float32 with 64 / 128 dims
- fix sharding of flat indexes on GPU with index_cpu_to_gpu_multiple
* Refactors Makefiles and add configure script.
* Give MKL higher priority in configure script.
* Clean up Linux example makefile.inc.
* Cleanup makefile.inc examples.
* Fix python clean Makefile target.
* Regen swig wrappers.
* Remove useless CUDAFLAGS variable.
* Fix python linking flags.
* Separate compile and link phase in python makefile.
* Add macro to look for swig.
* Add CUDA check in configure script.
* Cleanup make depend targets.
* Cleanup CUDA flags.
* Fix linking flags.
* Fix python GPU linking.
* Remove useless flags from python gpu module linking.
* Add check for cuda libs.
* Cleanup GPU targets.
* Clean up test target.
* Add cpu/gpu targets to python makefile.
* Clean up tutorial Makefile.
* Remove stale OS var from example makefiles.
* Clean up cuda example flags.