Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2294
there is a weird CI failure on one of the platforms occurring in the PR
https://github.com/facebookresearch/faiss/pull/2291
This diff makes the test a bit more robust, correcting inter_perf to computer the intersection measure. Hopefully this will make the bug go away.
Reviewed By: beauby
Differential Revision: D35558855
fbshipit-source-id: f5a926d9d8ebee975e538c65ac37b15d485798aa
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2217
This diff introduces a new Faiss contrib module that contains:
- generic k-means implemented in python (was in distributed_ondisk)
- the two-level clustering code, including a simple function that runs it on a Faiss IVF index.
- sparse clustering code (new)
The main idea is that that code is often re-used so better have it in contrib.
Reviewed By: beauby
Differential Revision: D34170932
fbshipit-source-id: cc297cc56d241b5ef421500ed410d8e2be0f1b77
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2132
This diff adds the class IndexFlatCodes that becomes the parent of all "flat" encodings.
IndexPQ
IndexFlat
IndexAdditiveQuantizer
IndexScalarQuantizer
IndexLSH
Index2Layer
The other changes are:
- for IndexFlat, there is no vector<float> with the data anymore. It is replaced with a `get_xb()` function. This broke quite a few external codes, that this diff also attempts to fix.
- I/O functions needed to be adapted. This is done without changing the I/O format for any index.
- added a small contrib function to get the data from the IndexFlat
- the functionality has been made uniform, for example remove_ids and add are now in the parent class.
Eventually, we may support generic storage for flat indexes, similar to `InvertedLists`, eg to memmap the data, but this will again require a big change.
Reviewed By: wickedfoo
Differential Revision: D32646769
fbshipit-source-id: 04a1659173fd51b130ae45d345176b72183cae40
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1972
This fixes a few issues that I ran into + adds tests:
- range_search_max_results with IP search
- a few missing downcasts for VectorTRansforms
- ResultHeap supports max IP search
Reviewed By: wickedfoo
Differential Revision: D29525093
fbshipit-source-id: d4ff0aff1d83af9717ff1aaa2fe3cda7b53019a3
Summary:
Adds the preassigned add and search python wrappers to contrib.
Adds the preassigned search for the binary case (was missing before).
Also adds a real test for that functionality.
Reviewed By: beauby
Differential Revision: D26560021
fbshipit-source-id: 330b715a9ed0073cfdadbfbcb1c23b10bed963a5
Summary: The order of xb an xq was different between `faiss.knn` and `faiss.knn_gpu`. Also the metric argument was called distance_type. This diff fixes both. Hopefully not too much external code depends on it.
Reviewed By: wickedfoo
Differential Revision: D26222853
fbshipit-source-id: b43e143d64d9ecbbdf541734895c13847cf2696c
Summary:
Added a few functions in contrib to:
- run range searches by batches on the query or the database side
- emulate range search on GPU: search on GPU with k=1024, if the farthest neighbor is still within range, re-perform search on CPU
- as reference implementations for precision-recall on range search datasets
- optimized code to plot precision-recall plots (ie. sweep over thresholds)
The new functions are mainly in a new `evaluation.py`
Reviewed By: wickedfoo
Differential Revision: D25627619
fbshipit-source-id: 58f90654c32c925557d7bbf8083efbb710712e03
Summary:
IndexPQ and IndexIVFPQ implementations with AVX shuffle instructions.
The training and computing of the codes does not change wrt. the original PQ versions but the code layout is "packed" so that it can be used efficiently by the SIMD computation kernels.
The main changes are:
- new IndexPQFastScan and IndexIVFPQFastScan objects
- simdib.h for an abstraction above the AVX2 intrinsics
- BlockInvertedLists for invlists that are 32-byte aligned and where codes are not sequential
- pq4_fast_scan.h/.cpp: for packing codes and look-up tables + optmized distance comptuation kernels
- simd_result_hander.h: SIMD version of result collection in heaps / reservoirs
Misc changes:
- added contrib.inspect_tools to access fields in C++ objects
- moved .h and .cpp code for inverted lists to an invlists/ subdirectory, and made a .h/.cpp for InvertedListsIOHook
- added a new inverted lists type with 32-byte aligned codes (for consumption by SIMD)
- moved Windows-specific intrinsics to platfrom_macros.h
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1542
Test Plan:
```
buck test mode/opt -j 4 //faiss/tests/:test_fast_scan_ivf //faiss/tests/:test_fast_scan
buck test mode/opt //faiss/manifold/...
```
Reviewed By: wickedfoo
Differential Revision: D25175439
Pulled By: mdouze
fbshipit-source-id: ad1a40c0df8c10f4b364bdec7172e43d71b56c34
Summary: The synthetic dataset can now have IP groundtruth
Reviewed By: wickedfoo
Differential Revision: D24219860
fbshipit-source-id: 42e094479311135e932821ac0a97ed0fb237bf78
Summary:
Removed an unused function that caused compile errors in some configurations.
Added contrib function (exhaustive_search.knn) to compute the k nearest neighbors without constructing an index.
Renamed the equivalent GPU function as exhaustive_search.knn_gpu (it does not make much sense to mention numpy in the name as all functions take numpy arguments by default).
Reviewed By: beauby
Differential Revision: D24215427
fbshipit-source-id: 6d8e1eafa7c57593304b7b76f83b3015e4d2a2bb
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1432
The contrib function knn_ground_truth does not provide exactly the same resutls on GPU and CPU (but relative accuracy is still 1e-7). This diff relaxes the constraint on CPU and added test on GPU.
Reviewed By: wickedfoo
Differential Revision: D24012199
fbshipit-source-id: aaa20dbdf42b876b3ed7da34028646dbb20833d3
Summary:
This diff adds an object for a few useful dataset in faiss.contrib.
This includes synthetic datasets and the classic ones.
It is intended to work on:
- the FAIR cluster
- gluster
- manifold
Reviewed By: wickedfoo
Differential Revision: D23378763
fbshipit-source-id: 2437a7be9e712fd5ad1bccbe523cc1c936f7ab35