Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3012
The cross-tables for codebook construction contained the dot products between codebook entries, which is not necessary (and caused OOMs in some cases). This diff computes only the off-diagonal blocks.
Reviewed By: pemazare
Differential Revision: D48448615
fbshipit-source-id: 494b54e2900754a3ff5d3c8073cb9a768e578c58
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3030
Added default arguments to the .h file (for some reason I forgot this file when migrating default args).
Logging a hash value in MatrixStats, useful to check if two runs really really run on the same matrix...
Reviewed By: pemazare
Differential Revision: D48834343
fbshipit-source-id: 7c1948464e66ada1f462f4486f7cf3159bbf9dfd
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3007
There is a complicated interaction between SWIG and the python wrappers where the ownership of ParameterSpace arguments was stolen from Python.
This diff adds a test, fixes that behavior and fixes the referenced_objects construction
Reviewed By: mlomeli1
Differential Revision: D48404252
fbshipit-source-id: 8afa9e6c15d11451c27864223e33ed1187817224
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2984
It is not entirely trivial to access the NSG graph structure from Python (although it is a fixed size N-by-K matrix of vector ids).
This diff adds an inspect_tools function to do that.
Reviewed By: algoriddle
Differential Revision: D48026775
fbshipit-source-id: 94cd7be7f656bcd333d62586531f287ea8e052e5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2986
A NaN vector is a vector with at least one NaN (not-a-number) entry.
After discussion in the Faiss team we decided that:
- training should throw an exception on NaN vectors
- added NaN vectors should be ignored (never returned)
- searched NaN vectors should return only -1s
This diff implements this for a few common index types + adds relevant tests.
Reviewed By: algoriddle
Differential Revision: D48031390
fbshipit-source-id: 99e7786582e91950e3a53c1d8bcffdd00b6afd24
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2949
A more scalable alternative to `np.unique` for deduping large datasets with a quantized code.
Reviewed By: mlomeli1
Differential Revision: D47443953
fbshipit-source-id: 4a1554d4d4200b5fa657e9d8b7395bba9856a8e3
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2940
This test fails on some occasions.
After investigation it turns out this is due to non reproducible behavior IndexIVFFastScan::search_implem_14 with a parallel loop, where there are ties in the resutls (ie. the resulting distances are the same but not the ids).
As a workaround I relaxed the test slightly.
+ a fix in the checksum function.
Reviewed By: algoriddle
Differential Revision: D47229086
fbshipit-source-id: 55e53bcfe47cf33041cc7fd5691b5de65067ce0f
Summary: Useful info on github test runs is burried in spurious logging. Avoid this.
Reviewed By: mlomeli1
Differential Revision: D47209139
fbshipit-source-id: b5111c91e2b94f0c3678d599197f8e7094993df1
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2916
Overall better support for binary indexes:
- cloning (to CPU and GPU), only for BinaryFlat for now
- fix bug in reconstruct_n
- range_search_max_results
Reviewed By: algoriddle
Differential Revision: D46755778
fbshipit-source-id: 777ad90aff5c54a77f9685ed6512247a922c6ef5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2901
This diff allows each GPU to work independently, a hot centroid (eg. out-of-distribution queries that hit a centroid heavily) will only block the one GPU that is processing it, others will continue to pick up work independently.
Reviewed By: mdouze
Differential Revision: D46521298
fbshipit-source-id: 171cb06cce8b2d16b7bd744799b105b3cd525be3
Summary: In the IndexIVFIndepenentQuantizer, the coarse quantizer is applied on the input vectors, but the encoding is performed on a vector-transformed version of the database elements.
Reviewed By: alexanderguzhva
Differential Revision: D45950970
fbshipit-source-id: 30f6cf46d44174b1d99a12384b7d5e2d475c1f88
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2864
Adds use_raft to the cloner options.
Adds tests for the python interface.
Also continue cleanup of data structures to set default arguments.
Add flags GPU and NVIDIA_RAFT to get_compile_options()
Reviewed By: algoriddle
Differential Revision: D45943372
fbshipit-source-id: 3428b24d309e9facfb4ebcf0d2d108dccfb4ad01
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2860
Optimized range search function where the GPU computes by default and falls back on gpu for queries where there are too many results.
Parallelize the CPU to GPU cloning, it seems to work.
Support range_search_preassigned in Python
Fix long-standing issue with SWIG exposed functions that did not release the GIL (in particular the MapLong2Long).
Adds a MapInt64ToInt64 that is more efficient than MapLong2Long.
Reviewed By: algoriddle
Differential Revision: D45672301
fbshipit-source-id: 2e77397c40083818584dbafa5427149359a2abfd
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2848
Add selector support for IDMap wrapped indices.
Caveat: this requires to wrap the IDSelector with another selector. Since the params are const, the const is casted away.
This is a problem if the same params are used from multiple execution threads with different selectors. However, this seems rare enough to take the risk.
Reviewed By: alexanderguzhva
Differential Revision: D45598823
fbshipit-source-id: ec23465c13f1f8273a6a46f9aa869ccae2cdb79c
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2846
Adds a function to ivf_contrib to sort the inverted lists by size without changing the results. Also moves big_batch_search to its own module.
Reviewed By: algoriddle
Differential Revision: D45565880
fbshipit-source-id: 091a1c1c074f860d6953bf20d04523292fb55e1a
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2781
This is a benchmarking script for keypoint matching with labelled ground-truth.
Reviewed By: alexanderguzhva
Differential Revision: D44036091
fbshipit-source-id: d9d7c089c4d172b66f33dc968c00713a1b79c2d1
Summary: Big batch search can be running for hours so it's useful to have a checkpointing mechanism in case it's run on a best-effort cluster queue.
Reviewed By: algoriddle
Differential Revision: D44059758
fbshipit-source-id: 5cb5e80800c6d2bf76d9f6cb40736009cd5d4b8e
Summary:
Adds support for an IDSelector that takes in two IDSelectors and can perform a boolean operation on their is_member outcomes.
Current implementation is pretty naive and doesn't try to do any optimizations on the types of IDSelectors combined.
Also test cases are definitely lacking but can add more once approach is agreed upon.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2742
Reviewed By: algoriddle
Differential Revision: D43904855
Pulled By: mdouze
fbshipit-source-id: bbe687800a19b418ca30c9257fb0334c64ab5f52
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2737
IVFPQ with more than 8 bits per subquantizer seem to be acceptable in Faiss. So, comments were altered, additional unit tests were added.
Reviewed By: mdouze
Differential Revision: D43706459
fbshipit-source-id: 45d0cc6f43ec0198aa95d025f07b75a9c33e4db7
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2721
FAISS_PRAGMA_IMPRECISE_* macros were modified:
* Disabled ones on clang on arm, because it does not support `_Pragma("float_control(precise, off)")`
* Added missing pragma for the GCC compiler.
Reviewed By: alexanderguzhva
Differential Revision: D43437450
fbshipit-source-id: cec8042c3c8c7147ae7e2ffa1ac9e2232c8f1a92
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2682
IndexShards normally sees the indexes as opaque, so there is no way to factrorize the coarse quantizer.
This diff introduces IndexIVFShards that handles IVF indexes with a common quantizer so that the quantization is computed only once.
Reviewed By: alexanderguzhva
Differential Revision: D42781513
fbshipit-source-id: 441316eff4c1ba0468501c456af9194ea5f042d6