Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3608
This is a straightforward implementation of QINCo in CPU Faiss, with encoding and decoding capabilities (not training).
For this, we translate a simplified version of some torch classes:
- tensors, restricted to 2D and int32 + float32
- Linear and Embedding layer
Then the QINCoStep and QINCo can just be defined as C++ objects that are copy-constructable.
There is some plumbing required in the wrapping layers to support the integration. Pytroch tensors are converted to numpy for getting / setting them in C++.
Reviewed By: asadoughi
Differential Revision: D59132952
fbshipit-source-id: eea4856507a5b7c5f219efcf8d19fe56944df088
Summary:
LLVM-15 has a warning `-Wunused-but-set-variable` which we treat as an error because it's so often diagnostic of a code issue. Unused variables can compromise readability or, worse, performance.
This diff either (a) removes an unused variable and, possibly, it's associated code, or (b) qualifies the variable with `[[maybe_unused]]`, mostly in cases where the variable _is_ used, but, eg, in an `assert` statement that isn't present in production code.
- If you approve of this diff, please use the "Accept & Ship" button :-)
Reviewed By: dmm-fb
Differential Revision: D56065763
fbshipit-source-id: b0541b8a759c4b6ca0e8753fc24b8c227047bd3d
Summary:
This PR adds support for dimensionality reduction in OIVFBBS. I tested the code with an index `OPQ64_128,IVF4096,PQ64` using the ssnpp embeddings - this index string is added to the config_ssnpp.yaml to showcase this functionality.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3290
Reviewed By: junjieqi
Differential Revision: D54878345
Pulled By: mlomeli1
fbshipit-source-id: 98ecdeb2224ce0325e37720cc113d82f9c6c75d6
Summary:
This PR introduces the offline IVF (OIVF) framework which contains some tooling to run search using IVFPQ indexes (plus OPQ pretransforms) for large batches of queries using [big_batch_search](https://github.com/mlomeli1/faiss/blob/main/contrib/big_batch_search.py) and GPU faiss. See the [README](36226f5fe8/demos/offline_ivf/README.md) for details about using this framework.
This PR includes the following unit tests, which can be run with the unittest library as so:
````
~/faiss/demos/offline_ivf$ python3 -m unittest tests/test_iterate_input.py -k test_iterate_back
````
In test_offline_ivf:
````
test_consistency_check
test_train_index
test_index_shard_equal_file_sizes
test_index_shard_unequal_file_sizes
test_search
test_evaluate_without_margin
test_evaluate_without_margin_OPQ
````
In test_iterate_input:
````
test_iterate_input_file_larger_than_batch
test_get_vs_iterate
test_iterate_back
````
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3202
Reviewed By: algoriddle
Differential Revision: D52734222
Pulled By: mlomeli1
fbshipit-source-id: 61fd0084277c1b14bdae1189db8ae43340611e16
Summary: Revert so that the test data can be replaced with synthetic
Reviewed By: mdouze
Differential Revision: D52388604
fbshipit-source-id: c0037635a4e66c54d42400294d13d9a80610b845
Summary:
This PR introduces the offline IVF (OIVF) framework which contains some tooling to run search using IVFPQ indexes (plus OPQ pretransforms) for large batches of queries using [big_batch_search](https://github.com/mlomeli1/faiss/blob/main/contrib/big_batch_search.py) and GPU faiss. See the [README](https://github.com/mlomeli1/faiss/blob/oivf/demos/offline_ivf/README.md) for details about using this framework.
This PR includes the following unit tests, which can be run with the unittest library as so:
````
~/faiss/demos/offline_ivf$ python3 -m unittest tests/test_iterate_input.py -k test_iterate_back
````
In test_offline_ivf:
````
test_consistency_check
test_train_index
test_index_shard_equal_file_sizes
test_index_shard_unequal_file_sizes
test_search
test_evaluate_without_margin
test_evaluate_without_margin_OPQ
test_evaluate_with_margin
test_split_batch_size_bigger_than_file_sizes
test_split_batch_size_smaller_than_file_sizes
test_split_files_with_corrupted_input_file
````
In test_iterate_input:
````
test_iterate_input_file_larger_than_batch
test_get_vs_iterate
test_iterate_back
````
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3175
Reviewed By: algoriddle
Differential Revision: D52218447
Pulled By: mlomeli1
fbshipit-source-id: 78b12457c79b02eb2c9ae993560f2e295798e7e5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2582
A few more or less cosmetic improvements
* Index::idx_t was in the Index object, which does not make much sense, this diff moves it to faiss::idx_t
* replace multiprocessing.dummy with multiprocessing.pool
* add Alexandr as a core contributor of Faiss in the README ;-)
```
for i in $( find . -name \*.cu -o -name \*.cuh -o -name \*.h -o -name \*.cpp ) ; do
sed -i s/Index::idx_t/idx_t/ $i
done
```
For the fbcode deps:
```
for i in $( fbgs Index::idx_t --exclude fbcode/faiss -l ) ; do
sed -i s/Index::idx_t/idx_t/ $i
done
```
Reviewed By: algoriddle
Differential Revision: D41437507
fbshipit-source-id: 8300f2a3ae97cace6172f3f14a9be3a83999fb89
Summary:
As discussed in https://github.com/facebookresearch/faiss/issues/685, I'm going to add an NSG index to faiss. This PR which adds an NNDescent index is the first step as I commented [here ](https://github.com/facebookresearch/faiss/issues/685#issuecomment-760608431).
**Changes:**
1. Add an `IndexNNDescent` and an `IndexNNDescentFlat` which allow users to construct a KNN graph on a million scale dataset using CPU and search NN on it. The implementation part is put under `faiss/impl`.
2. Add compilation entries to `CMakeLists.txt` for C++ and `swigfaiss.swig` for Python. `IndexNNDescentFlat` could be directly called by users in C++ and Python.
3. `VisitedTable` struct in `HNSW.h` is moved into `AuxIndexStructures.h`.
3. Add a demo `demo_nndescent.cpp` to demonstrate the effectiveness.
**TODO**
1. Support index factor.
2. Implement `IndexNNDescentPQ` and `IndexNNDescentSQ`
3. More comments in the code.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1654
Test Plan:
buck test //faiss/tests/:test_index_accuracy -- TestNNDescent
buck test //faiss/tests/:test_build_blocks -- TestNNDescentKNNG
Reviewed By: wickedfoo
Differential Revision: D26309716
Pulled By: mdouze
fbshipit-source-id: 2abade9708d29023f8bccbf77143e8eea14f66c4
Summary: Fixes 2 bugs spotted by ASAN in the demo.
Reviewed By: wenjieX
Differential Revision: D25897053
fbshipit-source-id: fd2bed13faded42426cefc5ebe9d027adec78015
Changelog:
- changed license: BSD+Patents -> MIT
- propagates exceptions raised in sub-indexes of IndexShards and IndexReplicas
- support for searching several inverted lists in parallel (parallel_mode != 0)
- better support for PQ codes where nbit != 8 or 16
- IVFSpectralHash implementation: spectral hash codes inside an IVF
- 6-bit per component scalar quantizer (4 and 8 bit were already supported)
- combinations of inverted lists: HStackInvertedLists and VStackInvertedLists
- configurable number of threads for OnDiskInvertedLists prefetching (including 0=no prefetch)
- more test and demo code compatible with Python 3 (print with parentheses)
- refactored benchmark code: data loading is now in a single file
+ Add conda packages metadata (now building Faiss using conda's toolchain);
+ add Dockerfile for building conda packages (for all CUDA versions);
+ add working Dockerfile building faiss on Centos7;
+ simplify GPU build;
+ avoid falling back to CPU-only version (python);
+ simplify TravisCI config;
+ update INSTALL.md;
+ add configure flag for specifying target architectures (--with-cuda-arch);
+ fix Makefile for gpu tests;
+ fix various Makefile issues;
+ remove stale file (gpu/utils/DeviceUtils.cpp).
* Refactors Makefiles and add configure script.
* Give MKL higher priority in configure script.
* Clean up Linux example makefile.inc.
* Cleanup makefile.inc examples.
* Fix python clean Makefile target.
* Regen swig wrappers.
* Remove useless CUDAFLAGS variable.
* Fix python linking flags.
* Separate compile and link phase in python makefile.
* Add macro to look for swig.
* Add CUDA check in configure script.
* Cleanup make depend targets.
* Cleanup CUDA flags.
* Fix linking flags.
* Fix python GPU linking.
* Remove useless flags from python gpu module linking.
* Add check for cuda libs.
* Cleanup GPU targets.
* Clean up test target.
* Add cpu/gpu targets to python makefile.
* Clean up tutorial Makefile.
* Remove stale OS var from example makefiles.
* Clean up cuda example flags.