faiss/contrib/README.md


# The contrib modules

The contrib directory contains helper modules for Faiss for various tasks.

## Code structure

The contrib directory gets compiled in the module faiss.contrib.
Note that although some of the modules may depend on additional modules (eg. GPU Faiss, pytorch, hdf5), they are not necessarily compiled in to avoid adding dependencies. It is the user's responsibility to provide them.

In contrib, we are progressively dropping python2 support.

## List of contrib modules

### rpc.py

A very simple Remote Procedure Call library, where function parameters and results are pickled, for use with client_server.py

### client_server.py

The server handles requests to a Faiss index. The client calls the remote index.
This is mainly to shard datasets over several machines, see [Distributd index](https://github.com/facebookresearch/faiss/wiki/Indexes-that-do-not-fit-in-RAM#distributed-index)

### ondisk.py

Encloses the main logic to merge indexes into an on-disk index.
See [On-disk storage](https://github.com/facebookresearch/faiss/wiki/Indexes-that-do-not-fit-in-RAM#on-disk-storage)

### exhaustive_search.py

Computes the ground-truth search results for a dataset that possibly does not fit in RAM. Uses GPU if available.
Tested in `tests/test_contrib.TestComputeGT`

### torch_utils.py

Interoperability functions for pytorch and Faiss: Importing this will allow pytorch Tensors (CPU or GPU) to be used as arguments to Faiss indexes and other functions. Torch GPU tensors can only be used with Faiss GPU indexes. If this is imported with a package that supports Faiss GPU, the necessary stream synchronization with the current pytorch stream will be automatically performed.

Numpy ndarrays can continue to be used in the Faiss python interface after importing this file. All arguments must be uniformly either numpy ndarrays or Torch tensors; no mixing is allowed.

Tested in `tests/test_contrib_torch.py` (CPU) and `gpu/test/test_contrib_torch_gpu.py` (GPU).

### inspect_tools.py

Functions to inspect C++ objects wrapped by SWIG. Most often this just means reading
fields and converting them to the proper python array.


### datasets.py

(may require h5py)

Defintion of how to access data for some standard datsets.

### factory_tools.py

Functions related to factory strings.
Moved pytorch interop code to contrib Summary: The pytorch interop code was in a test until now. However, it is better if people can rely on it to be updated when the API is updated. Therefore, we move it into contrib. Also added a README.md Reviewed By: wickedfoo Differential Revision: D23392962 fbshipit-source-id: 9b7c0e388a7ea3c0b73dc0018322138f49191673 2020-08-29 08:49:25 +08:00
			`# The contrib modules`

			`The contrib directory contains helper modules for Faiss for various tasks.`

			`## Code structure`

			`The contrib directory gets compiled in the module faiss.contrib.`
			`Note that although some of the modules may depend on additional modules (eg. GPU Faiss, pytorch, hdf5), they are not necessarily compiled in to avoid adding dependencies. It is the user's responsibility to provide them.`

			`In contrib, we are progressively dropping python2 support.`

			`## List of contrib modules`

			`### rpc.py`

			`A very simple Remote Procedure Call library, where function parameters and results are pickled, for use with client_server.py`

			`### client_server.py`

			`The server handles requests to a Faiss index. The client calls the remote index.`
			`This is mainly to shard datasets over several machines, see [Distributd index](https://github.com/facebookresearch/faiss/wiki/Indexes-that-do-not-fit-in-RAM#distributed-index)`

			`### ondisk.py`

			`Encloses the main logic to merge indexes into an on-disk index.`
			`See [On-disk storage](https://github.com/facebookresearch/faiss/wiki/Indexes-that-do-not-fit-in-RAM#on-disk-storage)`

			`### exhaustive_search.py`

			`Computes the ground-truth search results for a dataset that possibly does not fit in RAM. Uses GPU if available.`
			Tested in `tests/test_contrib.TestComputeGT`

PyTorch tensor / Faiss index interoperability (#1484) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1484 This diff allows for native usage of PyTorch tensors for Faiss indexes on both CPU and GPU. It is currently only implemented in this diff for things that inherit from `faiss.Index`, which covers the non-binary indices, and it patches the same functions on `faiss.Index` that were also covered by `__init__.py` for numpy interoperability. There must be uniformity among the inputs: if any array input is a Torch tensor, then all array inputs must be Torch tensors. Similarly, if any array input is a numpy ndarray, then all array inputs must be numpy ndarrays. If `faiss.contrib.torch_utils` is imported, it ensures that `import faiss` has already been performed to patch all of the functions using the base `__init__.py` numpy wrappers, and then patches the following functions again: ``` add add_with_ids assign train search remove_ids reconstruct reconstruct_n range_search update_vectors search_and_reconstruct sa_encode sa_decode ``` to allow usage of PyTorch CPU tensors, and additionally PyTorch GPU tensors if the index being used is on the GPU. numpy functionality is still available when `faiss.contrib.torch_utils` is imported; we pass through to the original patched numpy function when we detect numpy inputs. In addition, to allow for better (asynchronous) GPU usage without requiring the CPU to be involved, all of these functions which construct tensors/arrays for output now take optional arguments for storage (numpy or torch.Tensor) to be provided that will contain the output data. `range_search` is the only exception to this, as the size of the output data is indeterminate. The eventual GPU implementation will likely require the user to provide a maximum cap on the output size, and allow that to be passed instead. If the optional pre-allocated output values are presented by the user, they are used; otherwise, new return ndarray / Tensors are constructed as before and used for the return. If this feature were not provided on the GPU, then every execution would be completely serial as we would depend upon the CPU to allocate GPU memory before every operation. Instead, now this can function much like NN graph execution on the GPU, assuming that all of the data requirements are pre-allocated, so the execution will run at the full speed of the GPU and not be stalled sequentially launching kernels. This diff also exposes the `GpuResources` shared_ptr object owned by a GPU index. This is required for pytorch GPU so that we can perform proper stream ordering in Faiss with respect to the current pytorch stream. So, Faiss indices now perform more or less as any NN operation in Torch does. Note, however, that a Faiss index has its own setting on current device, and if the pytorch GPU tensor inputs are resident on a different device than what the Faiss index expects, a cross-device copy will be initiated. I may choose to make this an error in the future and require matching device to device. This diff also found a bug when passing GPU data directly to `train()` for `GpuIndexIVFFlat` and `GpuIndexIVFScalarQuantizer`, as I guess we never tested passing GPU data directly to these functions before. `GpuIndexIVFPQ` was doing the right thing however. The assign function is now also implemented on the GPU as well, and is now marked `const` to be in line with the `search` function. Also added better checking of non-contiguous inputs for both Torch tensors and numpy ndarrays. Updated the `knn_gpu` function with a base implementation always present that allows for usage of numpy arrays, which is overridden when `torch_utils` is imported to allow torch usage. This supports row/column major layout, float32/float16 data and int64/int32 indices for both numpy and torch. Reviewed By: mdouze Differential Revision: D24299400 fbshipit-source-id: b4f117b9c120bd1ad83e7702087051ab7b303b29 2020-10-24 13:22:51 +08:00			`### torch_utils.py`
Moved pytorch interop code to contrib Summary: The pytorch interop code was in a test until now. However, it is better if people can rely on it to be updated when the API is updated. Therefore, we move it into contrib. Also added a README.md Reviewed By: wickedfoo Differential Revision: D23392962 fbshipit-source-id: 9b7c0e388a7ea3c0b73dc0018322138f49191673 2020-08-29 08:49:25 +08:00
PyTorch tensor / Faiss index interoperability (#1484) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1484 This diff allows for native usage of PyTorch tensors for Faiss indexes on both CPU and GPU. It is currently only implemented in this diff for things that inherit from `faiss.Index`, which covers the non-binary indices, and it patches the same functions on `faiss.Index` that were also covered by `__init__.py` for numpy interoperability. There must be uniformity among the inputs: if any array input is a Torch tensor, then all array inputs must be Torch tensors. Similarly, if any array input is a numpy ndarray, then all array inputs must be numpy ndarrays. If `faiss.contrib.torch_utils` is imported, it ensures that `import faiss` has already been performed to patch all of the functions using the base `__init__.py` numpy wrappers, and then patches the following functions again: ``` add add_with_ids assign train search remove_ids reconstruct reconstruct_n range_search update_vectors search_and_reconstruct sa_encode sa_decode ``` to allow usage of PyTorch CPU tensors, and additionally PyTorch GPU tensors if the index being used is on the GPU. numpy functionality is still available when `faiss.contrib.torch_utils` is imported; we pass through to the original patched numpy function when we detect numpy inputs. In addition, to allow for better (asynchronous) GPU usage without requiring the CPU to be involved, all of these functions which construct tensors/arrays for output now take optional arguments for storage (numpy or torch.Tensor) to be provided that will contain the output data. `range_search` is the only exception to this, as the size of the output data is indeterminate. The eventual GPU implementation will likely require the user to provide a maximum cap on the output size, and allow that to be passed instead. If the optional pre-allocated output values are presented by the user, they are used; otherwise, new return ndarray / Tensors are constructed as before and used for the return. If this feature were not provided on the GPU, then every execution would be completely serial as we would depend upon the CPU to allocate GPU memory before every operation. Instead, now this can function much like NN graph execution on the GPU, assuming that all of the data requirements are pre-allocated, so the execution will run at the full speed of the GPU and not be stalled sequentially launching kernels. This diff also exposes the `GpuResources` shared_ptr object owned by a GPU index. This is required for pytorch GPU so that we can perform proper stream ordering in Faiss with respect to the current pytorch stream. So, Faiss indices now perform more or less as any NN operation in Torch does. Note, however, that a Faiss index has its own setting on current device, and if the pytorch GPU tensor inputs are resident on a different device than what the Faiss index expects, a cross-device copy will be initiated. I may choose to make this an error in the future and require matching device to device. This diff also found a bug when passing GPU data directly to `train()` for `GpuIndexIVFFlat` and `GpuIndexIVFScalarQuantizer`, as I guess we never tested passing GPU data directly to these functions before. `GpuIndexIVFPQ` was doing the right thing however. The assign function is now also implemented on the GPU as well, and is now marked `const` to be in line with the `search` function. Also added better checking of non-contiguous inputs for both Torch tensors and numpy ndarrays. Updated the `knn_gpu` function with a base implementation always present that allows for usage of numpy arrays, which is overridden when `torch_utils` is imported to allow torch usage. This supports row/column major layout, float32/float16 data and int64/int32 indices for both numpy and torch. Reviewed By: mdouze Differential Revision: D24299400 fbshipit-source-id: b4f117b9c120bd1ad83e7702087051ab7b303b29 2020-10-24 13:22:51 +08:00			`Interoperability functions for pytorch and Faiss: Importing this will allow pytorch Tensors (CPU or GPU) to be used as arguments to Faiss indexes and other functions. Torch GPU tensors can only be used with Faiss GPU indexes. If this is imported with a package that supports Faiss GPU, the necessary stream synchronization with the current pytorch stream will be automatically performed.`
Moved pytorch interop code to contrib Summary: The pytorch interop code was in a test until now. However, it is better if people can rely on it to be updated when the API is updated. Therefore, we move it into contrib. Also added a README.md Reviewed By: wickedfoo Differential Revision: D23392962 fbshipit-source-id: 9b7c0e388a7ea3c0b73dc0018322138f49191673 2020-08-29 08:49:25 +08:00
PyTorch tensor / Faiss index interoperability (#1484) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1484 This diff allows for native usage of PyTorch tensors for Faiss indexes on both CPU and GPU. It is currently only implemented in this diff for things that inherit from `faiss.Index`, which covers the non-binary indices, and it patches the same functions on `faiss.Index` that were also covered by `__init__.py` for numpy interoperability. There must be uniformity among the inputs: if any array input is a Torch tensor, then all array inputs must be Torch tensors. Similarly, if any array input is a numpy ndarray, then all array inputs must be numpy ndarrays. If `faiss.contrib.torch_utils` is imported, it ensures that `import faiss` has already been performed to patch all of the functions using the base `__init__.py` numpy wrappers, and then patches the following functions again: ``` add add_with_ids assign train search remove_ids reconstruct reconstruct_n range_search update_vectors search_and_reconstruct sa_encode sa_decode ``` to allow usage of PyTorch CPU tensors, and additionally PyTorch GPU tensors if the index being used is on the GPU. numpy functionality is still available when `faiss.contrib.torch_utils` is imported; we pass through to the original patched numpy function when we detect numpy inputs. In addition, to allow for better (asynchronous) GPU usage without requiring the CPU to be involved, all of these functions which construct tensors/arrays for output now take optional arguments for storage (numpy or torch.Tensor) to be provided that will contain the output data. `range_search` is the only exception to this, as the size of the output data is indeterminate. The eventual GPU implementation will likely require the user to provide a maximum cap on the output size, and allow that to be passed instead. If the optional pre-allocated output values are presented by the user, they are used; otherwise, new return ndarray / Tensors are constructed as before and used for the return. If this feature were not provided on the GPU, then every execution would be completely serial as we would depend upon the CPU to allocate GPU memory before every operation. Instead, now this can function much like NN graph execution on the GPU, assuming that all of the data requirements are pre-allocated, so the execution will run at the full speed of the GPU and not be stalled sequentially launching kernels. This diff also exposes the `GpuResources` shared_ptr object owned by a GPU index. This is required for pytorch GPU so that we can perform proper stream ordering in Faiss with respect to the current pytorch stream. So, Faiss indices now perform more or less as any NN operation in Torch does. Note, however, that a Faiss index has its own setting on current device, and if the pytorch GPU tensor inputs are resident on a different device than what the Faiss index expects, a cross-device copy will be initiated. I may choose to make this an error in the future and require matching device to device. This diff also found a bug when passing GPU data directly to `train()` for `GpuIndexIVFFlat` and `GpuIndexIVFScalarQuantizer`, as I guess we never tested passing GPU data directly to these functions before. `GpuIndexIVFPQ` was doing the right thing however. The assign function is now also implemented on the GPU as well, and is now marked `const` to be in line with the `search` function. Also added better checking of non-contiguous inputs for both Torch tensors and numpy ndarrays. Updated the `knn_gpu` function with a base implementation always present that allows for usage of numpy arrays, which is overridden when `torch_utils` is imported to allow torch usage. This supports row/column major layout, float32/float16 data and int64/int32 indices for both numpy and torch. Reviewed By: mdouze Differential Revision: D24299400 fbshipit-source-id: b4f117b9c120bd1ad83e7702087051ab7b303b29 2020-10-24 13:22:51 +08:00			`Numpy ndarrays can continue to be used in the Faiss python interface after importing this file. All arguments must be uniformly either numpy ndarrays or Torch tensors; no mixing is allowed.`

			Tested in `tests/test_contrib_torch.py` (CPU) and `gpu/test/test_contrib_torch_gpu.py` (GPU).
Moved pytorch interop code to contrib Summary: The pytorch interop code was in a test until now. However, it is better if people can rely on it to be updated when the API is updated. Therefore, we move it into contrib. Also added a README.md Reviewed By: wickedfoo Differential Revision: D23392962 fbshipit-source-id: 9b7c0e388a7ea3c0b73dc0018322138f49191673 2020-08-29 08:49:25 +08:00
Implementation of PQ4 search with SIMD instructions (#1542) Summary: IndexPQ and IndexIVFPQ implementations with AVX shuffle instructions. The training and computing of the codes does not change wrt. the original PQ versions but the code layout is "packed" so that it can be used efficiently by the SIMD computation kernels. The main changes are: - new IndexPQFastScan and IndexIVFPQFastScan objects - simdib.h for an abstraction above the AVX2 intrinsics - BlockInvertedLists for invlists that are 32-byte aligned and where codes are not sequential - pq4_fast_scan.h/.cpp: for packing codes and look-up tables + optmized distance comptuation kernels - simd_result_hander.h: SIMD version of result collection in heaps / reservoirs Misc changes: - added contrib.inspect_tools to access fields in C++ objects - moved .h and .cpp code for inverted lists to an invlists/ subdirectory, and made a .h/.cpp for InvertedListsIOHook - added a new inverted lists type with 32-byte aligned codes (for consumption by SIMD) - moved Windows-specific intrinsics to platfrom_macros.h Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1542 Test Plan: ``` buck test mode/opt -j 4 //faiss/tests/:test_fast_scan_ivf //faiss/tests/:test_fast_scan buck test mode/opt //faiss/manifold/... ``` Reviewed By: wickedfoo Differential Revision: D25175439 Pulled By: mdouze fbshipit-source-id: ad1a40c0df8c10f4b364bdec7172e43d71b56c34 2020-12-04 02:04:50 +08:00			`### inspect_tools.py`

			`Functions to inspect C++ objects wrapped by SWIG. Most often this just means reading`
			`fields and converting them to the proper python array.`


Moved pytorch interop code to contrib Summary: The pytorch interop code was in a test until now. However, it is better if people can rely on it to be updated when the API is updated. Therefore, we move it into contrib. Also added a README.md Reviewed By: wickedfoo Differential Revision: D23392962 fbshipit-source-id: 9b7c0e388a7ea3c0b73dc0018322138f49191673 2020-08-29 08:49:25 +08:00			`### datasets.py`

			`(may require h5py)`

			`Defintion of how to access data for some standard datsets.`
PQ4 fast scan benchmarks (#1555) Summary: Code + scripts for Faiss benchmarks around the Fast scan codes. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1555 Test Plan: buck test //faiss/tests/:test_refine Reviewed By: wickedfoo Differential Revision: D25546505 Pulled By: mdouze fbshipit-source-id: 902486b7f47e36221a2671d124df8c114f25db58 2020-12-16 17:17:59 +08:00
			`### factory_tools.py`

			`Functions related to factory strings.`