faiss/INSTALL.md

299 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# Installing Faiss via conda
The supported way to install Faiss is through [conda](https://docs.conda.io).
Stable releases are pushed regularly to the pytorch conda channel, as well as
pre-release nightly builds.
- The CPU-only faiss-cpu conda package is currently available on Linux (x86-64 and aarch64), OSX (arm64 only), and Windows (x86-64)
- faiss-gpu, containing both CPU and GPU indices, is available on Linux (x86-64 only) for CUDA 11.4 and 12.1
- faiss-gpu-raft containing both CPU and GPU indices provided by NVIDIA RAFT, is available on Linux (x86-64 only) for CUDA 11.8 and 12.1.
To install the latest stable release:
``` shell
# CPU-only version
$ conda install -c pytorch faiss-cpu=1.9.0
# GPU(+CPU) version
$ conda install -c pytorch -c nvidia faiss-gpu=1.9.0
# GPU(+CPU) version with NVIDIA RAFT
$ conda install -c pytorch -c nvidia -c rapidsai -c conda-forge faiss-gpu-raft=1.9.0
# GPU(+CPU) version using AMD ROCm not yet available
```
For faiss-gpu, the nvidia channel is required for CUDA, which is not
published in the main anaconda channel.
For faiss-gpu-raft, the nvidia, rapidsai and conda-forge channels are required.
Nightly pre-release packages can be installed as follows:
``` shell
# CPU-only version
$ conda install -c pytorch/label/nightly faiss-cpu
# GPU(+CPU) version
$ conda install -c pytorch/label/nightly -c nvidia faiss-gpu=1.9.0
# GPU(+CPU) version with NVIDIA RAFT
conda install -c pytorch -c nvidia -c rapidsai -c conda-forge faiss-gpu-raft=1.9.0 pytorch pytorch-cuda numpy
# GPU(+CPU) version using AMD ROCm not yet available
```
In the above commands, pytorch-cuda=11 or pytorch-cuda=12 would select a specific CUDA version, if its required.
A combination of versions that installs GPU Faiss with CUDA and Pytorch (as of 2024-05-15):
```
conda create --name faiss_1.8.0
conda activate faiss_1.8.0
conda install -c pytorch -c nvidia faiss-gpu=1.8.0 pytorch=*=*cuda* pytorch-cuda=11 numpy
```
## Installing from conda-forge
Faiss is also being packaged by [conda-forge](https://conda-forge.org/), the
community-driven packaging ecosystem for conda. The packaging effort is
collaborating with the Faiss team to ensure high-quality package builds.
Due to the comprehensive infrastructure of conda-forge, it may even happen that
certain build combinations are supported in conda-forge that are not available
through the pytorch channel. To install, use
``` shell
# CPU version
$ conda install -c conda-forge faiss-cpu
# GPU version
$ conda install -c conda-forge faiss-gpu
# AMD ROCm version not yet available
```
You can tell which channel your conda packages come from by using `conda list`.
If you are having problems using a package built by conda-forge, please raise
an [issue](https://github.com/conda-forge/faiss-split-feedstock/issues) on the
conda-forge package "feedstock".
# Building from source
Faiss can be built from source using CMake.
Faiss is supported on x86-64 machines on Linux, OSX, and Windows. It has been
found to run on other platforms as well, see
[other platforms](https://github.com/facebookresearch/faiss/wiki/Related-projects#bindings-to-other-languages-and-porting-to-other-platforms).
The basic requirements are:
- a C++17 compiler (with support for OpenMP support version 2 or higher),
- a BLAS implementation (on Intel machines we strongly recommend using Intel MKL for best
performance).
The optional requirements are:
- for GPU indices:
- nvcc,
- the CUDA toolkit,
- for AMD GPUs:
- AMD ROCm,
- for the python bindings:
- python 3,
- numpy,
- and swig.
Indications for specific configurations are available in the [troubleshooting
section of the wiki](https://github.com/facebookresearch/faiss/wiki/Troubleshooting).
## Step 1: invoking CMake
``` shell
$ cmake -B build .
```
This generates the system-dependent configuration/build files in the `build/`
subdirectory.
Several options can be passed to CMake, among which:
- general options:
- `-DFAISS_ENABLE_GPU=OFF` in order to disable building GPU indices (possible
values are `ON` and `OFF`),
- `-DFAISS_ENABLE_PYTHON=OFF` in order to disable building python bindings
(possible values are `ON` and `OFF`),
- `-DFAISS_ENABLE_RAFT=ON` in order to enable building the RAFT implementations
of the IVF-Flat and IVF-PQ GPU-accelerated indices (default is `OFF`, possible
values are `ON` and `OFF`)
- `-DBUILD_TESTING=OFF` in order to disable building C++ tests,
- `-DBUILD_SHARED_LIBS=ON` in order to build a shared library (possible values
are `ON` and `OFF`),
- `-DFAISS_ENABLE_C_API=ON` in order to enable building [C API](c_api/INSTALL.md) (possible values
are `ON` and `OFF`),
- optimization-related options:
- `-DCMAKE_BUILD_TYPE=Release` in order to enable generic compiler
optimization options (enables `-O3` on gcc for instance),
- `-DFAISS_OPT_LEVEL=avx2` in order to enable the required compiler flags to
generate code using optimized SIMD/Vector instructions. Possible values are below:
- On x86-64, `generic`, `avx2` and `avx512`, by increasing order of optimization,
- On aarch64, `generic` and `sve`, by increasing order of optimization,
- `-DFAISS_USE_LTO=ON` in order to enable [Link-Time Optimization](https://en.wikipedia.org/wiki/Link-time_optimization) (default is `OFF`, possible values are `ON` and `OFF`).
- BLAS-related options:
- `-DBLA_VENDOR=Intel10_64_dyn -DMKL_LIBRARIES=/path/to/mkl/libs` to use the
Intel MKL BLAS implementation, which is significantly faster than OpenBLAS
(more information about the values for the `BLA_VENDOR` option can be found in
the [CMake docs](https://cmake.org/cmake/help/latest/module/FindBLAS.html)),
- GPU-related options:
- `-DCUDAToolkit_ROOT=/path/to/cuda-10.1` in order to hint to the path of
the CUDA toolkit (for more information, see
[CMake docs](https://cmake.org/cmake/help/latest/module/FindCUDAToolkit.html)),
- `-DCMAKE_CUDA_ARCHITECTURES="75;72"` for specifying which GPU architectures
to build against (see [CUDA docs](https://developer.nvidia.com/cuda-gpus) to
determine which architecture(s) you should pick),
- `-DFAISS_ENABLE_ROCM=ON` in order to enable building GPU indices for AMD GPUs.
`-DFAISS_ENABLE_GPU` must be `ON` when using this option. (possible values are `ON` and `OFF`),
- python-related options:
- `-DPython_EXECUTABLE=/path/to/python3.7` in order to build a python
interface for a different python than the default one (see
[CMake docs](https://cmake.org/cmake/help/latest/module/FindPython.html)).
## Step 2: Invoking Make
``` shell
$ make -C build -j faiss
```
This builds the C++ library (`libfaiss.a` by default, and `libfaiss.so` if
`-DBUILD_SHARED_LIBS=ON` was passed to CMake).
The `-j` option enables parallel compilation of multiple units, leading to a
faster build, but increasing the chances of running out of memory, in which case
it is recommended to set the `-j` option to a fixed value (such as `-j4`).
If making use of optimization options, build the correct target before swigfaiss.
For AVX2:
``` shell
$ make -C build -j faiss_avx2
```
For AVX512:
``` shell
$ make -C build -j faiss_avx512
```
This will ensure the creation of neccesary files when building and installing the python package.
## Step 3: Building the python bindings (optional)
``` shell
$ make -C build -j swigfaiss
$ (cd build/faiss/python && python setup.py install)
```
The first command builds the python bindings for Faiss, while the second one
generates and installs the python package.
## Step 4: Installing the C++ library and headers (optional)
``` shell
$ make -C build install
```
This will make the compiled library (either `libfaiss.a` or `libfaiss.so` on
Linux) available system-wide, as well as the C++ headers. This step is not
needed to install the python package only.
## Step 5: Testing (optional)
### Running the C++ test suite
To run the whole test suite, make sure that `cmake` was invoked with
`-DBUILD_TESTING=ON`, and run:
``` shell
$ make -C build test
```
### Running the python test suite
``` shell
$ (cd build/faiss/python && python setup.py build)
$ PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
```
### Basic example
A basic usage example is available in
[`demos/demo_ivfpq_indexing.cpp`](https://github.com/facebookresearch/faiss/blob/main/demos/demo_ivfpq_indexing.cpp).
It creates a small index, stores it and performs some searches. A normal runtime
is around 20s. With a fast machine and Intel MKL's BLAS it runs in 2.5s.
It can be built with
``` shell
$ make -C build demo_ivfpq_indexing
```
and subsequently ran with
``` shell
$ ./build/demos/demo_ivfpq_indexing
```
### Basic GPU example
``` shell
$ make -C build demo_ivfpq_indexing_gpu
$ ./build/demos/demo_ivfpq_indexing_gpu
```
This produce the GPU code equivalent to the CPU `demo_ivfpq_indexing`. It also
shows how to translate indexes from/to a GPU.
### A real-life benchmark
A longer example runs and evaluates Faiss on the SIFT1M dataset. To run it,
please download the ANN_SIFT1M dataset from http://corpus-texmex.irisa.fr/
and unzip it to the subdirectory `sift1M` at the root of the source
directory for this repository.
Then compile and run the following (after ensuring you have installed faiss):
``` shell
$ make -C build demo_sift1M
$ ./build/demos/demo_sift1M
```
This is a demonstration of the high-level auto-tuning API. You can try
setting a different index_key to find the indexing structure that
gives the best performance.
### Real-life test
The following script extends the demo_sift1M test to several types of
indexes. This must be run from the root of the source directory for this
repository:
``` shell
$ mkdir tmp # graphs of the output will be written here
$ python demos/demo_auto_tune.py
```
It will cycle through a few types of indexes and find optimal
operating points. You can play around with the types of indexes.
### Real-life test on GPU
The example above also runs on GPU. Edit `demos/demo_auto_tune.py` at line 100
with the values
``` python
keys_to_test = keys_gpu
use_gpu = True
```
and you can run
``` shell
$ python demos/demo_auto_tune.py
```
to test the GPU code.