Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3361
Fix a few issues in the PR.
Normally all tests should pass on a litlle-endian machine
Reviewed By: junjieqi
Differential Revision: D56003181
fbshipit-source-id: 405dec8c71898494f5ddcd2718c35708a1abf9cb
Summary:
This pull request is for issue https://github.com/facebookresearch/faiss/issues/3330. This patch makes sure that packed code arrays are in big endian format. Kindly let us know if we need any changes or if we can have a better approach.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3345
Reviewed By: junjieqi
Differential Revision: D55957630
Pulled By: mdouze
fbshipit-source-id: f728f9563f6b942af9d8899b54662d7ceb811206
Summary:
The current loop goes from 0 to 31. It has an if statement to do an assignment for j < 16 and a different assignment for j >= 16. By unrolling the loop to do the j < 16 and the j >= 16 iterations in parallel the if j < 16 is eliminated and the number of loop iterations is reduced in half.
Then unroll the loop for the j < 16 and the j >=16 to a depth of 2.
This change results in approximately a 55% reduction in the execution time for the bench_ivf_fastscan.py workload on Power 10 when compiled with CMAKE_INSTALL_CONFIG_NAME=Release.
The removal of the if (j < 16) statement and the unrolling of the loop removes branch cycle stall and register dependencies on instruction issue. The result is the unrolled code is able issue instructions earlier thus reducing the total number of cycles required to execute the function.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3364
Reviewed By: kuarora
Differential Revision: D56455690
Pulled By: mdouze
fbshipit-source-id: 490a17a40d9d4439b1a8ea22e991e706d68fb2fa
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3371
This will never happen because N is fixed at compile time and the buffer is large enough. It is misleading to add error handling code for a case that will never happen.
Reviewed By: kuarora
Differential Revision: D56274458
fbshipit-source-id: ca706f1223dbc97e69d5ac9750b277afa4df80a7
Summary:
In this commit ab2b7f5093, they changed format based on clang-format-18. However, we still use clang-format-11 in our circle ci job which caused the failure. In this PR, we are going to switch to clang-format-18
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3372
Reviewed By: kuarora
Differential Revision: D56280363
Pulled By: junjieqi
fbshipit-source-id: f832ab2112f762e6000b55a155e3e43fe99071d7
Summary:
The CMakeLists.txt in faiss/gpu uses the $<LINK_LIBRARY:WHOLE_ARCHIVE expression which requires at least cmake 3.24.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3305
Reviewed By: mlomeli1
Differential Revision: D56234500
Pulled By: algoriddle
fbshipit-source-id: dfe7df3379c5250dedec7d1988cffa889fc1c393
Summary:
LLVM-15 has a warning `-Wunused-but-set-variable` which we treat as an error because it's so often diagnostic of a code issue. Unused variables can compromise readability or, worse, performance.
This diff either (a) removes an unused variable and, possibly, it's associated code, or (b) qualifies the variable with `[[maybe_unused]]`, mostly in cases where the variable _is_ used, but, eg, in an `assert` statement that isn't present in production code.
- If you approve of this diff, please use the "Accept & Ship" button :-)
Reviewed By: dmm-fb
Differential Revision: D56065763
fbshipit-source-id: b0541b8a759c4b6ca0e8753fc24b8c227047bd3d
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3363
'sprintf' is deprecated: This function is provided for compatibility reasons only. Due to security concerns inherent in the design of sprintf(3), it is highly recommended that you use snprintf(3) instead
{F1484071654}
Reviewed By: kuarora
Differential Revision: D56009251
fbshipit-source-id: ec222cf589ff98b016979058d59fc20cccec8f43
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3354
**Change was previously reverted because of build failure as change D55577576 removed the definition of FAISS_THROW_IF_MSG**
**Context**
[Issue 3128](https://github.com/facebookresearch/faiss/issues/3128) is an enhancement request to support remove_ids for IVFPQFastScan.
Existing mechanism use direct map and iterate over items in selector and use scopecodes and scopeIds to replace item to be removed. Given that codes are packed, it is hard to return single code how it is packed in CodePackerPQ4. Thus, we need a custom implementation to removed_ids.
**In this diff**,
1. We have added custom implementation of remove_ids from BlockInvertedLists which unpack code as it iterate and repack in new position. DirectMap use this remove_id function in BlockInvertedLists for type NoMap in DirectMap.
2. Also, we are throwing exception for other map type in DirectMap i.e. HashTable
Reviewed By: ramilbakhshyiev
Differential Revision: D55858959
fbshipit-source-id: c8a0631495380b7dead36720e4507f4d1900d39f
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3349
**Context**
[Issue 3128](https://github.com/facebookresearch/faiss/issues/3128) is an enhancement request to support remove_ids for IVFPQFastScan.
Existing mechanism use direct map and iterate over items in selector and use scopecodes and scopeIds to replace item to be removed. Given that codes are packed, it is hard to return single code how it is packed in CodePackerPQ4. Thus, we need a custom implementation to removed_ids.
**In this diff**,
1. We have added custom implementation of remove_ids from BlockInvertedLists which unpack code as it iterate and repack in new position. DirectMap use this remove_id function in BlockInvertedLists for type NoMap in DirectMap.
2. Also, we are throwing exception for other map type in DirectMap i.e. HashTable
Reviewed By: mdouze
Differential Revision: D55723390
fbshipit-source-id: 0017b556bd790765251e778ac48ed37ff3a29a45
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3327
**Context**
1. [Issue 2621](https://github.com/facebookresearch/faiss/issues/2621) discuss inconsistency between OnDiskInvertedList and InvertedList. OnDiskInvertedList is supposed to handle disk based multiple Index Shards. Thus, we should name it differently when merging invls from index shard.
2. [Issue 2876](https://github.com/facebookresearch/faiss/issues/2876) provides usecase of shifting ids when merging invls from different shards.
**In this diff**,
1. To address #1 above, I renamed the merge_from function to merge_from_multiple without touching merge_from base class.
why so? To continue to allow merge invl from one index to ondiskinvl from other index.
2. To address #2 above, I have added support of shift_ids in merge_from_multiple to shift ids from different shards. This can be used when each shard has same set of ids but different data. This is not recommended if id is already unique across shards.
Reviewed By: mdouze
Differential Revision: D55482518
fbshipit-source-id: 95470c7449160488d2b45b024d134cbc037a2083
Summary:
I noticed we have a pretty decent C API for binary indexes and please correct me if I'm wrong but we seem to be missing a couple of functions, like the ability to clone and read binary indexes. This PR provides those functions.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3318
Reviewed By: algoriddle
Differential Revision: D55469615
Pulled By: mdouze
fbshipit-source-id: 42e6f827d8b5ad6bc3efe989e47ede3aa06c1810
Summary:
In the https://github.com/facebookresearch/faiss/pull/3278, we to find_package to link to GTest. However, it needs to have googletest to build independently. Not everyone builds their googletest locally first. In this diff, we still try to build googletest from source and combine find_package together.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3319
Test Plan:
STEP 1: Install deps
```
conda install -y -q python=3.11 cmake make swig=4.0.2 mkl=2023 mkl-devel=2023 numpy scipy pytest gxx_linux-64 sysroot_linux-64
```
STEP2: Compile
```
cmake -B build \
-DBUILD_TESTING=ON \
-DBUILD_SHARED_LIBS=ON \
-DFAISS_ENABLE_GPU=OFF \
-DFAISS_ENABLE_RAFT=OFF \
-DFAISS_OPT_LEVEL=avx2 \
-DFAISS_ENABLE_C_API=ON \
-DPYTHON_EXECUTABLE=$(which python) \
-DCMAKE_BUILD_TYPE=Release \
-DBLA_VENDOR=Intel10_64_dyn \
-DCMAKE_CUDA_FLAGS="-gencode arch=compute_75,code=sm_75" \
.
```
Reviewed By: algoriddle
Differential Revision: D55358059
Pulled By: junjieqi
fbshipit-source-id: 95ad4a745238b88b438728de64173f99d3d50dbe
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3312
as the [#issue3258](https://github.com/facebookresearch/faiss/issues/3258) mentioned, the IVFPQFastScan should have same decoding result as IVFPQ. However, current result is not as expected.
In this PR/Diff, we are going to fix the decoding function
Reviewed By: mdouze
Differential Revision: D55264781
fbshipit-source-id: dfdae9eabceadfc5a3ebb851930d71ce3c1c654d
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3317
libraft packages are first published in rapidsai-nightly and moved to rapidsai after release, at which point they're removed from rapidsai-nightly
In this diff we enable both channels with a preference to rapidsai (since it's before rapidsai-nightly on the command line).
Reviewed By: mlomeli1
Differential Revision: D55310143
fbshipit-source-id: b85e0fda86a442f435d985ace1d7eb37209c74e1
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3310
**Context**
[Issue 2751](https://github.com/facebookresearch/faiss/issues/2751) is already fixed as class wrappers has replacement definition of reconstruct_n in handle_IndexBinary.
**In this diff**,
Writing test test_reconstruct for binary index to validate fix for above issue.
Reviewed By: junjieqi
Differential Revision: D55168600
fbshipit-source-id: b62dc5fa89d65b843c52faa7456f046142e34421
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3311
**Context**
[Issue 2948](https://github.com/facebookresearch/faiss/issues/2948) highlights potential issue of calling allocation on result handler which may throw exception but it is not handled.
**In this diff**,
I observed two calls where we may potentially call allocation in ResultHandler.h and handled FaissException.
1/ partial result when finalized in ~SingleResultHandler
2/ partial result when merged in ~RangeSearchBlockResultHandler
Reviewed By: junjieqi
Differential Revision: D55258213
fbshipit-source-id: 259be472e73619b2fcb0ea480d6d3486affeafdf
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3309
Make sure that the HNSW search stats work, remove stats for deprecated functionality.
Remove code of the link and code paper that is not supported anymore.
Reviewed By: kuarora, junjieqi
Differential Revision: D55247802
fbshipit-source-id: 03f176be092bff6b2db359cc956905d8646ea702
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3295
In the past, we had build failure due to the latest swig version in conda-forge so we had to specify the version of swig. In this diff, we are going to change it to be the latest version always because the issue has gone.
Reviewed By: algoriddle
Differential Revision: D54975271
fbshipit-source-id: 7ca59fb58390edb0cc5ed52f6fd416f633dd7938
Summary:
This PR adds support for dimensionality reduction in OIVFBBS. I tested the code with an index `OPQ64_128,IVF4096,PQ64` using the ssnpp embeddings - this index string is added to the config_ssnpp.yaml to showcase this functionality.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3290
Reviewed By: junjieqi
Differential Revision: D54878345
Pulled By: mlomeli1
fbshipit-source-id: 98ecdeb2224ce0325e37720cc113d82f9c6c75d6
Summary:
in AIX OS ,as fileno is defined as C macro, we get the compilation error during preprocessing step.
In file included from /ranjit/Faiss/faiss/faiss/invlists/InvertedListsIOHook.h:10:
/ranjit/Faiss/faiss/faiss/impl/io.h:35:17: error: expected member name or ';' after declaration specifiers
35 | virtual int fileno();
| ~~~~~~~~~~~ ^
/usr/include/stdio.h:517:30: note: expanded from macro 'fileno'
517 | #define fileno(__p) ((__p)->_file)
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3275
Reviewed By: algoriddle
Differential Revision: D54944388
Pulled By: mdouze
fbshipit-source-id: 40c4314de93547778ac274281245ff59e3a18b6d
Summary:
## Description
Related issue: https://github.com/facebookresearch/faiss/issues/3246
When reading HNSWPQ from disk, if index ~read only~ new `IO_FLAG_PQ_SKIP_SDC_TABLE` flag is set, skip initializing the sdc_table.
In addition, adds cpp test case verifying functionality and build test util header file to share creation of temporary files amongst tests.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3250
Test Plan: buck test //faiss/tests/:test_disable_pq_sdc_tables
Reviewed By: junjieqi
Differential Revision: D53844075
Pulled By: mdouze
fbshipit-source-id: e9a83c0e5243867edbca8f80e3b1242b38ef6a42
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3256
Per https://github.com/facebookresearch/faiss/issues/3251 there are two problems with DeviceVector resizing and capacity growth. The first is that if you resize a vector with enough capacity available for the new size, it will go ahead and re-allocate memory anyways.
The second is that the calculation that was supposed to produce x + 0.25 * x was actually producing x + 4 * x for determining the new size of the allocated memory for a vector. This is also fixed.
Reviewed By: mdouze
Differential Revision: D53813207
fbshipit-source-id: 5aa67bc0a87c171a070645bdcc6bc5d22ba6b36b
Summary:
Tested with both MSVC (with /openmp:llvm) and clang-cl (no particular extra flags needed). This PR is separated into two commits (three after I found out that lines need to be 80 chars or less):
1. Changes needed for clang-cl (and probably stock clang too)
2. Changes needed for MSVC
So FAISS can decide either to require using LLVM for Windows (not a hard thing to do these days since it is fully supported inside Visual Studio) and discarding the second commits, or taking them all and documenting the need to use /openmp:llvm
Closes https://github.com/facebookresearch/faiss/issues/3193
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3238
Reviewed By: mdouze
Differential Revision: D53479325
Pulled By: algoriddle
fbshipit-source-id: e8628f44626b6f49c5d9d7f259a9e3061cfe5568
Summary:
LLVM-15 has a warning `-Wunused-but-set-variable` which we treat as an error because it's so often diagnostic of a code issue. Unused variables can compromise readability or, worse, performance.
This diff either (a) removes an unused variable and, possibly, it's associated code, or (b) qualifies the variable with `[[maybe_unused]]`, mostly in cases where the variable _is_ used, but, eg, in an `assert` statement that isn't present in production code.
- If you approve of this diff, please use the "Accept & Ship" button :-)
Reviewed By: palmje
Differential Revision: D53779589
fbshipit-source-id: 2631d7b23f2bc79b0468a8d983f74547c15f9c15