Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3786
ROCm build successfully passes all but 2 GPU tests and we want to enable the passing test on CI while skipping the 2 failing tests to make progress. The 2 failing tests are failing specifically on the hardware type that we use for our runners and the AMD team is actively working on root causing it and providing a fix:
`TestGpuIndexIVFPQ.Query_L2_MMCodeDistance`
`TestGpuIndexIVFPQ.Query_IP_MMCodeDistance`
Reviewed By: asadoughi
Differential Revision: D61688657
fbshipit-source-id: 3fedfcf22a0ccf40ac8aff033e8bc09c4eb0cbd5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3785
Right now when avx512 is turned on, we will only return AVX2 in options. My understanding is turning on avx512 sets both the macros `__AVX2__` and `__AVX512F__`: https://fburl.com/vgh7jg9p
Reviewed By: asadoughi
Differential Revision: D61674490
fbshipit-source-id: 47292025b4eb5ef5907c4fbb0bbf39259129f6ee
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3772
It looks like there are many failures on the retry build workflow, but these are mainly due to retry attempts with the --failed flag being unable to rerun workflows that don't have any failed jobs.
Reviewed By: kuarora, junjieqi, ramilbakhshyiev
Differential Revision: D61489426
fbshipit-source-id: 6dcef6ba422634bb333e44a5b12c74c5d3b3df8f
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3757
In the telemetry wrapper, we need to wrap read_index to return wrapped index structs. D61049751
This read_index wrapper calls several static functions. These are not callable outside a C++ file. Thus this diff changes them to non static and declares them in the header file. Then the wrapper is able to call them.
Reviewed By: asadoughi
Differential Revision: D61282004
fbshipit-source-id: 2c8b2ded169577aa6eecdf1edc7483b0ef5f0665
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3732
AVX512 has been running on GhA for some days without issues. Deleting the CircleCI config. Will press the "deprecate CircleCI button" in 1-2 more weeks. I want to wait a little longer just in case anything goes wrong for AVX512 on GhA.
Reviewed By: junjieqi, ramilbakhshyiev
Differential Revision: D60914370
fbshipit-source-id: 5bb09e81c3f5cd1a58525fe633d07373884207d4
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3777
openblas version is bumped from 0.3.27 -> 0.3.28 in the last 3 days. This caused the below test to fail. Confirmed with algoriddle bumping nprobe is okay to do
Reviewed By: algoriddle
Differential Revision: D61536541
fbshipit-source-id: 1e83f75011517ba7b856520f11526e72a00494a5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3761
This fixes CUDA errors inside faiss in the test environment. If torch is loaded first (this change) then both torch and faiss see all GPUs available on the machine in the ROCm build. Without this change, torch sees the GPUs and faiss does not. AMD team is looking at finding the root cause but we wanted to fix this for now.
Reviewed By: junjieqi, mnorris11
Differential Revision: D61358018
fbshipit-source-id: ac59be99817ef13d37a1676f615585f44eabaf24
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3760
This fixes the memleak and the warning received after running Python tests under ROCm since no destructor was declared and objects would remain allocated.
Reviewed By: gtwang01
Differential Revision: D61357579
fbshipit-source-id: cf73bbd7a7002565a4224c1f0af0aa6ea5edebdb
Summary:
Several changes:
1. Introduce `ClusteringParameters::check_input_data_for_NaNs`, which may suppress checks for NaN values in the input data
2. Introduce `ClusteringParameters::use_faster_subsampling`, which uses a newly added SplitMix64-based rng (`SplitMix64RandomGenerator`) and also may pick duplicate points from the original input dataset. Surprisingly, `rand_perm()` may involve noticeable non-zero costs for certain scenarios.
3. Negative values for `ClusteringParameters::seed` initialize internal clustering rng with high-resolution clock each time, making clustering procedure to pick different subsamples each time. I've decided not to use `std::random_device` in order to avoid possible negative effects.
Useful for future `ProductResidualQuantizer` improvements.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3731
Reviewed By: asadoughi
Differential Revision: D61106105
Pulled By: mnorris11
fbshipit-source-id: 072ab2f5ce4f82f9cf49d678122f65d1c08ce596
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3748
So we can dynamically change it
Reviewed By: asadoughi
Differential Revision: D61029191
fbshipit-source-id: 19a6775c1218762dac7a7805e13efab9bb43cfa5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3747
This change converts the ROCm build to run inside containers and updates it to run on AMD GPU based runners. Still working with the AMD team to resolve test failures before enabled those.
Differential Revision: D61049115
fbshipit-source-id: 28274e0bde795f99b3d78711beaf9b3ed3c5e66c
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3744
gpg is needed for ROCm builds but does not come with containerized builds. This change add installation of gpg.
Reviewed By: junjieqi
Differential Revision: D61007840
fbshipit-source-id: 6322112803866dff57637bea290dc032e2bf41ad
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3743
This fixes containerized builds that will be needed for ROCm.
Reviewed By: junjieqi
Differential Revision: D61007764
fbshipit-source-id: 11fa8dc3641a85d4c220832bedf0f6d62ae49426
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3742
Before this change, the CMake line was setting (instead of appending to) compile definitions for Python code which replace the GPU wrappers flag and resulting in Python library compiling with no GPU code which failed ROCm tests.
Reviewed By: junjieqi
Differential Revision: D61007640
fbshipit-source-id: 174aeb0a4abe0607629ddf57c882d19ea2d6c6bf
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3718
This workflow needs to be pushed first before it can be called from the build workflow.
Reviewed By: ramilbakhshyiev
Differential Revision: D60697701
fbshipit-source-id: 40cb6b7006dae8293e966cc2cbb0ebda5d606045
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3737
This moves nightly builds away from when most of the team is working to avoid exhausting limited resources like custom hardware / specialized hardware.
Reviewed By: bshethmeta
Differential Revision: D60976671
fbshipit-source-id: 1a8521379654a06a793fda0ae3f3bd1bf6fa8bf6
Summary:
The TestPartitioning.TestPartitioningBigRange test case fails on gcc version 13.2. We can avoid this by requiring gcc version 11.2 where the test case works.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3655
Test Plan: Check github workflows and test branches
Reviewed By: ramilbakhshyiev
Differential Revision: D59988036
Pulled By: gtwang01
fbshipit-source-id: ae6d7f7888c9d7a2e59f557e05dbd4f318983668
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3725
This step is necessary for both of the builds with newer gxx_linux package version. ROCm is already using this symlinking and this change expands it to RAFT as well.
Reviewed By: mengdilin
Differential Revision: D60830977
fbshipit-source-id: fe95a6580b3866e17b56d542509405e93a3ff453
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3717
they have been running in shadow mode for quite some time now.
I spot-checked the builds for the past 10 jobs and they all look good. Since `continue-on-error` will always mark a job as "green" even if it fails, I need a way to holistically verify these builds actually work reliably. Turning the builds to blocking to accomplish that.
Reviewed By: ramilbakhshyiev
Differential Revision: D60692521
fbshipit-source-id: 172a6362c672b0376c76559f12852110936756df
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3716
Renaming `USE_ROCM` to `FAISS_ENABLE_ROCM` in CMake files, `FAISS_ENABLE_ROCM` in SWIG files, and `USE_AMD_ROCM` in other source files to follow the existing naming convention.
Reviewed By: mnorris11
Differential Revision: D60673731
fbshipit-source-id: 1aaa3f2ff6836830c4eb733ee7f41554f79f9695
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3715
Cleaning up linter and shell warnings after importing the PR. This is done as a follow-up as it's easier to address these in a separate PR since the original PR was authored by an external contributor.
Reviewed By: mengdilin
Differential Revision: D60639835
fbshipit-source-id: eba00a557339873742e1caf43c6be45f4d065333
Summary:
* add hipify at configure time
* ROCm specific code paths behind USE_ROCM guards
* support for wavefront 32 (Navi) and 64 (MI)
* use builtins to match inline PTX
* support C API on ROCm
* support Python API on ROCm
---------
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3462
Reviewed By: asadoughi
Differential Revision: D60431193
Pulled By: ramilbakhshyiev
fbshipit-source-id: ac82d5ecb38f995c467e100ed583d5178ae489ee
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3703
Now that SVE PR has been merged, we can turn on SVE opt mode in CI
Reviewed By: ramilbakhshyiev
Differential Revision: D60457456
fbshipit-source-id: 053b1f8ac805afba9035095c5df811da05675a81
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3701
Gating ARM SVE behind the plain vanilla linux x64 build like all the other builds.
Reviewed By: mengdilin
Differential Revision: D60425535
fbshipit-source-id: f2e082fbaa6ea1e314ffe0e2e7260c8634cab989
Summary:
related: https://github.com/facebookresearch/faiss/issues/2884
This PR contains below changes:
- Add new optlevel `sve`
- ARM SVE is _extension_ of ARMv8, so it should be treated similar to AVX2 IMO
- Add targets for ARM SVE, `faiss_sve` and `swigfaiss_sve`
- These targets will be built when you give `-DFAISS_OPT_LEVEL=sve` at build time
- Design decision: Don't fix SVE register length.
- The python package of faiss is "fat binary" (for example, the package for avx2 contains `_swigfaiss_avx2.so` and `_swigfaiss.so`)
- SVE is scalable instruction set (= doesn't fix vector length), but actually we can specify the vector length at compile time.
- [with `-msve-vector-length=` option](https://developer.arm.com/documentation/101726/4-0/Coding-for-Scalable-Vector-Extension--SVE-/SVE-Vector-Length-Specific--VLS--programming)
- When this option is specified, the binary can't work correctly on the CPU which has other vector length rather than specified at compile time
- When we use fixed vector length, SVE-supported faiss python package will contain 7 shared libraries like `_swigfaiss.so` , `_swigfaiss_sve.so` , `_swigfaiss_sve128.so` , `_swigfaiss_sve256.so` , `_swigfaiss_sve512.so` , `_swigfaiss_sve1024.so` , and `_swigfaiss_sve2048.so` . The package size will be exploded.
- For these reason, I don't specify the vector length at compile time and `faiss_sve` detects the vector length at run time.
- Add a mechanism of detecting ARM SVE on runtime environment and importing `swigfaiss_sve` dynamically
- Currently it only supports Linux, but there is no SVE environment with non-Linux OS now, as far as I know
NOTE: I plan to make one more PR about add some SVE implementation after this PR merged. This PR only contains adding sve target.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2886
Reviewed By: ramilbakhshyiev
Differential Revision: D60386983
Pulled By: mengdilin
fbshipit-source-id: 7e66162ee53ce88fbfb6636e7bf705b44e6c3282
Summary:
Add instructions to download arm64 specific conda dependencies and cmake command and run it on CI. This should prepare us to turn on CI with SVE optimization
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3653
Reviewed By: ramilbakhshyiev
Differential Revision: D60043435
Pulled By: mengdilin
fbshipit-source-id: d81bb1c1022681c3da8f98bbf080d5e1d65d6b80
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3688
Looks like our previous changes only modified the cpp API. Not the c_api like the request wanted. This attempts to add faiss_get_version to c_api
Reviewed By: ramilbakhshyiev
Differential Revision: D60207739
fbshipit-source-id: 07184aeae92a154bb3f440279595077f002851f3
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3687
We will write a warning if nbits is not specified while using index factory with LSH. The warning lets users know we will be using default d as nbits.
Reviewed By: ramilbakhshyiev
Differential Revision: D60187935
fbshipit-source-id: 0fa960eeed615d857add77fa131a4cfa1989809d
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3679
T195237796 Claims we should be able to incldue nbits in the LSH factory string.
Their example is:
```
index = faiss.index_factory(128, 'LSH16rt')
Returns the following error.
faiss/index_factory.cpp:880: could not parse index string LSHrt_16
```
This is my first attempt at modifying the regex to accept an integer for nbits. Can an expert help me understand what the domain of accepted strings should be so I can modify the regex as necessary?
Reviewed By: ramilbakhshyiev
Differential Revision: D60054776
fbshipit-source-id: e47074eb9986b7c1c702832fc0bf758f60f45290
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3635
Add a util function to return the version in the c api.
Reviewed By: ramilbakhshyiev, fxdawnn
Differential Revision: D59817407
fbshipit-source-id: ca805f8e04f554d0294ba9da8ec6dc7c31e91fe3