Summary:
It turns out to be very useful for testing faiss compilation in a fresh environment
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3851
Reviewed By: junjieqi
Differential Revision: D62655137
Pulled By: mengdilin
fbshipit-source-id: c1c2591e26bd20fb3783d3661a4096eaf31c4c08
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3855
Laser clients are calling RCQ search with a query size of 1 and the bulk of the overhead came from IndexFlat add/search. With a small query size, using IndexFlatL2 does lots of unnecessary copies to the IndexFlatL2.
By default, we should fall back to https://fburl.com/code/jpt236mz branch unless the client overrides `assign_index` with `assign_index_factory`.
Reviewed By: bshethmeta
Differential Revision: D62644305
fbshipit-source-id: 2434e2b238968304dc5b346637f8ca956e6bd548
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3849https://github.com/facebookresearch/faiss/issues/3845
Add unit tests for helper search utilities for HNSW. These utility functions live inside an anonymous namespace and each has a reference version gated behind a const bool, I refactored them so the reference version is a flag for the function which defaults to false.
If we are concerned about the performance overhead of the extra if branching (whether to use reference version or not) inside these utility functions, I'm happy to lift out the reference versions to their own functions inside the unit test
Reviewed By: junjieqi
Differential Revision: D62510014
fbshipit-source-id: b92ed4db69d125c7830da93946f1c2374fb87b08
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3848
same as title.
Dataset can be referred from blobstore
Reviewed By: satymish
Differential Revision: D62476993
fbshipit-source-id: db2b4088ab6e02278b8b91194bf916fc476b79ec
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3812
Allows factory strings like `IVF3k,Flat` as a shorthand for 3072 centroids.
The main question is whether k or M should be metric (k=1000) or power of 2 (k=1024):
* pro-metric: standard,
* pro-power of 2: in practice we use powers of 2 most often
The suffixes ki and Mi should be used for powers of 2 but this makes the notation more heavy (which is what we wanted to avoid in the first place).
So I picked power of 2.
Reviewed By: mnorris11
Differential Revision: D62019941
fbshipit-source-id: f547962625123ecdfaa406067781c77386017793
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3838
This fixes 3787 now that we do not install CUDA for ROCm builds.
Reviewed By: mengdilin
Differential Revision: D62283662
fbshipit-source-id: e5c736296b1c6d7b9a6b9f60161ffe3b5cb1c699
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3839
This is a prerequisite to fixing issue 3787 and an upgrade to a newer stable version.
Reviewed By: mengdilin
Differential Revision: D62284555
fbshipit-source-id: 946f7757eea36bdddc3f8bb7d8c16168e90fd063
Summary: ROCm does not require CUDA, this change stops installing it.
Reviewed By: mengdilin
Differential Revision: D62283602
fbshipit-source-id: 8fd0d770c5bd407b0c7bca7e92d754e05b5083da
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3836
This disables verbose output from apt-get and only outputs on errors to make the build output logs more readable.
Reviewed By: junjieqi
Differential Revision: D62278742
fbshipit-source-id: 524490ffd95fc1283f69797c0da57886e68206a6
Summary:
This change decreases the training time for 1M x 768 dataset down to 10 minutes from 13 minutes in our experiments.
Please verify and benchmark.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3822
Reviewed By: mdouze
Differential Revision: D62151489
Pulled By: kuarora
fbshipit-source-id: b29ffd0db615bd52187464b4665c31fc9d3b8d0a
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3807
This diff re-organizes the tests a bit:
* groups tests related to SWIG into a single file
* enable doxygen test conditionally
* removes a dep on platform.Version that does not exist on some envs
* move a few tests out of build_blocks to avoid it being a catch-all file for uncategorized tests
Reviewed By: asadoughi
Differential Revision: D61959750
fbshipit-source-id: fb6ada5b5a980ac07088254183d80b80ddfb1f45
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3827
There are few fixes in this diff which allows us to execute search on an existing index without needing to compare it with ground truth. This has been currently added to only knn search (not range)
Reviewed By: satymish
Differential Revision: D61825365
fbshipit-source-id: ee1e39260ed3480ed32aeeb8d7232e975f56bbfa
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3833
size_t expands to unsigned long int, so any left shift more than 31 means ksub overflows. So, we can add a check right before it is constructed in ProductQuantizer.
Ramil talked to Matthijs and the outcome is that 16 bits isn't really practical, but we can do the check for 24 to be safe.
Reviewed By: kuarora, mengdilin
Differential Revision: D62153881
fbshipit-source-id: 721db6bf6ad5dd5d336b4498f4750acc4059310c
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3834
Looks like ServiceLab does not handle any metric that is not integer: https://fburl.com/code/chqi5hcr The current experiments are erroring with the message https://fburl.com/servicelab/99s69hbf:
```
ERROR:windtunnel.benchmarks.gbench_benchmark_runner.gbench_benchmark_runner:exception occurred while processing benchmark (this usually means a benchmark is misconfigured) faiss/perf_tests/scalar_quantizer_distance_test1/bench_scalar_quantizer_distance:QT_bf16_128d_2000n/iterations invalid literal for int() with base 10: 'OPTIMIZE AVX2 '
Traceback (most recent call last):
File "windtunnel/benchmarks/gbench_benchmark_runner/gbench_benchmark_runner.py", line 116, in parse_gbench_results
int(entry[metric_name]),
ValueError: invalid literal for int() with base 10: 'OPTIMIZE AVX2 '
```
Removing the label that's causing the failure. We can track the optimization mode via experiment names, or a dummy counter name in the future as I roll out multi-platform experiments.
Reviewed By: junjieqi
Differential Revision: D62166280
fbshipit-source-id: 6de70c945cf5058feb479e6dd501d7e84d08ef83
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3825
- previously, there is 1 benchmark that measures the reconstruction error and then measures the performance of the distance computation for the scalar quantizer. Split it up into distance benchmark and accuracy benchmark.
- add performance benchmarks for encode and decode as well
- refactor the benchmarks to accept `n` and `d` as command line arguments. We run the benchmarks with `n` = 2000 and `d` = 128 to start. Happy to expand it to d=`256` and a higher `n` if we think it's better.
- refactor the targets file so we can create servicelab experiments based on different parameters
Planning to use the benchmarks here to test my simd refactor changes (and expand the benchmarks when necessary).
Reviewed By: mnorris11
Differential Revision: D62049857
fbshipit-source-id: 7e4cbfe27af6da09616b2e7c82d77480c8ddecd6
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3824
We currently have a 6bit quantizer benchmark running on servicelab during difftime.
This diff creates `faiss/perf_tests` directory which will host all the benchmarks that run as performance regression tests. It generalizes `bench_6bit_codec` to work for all quantizer types.
I am not deleting the existing `bench_6bit_codec` experiments yet because I want to play around with parameters: {num_trials, num_iterations, strobelight_profiling} in the new sets of experiments to see which combinations provide the least amount of noise. For now, these experiments will not run on diffs.
Reviewed By: mnorris11
Differential Revision: D62033843
fbshipit-source-id: d59a67db6c68d8830b1bc8b4d95e2fe8262e8c4d
Summary:
I noticed by default, conda install openblas installs `libopenblas-pthreads` on our SVE CI. This can be problematic as described in https://github.com/facebookresearch/faiss/wiki/Troubleshooting#surprising-faiss-openmp-and-openblas-interaction
Updating installation of openblas to be more specific and use the version that works well with openmp.
Sees version `0.3.27-openmp_h1b0c31a_0` for openblas instead of `pthread`
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3776
Reviewed By: ramilbakhshyiev
Differential Revision: D61856775
Pulled By: mengdilin
fbshipit-source-id: 950bd68ba438d221b39d25b2d6e185bc61512243
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3808
Previously, we only upload the results when the tests passed. It is useful to also upload results of failing tests for debugging
Reviewed By: ramilbakhshyiev
Differential Revision: D61972433
fbshipit-source-id: 1825e6ebea56279e4d8d8c6480841985c1626674
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3817
`nullptr` is typesafe. `0` and `NULL` are not. We are interested in enabling `-Wzero-as-null-pointer-constant` for first-party code, and only `nullptr` will be allowed.
Reviewed By: zertosh
Differential Revision: D62083883
fbshipit-source-id: 9f19434ff72faa1444dec72ec182a41398b575fe
Summary:
`torch.load` without `weights_only` parameter is unsafe. Explicitly set `weights_only` to False only if you trust the data you load and full pickle functionality is needed, otherwise set `weights_only=True`.
If `weights_only=True` doesn't work for some cases, then explicit `weights_only=False` should be used.
Found with https://github.com/pytorch-labs/torchfix/
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3796
Reviewed By: asadoughi
Differential Revision: D61824340
Pulled By: kit1980
fbshipit-source-id: bc013d06d4f368f730ffee6898e75fd0b0ff1d40
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3809
test_IndexIVFRQ is failing nightly GPU w/RAFT builds. These are false positive failures for the edge cases when the distance of a query vector to two candidate vectors is identical. In such cases, the search and search_and_reconstruct return these candidate vectors in a different order (on GPU w/Raft).
Current fix is to ignore the order of such ids with identical distances by sorting on distance and use id as tiebreaker. There is negligible performance implication (5ms extra), given the test candidate size is low.
Reviewed By: mengdilin, ramilbakhshyiev
Differential Revision: D61977915
fbshipit-source-id: 5f7a81c51c91a967013fb5e69e2bfee59be341c7
Summary:
A collection of small tweaks that might be needed in the future.
* Adds another statistics for HNSW called `nhops`, which is a number of different nodes visited through a single query search
* AVX-512 code for `PQDecoder8`
* Binary heap utilities for data stored in `std::pair<dis,ids>` format
* Completely remove `HNSW::upper_beam`
* AVX-512 code `HNSW::MinimaxHeap::pop_min`
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3692
Test Plan: buck test //faiss/tests/:test_graph_based
Reviewed By: mnorris11
Differential Revision: D61210752
Pulled By: mdouze
fbshipit-source-id: dea78c04a1b30db2885ecb67235de5099b5b3476
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3806
No explanation for why this works but importing faiss prior to torch causing Python to crash on certain test cases under pytest. This is a temporary workaround with a full RCA required.
Reviewed By: pankajsingh88
Differential Revision: D61956515
fbshipit-source-id: d33992d0b92e87d58ddff1f160990cba487b9927
Summary: It was not training even. So, changing default to run training.
Reviewed By: mengdilin
Differential Revision: D61884150
fbshipit-source-id: 182fbff69f223dbf8efb8fbd056279901c311d3a
Summary:
Adds `-DFAISS_USE_LTO=ON` option to CMake to enable LTO. LTO increases the linking time, but potentially provides a small boost to the whole library.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2943
Reviewed By: mengdilin
Differential Revision: D61868553
Pulled By: junjieqi
fbshipit-source-id: f07ade6fdaaa337876f28b9d06bdc5629cc486b0
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3800
In this diff,
1. codec can be referred both using desc name or remote path in IndexFromCodec
2. expose serialization of full index through BuildOperator
3. Rename get_local_filename to get_local_filepath.
Reviewed By: satymish
Differential Revision: D61813717
fbshipit-source-id: ed422751a1d3712565efa87ecf615620799cb8eb
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3786
ROCm build successfully passes all but 2 GPU tests and we want to enable the passing test on CI while skipping the 2 failing tests to make progress. The 2 failing tests are failing specifically on the hardware type that we use for our runners and the AMD team is actively working on root causing it and providing a fix:
`TestGpuIndexIVFPQ.Query_L2_MMCodeDistance`
`TestGpuIndexIVFPQ.Query_IP_MMCodeDistance`
Reviewed By: asadoughi
Differential Revision: D61688657
fbshipit-source-id: 3fedfcf22a0ccf40ac8aff033e8bc09c4eb0cbd5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3785
Right now when avx512 is turned on, we will only return AVX2 in options. My understanding is turning on avx512 sets both the macros `__AVX2__` and `__AVX512F__`: https://fburl.com/vgh7jg9p
Reviewed By: asadoughi
Differential Revision: D61674490
fbshipit-source-id: 47292025b4eb5ef5907c4fbb0bbf39259129f6ee
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3772
It looks like there are many failures on the retry build workflow, but these are mainly due to retry attempts with the --failed flag being unable to rerun workflows that don't have any failed jobs.
Reviewed By: kuarora, junjieqi, ramilbakhshyiev
Differential Revision: D61489426
fbshipit-source-id: 6dcef6ba422634bb333e44a5b12c74c5d3b3df8f
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3757
In the telemetry wrapper, we need to wrap read_index to return wrapped index structs. D61049751
This read_index wrapper calls several static functions. These are not callable outside a C++ file. Thus this diff changes them to non static and declares them in the header file. Then the wrapper is able to call them.
Reviewed By: asadoughi
Differential Revision: D61282004
fbshipit-source-id: 2c8b2ded169577aa6eecdf1edc7483b0ef5f0665
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3732
AVX512 has been running on GhA for some days without issues. Deleting the CircleCI config. Will press the "deprecate CircleCI button" in 1-2 more weeks. I want to wait a little longer just in case anything goes wrong for AVX512 on GhA.
Reviewed By: junjieqi, ramilbakhshyiev
Differential Revision: D60914370
fbshipit-source-id: 5bb09e81c3f5cd1a58525fe633d07373884207d4
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3777
openblas version is bumped from 0.3.27 -> 0.3.28 in the last 3 days. This caused the below test to fail. Confirmed with algoriddle bumping nprobe is okay to do
Reviewed By: algoriddle
Differential Revision: D61536541
fbshipit-source-id: 1e83f75011517ba7b856520f11526e72a00494a5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3761
This fixes CUDA errors inside faiss in the test environment. If torch is loaded first (this change) then both torch and faiss see all GPUs available on the machine in the ROCm build. Without this change, torch sees the GPUs and faiss does not. AMD team is looking at finding the root cause but we wanted to fix this for now.
Reviewed By: junjieqi, mnorris11
Differential Revision: D61358018
fbshipit-source-id: ac59be99817ef13d37a1676f615585f44eabaf24
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3760
This fixes the memleak and the warning received after running Python tests under ROCm since no destructor was declared and objects would remain allocated.
Reviewed By: gtwang01
Differential Revision: D61357579
fbshipit-source-id: cf73bbd7a7002565a4224c1f0af0aa6ea5edebdb
Summary:
Several changes:
1. Introduce `ClusteringParameters::check_input_data_for_NaNs`, which may suppress checks for NaN values in the input data
2. Introduce `ClusteringParameters::use_faster_subsampling`, which uses a newly added SplitMix64-based rng (`SplitMix64RandomGenerator`) and also may pick duplicate points from the original input dataset. Surprisingly, `rand_perm()` may involve noticeable non-zero costs for certain scenarios.
3. Negative values for `ClusteringParameters::seed` initialize internal clustering rng with high-resolution clock each time, making clustering procedure to pick different subsamples each time. I've decided not to use `std::random_device` in order to avoid possible negative effects.
Useful for future `ProductResidualQuantizer` improvements.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3731
Reviewed By: asadoughi
Differential Revision: D61106105
Pulled By: mnorris11
fbshipit-source-id: 072ab2f5ce4f82f9cf49d678122f65d1c08ce596
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3748
So we can dynamically change it
Reviewed By: asadoughi
Differential Revision: D61029191
fbshipit-source-id: 19a6775c1218762dac7a7805e13efab9bb43cfa5
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3747
This change converts the ROCm build to run inside containers and updates it to run on AMD GPU based runners. Still working with the AMD team to resolve test failures before enabled those.
Differential Revision: D61049115
fbshipit-source-id: 28274e0bde795f99b3d78711beaf9b3ed3c5e66c
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3744
gpg is needed for ROCm builds but does not come with containerized builds. This change add installation of gpg.
Reviewed By: junjieqi
Differential Revision: D61007840
fbshipit-source-id: 6322112803866dff57637bea290dc032e2bf41ad