1432 Commits

Author SHA1 Message Date
Alexandr Guzhva
d104275a2c Add a dockerfile for development (#3851)
Summary:
It turns out to be very useful for testing faiss compilation in a fresh environment

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3851

Reviewed By: junjieqi

Differential Revision: D62655137

Pulled By: mengdilin

fbshipit-source-id: c1c2591e26bd20fb3783d3661a4096eaf31c4c08
2024-09-13 12:42:19 -07:00
Mengdi Lin
dbdd63bce5 assign_index should default to null (#3855)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3855

Laser clients are calling RCQ search with a query size of 1 and the bulk of the overhead came from IndexFlat add/search. With a small query size, using IndexFlatL2 does lots of unnecessary copies to the IndexFlatL2.

By default, we should fall back to https://fburl.com/code/jpt236mz branch unless the client overrides `assign_index` with `assign_index_factory`.

Reviewed By: bshethmeta

Differential Revision: D62644305

fbshipit-source-id: 2434e2b238968304dc5b346637f8ca956e6bd548
2024-09-13 10:53:32 -07:00
Mengdi Lin
52ce3f55ae add hnsw unit test for PR 3840 (#3849)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3849

https://github.com/facebookresearch/faiss/issues/3845

Add unit tests for helper search utilities for HNSW. These utility functions live inside an anonymous namespace and each has a reference version gated behind a const bool, I refactored them so the reference version is a flag for the function which defaults to false.

If we are concerned about the performance overhead of the extra if branching (whether to use reference version or not) inside these utility functions, I'm happy to lift out the reference versions to their own functions inside the unit test

Reviewed By: junjieqi

Differential Revision: D62510014

fbshipit-source-id: b92ed4db69d125c7830da93946f1c2374fb87b08
2024-09-12 13:54:54 -07:00
Kumar Saurabh Arora
a166e13a25 Adding bucket/path (blobstore) in dataset descriptor (#3848)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3848

same as title.
Dataset can be referred from blobstore

Reviewed By: satymish

Differential Revision: D62476993

fbshipit-source-id: db2b4088ab6e02278b8b91194bf916fc476b79ec
2024-09-11 20:01:04 -07:00
Matthijs Douze
d85fda7fca Allow k and M suffixes in IVF indexes (#3812)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3812

Allows factory strings like `IVF3k,Flat` as a shorthand for 3072 centroids.

The main question is whether k or M should be metric (k=1000) or power of 2 (k=1024):

* pro-metric: standard,

* pro-power of 2: in practice we use powers of 2 most often

The suffixes ki and Mi should be used for powers of 2 but this makes the notation more heavy (which is what we wanted to avoid in the first place).

So I picked power of 2.

Reviewed By: mnorris11

Differential Revision: D62019941

fbshipit-source-id: f547962625123ecdfaa406067781c77386017793
2024-09-10 02:57:13 -07:00
Kumar Saurabh Arora
6fe4640d5c Fixing headers as per OSS requirement (#3847)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3847

same as title.
Fixing headers as raised in task - P1558157110

Reviewed By: junjieqi

Differential Revision: D62408917

fbshipit-source-id: 652b55dd2ba9617edeb2b05172be0f42291d7035
2024-09-09 22:58:05 -07:00
Alexandr Guzhva
21dfdbaaa0 Fix an incorrectly counted the number of computed distances for HNSW (#3840)
Summary:
https://github.com/facebookresearch/faiss/issues/3819

A definite bug in my code from the past.
The number of reported distances is higher that the number of distances actually computed.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3840

Reviewed By: junjieqi

Differential Revision: D62392577

Pulled By: mengdilin

fbshipit-source-id: c4d595849cc95e77eb6cd8d499b3128bbe9a6e13
2024-09-09 12:59:16 -07:00
Ramil Bakhshyiev
3f41161cae Re-enable Query_L2_MMCodeDistance and Query_IP_MMCodeDistance tests for ROCm (#3838)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3838

This fixes 3787 now that we do not install CUDA for ROCm builds.

Reviewed By: mengdilin

Differential Revision: D62283662

fbshipit-source-id: e5c736296b1c6d7b9a6b9f60161ffe3b5cb1c699
2024-09-06 14:09:10 -07:00
Ramil Bakhshyiev
753833c529 Upgrade to ROCm 6.2 (#3839)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3839

This is a prerequisite to fixing issue 3787 and an upgrade to a newer stable version.

Reviewed By: mengdilin

Differential Revision: D62284555

fbshipit-source-id: 946f7757eea36bdddc3f8bb7d8c16168e90fd063
2024-09-06 14:09:10 -07:00
Ramil Bakhshyiev
736cd4d984 Do not unnecessarily install CUDA for ROCm
Summary: ROCm does not require CUDA, this change stops installing it.

Reviewed By: mengdilin

Differential Revision: D62283602

fbshipit-source-id: 8fd0d770c5bd407b0c7bca7e92d754e05b5083da
2024-09-06 10:47:22 -07:00
Ramil Bakhshyiev
18bc38a0e4 Quiet down apt-get on ROCm builds (#3836)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3836

This disables verbose output from apt-get and only outputs on errors to make the build output logs more readable.

Reviewed By: junjieqi

Differential Revision: D62278742

fbshipit-source-id: 524490ffd95fc1283f69797c0da57886e68206a6
2024-09-06 10:42:20 -07:00
Alexandr Guzhva
e261725039 faster hnsw CPU index training (#3822)
Summary:
This change decreases the training time for 1M x 768 dataset down to 10 minutes from 13 minutes in our experiments.

Please verify and benchmark.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3822

Reviewed By: mdouze

Differential Revision: D62151489

Pulled By: kuarora

fbshipit-source-id: b29ffd0db615bd52187464b4665c31fc9d3b8d0a
2024-09-05 14:04:34 -07:00
Matthijs Douze
52cf9af18a group SWIG tests into one file (#3807)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3807

This diff re-organizes the tests a bit:

* groups tests related to SWIG into a single file

* enable doxygen test conditionally

* removes a dep on platform.Version that does not exist on some envs

* move a few tests out of build_blocks to avoid it being a catch-all file for uncategorized tests

Reviewed By: asadoughi

Differential Revision: D61959750

fbshipit-source-id: fb6ada5b5a980ac07088254183d80b80ddfb1f45
2024-09-05 09:43:31 -07:00
Kumar Saurabh Arora
202a204bd8 Allow search Index without Gt (#3827)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3827

There are few fixes in this diff which allows us to execute search on an existing index without needing to compare it with ground truth. This has been currently added to only knn search (not range)

Reviewed By: satymish

Differential Revision: D61825365

fbshipit-source-id: ee1e39260ed3480ed32aeeb8d7232e975f56bbfa
2024-09-04 22:17:02 -07:00
Michael Norris
a4ebcb111e Add error for overflowing nbits during PQ construction (#3833)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3833

size_t expands to unsigned long int, so any left shift more than 31 means ksub overflows. So, we can add a check right before it is constructed in ProductQuantizer.

Ramil talked to Matthijs and the outcome is that 16 bits isn't really practical, but we can do the check for 24 to be safe.

Reviewed By: kuarora, mengdilin

Differential Revision: D62153881

fbshipit-source-id: 721db6bf6ad5dd5d336b4498f4750acc4059310c
2024-09-04 14:16:17 -07:00
Mengdi Lin
1cafc711e4 remove compile options label from gbench (#3834)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3834

Looks like ServiceLab does not handle any metric that is not integer: https://fburl.com/code/chqi5hcr The current experiments are erroring with the message https://fburl.com/servicelab/99s69hbf:
```
ERROR:windtunnel.benchmarks.gbench_benchmark_runner.gbench_benchmark_runner:exception occurred while processing benchmark (this usually means a benchmark is misconfigured) faiss/perf_tests/scalar_quantizer_distance_test1/bench_scalar_quantizer_distance:QT_bf16_128d_2000n/iterations invalid literal for int() with base 10: 'OPTIMIZE AVX2 '
Traceback (most recent call last):
  File "windtunnel/benchmarks/gbench_benchmark_runner/gbench_benchmark_runner.py", line 116, in parse_gbench_results
    int(entry[metric_name]),
ValueError: invalid literal for int() with base 10: 'OPTIMIZE AVX2 '
```

Removing the label that's causing the failure. We can track the optimization mode via experiment names, or a dummy counter name in the future as I roll out multi-platform experiments.

Reviewed By: junjieqi

Differential Revision: D62166280

fbshipit-source-id: 6de70c945cf5058feb479e6dd501d7e84d08ef83
2024-09-04 13:53:09 -07:00
Bhavik Sheth
d296b2ce01 Prevent reordering of imports by auto formatter to avoid crashes (#3826)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3826

Apparently this is the generally accepted way to do this.

https://usort.readthedocs.io/en/stable/guide.html#import-blocks

What do you think?

Reviewed By: kuarora, mengdilin

Differential Revision: D62147522

fbshipit-source-id: b9cf034ff6119595956a3c46e7094a2a8b0cb2cc
2024-09-03 18:24:48 -07:00
Mengdi Lin
501a8be55c more refactor and add encode/decode steps to benchmark (#3825)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3825

- previously, there is 1 benchmark that measures the reconstruction error and then measures the performance of the distance computation for the scalar quantizer. Split it up into distance benchmark and accuracy benchmark.
- add performance benchmarks for encode and decode as well
- refactor the benchmarks to accept `n` and `d` as command line arguments. We run the benchmarks with `n` = 2000 and `d` = 128 to start. Happy to expand it to d=`256` and a higher `n` if we think it's better.
- refactor the targets file so we can create servicelab experiments based on different parameters

Planning to use the benchmarks here to test my simd refactor changes (and expand the benchmarks when necessary).

Reviewed By: mnorris11

Differential Revision: D62049857

fbshipit-source-id: 7e4cbfe27af6da09616b2e7c82d77480c8ddecd6
2024-09-03 13:56:59 -07:00
Mengdi Lin
4683cc119f create perf_tests directory and onboard all scalar quantizer types to benchmark (#3824)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3824

We currently have a 6bit quantizer benchmark running on servicelab during difftime.

This diff creates `faiss/perf_tests` directory which will host all the benchmarks that run as performance regression tests. It generalizes `bench_6bit_codec` to work for all quantizer types.

I am not deleting the existing `bench_6bit_codec` experiments yet because I want to play around with parameters: {num_trials, num_iterations, strobelight_profiling} in the new sets of experiments to see which combinations provide the least amount of noise. For now, these experiments will not run on diffs.

Reviewed By: mnorris11

Differential Revision: D62033843

fbshipit-source-id: d59a67db6c68d8830b1bc8b4d95e2fe8262e8c4d
2024-09-03 13:16:33 -07:00
mengdilin
5e614503dd Build SVE CI with openblas that was compiled with USE_OPENMP=1 (#3776)
Summary:
I noticed by default, conda install openblas installs `libopenblas-pthreads` on our SVE CI. This can be problematic as described in https://github.com/facebookresearch/faiss/wiki/Troubleshooting#surprising-faiss-openmp-and-openblas-interaction

Updating installation of openblas to be more specific and use the version that works well with openmp.

Sees version `0.3.27-openmp_h1b0c31a_0` for openblas instead of `pthread`

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3776

Reviewed By: ramilbakhshyiev

Differential Revision: D61856775

Pulled By: mengdilin

fbshipit-source-id: 950bd68ba438d221b39d25b2d6e185bc61512243
2024-09-03 13:01:51 -07:00
Mengdi Lin
ca1ab78ea4 always upload pytest results (#3808)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3808

Previously, we only upload the results when the tests passed. It is useful to also upload results of failing tests for debugging

Reviewed By: ramilbakhshyiev

Differential Revision: D61972433

fbshipit-source-id: 1825e6ebea56279e4d8d8c6480841985c1626674
2024-09-03 10:17:17 -07:00
David Tolnay
c418b30f75 Fix deprecated use of 0/NULL (#3817)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3817

`nullptr` is typesafe. `0` and `NULL` are not. We are interested in enabling `-Wzero-as-null-pointer-constant` for first-party code, and only `nullptr` will be allowed.

Reviewed By: zertosh

Differential Revision: D62083883

fbshipit-source-id: 9f19434ff72faa1444dec72ec182a41398b575fe
2024-09-01 14:59:10 -07:00
Sergii Dymchenko
383b5d908c Use weights_only for load (#3796)
Summary:
`torch.load` without `weights_only` parameter is unsafe. Explicitly set `weights_only` to False only if you trust the data you load and full pickle functionality is needed, otherwise set `weights_only=True`.

If `weights_only=True` doesn't work for some cases, then explicit `weights_only=False` should be used.

Found with https://github.com/pytorch-labs/torchfix/

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3796

Reviewed By: asadoughi

Differential Revision: D61824340

Pulled By: kit1980

fbshipit-source-id: bc013d06d4f368f730ffee6898e75fd0b0ff1d40
2024-08-30 12:01:55 -07:00
Pankaj Singh
95e0a667ff Nightly failure fix - ignore searched vectors with identical distances (#3809)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3809

test_IndexIVFRQ is failing nightly GPU w/RAFT builds. These are false positive failures for the edge cases when the distance of a query vector to two candidate vectors is identical. In such cases, the search and search_and_reconstruct return these candidate vectors in a different order (on GPU w/Raft).

Current fix is to ignore the order of such ids with identical distances by sorting on distance and use id as tiebreaker. There is negligible performance implication (5ms extra), given the test candidate size is low.

Reviewed By: mengdilin, ramilbakhshyiev

Differential Revision: D61977915

fbshipit-source-id: 5f7a81c51c91a967013fb5e69e2bfee59be341c7
2024-08-29 11:00:05 -07:00
Alexandr Guzhva
97e6f48ffd Some small improvements. (#3692)
Summary:
A collection of small tweaks that might be needed in the future.
* Adds another statistics for HNSW called `nhops`, which is a number of different nodes visited through a single query search
* AVX-512 code for `PQDecoder8`
* Binary heap utilities for data stored in `std::pair<dis,ids>` format
* Completely remove `HNSW::upper_beam`
* AVX-512 code `HNSW::MinimaxHeap::pop_min`

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3692

Test Plan: buck test //faiss/tests/:test_graph_based

Reviewed By: mnorris11

Differential Revision: D61210752

Pulled By: mdouze

fbshipit-source-id: dea78c04a1b30db2885ecb67235de5099b5b3476
2024-08-29 03:11:14 -07:00
Ramil Bakhshyiev
a5ad7148fc Re-order imports in Python tests to avoid crashing (#3806)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3806

No explanation for why this works but importing faiss prior to torch causing Python to crash on certain test cases under pytest. This is a temporary workaround with a full RCA required.

Reviewed By: pankajsingh88

Differential Revision: D61956515

fbshipit-source-id: d33992d0b92e87d58ddff1f160990cba487b9927
2024-08-29 01:06:34 -07:00
Kumar Saurabh Arora
145e93d09a Fix bench_fw_codec
Summary: It was not training even. So, changing default to run training.

Reviewed By: mengdilin

Differential Revision: D61884150

fbshipit-source-id: 182fbff69f223dbf8efb8fbd056279901c311d3a
2024-08-27 17:37:50 -07:00
Alexandr Guzhva
4283e5bf4c Add standalone Link-Time Optimization option to CMake (#2943)
Summary:
Adds `-DFAISS_USE_LTO=ON` option to CMake to enable LTO. LTO increases the linking time, but potentially provides a small boost to the whole library.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2943

Reviewed By: mengdilin

Differential Revision: D61868553

Pulled By: junjieqi

fbshipit-source-id: f07ade6fdaaa337876f28b9d06bdc5629cc486b0
2024-08-27 13:36:53 -07:00
Kumar Saurabh Arora
37f6b76fe1 Adding support for index builder (#3800)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3800

In this diff,
1. codec can be referred both using desc name or remote path in IndexFromCodec
2. expose serialization of full index through BuildOperator
3. Rename get_local_filename to get_local_filepath.

Reviewed By: satymish

Differential Revision: D61813717

fbshipit-source-id: ed422751a1d3712565efa87ecf615620799cb8eb
2024-08-27 10:02:15 -07:00
Sergii Dymchenko
084496a035 Fix parameter names in docstrings (#3795)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3795

Reviewed By: bshethmeta

Differential Revision: D61817514

Pulled By: junjieqi

fbshipit-source-id: a1b06825b9e4d5a38bd3d800c1e540a8298c80eb
2024-08-26 18:08:57 -07:00
Mengdi Lin
3614cc7d47 avx512 compilation option (#3798)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3798

Alexander left a comment on the previous PR: https://github.com/facebookresearch/faiss/pull/3785#issuecomment-2305864630. The contract for the function seems to be that it will only append a single compilation option, not a list of options. Fixing it to comply with the contract.

Reviewed By: asadoughi, ramilbakhshyiev

Differential Revision: D61803839

fbshipit-source-id: 948a3d636f6dd6b5c4f975d236c19923af2bbd18
2024-08-26 13:40:00 -07:00
Mengdi Lin
4ca67340ea add AMD_ROCM as part of get_compile_options (#3790)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3790

Comment from ramilbakhshyiev https://www.internalfb.com/diff/D61674490?dst_version_fbid=359297527230868&transaction_fbid=8025001410919172

Reviewed By: ramilbakhshyiev

Differential Revision: D61680186

fbshipit-source-id: 2b6d5803e620b36878b669e617253c875562c30f
2024-08-23 11:08:39 -07:00
Ramil Bakhshyiev
58a673d938 Enable most of C++ tests on ROCm (#3786)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3786

ROCm build successfully passes all but 2 GPU tests and we want to enable the passing test on CI while skipping the 2 failing tests to make progress. The 2 failing tests are failing specifically on the hardware type that we use for our runners and the AMD team is actively working on root causing it and providing a fix:
`TestGpuIndexIVFPQ.Query_L2_MMCodeDistance`
`TestGpuIndexIVFPQ.Query_IP_MMCodeDistance`

Reviewed By: asadoughi

Differential Revision: D61688657

fbshipit-source-id: 3fedfcf22a0ccf40ac8aff033e8bc09c4eb0cbd5
2024-08-23 10:58:14 -07:00
Mengdi Lin
6053348b2e fix get_compile_options bug (#3785)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3785

Right now when avx512 is turned on, we will only return AVX2 in options. My understanding is turning on avx512 sets both the macros `__AVX2__` and `__AVX512F__`: https://fburl.com/vgh7jg9p

Reviewed By: asadoughi

Differential Revision: D61674490

fbshipit-source-id: 47292025b4eb5ef5907c4fbb0bbf39259129f6ee
2024-08-23 08:52:53 -07:00
Kumar Saurabh Arora
5c87f132de Add sampling fields to dataset descriptor (#3782)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3782

Fields sampling_column and sampling_rate are added to dataset descriptor for supporting sampling in dataset readers.

Reviewed By: satymish

Differential Revision: D61569067

fbshipit-source-id: e5db9957538b033bbef4b7662154411b9044d1f8
2024-08-22 10:44:55 -07:00
George Wang
a43afd6a62 Specify to retry only on failed jobs (#3772)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3772

It looks like there are many failures on the retry build workflow, but these are mainly due to retry attempts with the --failed flag being unable to rerun workflows that don't have any failed jobs.

Reviewed By: kuarora, junjieqi, ramilbakhshyiev

Differential Revision: D61489426

fbshipit-source-id: 6dcef6ba422634bb333e44a5b12c74c5d3b3df8f
2024-08-20 19:27:53 -07:00
Michael Norris
a10b883584 Move static functions to header file (#3757)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3757

In the telemetry wrapper, we need to wrap read_index to return wrapped index structs. D61049751

This read_index wrapper calls several static functions. These are not callable outside a C++ file. Thus this diff changes them to non static and declares them in the header file. Then the wrapper is able to call them.

Reviewed By: asadoughi

Differential Revision: D61282004

fbshipit-source-id: 2c8b2ded169577aa6eecdf1edc7483b0ef5f0665
2024-08-20 13:43:37 -07:00
Pankaj Singh
f3c05bd21b add reconstruct support to additive quantizers (#3752)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3752

add reconstruct support to additive quantizers.

fixes github issue: 2422

Reviewed By: asadoughi

Differential Revision: D61255049

fbshipit-source-id: 09a0edae7fc24295a686d332e2d052e37372d2c0
2024-08-20 13:06:02 -07:00
Mengdi Lin
6e6685b0fe delete circle CI config (#3732)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3732

AVX512 has been running on GhA for some days without issues. Deleting the CircleCI config. Will press the "deprecate CircleCI button" in 1-2 more weeks. I want to wait a little longer just in case anything goes wrong for AVX512 on GhA.

Reviewed By: junjieqi, ramilbakhshyiev

Differential Revision: D60914370

fbshipit-source-id: 5bb09e81c3f5cd1a58525fe633d07373884207d4
2024-08-20 12:40:58 -07:00
Mengdi Lin
c0b32d2821 fix ARM64 SVE CI due to openblas version bump (#3777)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3777

openblas version is bumped from 0.3.27 -> 0.3.28 in the last 3 days. This caused the below test to fail. Confirmed with algoriddle bumping nprobe is okay to do

Reviewed By: algoriddle

Differential Revision: D61536541

fbshipit-source-id: 1e83f75011517ba7b856520f11526e72a00494a5
2024-08-20 06:25:40 -07:00
Ramil Bakhshyiev
924c24db23 Enable Python tests for ROCm (#3763)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3763

This change enables Python version of tests for ROCm builds.

Reviewed By: asadoughi

Differential Revision: D61366282

fbshipit-source-id: c2fd688db42d63946f1c5ca7d50f0a1c4d4a33cd
2024-08-16 12:01:50 -07:00
Ramil Bakhshyiev
772d86062d Reorder imports in torch_test_contrib_gpu (#3761)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3761

This fixes CUDA errors inside faiss in the test environment. If torch is loaded first (this change) then both torch and faiss see all GPUs available on the machine in the ROCm build. Without this change, torch sees the GPUs and faiss does not. AMD team is looking at finding the root cause but we wanted to fix this for now.

Reviewed By: junjieqi, mnorris11

Differential Revision: D61358018

fbshipit-source-id: ac59be99817ef13d37a1676f615585f44eabaf24
2024-08-15 16:13:41 -07:00
Ramil Bakhshyiev
d40adca9cb Add midding hipStream SWIG typedef to fix ROCm memleak in Python (#3760)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3760

This fixes the memleak and the warning received after running Python tests under ROCm since no destructor was declared and objects would remain allocated.

Reviewed By: gtwang01

Differential Revision: D61357579

fbshipit-source-id: cf73bbd7a7002565a4224c1f0af0aa6ea5edebdb
2024-08-15 16:13:41 -07:00
Alexandr Guzhva
afe9c40f36 introduce options for reducing the overhead for a clustering procedure (#3731)
Summary:
Several changes:
1. Introduce `ClusteringParameters::check_input_data_for_NaNs`, which may suppress checks for NaN values in the input data
2. Introduce `ClusteringParameters::use_faster_subsampling`, which uses a newly added SplitMix64-based rng (`SplitMix64RandomGenerator`) and also may pick duplicate points from the original input dataset.  Surprisingly, `rand_perm()` may involve noticeable non-zero costs for certain scenarios.
3. Negative values for `ClusteringParameters::seed` initialize internal clustering rng with high-resolution clock each time, making clustering procedure to pick different subsamples each time. I've decided not to use `std::random_device` in order to avoid possible negative effects.

Useful for future `ProductResidualQuantizer` improvements.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3731

Reviewed By: asadoughi

Differential Revision: D61106105

Pulled By: mnorris11

fbshipit-source-id: 072ab2f5ce4f82f9cf49d678122f65d1c08ce596
2024-08-14 17:10:13 -07:00
Pankaj Singh
b10f001185 minor refactor to avoid else block in IVFPQ reconstruct_from_offset. (#3753)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3753

minor refactor to avoid else block in IVFPQ reconstruct_from_offset. No change in logic.

Reviewed By: asadoughi

Differential Revision: D61255339

fbshipit-source-id: e0a8ac10570391eaf7ed3b35796af8b38d40a23c
2024-08-14 13:59:18 -07:00
Emy Sun
2968ab130f Add hnsw search params for bounded queue option (#3748)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3748

So we can dynamically change it

Reviewed By: asadoughi

Differential Revision: D61029191

fbshipit-source-id: 19a6775c1218762dac7a7805e13efab9bb43cfa5
2024-08-13 15:15:16 -07:00
Kumar Saurabh Arora
80a2462483 Fixing initialization of dictionary in dataclass (#3749)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3749

same as title

Reviewed By: satymish

Differential Revision: D61133788

fbshipit-source-id: 5761e6347365f7701ee0600a9d895b8bd1f0a6b8
2024-08-12 17:49:43 -07:00
Ramil Bakhshyiev
a56ee812a7 Containerize ROCm build and move it to AMD GPU runners (#3747)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3747

This change converts the ROCm build to run inside containers and updates it to run on AMD GPU based runners. Still working with the AMD team to resolve test failures before enabled those.

Differential Revision: D61049115

fbshipit-source-id: 28274e0bde795f99b3d78711beaf9b3ed3c5e66c
2024-08-09 17:17:37 -07:00
Kumar Saurabh Arora
290464f23b Adding embedding column to dataset descriptor (#3736)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3736

Nit - adding embedding column in dataset descriptor
Nit - initializing cached_ds as part of class instead of post_init

Reviewed By: satymish

Differential Revision: D60858496

fbshipit-source-id: 3358d866a0668424cd6895bc7a5c620ff97e72fa
2024-08-09 17:07:36 -07:00
Ramil Bakhshyiev
ac18577482 Install gpg for ROCm builds (#3744)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3744

gpg is needed for ROCm builds but does not come with containerized builds. This change add installation of gpg.

Reviewed By: junjieqi

Differential Revision: D61007840

fbshipit-source-id: 6322112803866dff57637bea290dc032e2bf41ad
2024-08-09 02:04:53 -07:00