922 Commits

Author SHA1 Message Date
Gergely Szilvasy
4c83965d2b benchmark view results (#3144)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3144

Visualize results of running the benchmark with Pareto optima filtering:
1. per index or across indices
2. for space, time or space & time
3. knn or range search, the latter @ specific precision

Reviewed By: mdouze

Differential Revision: D51552775

fbshipit-source-id: d4f29e3d46ef044e71b54439b3972548c86af5a7
2023-12-04 05:53:17 -08:00
Gergely Szilvasy
9519a19f42 benchmark refactor
Summary:
1. Support for index construction parameters outside of the factory string (arbitrary depth of quantizers).
2. Refactor that provides an index wrapper which is a prereq for the optimizer, which will generate indices from pre-optimized components (particularly quantizers)

Reviewed By: mdouze

Differential Revision: D51427452

fbshipit-source-id: 014d05dd798d856360f2546963e7cad64c2fcaeb
2023-12-04 05:53:17 -08:00
Alexandr Guzhva
a5b03cb9f6 Fix build on Alpine Linux (#3148)
Summary:
Fixes https://github.com/facebookresearch/faiss/issues/3142

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3148

Reviewed By: algoriddle

Differential Revision: D51741575

Pulled By: mdouze

fbshipit-source-id: b4ea2302802ef70b53f8afe5a7ab4ee02bf6659e
2023-12-01 02:52:52 -08:00
Yuri Vanin
4bf8f939d6 Add NegativeDistanceComputer::distances_batch_4 override (#3149)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3149

Enables vectorized distance calculation in [NegativeDistanceComputer](b109d086a2/faiss/IndexHNSW.cpp (L74)), whenever supported by the [NegativeDistanceComputer::basedis](b109d086a2/faiss/IndexHNSW.cpp (L76)).

Otherwise the default sequential calculation of [DistanceComputer::distances_batch_4](b109d086a2/faiss/impl/DistanceComputer.h (L36-L54)) is always chosen.

Reviewed By: algoriddle

Differential Revision: D51596177

fbshipit-source-id: fee510c0a229991ecb7d81a51bc53a20880685be
2023-11-29 16:38:08 -08:00
Gergely Szilvasy
90654d6011 benchmark core faiss prereqs
Summary:
1. Support `search_preassigned` in IVFFastScan
2. `try_extract_index_ivf` to search recursively and support `IndexRefine`
3. `get_InvertedListScanner` to fail where not available
4. Workaround an OpenMP issue with `IndexIVFSpectralHash`

Reviewed By: mdouze

Differential Revision: D51427241

fbshipit-source-id: 365e3f11d24e80f101f986fc358c28dcc00805fa
2023-11-28 11:50:03 -08:00
Alexandr Guzhva
04bb0a810c improve ScalarQuantizer performance, ESPECIALLY on old GCC (#3141)
Summary:
Introduces `FAISS_ALWAYS_INLINE` pragma directive and improves `ScalarQuantizer` performance with it.

Most of performance-critical methods for `ScalarQuantizer` are marked with this new directive, because a compiler (especially, an old one) may be unable to inline it properly. In some of my GCC experiments, such an inlining yields +50% queries per second in a search.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3141

Reviewed By: algoriddle

Differential Revision: D51615609

Pulled By: mdouze

fbshipit-source-id: 9c755c3e1a289b5d498306c1b9d6fcc21b0bec28
2023-11-28 10:34:38 -08:00
Matthijs Douze
43f8220818 fix scopedeleter diff
Summary: It seems that for some build modes, swig chokes on static_assert, so protect this with #idndef SWIG. Let's see what the tests say....

Reviewed By: algoriddle

Differential Revision: D50971042

fbshipit-source-id: 83e2ccb464c0bd024cbf3a494357147d75a76ca2
2023-11-28 09:42:57 -08:00
Alexandr Guzhva
d3692d2498 Deprecate ScopeDeleter and ScopeDeleter1 in favor of std::unique_ptr<[]> (#3108)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3108

Reviewed By: mlomeli1

Differential Revision: D50595705

Pulled By: mdouze

fbshipit-source-id: 8555c13609747b7b61201225fcd036d80b50ae59
2023-11-28 09:42:57 -08:00
luyuncheng
eb071f8c14 Fix is_trained in IndexNSGSQ (#3145)
Summary:
Same as https://github.com/facebookresearch/faiss/issues/3034

When using IndexNSGSQ with fp16, do not require training

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3145

Reviewed By: algoriddle

Differential Revision: D51615536

Pulled By: mdouze

fbshipit-source-id: c6bfbca920be80231d5d0a7290a29f17ea271f6e
2023-11-28 08:16:57 -08:00
Ben Frederickson
d643c41c02 use precomputed norms for raft brute_force knn calls (#3089)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3089

Reviewed By: algoriddle

Differential Revision: D50933982

Pulled By: mdouze

fbshipit-source-id: dd0d00cf71ac490f75b8c2f152e7ae4cc28019ef
2023-11-28 03:11:41 -08:00
Matthijs Douze
b109d086a2 Search and return codes (#3143)
Summary:
This PR adds a functionality where an IVF index can be searched and the corresponding codes be returned. It also adds a few functions to compress int arrays into a bit-compact representation.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3143

Test Plan:
```
buck test //faiss/tests/:test_index_composite -- TestSearchAndReconstruct

buck test //faiss/tests/:test_standalone_codec -- test_arrays
```

Reviewed By: algoriddle

Differential Revision: D51544613

Pulled By: mdouze

fbshipit-source-id: 875f72d0f9140096851592422570efa0f65431fc
2023-11-25 13:57:25 -08:00
Jeff Johnson
467f70edbf Consolidate GPU IVF query tile calculation + special handling for large query memory requirements
Summary:
In the GPU IVF (Flat, SQ and PQ) code, there is a requirement for using temporary memory for storing unfiltered (or partially filtered) vector distances calculated during list scanning which are k-selected by separate kernels.

While a batch query may be presented to an IVF index, the amount of temporary memory needed to store all these unfiltered distances prior to filtering may be very huge depending upon IVF characteristics (such as the maximum number of vectors encoded in any of the IVF lists), in which case we cannot process the entire batch of queries at once and instead must tile over the batch of queries to reuse the temporary memory that we make available for these distances.

The old code duplicated this roughly equivalent logic in 3 different places (the IVFFlat/SQ code, IVFPQ with precomputed codes, and IVFPQ without precomputed codes). Furthermore, in the case where either little/no temporary memory was available or where what temporary memory was available was (vastly) exceeded by the amount needed to handle a particular query, the old code enforced a minimum number of queries to be processed at once of 8. In certain cases (huge IVF list imbalance), this memory request could exceed the amount of memory that can be safely allocated on a GPU.

This diff consolidates the original 3 separate places where this calculation took place to 1 place in IVFUtils. The logic proceeds roughly as before, to figure out how many queries can be processed in the available temporary memory, except we add a new heuristic in the case where the number of queries that can be concurrently processed falls below 8. This could be either due to little temporary memory being available, or due to huge memory requirements. In this case, we instead ignore the amount of temporary memory available and instead see how many queries' memory requirements would fit into a single 512 MiB memory allocation, so we reasonably cap this amount. If the query still cannot be satisfied with this allocation, we still proceed executing 1 query at a time (which note could still potentially exhaust the GPU memory, but this is an error that is unavoidable).

While a different heuristic using the amount of actual memory allocatable on the device could be used instead of this fixed 512 MiB amount, there is no guarantee to my knowledge that a single cudaMalloc up to this limit could succeed (e.g., GPU reports 3 GiB available, you attempt to allocate all of that in a single allocation), so we just pick an amount which is a reasonable balance between efficiency (parallelism) and memory consumption. Note that if not enough temporary memory is available and a single 512 MiB allocation fails, then there is likely little memory to proceed efficiently at all under any scenario, as Faiss does require some headroom in terms of memory available for scratch spaces.

Reviewed By: mdouze

Differential Revision: D45574455

fbshipit-source-id: 08f5204e3e9656627c9134d7409b9b0960f07b2d
2023-11-15 11:19:02 -08:00
Robert Maynard
411c1721da Add linker script to support large cuda cubin files (#3115)
Summary:
nvcc starting with CUDA 11.5 offers a `-hls` option to generate host side linker scripts to support large cubin file support.
Since faiss supports CUDA 11.4 we replicate that behavior but injecting the same linker script into the link line manually.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3115

Reviewed By: mdouze

Differential Revision: D51308908

Pulled By: algoriddle

fbshipit-source-id: c6dd073cd3f44dbc99d2e2da97f79b9ebc843b59
2023-11-15 02:29:21 -08:00
Jeff Johnson
09c7aaceb6 Faiss GPU CUDA 12 fix: warp synchronous behavior
Summary:
This diff fixes the bug associated with moving Faiss GPU to CUDA 12.

The following tests were succeeding in CUDA 11.x but failed in CUDA 12:

```
  ✗ faiss/gpu/test:test_gpu_basics_py - test_input_types (faiss.gpu.test.test_gpu_basics.TestKnn)
  ✗ faiss/gpu/test:test_gpu_basics_py - test_dist (faiss.gpu.test.test_gpu_basics.TestAllPairwiseDistance)
  ✗ faiss/gpu/test:test_gpu_index_ivfpq - TestGpuIndexIVFPQ.Add_L2
  ✗ faiss/gpu/test:test_gpu_basics_py - test_input_types_tiling (faiss.gpu.test.test_gpu_basics.TestKnn)
  ✗ faiss/gpu/test:test_gpu_index_ivfpq - TestGpuIndexIVFPQ.Add_IP
  ✗ faiss/gpu/test:test_gpu_index_ivfpq - TestGpuIndexIVFPQ.Float16Coarse
  ✗ faiss/gpu/test:test_gpu_index_ivfpq - TestGpuIndexIVFPQ.LargeBatch
```

It took a long while to track down, but the issue presented itself when an odd number of dimensions not divisible by 32 was used in cases where we needed to calculate a L2 norm for vectors, which occurred with brute-force L2 distance computation, as well as certain L2 IVFPQ operations. This issue appeared as some tests were using 33 as the dimensionality of vectors.

The issue is that the number of threads given to the L2 norm kernel was effectively `min(dims, 1024)` where 1024 is the standard maximum number of CUDA threads per CTA on all devices at present. In the case where the result was not a multiple of 32, this would result in a partial warp being passed to the kernel (with non-participating lanes having no side effects).

The change in CUDA 12 here seemed to be a change in the compiler behavior for warp-synchronous shuffle instructions (such as `__shfl_up_sync`. In the case of the partial warp, we were passing `0xffffffff` as the active lane mask, implying that all lanes were present for the warp. In the case of dims = 33, we would have 1 full warp with all lanes present, and 1 partial warp with only 1 active thread, so `0xffffffff` is a lie in this case. Prior to CUDA 12, it appeared that these shuffle instructions may have passed 0? around for lanes not present (or would it stall?), so the result was still calculated correctly. However, with the change to CUDA 12, the compiler and/or device firmware (or something) interprets this differently, where the warp lanes not present were providing garbage. The shuffle instructions were used to perform in-warp reductions (e.g., summing a bunch of floating point numbers), namely those needed to sum up the L2 vector norm value. So for dims = 32 or dims = 64 (and bizarrely, dims = 40 and some other choices) it still worked, but for dims = 33 it was adding in garbage, producing erroneous results.

This diff removes the non-dim loop functionality for runL2Norm (where we can statically avoid a for loop over dimensions in case our threadblock is exactly sized with the number of dimensions present) and we just use the general-purpose fallback. Second, we now always provide an even number of warps when running the L2 norm kernel, avoiding the issue with the warp synchronous instructions not having a full warp present.

This bug has been present since the code was written 2016 and was technically wrong before, but is only surfaced to be a bug/problem with the CUDA 12 change.

tl;dr: if you use any kind of `_sync` instruction involving warp sync, always have a whole number of warps present, k thx.

Reviewed By: mdouze

Differential Revision: D51335172

fbshipit-source-id: 97da88a8dcbe6b4d8963083abc01d5d2121478bf
2023-11-14 22:57:45 -08:00
Christopher Ponce de Leon
0c2243c5b4 Revert D51029740: Namespace doesn't need to be followed by semicolon
Differential Revision:
D51029740

Original commit changeset: 177e3f6e6b0a

Original Phabricator Diff: D51029740

fbshipit-source-id: c71ff386342902f2cfa6552d6a834ea3f2475e32
2023-11-06 08:37:37 -08:00
Richard Barnes
438b51925f Namespace doesn't need to be followed by semicolon
Summary:
Auto-generated with
```
fbgs "}; // namespace" -l | sort | uniq | sed 's/fbsource.//' | xargs -n 50 sed -i 's_}; // namespace_} // namespace_'
```

Reviewed By: dmm-fb

Differential Revision: D51029740

fbshipit-source-id: 177e3f6e6b0ab7e986b1147952cd5e2f59d4b1fc
2023-11-06 08:02:11 -08:00
Alexandr Guzhva
9a66532482 Add search parameters for IndexRefine::search() and IndexRefineFlat::search() (#3122)
Summary:
Add search params for `faiss::IndexRefine` and `faiss::IndexRefineFlat`

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3122

Test Plan: buck test //faiss/tests/:test_refine

Reviewed By: pemazare

Differential Revision: D50968413

Pulled By: mdouze

fbshipit-source-id: 9f020d7e9c9d96b9acba54d9d7fff13bcf703b9e
2023-11-05 15:07:39 -08:00
pe4eniks
df7280b5f6 Documentation fixes (#3092)
Summary:
As a follow-up to this issue https://github.com/facebookresearch/faiss/issues/3086, I've fixed some bugs in the doxygen-generated documentation.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3092

Reviewed By: pemazare

Differential Revision: D50595811

Pulled By: mdouze

fbshipit-source-id: 74797d3f2594a20597e1eb6545e91f6eac6d035d
2023-11-02 10:06:03 -07:00
chasingegg
6b761503ba Remove confusing comments in partitioning.cpp (#3104)
Summary:
Fix https://github.com/facebookresearch/faiss/issues/3095

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3104

Reviewed By: mlomeli1

Differential Revision: D50595865

Pulled By: mdouze

fbshipit-source-id: f9107bda114a77d6e1f0da32c3451b7182d32e60
2023-11-02 10:03:52 -07:00
Gergely Szilvasy
6c89c8bd4e fix faiss-gpu nightly
Summary: Forcing cudatoolkit to 11.8 to work around cudatoolkit vs cuda-cudart clobbering: https://app.circleci.com/pipelines/github/facebookresearch/faiss/4845/workflows/baee8356-31c5-44e8-ae98-44f897967557/jobs/26451?invite=true#step-105-335068_80

Reviewed By: mlomeli1

Differential Revision: D50924919

fbshipit-source-id: 590f39d1292ec64af7e179e764b8e7ac26108962
2023-11-02 03:09:40 -07:00
Gergely Szilvasy
0c07a114ad fix raft contbuild and switch to libraft 23.12 (#3116)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3116

Test Plan: https://app.circleci.com/pipelines/github/facebookresearch/faiss/4839/workflows/cfd84a94-ca60-4128-96e6-db0f5afc69c4

Reviewed By: mdouze

Differential Revision: D50897934

Pulled By: algoriddle

fbshipit-source-id: 1422be39d640a2aec3ab6b4c68d3ef54900b5ba2
2023-11-01 14:37:21 -07:00
Gergely Szilvasy
9bb6b4be0d fix test TestCrossCodebookComputations::test_precomp
Summary: To fix the nightly: https://app.circleci.com/pipelines/github/facebookresearch/faiss/4815/workflows/2027a135-72ee-459f-a092-7ada95affd41/jobs/26225

Reviewed By: mdouze

Differential Revision: D50839933

fbshipit-source-id: 311b548182a2b3966c9603f83c115fa038eb19e8
2023-10-31 09:50:05 -07:00
Gergely Szilvasy
c3b9374984 bench_fw - fixes & nits for oss (#3102)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3102

Reviewed By: pemazare

Differential Revision: D50426528

Pulled By: algoriddle

fbshipit-source-id: 886960b8b522318967fc5ec305666871b496cae8
2023-10-20 07:53:56 -07:00
Gergely Szilvasy
0a00d8137a offline index evaluation (#3097)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3097

A framework for evaluating indices offline.

Long term objectives:
1. Generate offline similarity index performance data with test datasets both for existing indices and automatically generated alternatives. That is, given a dataset and some constraints this workflow should automatically discover optimal index types and parameter choices as well as evaluate the performance of existing production indices and their parameters.
2. Allow researchers, platform owners (Laser, Unicorn) and product teams to understand how different index types perform on their datasets and make optimal choices wrt their objectives. Longer term to enable automatic decision-making/auto-tuning.

Constraints, design choices:
1. I want to run the same evaluation on Meta-internal (fblearner, data from hive and manifold) or the local machine + research cluster (data on local disk or NFS) via OSS Faiss. Via fblearner, I want this to work in a way that it can be turned into a service and plugged into Unicorn or Laser, while the core Faiss part can be used/referred to in our research and to update the wiki with the latest results/recommendations for public datasets.
2. It must support a range of metrics for KNN and range search, and it should be easy to add new ones. Cost metrics need to be fine-grained to allow extrapolation.
3. It should automatically sweep all query time params (eg. nprobe, polysemous code hamming distance, params of quantizers), using`OperatingPointsWithRanges` to cut down the optimal param search space. (For now, it sweeps nprobes only.)
4. [FUTURE] It will generate/sweep index creation hyperparams (factory strings, quantizer sizes, quantizer params), using heuristics.
5. [FUTURE] It will sweep the dataset size: start small test with e.g. 100K db vectors and go up to millions, billions potentially, while narrowing down the index+param choices at each step.
6. [FUTURE] Extrapolate perf metrics (cost and accuracy)
7. Intermediate results must be saved (to disk, to manifold) throughout, and reused as much as possible to cut down on overall runtime and enable faster iteration during development.

For range search, this diff supports the metric proposed in https://docs.google.com/document/d/1v5OOj7kfsKJ16xzaEHuKQj12Lrb-HlWLa_T2ct0LJiw/edit?usp=sharing I also added support for the classical case where the scoring function steps from 1 to 0 at some arbitrary threshold.

For KNN, I added knn_intersection, but other metrics, particularly recall@1 will also be interesting. I also added the distance_ratio metric, which we previously discussed as an interesting alternative, since it shows how much the returned results approximate the ground-truth nearest-neighbours in terms of distances.

In the test case, I evaluated three current production indices for VCE with 1M vectors in the database and 10K queries. Each index is tested at various operating points (nprobes), which are shows on the charts. The results are not extrapolated to the true scale of these indices.

Reviewed By: yonglimeta

Differential Revision: D49958434

fbshipit-source-id: f7f567b299118003955dc9e2d9c5b971e0940fc5
2023-10-17 13:56:02 -07:00
Matthijs Douze
f969d7ae3b better docs
Summary: Improved comments.

Reviewed By: algoriddle

Differential Revision: D50259422

fbshipit-source-id: 92ba0840468eb8724f21d8fbe406b1bc43c64706
2023-10-13 03:37:53 -07:00
Corey J. Nolet
edcf7438bb Integrate IVF-Flat from RAFT (#2521)
Summary:
This is a design proposal that demonstrates an approach to enabling optional support for [RAFT](https://github.com/rapidsai/raft) versions of IVF PQ and IVF Flat (and brute force w/ fused k-selection when k <= 64). There are still a few open issues and design discussions needed for the new RAFT index types to support the full range of features of that FAISS' current gpu index types.

Checklist for the integration todos:
- [x] Rebase on current `main` branch
- [X] The raft handle has been plugged directly into the StandardGpuResources
- [X] `FlatIndex` passing Googletests
- [x] Use `CodePacker` to support `copyFrom()` and `copyTo()`
- [X] `IVF-flat passing Googletests
- [ ] Raise appropriate exceptions for operations which are not yet supported by RAFT

Additional features we've discussed:
- [x] Separate IVF lists into individual memory chunks
- [ ] Saving/loading

To build FAISS w/ optional RAFT support:
```
mkdir build
cd build
cmake ../ -DFAISS_ENABLE_RAFT=ON -DFAISS_ENABLE_GPU=ON
make -j
```

For development/testing, we've also supplied a bash script to make things easier: `build.sh`

Below is a benchmark comparing the training of IVF Flat indices for RAFT and FAISS:
![image](https://user-images.githubusercontent.com/1242464/194944737-8b808f11-e28e-4556-82d1-1ea4b0707283.png)

The benchmark was produced using Googlebench in [this](https://github.com/tfeher/raft/tree/raft_faiss_bench) RAFT fork. We're going to provide benchmarks for the queries as well. There are still a couple bottlenecks to be removed in the IVF-Flat training implementation and we'll update the current benchmark when ready.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2521

Test Plan: `buck test mode/debuck test mode/dev-nosan //faiss/gpu/test:test_gpu_index_ivfflat`

Reviewed By: algoriddle

Differential Revision: D49118319

Pulled By: mdouze

fbshipit-source-id: 5916108bc27154acf7c92021ba579a6ca85d730b
2023-10-04 23:42:30 -07:00
Robert Maynard
458633c203 Remove uneeded PTX code generation from libfaiss builds (#3083)
Summary:
The CMake CUDA Architecture value of `60` means to generate both PTX and SASS for that arch. We only need SASS for the architectures we support, and one PTX version for future hardware versions.

So now we build on SASS for everything ( `60-real` ) and use 80 as the baseline for newer archs likes 90

By removing this unneeded PTX code we can reduce the libfaiss.a binary to 305MB from the current 484MB.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3083

Reviewed By: wickedfoo

Differential Revision: D49901896

Pulled By: algoriddle

fbshipit-source-id: 15e98f81e191a565319cf855debad33b24ebf10b
2023-10-04 12:11:47 -07:00
Matthijs Douze
2b48901b51 Remove 1L and 1UL
Summary: 1L and 1UL are problematic because sizeof(long) depends on the platform

Reviewed By: mlomeli1

Differential Revision: D49911901

fbshipit-source-id: d4e4cb1f0283a33330bf1b8ca6b7f7bf41bc6ff4
2023-10-04 09:11:48 -07:00
Alexandr Guzhva
3f3321c446 Small refactoring of inverted lists (#3055)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3055

Reviewed By: mlomeli1

Differential Revision: D49859119

Pulled By: mdouze

fbshipit-source-id: 7b457ffcae6cbcd732abcf34f87f5037697c4494
2023-10-04 09:11:48 -07:00
Haijun Yu
834c543240 Fix SingleBestResultHandler bug. if IndexFlatL2 empty data then search topk = 1 return label = 0 not -1. (#3075)
Summary:
Fix SingleBestResultHandler bug. if IndexFlatL2 empty data then search topk = 1 return label = 0 not -1.

for example

 int d = 64;      // dimension
    int nb = 100000; // database size
    int nq = 1;      // nb of queries

    std::mt19937 rng;
    std::uniform_real_distribution<> distrib;

    float* xb = new float[d * nb];
    float* xq = new float[d * nq];

    for (int i = 0; i < nb; i++) {
        for (int j = 0; j < d; j++)
            xb[d * i + j] = distrib(rng);
        xb[d * i] += i / 1000.;
    }

    for (int i = 0; i < nq; i++) {
        for (int j = 0; j < d; j++)
            xq[d * i + j] = distrib(rng);
        xq[d * i] += i / 1000.;
    }

    faiss::IndexFlatL2 index(d); // call constructor
    printf("is_trained = %s\n", index.is_trained ? "true" : "false");

    int k = 1;

    { // sanity check: search 1 first vectors of xb
        idx_t* I = new idx_t[k * nq];
        float* D = new float[k * nq];

        index.search(nq, xb, k, D, I); // *I = 0 not -1
	}

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3075

Reviewed By: pemazare

Differential Revision: D49749983

Pulled By: mdouze

fbshipit-source-id: 10e9784035118b9e33e109180cab425de28d4ded
2023-10-03 02:30:47 -07:00
Matthijs Douze
9db182460c Relax IVFFlatDedup test (#3077)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3077

This diff relaxes some IVFFlatDedup tests where distances are slighlty different over runs.
Should fix

https://app.circleci.com/pipelines/github/facebookresearch/faiss/4709/workflows/8c8213bf-8fe0-4c4e-9a7d-991f44bf1010/jobs/25551

https://app.circleci.com/pipelines/github/facebookresearch/faiss/4709/workflows/8c8213bf-8fe0-4c4e-9a7d-991f44bf1010/jobs/25547

Reviewed By: algoriddle

Differential Revision: D49732349

fbshipit-source-id: 728b9885c6b7d6ba697ccb6bacc0abd0ee2b0679
2023-09-29 01:16:59 -07:00
generatedunixname89002005325676
0f182519a6 Daily arc lint --take CLANGFORMAT
Reviewed By: bigfootjon

Differential Revision: D49724184

fbshipit-source-id: dae2af01c245516e910801e11acda3fdd8bfaa54
2023-09-28 08:26:29 -07:00
Matthijs Douze
cf90435737 fix flaky GPU test
Summary:
Relaxed a test slightly to avoid transient failures.
Errors here:
https://www.internalfb.com/intern/testinfra/diagnostics/5066549780888824.844425016185500.1695878046/
https://www.internalfb.com/intern/testinfra/diagnostics/16044073676393084.844425016185500.1695809788/

Reviewed By: mlomeli1

Differential Revision: D49726244

fbshipit-source-id: 6c55efb8ff5b470dbd426129cb12fb9a0b40939f
2023-09-28 07:24:46 -07:00
Alexandr Guzhva
e18de2303e Fix chunk-based processing in ResidualCoarseQuantizer::search() (#3047)
Summary:
Adds a missing function argument to ResidualCoarseQuantizer() whenever the data is processed in chunks

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3047

Reviewed By: mlomeli1

Differential Revision: D49687858

Pulled By: mdouze

fbshipit-source-id: 1456138fe1ff3a033b73e97f16470ac8ceca60ab
2023-09-28 02:45:59 -07:00
Alexandr Guzhva
a1814be1b8 Simplify dependency components chain (#3058)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3058

Reviewed By: mlomeli1

Differential Revision: D49687780

Pulled By: mdouze

fbshipit-source-id: 28c5608162e814ec9c83eedde3922b2e9283bff7
2023-09-28 02:44:11 -07:00
Alexandr Guzhva
56b108703e move fvec_madd_* functions declarations to a right header (#3054)
Summary:
The implementations for `fvec_madd` and `fvec_madd_and_argmin` are in `utils/distances.cpp`, so I moved the declarations from `utils/utils.h` to `utils/distances.h`

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3054

Reviewed By: mlomeli1

Differential Revision: D49687725

Pulled By: mdouze

fbshipit-source-id: b98c13f5710f06daba479767a7aab8d62d6e6ddf
2023-09-28 02:35:22 -07:00
Alexandr Guzhva
0780a281dc Fix a couple of type mismatches (#3059)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3059

Reviewed By: mdouze

Differential Revision: D49617392

Pulled By: mlomeli1

fbshipit-source-id: 6a81fcae7be138a734edcb3996c369256de727ba
2023-09-27 07:25:11 -07:00
Alexandr Guzhva
592f3016d1 Unneeded field, exists in a baseclass (#3064)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3064

Reviewed By: pemazare

Differential Revision: D49617455

Pulled By: mlomeli1

fbshipit-source-id: 2d35cfc3c2afc2536ac0d614b3dbe1401134d03f
2023-09-27 06:15:14 -07:00
chasingegg
6218111233 Fix some typos (#3056)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3056

Reviewed By: pemazare

Differential Revision: D49617607

Pulled By: mlomeli1

fbshipit-source-id: b2d5df67e88e029882e697597af9f3fc8fe1e64c
2023-09-27 03:17:41 -07:00
generatedunixname89002005287564
d85601d972 fairring, faiss, fairness (4401366386162573988)
Reviewed By: r-barnes

Differential Revision: D49181434

fbshipit-source-id: 0554ec62155b422e4abe9cec709b69587f71dea0
2023-09-14 00:50:50 -07:00
generatedunixname89002005287564
50be4eaa1e faiss, falcon (1203443027085661913)
Reviewed By: r-barnes

Differential Revision: D49090185

fbshipit-source-id: d02bf6c1929cff1eeeb2e57cd868f31c8cc3bcec
2023-09-13 11:43:55 -07:00
Matthijs Douze
c8d6f7bb2b fix CI issues after cross-matrix diff (#3042)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3042

to fix nightly builds

Reviewed By: mlomeli1

Differential Revision: D48969974

fbshipit-source-id: b7206aac907ed65caf182a95cf22ec463bb58dc4
2023-09-06 07:55:15 -07:00
Naveen Tatikonda
4699365668 Fix is_trained in IndexHNSWSQ (#3034)
Summary:
### Description
Even though `is_trained` is set as `true` in [IndexScalarQuantizer](https://github.com/facebookresearch/faiss/blob/main/faiss/IndexScalarQuantizer.cpp#L34). It is again overwritten as `false` in [IndexHNSW](https://github.com/facebookresearch/faiss/blob/main/faiss/IndexHNSW.cpp#L910) which is failing at this [validation check](https://github.com/facebookresearch/faiss/blob/main/faiss/IndexHNSW.cpp#L363) while ingesting vectors. Raising this PR with a small change to fix it.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3034

Reviewed By: pemazare

Differential Revision: D48900445

Pulled By: mdouze

fbshipit-source-id: 16b1cf17e9d8900c4a42956d466e30c76b13d064
2023-09-05 07:36:53 -07:00
generatedunixname89002005325676
1d6db933ff Daily arc lint --take CLANGFORMAT
Reviewed By: h-friederich

Differential Revision: D48948669

fbshipit-source-id: b3029b6b61a74a19b28a33cf5e9297f5b81dbe39
2023-09-04 05:41:45 -07:00
Matthijs Douze
9dc75d026d reduce cross table size (#3012)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3012

The cross-tables for codebook construction contained the dot products between codebook entries, which is not necessary (and caused OOMs in some cases). This diff computes only the off-diagonal blocks.

Reviewed By: pemazare

Differential Revision: D48448615

fbshipit-source-id: 494b54e2900754a3ff5d3c8073cb9a768e578c58
2023-09-01 07:06:14 -07:00
Matthijs Douze
039409d950 split off RQ encoding steps to another file (#3011)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3011

After Alexandr's optimizations the ResidualQuantizer code has become harder to read. Split off the quantization code to a separate .h / .cpp to make it clearer.

Reviewed By: pemazare

Differential Revision: D48448614

fbshipit-source-id: c90d572ea3afe12a7a7e5092f88710e8eceaa2d1
2023-09-01 07:06:14 -07:00
Matthijs Douze
67d87275f8 Clean up batch comments + obey IO_FLAG_SKIP_PRECOMPUTE_TABLE (#3013)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3013

To avoid OOM when loading some RCQs, don't precompute cross product tables when io_flags contains bit IO_FLAG_SKIP_PRECOMPUTE_TABLE

Reviewed By: pemazare

Differential Revision: D48448616

fbshipit-source-id: a261259f1fb583aa358d6b6c42d9b851e9729247
2023-09-01 07:06:14 -07:00
Matthijs Douze
82352dd453 make nbits configurable for graph indices based on PQ (#3031)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3031

As requested in

https://github.com/facebookresearch/faiss/issues/3027

Indeed, PQ sizes with nbits > 8 are good tradeoffs, so it is interesting to support them.

Reviewed By: pemazare

Differential Revision: D48860659

fbshipit-source-id: 6f3c642e0902e1523bef36db6be3af3688d529a5
2023-09-01 02:37:33 -07:00
Matthijs Douze
5c4bd3feb3 Cleanup clustering code (#3030)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3030

Added default arguments to the .h file (for some reason I forgot this file when migrating default args).
Logging a hash value in MatrixStats, useful to check if two runs really really run on the same matrix...

Reviewed By: pemazare

Differential Revision: D48834343

fbshipit-source-id: 7c1948464e66ada1f462f4486f7cf3159bbf9dfd
2023-08-31 01:11:45 -07:00
Corey J. Nolet
3888f9bb11 Using expanded distance forms in RaftFlatIndex.cu (#3021)
Summary:
This is a minor bug that comes with a perf impact. The classic FAISS `FlatIndex` always uses expanded form of distance computation even though an argument `exactDistances` is provided. `RaftFlatIndex` was using this argument to determine whether the computation should be exhaustive.

This PR includes one additional change to eagerly initialize the `cublas_handle` on the `device_resources` instance when it's created.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3021

Reviewed By: pemazare

Differential Revision: D48739660

Pulled By: mdouze

fbshipit-source-id: a361334eb243df86c169c69d24bb10fed8876ee9
2023-08-30 09:05:59 -07:00