faiss

mirror of https://github.com/facebookresearch/faiss.git synced 2025-06-03 21:54:02 +08:00

Author	SHA1	Message	Date
Gergely Szilvasy	4c83965d2b	benchmark view results (#3144 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3144 Visualize results of running the benchmark with Pareto optima filtering: 1. per index or across indices 2. for space, time or space & time 3. knn or range search, the latter @ specific precision Reviewed By: mdouze Differential Revision: D51552775 fbshipit-source-id: d4f29e3d46ef044e71b54439b3972548c86af5a7	2023-12-04 05:53:17 -08:00
Gergely Szilvasy	9519a19f42	benchmark refactor Summary: 1. Support for index construction parameters outside of the factory string (arbitrary depth of quantizers). 2. Refactor that provides an index wrapper which is a prereq for the optimizer, which will generate indices from pre-optimized components (particularly quantizers) Reviewed By: mdouze Differential Revision: D51427452 fbshipit-source-id: 014d05dd798d856360f2546963e7cad64c2fcaeb	2023-12-04 05:53:17 -08:00
Alexandr Guzhva	a5b03cb9f6	Fix build on Alpine Linux (#3148 ) Summary: Fixes https://github.com/facebookresearch/faiss/issues/3142 Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3148 Reviewed By: algoriddle Differential Revision: D51741575 Pulled By: mdouze fbshipit-source-id: b4ea2302802ef70b53f8afe5a7ab4ee02bf6659e	2023-12-01 02:52:52 -08:00
Yuri Vanin	4bf8f939d6	Add NegativeDistanceComputer::distances_batch_4 override (#3149 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3149 Enables vectorized distance calculation in [NegativeDistanceComputer](`b109d086a2/faiss/IndexHNSW.cpp (L74)`), whenever supported by the [NegativeDistanceComputer::basedis](`b109d086a2/faiss/IndexHNSW.cpp (L76)`). Otherwise the default sequential calculation of [DistanceComputer::distances_batch_4](`b109d086a2/faiss/impl/DistanceComputer.h (L36-L54)`) is always chosen. Reviewed By: algoriddle Differential Revision: D51596177 fbshipit-source-id: fee510c0a229991ecb7d81a51bc53a20880685be	2023-11-29 16:38:08 -08:00
Gergely Szilvasy	90654d6011	benchmark core faiss prereqs Summary: 1. Support `search_preassigned` in IVFFastScan 2. `try_extract_index_ivf` to search recursively and support `IndexRefine` 3. `get_InvertedListScanner` to fail where not available 4. Workaround an OpenMP issue with `IndexIVFSpectralHash` Reviewed By: mdouze Differential Revision: D51427241 fbshipit-source-id: 365e3f11d24e80f101f986fc358c28dcc00805fa	2023-11-28 11:50:03 -08:00
Alexandr Guzhva	04bb0a810c	improve ScalarQuantizer performance, ESPECIALLY on old GCC (#3141 ) Summary: Introduces `FAISS_ALWAYS_INLINE` pragma directive and improves `ScalarQuantizer` performance with it. Most of performance-critical methods for `ScalarQuantizer` are marked with this new directive, because a compiler (especially, an old one) may be unable to inline it properly. In some of my GCC experiments, such an inlining yields +50% queries per second in a search. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3141 Reviewed By: algoriddle Differential Revision: D51615609 Pulled By: mdouze fbshipit-source-id: 9c755c3e1a289b5d498306c1b9d6fcc21b0bec28	2023-11-28 10:34:38 -08:00
Matthijs Douze	43f8220818	fix scopedeleter diff Summary: It seems that for some build modes, swig chokes on static_assert, so protect this with #idndef SWIG. Let's see what the tests say.... Reviewed By: algoriddle Differential Revision: D50971042 fbshipit-source-id: 83e2ccb464c0bd024cbf3a494357147d75a76ca2	2023-11-28 09:42:57 -08:00
Alexandr Guzhva	d3692d2498	Deprecate ScopeDeleter and ScopeDeleter1 in favor of std::unique_ptr<[]> (#3108 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3108 Reviewed By: mlomeli1 Differential Revision: D50595705 Pulled By: mdouze fbshipit-source-id: 8555c13609747b7b61201225fcd036d80b50ae59	2023-11-28 09:42:57 -08:00
luyuncheng	eb071f8c14	Fix is_trained in IndexNSGSQ (#3145 ) Summary: Same as https://github.com/facebookresearch/faiss/issues/3034 When using IndexNSGSQ with fp16, do not require training Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3145 Reviewed By: algoriddle Differential Revision: D51615536 Pulled By: mdouze fbshipit-source-id: c6bfbca920be80231d5d0a7290a29f17ea271f6e	2023-11-28 08:16:57 -08:00
Ben Frederickson	d643c41c02	use precomputed norms for raft brute_force knn calls (#3089 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3089 Reviewed By: algoriddle Differential Revision: D50933982 Pulled By: mdouze fbshipit-source-id: dd0d00cf71ac490f75b8c2f152e7ae4cc28019ef	2023-11-28 03:11:41 -08:00
Matthijs Douze	b109d086a2	Search and return codes (#3143 ) Summary: This PR adds a functionality where an IVF index can be searched and the corresponding codes be returned. It also adds a few functions to compress int arrays into a bit-compact representation. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3143 Test Plan: ``` buck test //faiss/tests/:test_index_composite -- TestSearchAndReconstruct buck test //faiss/tests/:test_standalone_codec -- test_arrays ``` Reviewed By: algoriddle Differential Revision: D51544613 Pulled By: mdouze fbshipit-source-id: 875f72d0f9140096851592422570efa0f65431fc	2023-11-25 13:57:25 -08:00
Jeff Johnson	467f70edbf	Consolidate GPU IVF query tile calculation + special handling for large query memory requirements Summary: In the GPU IVF (Flat, SQ and PQ) code, there is a requirement for using temporary memory for storing unfiltered (or partially filtered) vector distances calculated during list scanning which are k-selected by separate kernels. While a batch query may be presented to an IVF index, the amount of temporary memory needed to store all these unfiltered distances prior to filtering may be very huge depending upon IVF characteristics (such as the maximum number of vectors encoded in any of the IVF lists), in which case we cannot process the entire batch of queries at once and instead must tile over the batch of queries to reuse the temporary memory that we make available for these distances. The old code duplicated this roughly equivalent logic in 3 different places (the IVFFlat/SQ code, IVFPQ with precomputed codes, and IVFPQ without precomputed codes). Furthermore, in the case where either little/no temporary memory was available or where what temporary memory was available was (vastly) exceeded by the amount needed to handle a particular query, the old code enforced a minimum number of queries to be processed at once of 8. In certain cases (huge IVF list imbalance), this memory request could exceed the amount of memory that can be safely allocated on a GPU. This diff consolidates the original 3 separate places where this calculation took place to 1 place in IVFUtils. The logic proceeds roughly as before, to figure out how many queries can be processed in the available temporary memory, except we add a new heuristic in the case where the number of queries that can be concurrently processed falls below 8. This could be either due to little temporary memory being available, or due to huge memory requirements. In this case, we instead ignore the amount of temporary memory available and instead see how many queries' memory requirements would fit into a single 512 MiB memory allocation, so we reasonably cap this amount. If the query still cannot be satisfied with this allocation, we still proceed executing 1 query at a time (which note could still potentially exhaust the GPU memory, but this is an error that is unavoidable). While a different heuristic using the amount of actual memory allocatable on the device could be used instead of this fixed 512 MiB amount, there is no guarantee to my knowledge that a single cudaMalloc up to this limit could succeed (e.g., GPU reports 3 GiB available, you attempt to allocate all of that in a single allocation), so we just pick an amount which is a reasonable balance between efficiency (parallelism) and memory consumption. Note that if not enough temporary memory is available and a single 512 MiB allocation fails, then there is likely little memory to proceed efficiently at all under any scenario, as Faiss does require some headroom in terms of memory available for scratch spaces. Reviewed By: mdouze Differential Revision: D45574455 fbshipit-source-id: 08f5204e3e9656627c9134d7409b9b0960f07b2d	2023-11-15 11:19:02 -08:00
Robert Maynard	411c1721da	Add linker script to support large cuda cubin files (#3115 ) Summary: nvcc starting with CUDA 11.5 offers a `-hls` option to generate host side linker scripts to support large cubin file support. Since faiss supports CUDA 11.4 we replicate that behavior but injecting the same linker script into the link line manually. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3115 Reviewed By: mdouze Differential Revision: D51308908 Pulled By: algoriddle fbshipit-source-id: c6dd073cd3f44dbc99d2e2da97f79b9ebc843b59	2023-11-15 02:29:21 -08:00
Jeff Johnson	09c7aaceb6	Faiss GPU CUDA 12 fix: warp synchronous behavior Summary: This diff fixes the bug associated with moving Faiss GPU to CUDA 12. The following tests were succeeding in CUDA 11.x but failed in CUDA 12: ``` ✗ faiss/gpu/test:test_gpu_basics_py - test_input_types (faiss.gpu.test.test_gpu_basics.TestKnn) ✗ faiss/gpu/test:test_gpu_basics_py - test_dist (faiss.gpu.test.test_gpu_basics.TestAllPairwiseDistance) ✗ faiss/gpu/test:test_gpu_index_ivfpq - TestGpuIndexIVFPQ.Add_L2 ✗ faiss/gpu/test:test_gpu_basics_py - test_input_types_tiling (faiss.gpu.test.test_gpu_basics.TestKnn) ✗ faiss/gpu/test:test_gpu_index_ivfpq - TestGpuIndexIVFPQ.Add_IP ✗ faiss/gpu/test:test_gpu_index_ivfpq - TestGpuIndexIVFPQ.Float16Coarse ✗ faiss/gpu/test:test_gpu_index_ivfpq - TestGpuIndexIVFPQ.LargeBatch ``` It took a long while to track down, but the issue presented itself when an odd number of dimensions not divisible by 32 was used in cases where we needed to calculate a L2 norm for vectors, which occurred with brute-force L2 distance computation, as well as certain L2 IVFPQ operations. This issue appeared as some tests were using 33 as the dimensionality of vectors. The issue is that the number of threads given to the L2 norm kernel was effectively `min(dims, 1024)` where 1024 is the standard maximum number of CUDA threads per CTA on all devices at present. In the case where the result was not a multiple of 32, this would result in a partial warp being passed to the kernel (with non-participating lanes having no side effects). The change in CUDA 12 here seemed to be a change in the compiler behavior for warp-synchronous shuffle instructions (such as `__shfl_up_sync`. In the case of the partial warp, we were passing `0xffffffff` as the active lane mask, implying that all lanes were present for the warp. In the case of dims = 33, we would have 1 full warp with all lanes present, and 1 partial warp with only 1 active thread, so `0xffffffff` is a lie in this case. Prior to CUDA 12, it appeared that these shuffle instructions may have passed 0? around for lanes not present (or would it stall?), so the result was still calculated correctly. However, with the change to CUDA 12, the compiler and/or device firmware (or something) interprets this differently, where the warp lanes not present were providing garbage. The shuffle instructions were used to perform in-warp reductions (e.g., summing a bunch of floating point numbers), namely those needed to sum up the L2 vector norm value. So for dims = 32 or dims = 64 (and bizarrely, dims = 40 and some other choices) it still worked, but for dims = 33 it was adding in garbage, producing erroneous results. This diff removes the non-dim loop functionality for runL2Norm (where we can statically avoid a for loop over dimensions in case our threadblock is exactly sized with the number of dimensions present) and we just use the general-purpose fallback. Second, we now always provide an even number of warps when running the L2 norm kernel, avoiding the issue with the warp synchronous instructions not having a full warp present. This bug has been present since the code was written 2016 and was technically wrong before, but is only surfaced to be a bug/problem with the CUDA 12 change. tl;dr: if you use any kind of `_sync` instruction involving warp sync, always have a whole number of warps present, k thx. Reviewed By: mdouze Differential Revision: D51335172 fbshipit-source-id: 97da88a8dcbe6b4d8963083abc01d5d2121478bf	2023-11-14 22:57:45 -08:00
Christopher Ponce de Leon	0c2243c5b4	Revert D51029740: Namespace doesn't need to be followed by semicolon Differential Revision: D51029740 Original commit changeset: 177e3f6e6b0a Original Phabricator Diff: D51029740 fbshipit-source-id: c71ff386342902f2cfa6552d6a834ea3f2475e32	2023-11-06 08:37:37 -08:00
Richard Barnes	438b51925f	Namespace doesn't need to be followed by semicolon Summary: Auto-generated with ``` fbgs "}; // namespace" -l \| sort \| uniq \| sed 's/fbsource.//' \| xargs -n 50 sed -i 's_}; // namespace_} // namespace_' ``` Reviewed By: dmm-fb Differential Revision: D51029740 fbshipit-source-id: 177e3f6e6b0ab7e986b1147952cd5e2f59d4b1fc	2023-11-06 08:02:11 -08:00
Alexandr Guzhva	9a66532482	Add search parameters for IndexRefine::search() and IndexRefineFlat::search() (#3122 ) Summary: Add search params for `faiss::IndexRefine` and `faiss::IndexRefineFlat` Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3122 Test Plan: buck test //faiss/tests/:test_refine Reviewed By: pemazare Differential Revision: D50968413 Pulled By: mdouze fbshipit-source-id: 9f020d7e9c9d96b9acba54d9d7fff13bcf703b9e	2023-11-05 15:07:39 -08:00
pe4eniks	df7280b5f6	Documentation fixes (#3092 ) Summary: As a follow-up to this issue https://github.com/facebookresearch/faiss/issues/3086, I've fixed some bugs in the doxygen-generated documentation. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3092 Reviewed By: pemazare Differential Revision: D50595811 Pulled By: mdouze fbshipit-source-id: 74797d3f2594a20597e1eb6545e91f6eac6d035d	2023-11-02 10:06:03 -07:00
chasingegg	6b761503ba	Remove confusing comments in partitioning.cpp (#3104 ) Summary: Fix https://github.com/facebookresearch/faiss/issues/3095 Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3104 Reviewed By: mlomeli1 Differential Revision: D50595865 Pulled By: mdouze fbshipit-source-id: f9107bda114a77d6e1f0da32c3451b7182d32e60	2023-11-02 10:03:52 -07:00
Gergely Szilvasy	6c89c8bd4e	fix faiss-gpu nightly Summary: Forcing cudatoolkit to 11.8 to work around cudatoolkit vs cuda-cudart clobbering: https://app.circleci.com/pipelines/github/facebookresearch/faiss/4845/workflows/baee8356-31c5-44e8-ae98-44f897967557/jobs/26451?invite=true#step-105-335068_80 Reviewed By: mlomeli1 Differential Revision: D50924919 fbshipit-source-id: 590f39d1292ec64af7e179e764b8e7ac26108962	2023-11-02 03:09:40 -07:00
Gergely Szilvasy	0c07a114ad	fix raft contbuild and switch to libraft 23.12 (#3116 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3116 Test Plan: https://app.circleci.com/pipelines/github/facebookresearch/faiss/4839/workflows/cfd84a94-ca60-4128-96e6-db0f5afc69c4 Reviewed By: mdouze Differential Revision: D50897934 Pulled By: algoriddle fbshipit-source-id: 1422be39d640a2aec3ab6b4c68d3ef54900b5ba2	2023-11-01 14:37:21 -07:00
Gergely Szilvasy	9bb6b4be0d	fix test TestCrossCodebookComputations::test_precomp Summary: To fix the nightly: https://app.circleci.com/pipelines/github/facebookresearch/faiss/4815/workflows/2027a135-72ee-459f-a092-7ada95affd41/jobs/26225 Reviewed By: mdouze Differential Revision: D50839933 fbshipit-source-id: 311b548182a2b3966c9603f83c115fa038eb19e8	2023-10-31 09:50:05 -07:00
Gergely Szilvasy	c3b9374984	bench_fw - fixes & nits for oss (#3102 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3102 Reviewed By: pemazare Differential Revision: D50426528 Pulled By: algoriddle fbshipit-source-id: 886960b8b522318967fc5ec305666871b496cae8	2023-10-20 07:53:56 -07:00
Gergely Szilvasy	0a00d8137a	offline index evaluation (#3097 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3097 A framework for evaluating indices offline. Long term objectives: 1. Generate offline similarity index performance data with test datasets both for existing indices and automatically generated alternatives. That is, given a dataset and some constraints this workflow should automatically discover optimal index types and parameter choices as well as evaluate the performance of existing production indices and their parameters. 2. Allow researchers, platform owners (Laser, Unicorn) and product teams to understand how different index types perform on their datasets and make optimal choices wrt their objectives. Longer term to enable automatic decision-making/auto-tuning. Constraints, design choices: 1. I want to run the same evaluation on Meta-internal (fblearner, data from hive and manifold) or the local machine + research cluster (data on local disk or NFS) via OSS Faiss. Via fblearner, I want this to work in a way that it can be turned into a service and plugged into Unicorn or Laser, while the core Faiss part can be used/referred to in our research and to update the wiki with the latest results/recommendations for public datasets. 2. It must support a range of metrics for KNN and range search, and it should be easy to add new ones. Cost metrics need to be fine-grained to allow extrapolation. 3. It should automatically sweep all query time params (eg. nprobe, polysemous code hamming distance, params of quantizers), using`OperatingPointsWithRanges` to cut down the optimal param search space. (For now, it sweeps nprobes only.) 4. [FUTURE] It will generate/sweep index creation hyperparams (factory strings, quantizer sizes, quantizer params), using heuristics. 5. [FUTURE] It will sweep the dataset size: start small test with e.g. 100K db vectors and go up to millions, billions potentially, while narrowing down the index+param choices at each step. 6. [FUTURE] Extrapolate perf metrics (cost and accuracy) 7. Intermediate results must be saved (to disk, to manifold) throughout, and reused as much as possible to cut down on overall runtime and enable faster iteration during development. For range search, this diff supports the metric proposed in https://docs.google.com/document/d/1v5OOj7kfsKJ16xzaEHuKQj12Lrb-HlWLa_T2ct0LJiw/edit?usp=sharing I also added support for the classical case where the scoring function steps from 1 to 0 at some arbitrary threshold. For KNN, I added knn_intersection, but other metrics, particularly recall@1 will also be interesting. I also added the distance_ratio metric, which we previously discussed as an interesting alternative, since it shows how much the returned results approximate the ground-truth nearest-neighbours in terms of distances. In the test case, I evaluated three current production indices for VCE with 1M vectors in the database and 10K queries. Each index is tested at various operating points (nprobes), which are shows on the charts. The results are not extrapolated to the true scale of these indices. Reviewed By: yonglimeta Differential Revision: D49958434 fbshipit-source-id: f7f567b299118003955dc9e2d9c5b971e0940fc5	2023-10-17 13:56:02 -07:00
Matthijs Douze	f969d7ae3b	better docs Summary: Improved comments. Reviewed By: algoriddle Differential Revision: D50259422 fbshipit-source-id: 92ba0840468eb8724f21d8fbe406b1bc43c64706	2023-10-13 03:37:53 -07:00
Corey J. Nolet	edcf7438bb	Integrate IVF-Flat from RAFT (#2521 ) Summary: This is a design proposal that demonstrates an approach to enabling optional support for [RAFT](https://github.com/rapidsai/raft) versions of IVF PQ and IVF Flat (and brute force w/ fused k-selection when k <= 64). There are still a few open issues and design discussions needed for the new RAFT index types to support the full range of features of that FAISS' current gpu index types. Checklist for the integration todos: - [x] Rebase on current `main` branch - [X] The raft handle has been plugged directly into the StandardGpuResources - [X] `FlatIndex` passing Googletests - [x] Use `CodePacker` to support `copyFrom()` and `copyTo()` - [X] `IVF-flat passing Googletests - [ ] Raise appropriate exceptions for operations which are not yet supported by RAFT Additional features we've discussed: - [x] Separate IVF lists into individual memory chunks - [ ] Saving/loading To build FAISS w/ optional RAFT support: ``` mkdir build cd build cmake ../ -DFAISS_ENABLE_RAFT=ON -DFAISS_ENABLE_GPU=ON make -j ``` For development/testing, we've also supplied a bash script to make things easier: `build.sh` Below is a benchmark comparing the training of IVF Flat indices for RAFT and FAISS: ![image](https://user-images.githubusercontent.com/1242464/194944737-8b808f11-e28e-4556-82d1-1ea4b0707283.png) The benchmark was produced using Googlebench in [this](https://github.com/tfeher/raft/tree/raft_faiss_bench) RAFT fork. We're going to provide benchmarks for the queries as well. There are still a couple bottlenecks to be removed in the IVF-Flat training implementation and we'll update the current benchmark when ready. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2521 Test Plan: `buck test mode/debuck test mode/dev-nosan //faiss/gpu/test:test_gpu_index_ivfflat` Reviewed By: algoriddle Differential Revision: D49118319 Pulled By: mdouze fbshipit-source-id: 5916108bc27154acf7c92021ba579a6ca85d730b	2023-10-04 23:42:30 -07:00
Robert Maynard	458633c203	Remove uneeded PTX code generation from libfaiss builds (#3083 ) Summary: The CMake CUDA Architecture value of `60` means to generate both PTX and SASS for that arch. We only need SASS for the architectures we support, and one PTX version for future hardware versions. So now we build on SASS for everything ( `60-real` ) and use 80 as the baseline for newer archs likes 90 By removing this unneeded PTX code we can reduce the libfaiss.a binary to 305MB from the current 484MB. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3083 Reviewed By: wickedfoo Differential Revision: D49901896 Pulled By: algoriddle fbshipit-source-id: 15e98f81e191a565319cf855debad33b24ebf10b	2023-10-04 12:11:47 -07:00
Matthijs Douze	2b48901b51	Remove 1L and 1UL Summary: 1L and 1UL are problematic because sizeof(long) depends on the platform Reviewed By: mlomeli1 Differential Revision: D49911901 fbshipit-source-id: d4e4cb1f0283a33330bf1b8ca6b7f7bf41bc6ff4	2023-10-04 09:11:48 -07:00
Alexandr Guzhva	3f3321c446	Small refactoring of inverted lists (#3055 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3055 Reviewed By: mlomeli1 Differential Revision: D49859119 Pulled By: mdouze fbshipit-source-id: 7b457ffcae6cbcd732abcf34f87f5037697c4494	2023-10-04 09:11:48 -07:00
Haijun Yu	834c543240	Fix SingleBestResultHandler bug. if IndexFlatL2 empty data then search topk = 1 return label = 0 not -1. (#3075 ) Summary: Fix SingleBestResultHandler bug. if IndexFlatL2 empty data then search topk = 1 return label = 0 not -1. for example int d = 64; // dimension int nb = 100000; // database size int nq = 1; // nb of queries std::mt19937 rng; std::uniform_real_distribution<> distrib; float* xb = new float[d * nb]; float* xq = new float[d * nq]; for (int i = 0; i < nb; i++) { for (int j = 0; j < d; j++) xb[d * i + j] = distrib(rng); xb[d * i] += i / 1000.; } for (int i = 0; i < nq; i++) { for (int j = 0; j < d; j++) xq[d * i + j] = distrib(rng); xq[d * i] += i / 1000.; } faiss::IndexFlatL2 index(d); // call constructor printf("is_trained = %s\n", index.is_trained ? "true" : "false"); int k = 1; { // sanity check: search 1 first vectors of xb idx_t* I = new idx_t[k * nq]; float* D = new float[k * nq]; index.search(nq, xb, k, D, I); // *I = 0 not -1 } Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3075 Reviewed By: pemazare Differential Revision: D49749983 Pulled By: mdouze fbshipit-source-id: 10e9784035118b9e33e109180cab425de28d4ded	2023-10-03 02:30:47 -07:00
Matthijs Douze	9db182460c	Relax IVFFlatDedup test (#3077 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3077 This diff relaxes some IVFFlatDedup tests where distances are slighlty different over runs. Should fix https://app.circleci.com/pipelines/github/facebookresearch/faiss/4709/workflows/8c8213bf-8fe0-4c4e-9a7d-991f44bf1010/jobs/25551 https://app.circleci.com/pipelines/github/facebookresearch/faiss/4709/workflows/8c8213bf-8fe0-4c4e-9a7d-991f44bf1010/jobs/25547 Reviewed By: algoriddle Differential Revision: D49732349 fbshipit-source-id: 728b9885c6b7d6ba697ccb6bacc0abd0ee2b0679	2023-09-29 01:16:59 -07:00
generatedunixname89002005325676	0f182519a6	Daily `arc lint --take CLANGFORMAT` Reviewed By: bigfootjon Differential Revision: D49724184 fbshipit-source-id: dae2af01c245516e910801e11acda3fdd8bfaa54	2023-09-28 08:26:29 -07:00
Matthijs Douze	cf90435737	fix flaky GPU test Summary: Relaxed a test slightly to avoid transient failures. Errors here: https://www.internalfb.com/intern/testinfra/diagnostics/5066549780888824.844425016185500.1695878046/ https://www.internalfb.com/intern/testinfra/diagnostics/16044073676393084.844425016185500.1695809788/ Reviewed By: mlomeli1 Differential Revision: D49726244 fbshipit-source-id: 6c55efb8ff5b470dbd426129cb12fb9a0b40939f	2023-09-28 07:24:46 -07:00
Alexandr Guzhva	e18de2303e	Fix chunk-based processing in ResidualCoarseQuantizer::search() (#3047 ) Summary: Adds a missing function argument to ResidualCoarseQuantizer() whenever the data is processed in chunks Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3047 Reviewed By: mlomeli1 Differential Revision: D49687858 Pulled By: mdouze fbshipit-source-id: 1456138fe1ff3a033b73e97f16470ac8ceca60ab	2023-09-28 02:45:59 -07:00
Alexandr Guzhva	a1814be1b8	Simplify dependency components chain (#3058 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3058 Reviewed By: mlomeli1 Differential Revision: D49687780 Pulled By: mdouze fbshipit-source-id: 28c5608162e814ec9c83eedde3922b2e9283bff7	2023-09-28 02:44:11 -07:00
Alexandr Guzhva	56b108703e	move fvec_madd_* functions declarations to a right header (#3054 ) Summary: The implementations for `fvec_madd` and `fvec_madd_and_argmin` are in `utils/distances.cpp`, so I moved the declarations from `utils/utils.h` to `utils/distances.h` Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3054 Reviewed By: mlomeli1 Differential Revision: D49687725 Pulled By: mdouze fbshipit-source-id: b98c13f5710f06daba479767a7aab8d62d6e6ddf	2023-09-28 02:35:22 -07:00
Alexandr Guzhva	0780a281dc	Fix a couple of type mismatches (#3059 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3059 Reviewed By: mdouze Differential Revision: D49617392 Pulled By: mlomeli1 fbshipit-source-id: 6a81fcae7be138a734edcb3996c369256de727ba	2023-09-27 07:25:11 -07:00
Alexandr Guzhva	592f3016d1	Unneeded field, exists in a baseclass (#3064 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3064 Reviewed By: pemazare Differential Revision: D49617455 Pulled By: mlomeli1 fbshipit-source-id: 2d35cfc3c2afc2536ac0d614b3dbe1401134d03f	2023-09-27 06:15:14 -07:00
chasingegg	6218111233	Fix some typos (#3056 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3056 Reviewed By: pemazare Differential Revision: D49617607 Pulled By: mlomeli1 fbshipit-source-id: b2d5df67e88e029882e697597af9f3fc8fe1e64c	2023-09-27 03:17:41 -07:00
generatedunixname89002005287564	d85601d972	fairring, faiss, fairness (4401366386162573988) Reviewed By: r-barnes Differential Revision: D49181434 fbshipit-source-id: 0554ec62155b422e4abe9cec709b69587f71dea0	2023-09-14 00:50:50 -07:00
generatedunixname89002005287564	50be4eaa1e	faiss, falcon (1203443027085661913) Reviewed By: r-barnes Differential Revision: D49090185 fbshipit-source-id: d02bf6c1929cff1eeeb2e57cd868f31c8cc3bcec	2023-09-13 11:43:55 -07:00
Matthijs Douze	c8d6f7bb2b	fix CI issues after cross-matrix diff (#3042 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3042 to fix nightly builds Reviewed By: mlomeli1 Differential Revision: D48969974 fbshipit-source-id: b7206aac907ed65caf182a95cf22ec463bb58dc4	2023-09-06 07:55:15 -07:00
Naveen Tatikonda	4699365668	Fix is_trained in IndexHNSWSQ (#3034 ) Summary: ### Description Even though `is_trained` is set as `true` in [IndexScalarQuantizer](https://github.com/facebookresearch/faiss/blob/main/faiss/IndexScalarQuantizer.cpp#L34). It is again overwritten as `false` in [IndexHNSW](https://github.com/facebookresearch/faiss/blob/main/faiss/IndexHNSW.cpp#L910) which is failing at this [validation check](https://github.com/facebookresearch/faiss/blob/main/faiss/IndexHNSW.cpp#L363) while ingesting vectors. Raising this PR with a small change to fix it. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3034 Reviewed By: pemazare Differential Revision: D48900445 Pulled By: mdouze fbshipit-source-id: 16b1cf17e9d8900c4a42956d466e30c76b13d064	2023-09-05 07:36:53 -07:00
generatedunixname89002005325676	1d6db933ff	Daily `arc lint --take CLANGFORMAT` Reviewed By: h-friederich Differential Revision: D48948669 fbshipit-source-id: b3029b6b61a74a19b28a33cf5e9297f5b81dbe39	2023-09-04 05:41:45 -07:00
Matthijs Douze	9dc75d026d	reduce cross table size (#3012 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3012 The cross-tables for codebook construction contained the dot products between codebook entries, which is not necessary (and caused OOMs in some cases). This diff computes only the off-diagonal blocks. Reviewed By: pemazare Differential Revision: D48448615 fbshipit-source-id: 494b54e2900754a3ff5d3c8073cb9a768e578c58	2023-09-01 07:06:14 -07:00
Matthijs Douze	039409d950	split off RQ encoding steps to another file (#3011 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3011 After Alexandr's optimizations the ResidualQuantizer code has become harder to read. Split off the quantization code to a separate .h / .cpp to make it clearer. Reviewed By: pemazare Differential Revision: D48448614 fbshipit-source-id: c90d572ea3afe12a7a7e5092f88710e8eceaa2d1	2023-09-01 07:06:14 -07:00
Matthijs Douze	67d87275f8	Clean up batch comments + obey IO_FLAG_SKIP_PRECOMPUTE_TABLE (#3013 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3013 To avoid OOM when loading some RCQs, don't precompute cross product tables when io_flags contains bit IO_FLAG_SKIP_PRECOMPUTE_TABLE Reviewed By: pemazare Differential Revision: D48448616 fbshipit-source-id: a261259f1fb583aa358d6b6c42d9b851e9729247	2023-09-01 07:06:14 -07:00
Matthijs Douze	82352dd453	make nbits configurable for graph indices based on PQ (#3031 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3031 As requested in https://github.com/facebookresearch/faiss/issues/3027 Indeed, PQ sizes with nbits > 8 are good tradeoffs, so it is interesting to support them. Reviewed By: pemazare Differential Revision: D48860659 fbshipit-source-id: 6f3c642e0902e1523bef36db6be3af3688d529a5	2023-09-01 02:37:33 -07:00
Matthijs Douze	5c4bd3feb3	Cleanup clustering code (#3030 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3030 Added default arguments to the .h file (for some reason I forgot this file when migrating default args). Logging a hash value in MatrixStats, useful to check if two runs really really run on the same matrix... Reviewed By: pemazare Differential Revision: D48834343 fbshipit-source-id: 7c1948464e66ada1f462f4486f7cf3159bbf9dfd	2023-08-31 01:11:45 -07:00
Corey J. Nolet	3888f9bb11	Using expanded distance forms in `RaftFlatIndex.cu` (#3021 ) Summary: This is a minor bug that comes with a perf impact. The classic FAISS `FlatIndex` always uses expanded form of distance computation even though an argument `exactDistances` is provided. `RaftFlatIndex` was using this argument to determine whether the computation should be exhaustive. This PR includes one additional change to eagerly initialize the `cublas_handle` on the `device_resources` instance when it's created. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3021 Reviewed By: pemazare Differential Revision: D48739660 Pulled By: mdouze fbshipit-source-id: a361334eb243df86c169c69d24bb10fed8876ee9	2023-08-30 09:05:59 -07:00

1 2 3 4 5 ...

922 Commits