17 Commits

Author SHA1 Message Date
Saumya Agarwal
fbc7db2cce Revert D69984379: mem mapping and zero-copy python fixes
Differential Revision:
D69984379

Original commit changeset: 9437b4ad92ef

Original Phabricator Diff: D69984379

fbshipit-source-id: 3cb921fa79b6f20b6455b17e50acc3cb96bcbe7b
2025-03-11 11:43:17 -07:00
Matthijs Douze
631b0fde4f mem mapping and zero-copy python fixes (#4212)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4212

Add files to TARGETS
fix python

Reviewed By: mengdilin

Differential Revision: D69984379

fbshipit-source-id: 9437b4ad92ef49333a44ea37ec194364123fe825
2025-03-11 11:11:14 -07:00
Matthijs Douze
89e93e2105 more fast-scan reconstruction (#4128)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4128

Fix reconstruction code for the fast-scan and IVF fast-scan indices.

Reviewed By: asadoughi

Differential Revision: D68159014

fbshipit-source-id: fb33416eed994196b34f0f6d3014f4d4859b6039
2025-01-14 13:51:02 -08:00
Ali Safaya
86fa0db34e Fix IndexIVFFastScan reconstruct_from_offset method (#4095)
Summary:
Resolves issue https://github.com/facebookresearch/faiss/issues/4089 - IndexIVFPQFastScan crashes with certain nlist values

The `reconstruct_from_offset` method in `IndexIVFFastScan` was incorrectly reconstructing vectors, causing crashes when the `nlist` parameter was not byte-aligned (e.g. 100 instead of 256).

The root cause was that the `list_no` (Voronoi cell number) was not being properly encoded into the `code` vector before passing it to the `sa_decode` function. This resulted in invalid `list_no` values being read in `sa_decode`, triggering the assertion failure `'list_no >= 0 && list_no < nlist'` when `nlist` in some cases.

This PR fixes the issue with the following changes to `reconstruct_from_offset`:

1. Encode the `list_no` into the beginning of the `code` vector using the existing `encode_listno` method
2. Start the `BitstringWriter` after the coarse code portion of `code` (shifted by `coarse_code_size()` bytes)
3. Remove the residual centroid addition logic, as it is already handled in `sa_decode`

After these changes:
- Crashes no longer occur for any `nlist` value
- Reconstruction is now correct, matching the output of `IndexIVFPQ`

Fixes https://github.com/facebookresearch/faiss/issues/4089

Please review and let me know if any changes are needed. Thanks!

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4095

Reviewed By: asadoughi

Differential Revision: D67937160

Pulled By: mdouze

fbshipit-source-id: 4705106ba49c01c788b3c75c39c2260615f45764
2025-01-14 13:51:02 -08:00
Michael Norris
eff0898a13 Enable linting: lint config changes plus arc lint command (#3966)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3966

This actually enables the linting.

Manual changes:
- tools/arcanist/lint/fbsource-licenselint-config.toml
- tools/arcanist/lint/fbsource-lint-engine.toml

Automated changes:
`arc lint --apply-patches --take LICENSELINT --paths-cmd 'hg files faiss'`

Reviewed By: asadoughi

Differential Revision: D64484165

fbshipit-source-id: 4f2f6e953c94ef6ebfea8a5ae035ccfbea65ed04
2024-10-22 09:46:48 -07:00
Junjie Qi
14b8af6e73 Fix IVFPQFastScan decode function (#3312)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3312

as the [#issue3258](https://github.com/facebookresearch/faiss/issues/3258) mentioned, the IVFPQFastScan should have same decoding result as IVFPQ. However, current result is not as expected.

In this PR/Diff, we are going to fix the decoding function

Reviewed By: mdouze

Differential Revision: D55264781

fbshipit-source-id: dfdae9eabceadfc5a3ebb851930d71ce3c1c654d
2024-03-25 11:19:40 -07:00
Matthijs Douze
32f0e8cf92 Generalize ResultHanlder, support range search for HNSW and Fast Scan (#3190)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3190

This diff adds more result handlers in order to expose them externally.
This enables range search for HSNW and Fast Scan, and nprobe parameter support for FastScan.

Reviewed By: pemazare

Differential Revision: D52547384

fbshipit-source-id: 271da5ffea6411df3d8e50641abade18bd7b774b
2024-01-11 11:46:30 -08:00
Matthijs Douze
43d86e3073 Relax IVF AQ FastScan (#2940)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2940

This test fails on some occasions.
After investigation it turns out this is due to non reproducible behavior IndexIVFFastScan::search_implem_14 with a parallel loop, where there are ties in the resutls (ie. the resulting distances are the same but not the ids).
As a workaround I relaxed the test slightly.
+ a fix in the checksum function.

Reviewed By: algoriddle

Differential Revision: D47229086

fbshipit-source-id: 55e53bcfe47cf33041cc7fd5691b5de65067ce0f
2023-07-05 21:51:12 -07:00
alemagnani
230a97f7cb Support for parallelization in IVFFastScan over both queries and probes (#2380)
Summary:
For search request with few queries or single query, this PR adds the ability to run threads over both queries and different cluster of the IVF. For application where latency is important this can **dramatically reduce latency for single query requests**.

A new implementation (https://github.com/facebookresearch/faiss/issues/14) is added. The new implementation could be merged to the implementation 12 but for simplicity in this PR, I created a separate function.

Tests are added to cover the new implementation and new tests are added to specifically cover the case when a single query  is used.

In my benchmarks a very good reduction of latency is observed for single query requests.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2380

Test Plan:
```
buck test //faiss/tests/:test_fast_scan_ivf -- implem14
buck test //faiss/tests/:test_fast_scan_ivf -- implem15
```

Reviewed By: alexanderguzhva

Differential Revision: D38074577

Pulled By: mdouze

fbshipit-source-id: e7a20b6ea2f9216e0a045764b5d7b7f550ea89fe
2022-08-31 05:37:53 -07:00
Check Deng
838f85cb52 Implement search methods for ProductAdditiveQuantizer (#2336)
Summary:
Work in progress.

This PR is going to implement the following search methods for ProductAdditiveQuantizer, including index factory and I/O:

- [x] IndexProductAdditiveQuantizer
- [x] IndexIVFProductAdditiveQuantizer
- [x] IndexProductAdditiveQuantizerFastScan
- [x] IndexIVFProductAdditiveQuantizerFastScan

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2336

Test Plan:
buck test //faiss/tests/:test_fast_scan
buck test //faiss/tests/:test_fast_scan_ivf
buck test //faiss/tests/:test_local_search_quantizer
buck test //faiss/tests/:test_residual_quantizer

Reviewed By: alexanderguzhva

Differential Revision: D37172745

Pulled By: mdouze

fbshipit-source-id: 6ff18bfc462525478c90cd42e21805ab8605bd0f
2022-07-27 05:32:15 -07:00
Matthijs Douze
f2a9324359 make tests cheaper
Summary:
Many of the additive quantizer tests are recognized as flaky because the tests timeout in non-optimized stress mode.
This is probably because they don't import

https://www.internalfb.com/code/fbsource/fbcode/faiss/tests/common_faiss_tests.py

that sets the number of threads to 4. This diff fixes that and in addition declares the tests as "heavyweight" so that not too many of them are spawned in parallel in stress mode.

https://www.internalfb.com/intern/wiki/TAE/tpx/Timeouts_and_Sharded_Bundled_mode/#degree-of-parallelism

Hopefully it should fix the flaky tests

Reviewed By: alexanderguzhva

Differential Revision: D38111820

fbshipit-source-id: 7dd7c72e7e92b82384a170743cfd5c4aaf9a6960
2022-07-25 06:58:39 -07:00
Matthijs Douze
add3705c11 make fast scan tests cheaper (#2251)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2251

the fast_scan and fast_scan_ivf tests are irregularly timing out on the FB test infra

This diff:
- breaks down more tests into sub-tests
- makes tests cheaper by reducing the test dataset sizes
- corrects a nasty local variable binding bug that prevented all cases of `implem` to be covered.

I also tried to fix the polysemous tests that also timeout but I could not reproduce the timeout.

https://www.internalfb.com/intern/test/562949978542309?ref_report_id=0

Reviewed By: beauby

Differential Revision: D34852254

fbshipit-source-id: b005ffb3723e7d9df75516a539540d9165249cea
2022-03-16 13:23:07 -07:00
Ivan Sopin
d50211a38f Break distance ties in heap_replace_top() by ID (#2245)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2245

This changeset makes the `heap_replace_top()` function of the FAISS heap implementation break distance ties by the element's ID, according to the heap's min/max property.

Reviewed By: mdouze

Differential Revision: D34669542

fbshipit-source-id: 0db24fd12442eedeee917fbb3e811ba4a070ce0f
2022-03-09 10:23:48 -08:00
Check Deng
41007232d6 AQ fastscan (#2169)
Summary:
Work in progress.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2169

Test Plan:
buck test mode/opt //faiss/tests/:test_fast_scan
buck test mode/opt //faiss/tests/:test_fast_scan_ivf

Reviewed By: beauby

Differential Revision: D34208813

Pulled By: mdouze

fbshipit-source-id: 74b72e07dc537667a7def403c4e46d3d05408c27
2022-02-22 15:24:31 -08:00
Matthijs Douze
a3fae15e66 Fix training of complex quantizer (#2035)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2035

Relates to issue

https://github.com/facebookresearch/faiss/issues/2019

This diff fixes the issue and adds a test

Reviewed By: beauby

Differential Revision: D30749000

fbshipit-source-id: 3a03fa347a40bde04162981a5e0b153b4f7b9d66
2021-09-06 08:53:29 -07:00
Matthijs Douze
04f777ead5 Re-enable fast scan on Windows tests (#1663)
Summary:
Fast-scan tests were disabled on windows because of a heap corruption. This diff enables them because the free_aligned bug was fixed in the meantime.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1663

Reviewed By: beauby

Differential Revision: D26201040

Pulled By: mdouze

fbshipit-source-id: 8d6223b4e42ccb1ce2da6e2c51d9e0833199bde7
2021-02-03 07:48:52 -08:00
Matthijs Douze
6d0bc58db6 Implementation of PQ4 search with SIMD instructions (#1542)
Summary:
IndexPQ and IndexIVFPQ implementations with AVX shuffle instructions.

The training and computing of the codes does not change wrt. the original PQ versions but the code layout is "packed" so that it can be used efficiently by the SIMD computation kernels.

The main changes are:

- new IndexPQFastScan and IndexIVFPQFastScan objects

- simdib.h for an abstraction above the AVX2 intrinsics

- BlockInvertedLists for invlists that are 32-byte aligned and where codes are not sequential

- pq4_fast_scan.h/.cpp:  for packing codes and look-up tables + optmized distance comptuation kernels

- simd_result_hander.h: SIMD version of result collection in heaps / reservoirs

Misc changes:

- added contrib.inspect_tools to access fields in C++ objects

- moved .h and .cpp code for inverted lists to an invlists/ subdirectory, and made a .h/.cpp for InvertedListsIOHook

- added a new inverted lists type with 32-byte aligned codes (for consumption by SIMD)

- moved Windows-specific intrinsics to platfrom_macros.h

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1542

Test Plan:
```
buck test mode/opt  -j 4  //faiss/tests/:test_fast_scan_ivf //faiss/tests/:test_fast_scan
buck test mode/opt  //faiss/manifold/...
```

Reviewed By: wickedfoo

Differential Revision: D25175439

Pulled By: mdouze

fbshipit-source-id: ad1a40c0df8c10f4b364bdec7172e43d71b56c34
2020-12-03 10:06:38 -08:00