Commit Graph

22 Commits (21dfdbaaa0e30f2e16ad98ae4f94c2952e7178ce)

Author SHA1 Message Date
Xiao Fu 5e452ed52a Cleaning up more unnecessary print (#3455)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3455

Code quality control by reducing the number of prints

Reviewed By: junjieqi

Differential Revision: D57502194

fbshipit-source-id: a6cd65ed4cc49590ce73d2978d41b640b5259c17
2024-05-17 16:59:36 -07:00
Xiao Fu bf8bd6b689 Delete all remaining print (#3452)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3452

Delete all remaining print within the Tests to improve the readability and effectiveness of the codebase.

Reviewed By: junjieqi

Differential Revision: D57466393

fbshipit-source-id: 6ebd66ae2e769894d810d4ba7a5f69fc865b797d
2024-05-16 19:51:07 -07:00
Gergely Szilvasy 9bb6b4be0d fix test TestCrossCodebookComputations::test_precomp
Summary: To fix the nightly: https://app.circleci.com/pipelines/github/facebookresearch/faiss/4815/workflows/2027a135-72ee-459f-a092-7ada95affd41/jobs/26225

Reviewed By: mdouze

Differential Revision: D50839933

fbshipit-source-id: 311b548182a2b3966c9603f83c115fa038eb19e8
2023-10-31 09:50:05 -07:00
Matthijs Douze c8d6f7bb2b fix CI issues after cross-matrix diff (#3042)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3042

to fix nightly builds

Reviewed By: mlomeli1

Differential Revision: D48969974

fbshipit-source-id: b7206aac907ed65caf182a95cf22ec463bb58dc4
2023-09-06 07:55:15 -07:00
Matthijs Douze 9dc75d026d reduce cross table size (#3012)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3012

The cross-tables for codebook construction contained the dot products between codebook entries, which is not necessary (and caused OOMs in some cases). This diff computes only the off-diagonal blocks.

Reviewed By: pemazare

Differential Revision: D48448615

fbshipit-source-id: 494b54e2900754a3ff5d3c8073cb9a768e578c58
2023-09-01 07:06:14 -07:00
Matthijs Douze 67d87275f8 Clean up batch comments + obey IO_FLAG_SKIP_PRECOMPUTE_TABLE (#3013)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3013

To avoid OOM when loading some RCQs, don't precompute cross product tables when io_flags contains bit IO_FLAG_SKIP_PRECOMPUTE_TABLE

Reviewed By: pemazare

Differential Revision: D48448616

fbshipit-source-id: a261259f1fb583aa358d6b6c42d9b851e9729247
2023-09-01 07:06:14 -07:00
Matthijs Douze 1c1d5c808f Make tests a little less verbose
Summary: Useful info on github test runs is burried in spurious logging. Avoid this.

Reviewed By: mlomeli1

Differential Revision: D47209139

fbshipit-source-id: b5111c91e2b94f0c3678d599197f8e7094993df1
2023-07-04 07:02:53 -07:00
Matthijs Douze 547fe78c68 Support M1 in circleCI (#2774)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2774

Reviewed By: algoriddle

Differential Revision: D44215129

Pulled By: mdouze

fbshipit-source-id: 62266b214186684eaf49ab1b9a39971b324fd52b
2023-03-23 15:09:32 -07:00
Matthijs Douze 5f7ca61957 fail early if RCQ norms table would become too large
Summary:
The residual coarse quantizer could OOM because of the norms table that is of size ntotal.
This diff just re-uses a field that fixes a max amount of mem in the additive quantizers and throws if the norms would grow below that.

Reviewed By: alexanderguzhva

Differential Revision: D39771448

fbshipit-source-id: b6a071900e02a81848495e39691405b30f56e291
2022-09-23 11:16:12 -07:00
Check Deng 838f85cb52 Implement search methods for ProductAdditiveQuantizer (#2336)
Summary:
Work in progress.

This PR is going to implement the following search methods for ProductAdditiveQuantizer, including index factory and I/O:

- [x] IndexProductAdditiveQuantizer
- [x] IndexIVFProductAdditiveQuantizer
- [x] IndexProductAdditiveQuantizerFastScan
- [x] IndexIVFProductAdditiveQuantizerFastScan

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2336

Test Plan:
buck test //faiss/tests/:test_fast_scan
buck test //faiss/tests/:test_fast_scan_ivf
buck test //faiss/tests/:test_local_search_quantizer
buck test //faiss/tests/:test_residual_quantizer

Reviewed By: alexanderguzhva

Differential Revision: D37172745

Pulled By: mdouze

fbshipit-source-id: 6ff18bfc462525478c90cd42e21805ab8605bd0f
2022-07-27 05:32:15 -07:00
Matthijs Douze f2a9324359 make tests cheaper
Summary:
Many of the additive quantizer tests are recognized as flaky because the tests timeout in non-optimized stress mode.
This is probably because they don't import

https://www.internalfb.com/code/fbsource/fbcode/faiss/tests/common_faiss_tests.py

that sets the number of threads to 4. This diff fixes that and in addition declares the tests as "heavyweight" so that not too many of them are spawned in parallel in stress mode.

https://www.internalfb.com/intern/wiki/TAE/tpx/Timeouts_and_Sharded_Bundled_mode/#degree-of-parallelism

Hopefully it should fix the flaky tests

Reviewed By: alexanderguzhva

Differential Revision: D38111820

fbshipit-source-id: 7dd7c72e7e92b82384a170743cfd5c4aaf9a6960
2022-07-25 06:58:39 -07:00
Check Deng 9b1982262a Add ProductAdditiveQuantizer (#2286)
Summary:
This diff added ProductAdditiveQuantizer.

A Simple Algo description:

1. Divide the vector space into several orthogonal sub-spaces, just like PQ does.
2. Quantize each sub-space by an independent additive quantizer.

Usage:

Construct a ProductAdditiveQuantizer object:
- `d`: dimensionality of the input vectors
- `nsplits`: number of sub-spaces divided into
- `Msub`: `M` of each additive quantizer
- `nbits`: `nbits` of each additive quantizer

```python
d = 128
nsplits = 2
Msub = 4
nbits = 8
plsq = faiss.ProductLocalSearchQuantizer(d, nsplits, Msub, nbits)
prq = faiss.ProductResidualQuantizer(d, nsplits, Msub, nbits)
```

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2286

Test Plan:
```

buck test //faiss/tests/:test_local_search_quantizer -- TestProductLocalSearchQuantizer

buck test //faiss/tests/:test_residual_quantizer -- TestProductResidualQuantizer

```

Reviewed By: alexanderguzhva

Differential Revision: D35907702

Pulled By: mdouze

fbshipit-source-id: 7428a196e6bd323569caa585c57281dd70e547b1
2022-05-05 15:14:07 -07:00
Matthijs Douze 291353c5a9 Generalize DistanceComputer for flat indexes (#2255)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2255

The `DistanceComputer` object is derived from an Index (obtained with `get_distance_computer()`). It maintains a current query and quickly computes distances from that query to any item in the database. This is useful, eg. for the IndexHNSW and IndexNSG that rely on query-to-point comparisons in the datasets.

This diff introduces the `FlatCodesDistanceComputer`, that inherits from `DistanceComputer` for Flat indexes. In addition to the distance-to-item function, it adds a `distance_to_code` that computes the distance from any code to the current query, even if it is not stored in the index.

This is implemented for all FlatCode indexes (IndexFlat, IndexPQ, IndexScalarQuantizer and IndexAdditiveQuantizer).

In the process, the two classes were extracted to their own header file `impl/DistanceComputer.h`

Reviewed By: beauby

Differential Revision: D34863609

fbshipit-source-id: 39d8c66475e55c3223c4a6a210827aa48bca292d
2022-03-20 23:43:33 -07:00
Ivan Sopin d50211a38f Break distance ties in `heap_replace_top()` by ID (#2245)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2245

This changeset makes the `heap_replace_top()` function of the FAISS heap implementation break distance ties by the element's ID, according to the heap's min/max property.

Reviewed By: mdouze

Differential Revision: D34669542

fbshipit-source-id: 0db24fd12442eedeee917fbb3e811ba4a070ce0f
2022-03-09 10:23:48 -08:00
Matthijs Douze 07a874d5b1 Post-training refinement of residual quantizer codebooks (#2166)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2166

RQ training is done progressively from one quantizer to the next, maintaining a current set of codes and quantization centroids.
However, for RQ as for any additive quantizer, there is a closed form solution for the centroids that minimizes the quantization error for fixed codes.
This diff offers the option to estimate that codebook at the end of the optimization. It performs this estimation iteratively, ie. several rounds of code computation - codebook refinement are performed.

A pure python implementation + results is here:
https://github.com/fairinternal/faiss_improvements/blob/dbcc746/decoder/refine_aq_codebook.ipynb

Reviewed By: wickedfoo

Differential Revision: D33309409

fbshipit-source-id: 55c13425292e73a1b05f00e90f4dcfdc8b3549e8
2022-01-05 00:59:16 -08:00
Chengqi Deng 26abede812 Non-uniform quantization of vector norms (#2037)
Summary:
This diff implemented non-uniform quantization of vector norms in additive quantizers. index_factory and I/O are supported.

index_factory:  `XXX_Ncqint{nbits}` where `nbits` is the number of bits to quantize vector norm.

For 8 bits code, it is almost the same as 8-bit uniform quantization. It will slightly improve the accuracy if the code size is less than 8 bits.
```
RQ4x8_Nqint8:  R@1 0.1116
RQ4x8_Ncqint8: R@1 0.1117

RQ4x8_Nqint4:  R@1 0.0901
RQ4x8_Ncqint4: R@1 0.0989
```

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2037

Test Plan:
buck test //faiss/tests/:test_clustering -- TestClustering1D
buck test //faiss/tests/:test_lsq -- test_index_accuracy_cqint
buck test //faiss/tests/:test_residual_quantizer -- test_norm_cqint
buck test //faiss/tests/:test_residual_quantizer -- test_search_L2

Reviewed By: beauby

Differential Revision: D31083476

Pulled By: mdouze

fbshipit-source-id: f34c3dafc4eb1c6f44a63e68137158911aa4a2f4
2021-10-11 14:13:16 -07:00
Matthijs Douze 151e3d7be5 fix centroids_norms storage for ResidualCoarseQuantizer (#2018)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2018

The centroids norms table was not reconstructed correctly after being stored in RCQ.

Reviewed By: Sugoshnr

Differential Revision: D30484389

fbshipit-source-id: 9f618a3939c99dc987590c07eda8e76e19248b08
2021-08-25 06:37:33 -07:00
Matthijs Douze 760cce7f3a Support for additive quantizer search (#1961)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1961

This diff implements LUT-based search for additive quantizers.
It also further merges code for LSQ and the RedisualQuantizer.

The documentation + evaluation is on github:

https://github.com/facebookresearch/faiss/wiki/Additive-quantizers

Reviewed By: wickedfoo

Differential Revision: D29395079

fbshipit-source-id: b8a24a647bbdc4cda2a699e791ffdb2a12bfa9c6
2021-08-20 01:00:10 -07:00
Matthijs Douze 8eab15eca3 LUT based search for additive quantizers (#1908)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1908

To search the best combination of codebooks, the method that was implemented so far is via a beam search.

It is possible to make this faster for a query vector q by precomputing look-up tables in the form of

LUT_m = <q, cent_m>

where cent_m is the set of centroids for quantizer m=0..M-1.

The LUT can then be used as

inner_prod = sum_m LUT_m[c_m]

and

L2_distance = norm_q + norm_db - 2 * inner_prod

This diff implements this computation by:

- adding the LUT precomputation

- storing an exhaustive table of all centroid norms (when using L2)

This is only practical for small additive quantizers, eg. when a residual vector quantizer is used as coarse quantizer (ResidualCoarseQuantizer).

This diff is based on AdditiveQuantizer diff because it applies equally to other quantizers (eg. the LSQ).

Reviewed By: sc268

Differential Revision: D28467746

fbshipit-source-id: 82611fe1e4908c290204d4de866338c622ae4148
2021-05-25 01:54:53 -07:00
Matthijs Douze 441ccebbff Make more Residual quantizer more memory efficient (#1865)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1865

This diff chunks vectors to encode to make it more memory efficient.

Reviewed By: sc268

Differential Revision: D28234424

fbshipit-source-id: c1afd2aaff953d4ecd339800d5951ae1cae4789a
2021-05-07 02:12:27 -07:00
Matthijs Douze bb3c52a057 IndexResidual codec
Summary:
This diff adds the following to bring the residual quantizer support on-par with PQ:
- IndexResidual can be built with index factory, serialized and used as a Faiss codec.
- ResidualCoarseQuantizer can be used as a coarse quantizer for inverted files.

The factory string looks like "RQ1x16_6x8" which means a first 16-bit quantizer then 6 8-bit ones. For IVF it's "IVF4096(RQ2x6),Flat".

Reviewed By: sc268

Differential Revision: D27865612

fbshipit-source-id: f9f11d29e9f89d3b6d4cd22e9a4f9222422d5f26
2021-04-26 20:26:43 -07:00
Matthijs Douze 7559cf5c5b add ResidualQuantizer
Summary:
This diff includes:
- progressive dimension k-means.
- the ResidualQuantizer object
- GpuProgressiveDimIndexFactory so that it can be trained on GPU
- corresponding tests
- reference Python implementation of the same in scripts/matthijs/LCC_encoding

Reviewed By: wickedfoo

Differential Revision: D27608029

fbshipit-source-id: 9a8cf3310c8439a93641961ca8b042941f0f4249
2021-04-14 13:11:54 -07:00