#include <IndexIVFPQ.h>
Public Member Functions | |
IndexIVFPQ (Index *quantizer, size_t d, size_t nlist, size_t M, size_t nbits_per_idx) | |
void | add_with_ids (idx_t n, const float *x, const long *xids=nullptr) override |
default implementation that calls encode_vectors | |
void | encode_vectors (idx_t n, const float *x, const idx_t *list_nos, uint8_t *codes) const override |
void | add_core_o (idx_t n, const float *x, const long *xids, float *residuals_2, const long *precomputed_idx=nullptr) |
void | train_residual (idx_t n, const float *x) override |
trains the product quantizer | |
void | train_residual_o (idx_t n, const float *x, float *residuals_2) |
same as train_residual, also output 2nd level residuals | |
void | reconstruct_from_offset (long list_no, long offset, float *recons) const override |
size_t | find_duplicates (idx_t *ids, size_t *lims) const |
void | encode (long key, const float *x, uint8_t *code) const |
void | encode_multiple (size_t n, long *keys, const float *x, uint8_t *codes, bool compute_keys=false) const |
void | decode_multiple (size_t n, const long *keys, const uint8_t *xcodes, float *x) const |
inverse of encode_multiple | |
InvertedListScanner * | get_InvertedListScanner (bool store_pairs) const override |
get a scanner for this index (store_pairs means ignore labels) | |
void | precompute_table () |
build precomputed table More... | |
![]() | |
IndexIVF (Index *quantizer, size_t d, size_t nlist, size_t code_size, MetricType metric=METRIC_L2) | |
void | reset () override |
removes all elements from the database. | |
void | train (idx_t n, const float *x) override |
Trains the quantizer and calls train_residual to train sub-quantizers. | |
void | add (idx_t n, const float *x) override |
Calls add_with_ids with NULL ids. | |
virtual void | search_preassigned (idx_t n, const float *x, idx_t k, const idx_t *assign, const float *centroid_dis, float *distances, idx_t *labels, bool store_pairs, const IVFSearchParameters *params=nullptr) const |
void | search (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels) const override |
void | range_search (idx_t n, const float *x, float radius, RangeSearchResult *result) const override |
void | range_search_preassigned (idx_t nx, const float *x, float radius, const idx_t *keys, const float *coarse_dis, RangeSearchResult *result) const |
void | reconstruct (idx_t key, float *recons) const override |
void | reconstruct_n (idx_t i0, idx_t ni, float *recons) const override |
void | search_and_reconstruct (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const override |
idx_t | remove_ids (const IDSelector &sel) override |
Dataset manipulation functions. | |
void | check_compatible_for_merge (const IndexIVF &other) const |
virtual void | merge_from (IndexIVF &other, idx_t add_id) |
virtual void | copy_subset_to (IndexIVF &other, int subset_type, idx_t a1, idx_t a2) const |
size_t | get_list_size (size_t list_no) const |
void | make_direct_map (bool new_maintain_direct_map=true) |
void | replace_invlists (InvertedLists *il, bool own=false) |
replace the inverted lists, old one is deallocated if own_invlists | |
![]() | |
Index (idx_t d=0, MetricType metric=METRIC_L2) | |
void | assign (idx_t n, const float *x, idx_t *labels, idx_t k=1) |
void | compute_residual (const float *x, float *residual, idx_t key) const |
void | display () const |
![]() | |
void | train_q1 (size_t n, const float *x, bool verbose, MetricType metric_type) |
Trains the quantizer and calls train_residual to train sub-quantizers. | |
Level1Quantizer (Index *quantizer, size_t nlist) | |
Public Attributes | |
bool | by_residual |
Encode residual or plain vector? | |
ProductQuantizer | pq |
produces the codes | |
bool | do_polysemous_training |
reorder PQ centroids after training? | |
PolysemousTraining * | polysemous_training |
if NULL, use default | |
size_t | scan_table_threshold |
use table computation or on-the-fly? | |
int | polysemous_ht |
Hamming thresh for polysemous filtering. | |
int | use_precomputed_table |
if by_residual, build precompute tables More... | |
std::vector< float > | precomputed_table |
![]() | |
InvertedLists * | invlists |
Acess to the actual data. | |
bool | own_invlists |
size_t | code_size |
code size per vector in bytes | |
size_t | nprobe |
number of probes at query time | |
size_t | max_codes |
max nb of codes to visit to do a query | |
int | parallel_mode |
bool | maintain_direct_map |
map for direct access to the elements. Enables reconstruct(). | |
std::vector< idx_t > | direct_map |
![]() | |
int | d |
vector dimension | |
idx_t | ntotal |
total nb of indexed vectors | |
bool | verbose |
verbosity level | |
bool | is_trained |
set if the Index does not require training, or if training is done already | |
MetricType | metric_type |
type of metric this index uses for search | |
![]() | |
Index * | quantizer |
quantizer that maps vectors to inverted lists | |
size_t | nlist |
number of possible key values | |
char | quantizer_trains_alone |
bool | own_fields |
whether object owns the quantizer | |
ClusteringParameters | cp |
to override default clustering params | |
Index * | clustering_index |
to override index used during clustering | |
Static Public Attributes | |
static size_t | precomputed_table_max_bytes = ((size_t)1) << 31 |
2G by default, accommodates tables up to PQ32 w/ 65536 centroids | |
Additional Inherited Members | |
![]() | |
using | idx_t = long |
all indices are this type | |
using | component_t = float |
using | distance_t = float |
Inverted file with Product Quantizer encoding. Each residual vector is encoded as a product quantizer code.
Definition at line 34 of file IndexIVFPQ.h.
void faiss::IndexIVFPQ::add_core_o | ( | idx_t | n, |
const float * | x, | ||
const long * | xids, | ||
float * | residuals_2, | ||
const long * | precomputed_idx = nullptr |
||
) |
same as add_core, also:
Definition at line 220 of file IndexIVFPQ.cpp.
void faiss::IndexIVFPQ::encode_multiple | ( | size_t | n, |
long * | keys, | ||
const float * | x, | ||
uint8_t * | codes, | ||
bool | compute_keys = false |
||
) | const |
Encode multiple vectors
n | nb vectors to encode |
keys | posting list ids for those vectors (size n) |
x | vectors (size n * d) |
codes | output codes (size n * code_size) |
compute_keys | if false, assume keys are precomputed, otherwise compute them |
Definition at line 149 of file IndexIVFPQ.cpp.
|
overridevirtual |
Encodes a set of vectors as they would appear in the inverted lists
list_nos | inverted list ids as returned by the quantizer (size n). -1s are ignored. |
codes | output codes, size n * code_size |
Implements faiss::IndexIVF.
Definition at line 206 of file IndexIVFPQ.cpp.
size_t faiss::IndexIVFPQ::find_duplicates | ( | idx_t * | ids, |
size_t * | lims | ||
) | const |
Find exact duplicates in the dataset.
the duplicates are returned in pre-allocated arrays (see the max sizes).
lims limits between groups of duplicates (max size ntotal / 2 + 1) ids ids[lims[i]] : ids[lims[i+1]-1] is a group of duplicates (max size ntotal)
Definition at line 1141 of file IndexIVFPQ.cpp.
void faiss::IndexIVFPQ::precompute_table | ( | ) |
build precomputed table
Precomputed tables for residuals
During IVFPQ search with by_residual, we compute
d = || x - y_C - y_R ||^2
where x is the query vector, y_C the coarse centroid, y_R the refined PQ centroid. The expression can be decomposed as:
d = || x - y_C ||^2 + || y_R ||^2 + 2 * (y_C|y_R) - 2 * (x|y_R)
term 1 term 2 term 3
When using multiprobe, we use the following decomposition:
Since y_R defined by a product quantizer, it is split across subvectors and stored separately for each subvector. If the coarse quantizer is a MultiIndexQuantizer then the table can be stored more compactly.
At search time, the tables for term 2 and term 3 are added up. This is faster when the length of the lists is > ksub * M.
Definition at line 363 of file IndexIVFPQ.cpp.
|
overridevirtual |
Reconstruct a vector given the location in terms of (inv list index + inv list offset) instead of the id.
Useful for reconstructing when the direct_map is not maintained and the inv list offset is computed by search_preassigned() with store_pairs
set.
Reimplemented from faiss::IndexIVF.
Reimplemented in faiss::IndexIVFPQR.
Definition at line 310 of file IndexIVFPQ.cpp.
std::vector<float> faiss::IndexIVFPQ::precomputed_table |
if use_precompute_table size nlist * pq.M * pq.ksub
Definition at line 59 of file IndexIVFPQ.h.
int faiss::IndexIVFPQ::use_precomputed_table |
if by_residual, build precompute tables
Precompute table that speed up query preprocessing at some memory cost =-1: force disable =0: decide heuristically (default: use tables only if they are < precomputed_tables_max_bytes) =1: tables that work for all quantizers (size 256 * nlist * M) =2: specific version for MultiIndexQuantizer (much more compact)
Definition at line 54 of file IndexIVFPQ.h.