Faiss
 All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends
Public Member Functions | Public Attributes | Static Public Attributes | List of all members
faiss::IndexIVFPQ Struct Reference

#include <IndexIVFPQ.h>

Inheritance diagram for faiss::IndexIVFPQ:
faiss::IndexIVF faiss::Index faiss::Level1Quantizer faiss::IndexIVFPQR

Public Member Functions

 IndexIVFPQ (Index *quantizer, size_t d, size_t nlist, size_t M, size_t nbits_per_idx)
 
void add_with_ids (idx_t n, const float *x, const long *xids=nullptr) override
 default implementation that calls encode_vectors
 
void encode_vectors (idx_t n, const float *x, const idx_t *list_nos, uint8_t *codes) const override
 
void add_core_o (idx_t n, const float *x, const long *xids, float *residuals_2, const long *precomputed_idx=nullptr)
 
void train_residual (idx_t n, const float *x) override
 trains the product quantizer
 
void train_residual_o (idx_t n, const float *x, float *residuals_2)
 same as train_residual, also output 2nd level residuals
 
void reconstruct_from_offset (long list_no, long offset, float *recons) const override
 
size_t find_duplicates (idx_t *ids, size_t *lims) const
 
void encode (long key, const float *x, uint8_t *code) const
 
void encode_multiple (size_t n, long *keys, const float *x, uint8_t *codes, bool compute_keys=false) const
 
void decode_multiple (size_t n, const long *keys, const uint8_t *xcodes, float *x) const
 inverse of encode_multiple
 
InvertedListScannerget_InvertedListScanner (bool store_pairs) const override
 get a scanner for this index (store_pairs means ignore labels)
 
void precompute_table ()
 build precomputed table More...
 
- Public Member Functions inherited from faiss::IndexIVF
 IndexIVF (Index *quantizer, size_t d, size_t nlist, size_t code_size, MetricType metric=METRIC_L2)
 
void reset () override
 removes all elements from the database.
 
void train (idx_t n, const float *x) override
 Trains the quantizer and calls train_residual to train sub-quantizers.
 
void add (idx_t n, const float *x) override
 Calls add_with_ids with NULL ids.
 
virtual void search_preassigned (idx_t n, const float *x, idx_t k, const idx_t *assign, const float *centroid_dis, float *distances, idx_t *labels, bool store_pairs, const IVFSearchParameters *params=nullptr) const
 
void search (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels) const override
 
void range_search (idx_t n, const float *x, float radius, RangeSearchResult *result) const override
 
void range_search_preassigned (idx_t nx, const float *x, float radius, const idx_t *keys, const float *coarse_dis, RangeSearchResult *result) const
 
void reconstruct (idx_t key, float *recons) const override
 
void reconstruct_n (idx_t i0, idx_t ni, float *recons) const override
 
void search_and_reconstruct (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const override
 
idx_t remove_ids (const IDSelector &sel) override
 Dataset manipulation functions.
 
void check_compatible_for_merge (const IndexIVF &other) const
 
virtual void merge_from (IndexIVF &other, idx_t add_id)
 
virtual void copy_subset_to (IndexIVF &other, int subset_type, idx_t a1, idx_t a2) const
 
size_t get_list_size (size_t list_no) const
 
void make_direct_map (bool new_maintain_direct_map=true)
 
void replace_invlists (InvertedLists *il, bool own=false)
 replace the inverted lists, old one is deallocated if own_invlists
 
- Public Member Functions inherited from faiss::Index
 Index (idx_t d=0, MetricType metric=METRIC_L2)
 
void assign (idx_t n, const float *x, idx_t *labels, idx_t k=1)
 
void compute_residual (const float *x, float *residual, idx_t key) const
 
void display () const
 
- Public Member Functions inherited from faiss::Level1Quantizer
void train_q1 (size_t n, const float *x, bool verbose, MetricType metric_type)
 Trains the quantizer and calls train_residual to train sub-quantizers.
 
 Level1Quantizer (Index *quantizer, size_t nlist)
 

Public Attributes

bool by_residual
 Encode residual or plain vector?
 
ProductQuantizer pq
 produces the codes
 
bool do_polysemous_training
 reorder PQ centroids after training?
 
PolysemousTrainingpolysemous_training
 if NULL, use default
 
size_t scan_table_threshold
 use table computation or on-the-fly?
 
int polysemous_ht
 Hamming thresh for polysemous filtering.
 
int use_precomputed_table
 if by_residual, build precompute tables More...
 
std::vector< float > precomputed_table
 
- Public Attributes inherited from faiss::IndexIVF
InvertedListsinvlists
 Acess to the actual data.
 
bool own_invlists
 
size_t code_size
 code size per vector in bytes
 
size_t nprobe
 number of probes at query time
 
size_t max_codes
 max nb of codes to visit to do a query
 
int parallel_mode
 
bool maintain_direct_map
 map for direct access to the elements. Enables reconstruct().
 
std::vector< idx_tdirect_map
 
- Public Attributes inherited from faiss::Index
int d
 vector dimension
 
idx_t ntotal
 total nb of indexed vectors
 
bool verbose
 verbosity level
 
bool is_trained
 set if the Index does not require training, or if training is done already
 
MetricType metric_type
 type of metric this index uses for search
 
- Public Attributes inherited from faiss::Level1Quantizer
Indexquantizer
 quantizer that maps vectors to inverted lists
 
size_t nlist
 number of possible key values
 
char quantizer_trains_alone
 
bool own_fields
 whether object owns the quantizer
 
ClusteringParameters cp
 to override default clustering params
 
Indexclustering_index
 to override index used during clustering
 

Static Public Attributes

static size_t precomputed_table_max_bytes = ((size_t)1) << 31
 2G by default, accommodates tables up to PQ32 w/ 65536 centroids
 

Additional Inherited Members

- Public Types inherited from faiss::Index
using idx_t = long
 all indices are this type
 
using component_t = float
 
using distance_t = float
 

Detailed Description

Inverted file with Product Quantizer encoding. Each residual vector is encoded as a product quantizer code.

Definition at line 34 of file IndexIVFPQ.h.

Member Function Documentation

void faiss::IndexIVFPQ::add_core_o ( idx_t  n,
const float *  x,
const long *  xids,
float *  residuals_2,
const long *  precomputed_idx = nullptr 
)

same as add_core, also:

  • output 2nd level residuals if residuals_2 != NULL
  • use precomputed list numbers if precomputed_idx != NULL

Definition at line 220 of file IndexIVFPQ.cpp.

void faiss::IndexIVFPQ::encode_multiple ( size_t  n,
long *  keys,
const float *  x,
uint8_t *  codes,
bool  compute_keys = false 
) const

Encode multiple vectors

Parameters
nnb vectors to encode
keysposting list ids for those vectors (size n)
xvectors (size n * d)
codesoutput codes (size n * code_size)
compute_keysif false, assume keys are precomputed, otherwise compute them

Definition at line 149 of file IndexIVFPQ.cpp.

void faiss::IndexIVFPQ::encode_vectors ( idx_t  n,
const float *  x,
const idx_t list_nos,
uint8_t *  codes 
) const
overridevirtual

Encodes a set of vectors as they would appear in the inverted lists

Parameters
list_nosinverted list ids as returned by the quantizer (size n). -1s are ignored.
codesoutput codes, size n * code_size

Implements faiss::IndexIVF.

Definition at line 206 of file IndexIVFPQ.cpp.

size_t faiss::IndexIVFPQ::find_duplicates ( idx_t ids,
size_t *  lims 
) const

Find exact duplicates in the dataset.

the duplicates are returned in pre-allocated arrays (see the max sizes).

lims limits between groups of duplicates (max size ntotal / 2 + 1) ids ids[lims[i]] : ids[lims[i+1]-1] is a group of duplicates (max size ntotal)

Returns
n number of groups found

Definition at line 1141 of file IndexIVFPQ.cpp.

void faiss::IndexIVFPQ::precompute_table ( )

build precomputed table

Precomputed tables for residuals

During IVFPQ search with by_residual, we compute

d = || x - y_C - y_R ||^2

where x is the query vector, y_C the coarse centroid, y_R the refined PQ centroid. The expression can be decomposed as:

d = || x - y_C ||^2 + || y_R ||^2 + 2 * (y_C|y_R) - 2 * (x|y_R)


term 1 term 2 term 3

When using multiprobe, we use the following decomposition:

  • term 1 is the distance to the coarse centroid, that is computed during the 1st stage search.
  • term 2 can be precomputed, as it does not involve x. However, because of the PQ, it needs nlist * M * ksub storage. This is why use_precomputed_table is off by default
  • term 3 is the classical non-residual distance table.

Since y_R defined by a product quantizer, it is split across subvectors and stored separately for each subvector. If the coarse quantizer is a MultiIndexQuantizer then the table can be stored more compactly.

At search time, the tables for term 2 and term 3 are added up. This is faster when the length of the lists is > ksub * M.

Definition at line 363 of file IndexIVFPQ.cpp.

void faiss::IndexIVFPQ::reconstruct_from_offset ( long  list_no,
long  offset,
float *  recons 
) const
overridevirtual

Reconstruct a vector given the location in terms of (inv list index + inv list offset) instead of the id.

Useful for reconstructing when the direct_map is not maintained and the inv list offset is computed by search_preassigned() with store_pairs set.

Reimplemented from faiss::IndexIVF.

Reimplemented in faiss::IndexIVFPQR.

Definition at line 310 of file IndexIVFPQ.cpp.

Member Data Documentation

std::vector<float> faiss::IndexIVFPQ::precomputed_table

if use_precompute_table size nlist * pq.M * pq.ksub

Definition at line 59 of file IndexIVFPQ.h.

int faiss::IndexIVFPQ::use_precomputed_table

if by_residual, build precompute tables

Precompute table that speed up query preprocessing at some memory cost =-1: force disable =0: decide heuristically (default: use tables only if they are < precomputed_tables_max_bytes) =1: tables that work for all quantizers (size 256 * nlist * M) =2: specific version for MultiIndexQuantizer (much more compact)

Definition at line 54 of file IndexIVFPQ.h.


The documentation for this struct was generated from the following files: