#include <IndexIVF.h>
Public Member Functions | |
IndexIVF (Index *quantizer, size_t d, size_t nlist, size_t code_size, MetricType metric=METRIC_L2) | |
void | reset () override |
removes all elements from the database. | |
void | train (idx_t n, const float *x) override |
Trains the quantizer and calls train_residual to train sub-quantizers. | |
void | add (idx_t n, const float *x) override |
Calls add_with_ids with NULL ids. | |
void | add_with_ids (idx_t n, const float *x, const idx_t *xids) override |
default implementation that calls encode_vectors | |
virtual void | encode_vectors (idx_t n, const float *x, const idx_t *list_nos, uint8_t *codes) const =0 |
virtual void | train_residual (idx_t n, const float *x) |
virtual void | search_preassigned (idx_t n, const float *x, idx_t k, const idx_t *assign, const float *centroid_dis, float *distances, idx_t *labels, bool store_pairs, const IVFSearchParameters *params=nullptr) const |
void | search (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels) const override |
void | range_search (idx_t n, const float *x, float radius, RangeSearchResult *result) const override |
void | range_search_preassigned (idx_t nx, const float *x, float radius, const idx_t *keys, const float *coarse_dis, RangeSearchResult *result) const |
virtual InvertedListScanner * | get_InvertedListScanner (bool store_pairs=false) const |
get a scanner for this index (store_pairs means ignore labels) | |
void | reconstruct (idx_t key, float *recons) const override |
void | reconstruct_n (idx_t i0, idx_t ni, float *recons) const override |
void | search_and_reconstruct (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const override |
virtual void | reconstruct_from_offset (idx_t list_no, idx_t offset, float *recons) const |
idx_t | remove_ids (const IDSelector &sel) override |
Dataset manipulation functions. | |
void | check_compatible_for_merge (const IndexIVF &other) const |
virtual void | merge_from (IndexIVF &other, idx_t add_id) |
virtual void | copy_subset_to (IndexIVF &other, int subset_type, idx_t a1, idx_t a2) const |
size_t | get_list_size (size_t list_no) const |
void | make_direct_map (bool new_maintain_direct_map=true) |
void | replace_invlists (InvertedLists *il, bool own=false) |
replace the inverted lists, old one is deallocated if own_invlists | |
![]() | |
Index (idx_t d=0, MetricType metric=METRIC_L2) | |
void | assign (idx_t n, const float *x, idx_t *labels, idx_t k=1) |
void | compute_residual (const float *x, float *residual, idx_t key) const |
void | display () const |
![]() | |
void | train_q1 (size_t n, const float *x, bool verbose, MetricType metric_type) |
Trains the quantizer and calls train_residual to train sub-quantizers. | |
Level1Quantizer (Index *quantizer, size_t nlist) | |
Public Attributes | |
InvertedLists * | invlists |
Acess to the actual data. | |
bool | own_invlists |
size_t | code_size |
code size per vector in bytes | |
size_t | nprobe |
number of probes at query time | |
size_t | max_codes |
max nb of codes to visit to do a query | |
int | parallel_mode |
bool | maintain_direct_map |
map for direct access to the elements. Enables reconstruct(). | |
std::vector< idx_t > | direct_map |
![]() | |
int | d |
vector dimension | |
idx_t | ntotal |
total nb of indexed vectors | |
bool | verbose |
verbosity level | |
bool | is_trained |
set if the Index does not require training, or if training is done already | |
MetricType | metric_type |
type of metric this index uses for search | |
![]() | |
Index * | quantizer |
quantizer that maps vectors to inverted lists | |
size_t | nlist |
number of possible key values | |
char | quantizer_trains_alone |
bool | own_fields |
whether object owns the quantizer | |
ClusteringParameters | cp |
to override default clustering params | |
Index * | clustering_index |
to override index used during clustering | |
Additional Inherited Members | |
![]() | |
using | idx_t = long |
all indices are this type | |
using | component_t = float |
using | distance_t = float |
Index based on a inverted file (IVF)
In the inverted file, the quantizer (an Index instance) provides a quantization index for each vector to be added. The quantization index maps to a list (aka inverted list or posting list), where the id of the vector is stored.
The inverted list object is required only after trainng. If none is set externally, an ArrayInvertedLists is used automatically.
At search time, the vector to be searched is also quantized, and only the list corresponding to the quantization index is searched. This speeds up the search by making it non-exhaustive. This can be relaxed using multi-probe search: a few (nprobe) quantization indices are selected and several inverted lists are visited.
Sub-classes implement a post-filtering of the index that refines the distance estimation from the query to databse vectors.
Definition at line 90 of file IndexIVF.h.
faiss::IndexIVF::IndexIVF | ( | Index * | quantizer, |
size_t | d, | ||
size_t | nlist, | ||
size_t | code_size, | ||
MetricType | metric = METRIC_L2 |
||
) |
The Inverted file takes a quantizer (an Index) on input, which implements the function mapping a vector to a list identifier. The pointer is borrowed: the quantizer should not be deleted while the IndexIVF is in use.
Definition at line 114 of file IndexIVF.cpp.
void faiss::IndexIVF::check_compatible_for_merge | ( | const IndexIVF & | other | ) | const |
check that the two indexes are compatible (ie, they are trained in the same way and have the same parameters). Otherwise throw.
Definition at line 710 of file IndexIVF.cpp.
|
virtual |
copy a subset of the entries index to the other index
if subset_type == 0: copies ids in [a1, a2) if subset_type == 1: copies ids if id % a1 == a2 if subset_type == 2: copies inverted lists such that a1 elements are left before and a2 elements are after
Definition at line 748 of file IndexIVF.cpp.
|
pure virtual |
Encodes a set of vectors as they would appear in the inverted lists
list_nos | inverted list ids as returned by the quantizer (size n). -1s are ignored. |
codes | output codes, size n * code_size |
Implemented in faiss::IndexIVFScalarQuantizer, faiss::IndexIVFPQ, faiss::IndexIVFSpectralHash, and faiss::IndexIVFFlat.
void faiss::IndexIVF::make_direct_map | ( | bool | new_maintain_direct_map = true | ) |
intialize a direct map
new_maintain_direct_map | if true, create a direct map, else clear it |
Definition at line 202 of file IndexIVF.cpp.
moves the entries from another dataset to self. On output, other is empty. add_id is added to all moved ids (for sequential ids, this would be this->ntotal
Reimplemented in faiss::IndexIVFPQR.
Definition at line 721 of file IndexIVF.cpp.
|
overridevirtual |
query n vectors of dimension d to the index.
return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory).
x | input vectors to search, size n * d |
radius | search radius |
result | result table |
Reimplemented from faiss::Index.
Reimplemented in faiss::IndexIVFFlatDedup.
Definition at line 434 of file IndexIVF.cpp.
|
overridevirtual |
Reconstruct a stored vector (or an approximation if lossy coding)
this function may not be defined for some indexes
key | id of the vector to reconstruct |
recons | reconstucted vector (size d) |
Reimplemented from faiss::Index.
Definition at line 562 of file IndexIVF.cpp.
|
virtual |
Reconstruct a vector given the location in terms of (inv list index + inv list offset) instead of the id.
Useful for reconstructing when the direct_map is not maintained and the inv list offset is computed by search_preassigned() with store_pairs
set.
Reimplemented in faiss::IndexIVFPQR, faiss::IndexIVFScalarQuantizer, faiss::IndexIVFFlatDedup, faiss::IndexIVFPQ, and faiss::IndexIVFFlat.
Definition at line 633 of file IndexIVF.cpp.
Reconstruct a subset of the indexed vectors.
Overrides default implementation to bypass reconstruct() which requires direct_map to be maintained.
i0 | first vector to reconstruct |
ni | nb of vectors to reconstruct |
recons | output array of reconstructed vectors, size ni * d |
Reimplemented from faiss::Index.
Definition at line 574 of file IndexIVF.cpp.
|
overridevirtual |
assign the vectors, then call search_preassign
Implements faiss::Index.
Definition at line 228 of file IndexIVF.cpp.
|
overridevirtual |
Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.
Overrides default implementation to avoid having to maintain direct_map and instead fetch the code offsets through the store_pairs
flag in search_preassigned().
recons | reconstructed vectors size (n, k, d) |
Reimplemented from faiss::Index.
Definition at line 595 of file IndexIVF.cpp.
|
virtual |
search a set of vectors, that are pre-quantized by the IVF quantizer. Fill in the corresponding heaps with the query results. The default implementation uses InvertedListScanners to do the search.
n | nb of vectors to query |
x | query vectors, size nx * d |
assign | coarse quantization indices, size nx * nprobe |
centroid_dis | distances to coarse centroids, size nx * nprobe |
distance | output distances, size n * k |
labels | output labels, size n * k |
store_pairs | store inv list index + inv list offset instead in upper/lower 32 bit of result, instead of ids (used for reranking). |
params | used to override the object's search parameters |
Reimplemented in faiss::IndexIVFPQR, and faiss::IndexIVFFlatDedup.
Definition at line 250 of file IndexIVF.cpp.
|
virtual |
Sub-classes that encode the residuals can train their encoders here does nothing by default
Reimplemented in faiss::IndexIVFPQR, faiss::IndexIVFScalarQuantizer, faiss::IndexIVFPQ, and faiss::IndexIVFSpectralHash.
Definition at line 703 of file IndexIVF.cpp.
int faiss::IndexIVF::parallel_mode |
Parallel mode determines how queries are parallelized with OpenMP
0 (default): parallelize over queries 1: parallelize over over inverted lists 2: parallelize over both
Definition at line 106 of file IndexIVF.h.