Faiss
 All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends
Public Types | Public Member Functions | Public Attributes | List of all members
faiss::Clustering Struct Reference

#include <Clustering.h>

Inheritance diagram for faiss::Clustering:
faiss::ClusteringParameters

Public Types

typedef Index::idx_t idx_t
 

Public Member Functions

 Clustering (int d, int k)
 the only mandatory parameters are k and d
 
 Clustering (int d, int k, const ClusteringParameters &cp)
 
virtual void train (idx_t n, const float *x, faiss::Index &index)
 Index is used during the assignment stage.
 
- Public Member Functions inherited from faiss::ClusteringParameters
 ClusteringParameters ()
 sets reasonable defaults
 

Public Attributes

size_t d
 dimension of the vectors
 
size_t k
 nb of centroids
 
std::vector< float > centroids
 centroids (k * d)
 
std::vector< float > obj
 
- Public Attributes inherited from faiss::ClusteringParameters
int niter
 clustering iterations
 
int nredo
 redo clustering this many times and keep best
 
bool verbose
 
bool spherical
 do we want normalized centroids?
 
bool update_index
 update index after each iteration?
 
int min_points_per_centroid
 otherwise you get a warning
 
int max_points_per_centroid
 to limit size of dataset
 
int seed
 seed for the random number generator
 

Detailed Description

clustering based on assignment - centroid update iterations

The clustering is based on an Index object that assigns training points to the centroids. Therefore, at each iteration the centroids are added to the index.

On output, the centoids table is set to the latest version of the centroids and they are also added to the index. If the centroids table it is not empty on input, it is also used for initialization.

To do several clusterings, just call train() several times on different training sets, clearing the centroid table in between.

Definition at line 58 of file Clustering.h.

Member Data Documentation

std::vector<float> faiss::Clustering::obj

objective values (sum of distances reported by index) over iterations

Definition at line 68 of file Clustering.h.


The documentation for this struct was generated from the following files: