#include <Clustering.h>
Public Types | |
typedef Index::idx_t | idx_t |
Public Member Functions | |
Clustering (int d, int k) | |
the only mandatory parameters are k and d | |
Clustering (int d, int k, const ClusteringParameters &cp) | |
virtual void | train (idx_t n, const float *x, faiss::Index &index) |
Index is used during the assignment stage. | |
![]() | |
ClusteringParameters () | |
sets reasonable defaults | |
Public Attributes | |
size_t | d |
dimension of the vectors | |
size_t | k |
nb of centroids | |
std::vector< float > | centroids |
centroids (k * d) | |
std::vector< float > | obj |
![]() | |
int | niter |
clustering iterations | |
int | nredo |
redo clustering this many times and keep best | |
bool | verbose |
bool | spherical |
do we want normalized centroids? | |
bool | update_index |
update index after each iteration? | |
int | min_points_per_centroid |
otherwise you get a warning | |
int | max_points_per_centroid |
to limit size of dataset | |
int | seed |
seed for the random number generator | |
clustering based on assignment - centroid update iterations
The clustering is based on an Index object that assigns training points to the centroids. Therefore, at each iteration the centroids are added to the index.
On output, the centoids table is set to the latest version of the centroids and they are also added to the index. If the centroids table it is not empty on input, it is also used for initialization.
To do several clusterings, just call train() several times on different training sets, clearing the centroid table in between.
Definition at line 58 of file Clustering.h.
std::vector<float> faiss::Clustering::obj |
objective values (sum of distances reported by index) over iterations
Definition at line 68 of file Clustering.h.