write distributed_kmeans centroids and assignments to hive tables (#4017)

Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4017

Exposing an option to run kmeans centroids and assignments to hive table which should bring us close in parity with Digraph's Kmeans API. This is needed for cluster balance data quality checks for large scale centroids

Reviewed By: kuarora

Differential Revision: D64835789

fbshipit-source-id: 95cbea00bb6b4733c03836049bc379be813bf9e5
This commit is contained in:
Mengdi Lin 2024-11-05 18:49:44 -08:00 committed by Facebook GitHub Bot
parent a11c1dbab6
commit cfd4804ff8

View File

@ -83,6 +83,8 @@ class DatasetDescriptor:
embedding_column: Optional[str] = None
embedding_id_column: Optional[str] = None
sampling_rate: Optional[float] = None
# sampling column for xdb