Commit Graph

6 Commits (9590ad27460f65fe3dae3ba61d8b5e3e8d03265f)

Author SHA1 Message Date
Michael Norris eff0898a13 Enable linting: lint config changes plus arc lint command (#3966)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3966

This actually enables the linting.

Manual changes:
- tools/arcanist/lint/fbsource-licenselint-config.toml
- tools/arcanist/lint/fbsource-lint-engine.toml

Automated changes:
`arc lint --apply-patches --take LICENSELINT --paths-cmd 'hg files faiss'`

Reviewed By: asadoughi

Differential Revision: D64484165

fbshipit-source-id: 4f2f6e953c94ef6ebfea8a5ae035ccfbea65ed04
2024-10-22 09:46:48 -07:00
Matthijs Douze 838612c9d7 torch.distributed kmeans (#3876)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3876

Demo script for distributed kmeans. It provides a `DatasetAssign` object and shows how to run it with torch.distributed.

Reviewed By: asadoughi, pankajsingh88

Differential Revision: D63013820

fbshipit-source-id: 22c959f3afdc04fd4aa8b9aeed309ea6290b1328
2024-09-20 09:15:27 -07:00
Matthijs Douze 6baebe2cee begin torch_contrib (#3872)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3872

The contrib.torch subdirectory is intended to receive modules in python that are useful for similarity search and that apply to CPU or GPU pytorch tensors.

The current version includes CPU clustering on torch tensors. To be added:
* implementation of PQ

Reviewed By: asadoughi

Differential Revision: D62759207

fbshipit-source-id: 87dbaa5083e3f2f4f60526815e22ded4e83e8559
2024-09-20 09:15:27 -07:00
Matthijs Douze 0d7817e88f rewrite python kmeans without scipy (#3873)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3873

The previous version required scipy to do the accumulation, which is replaced here with a nifty piece of numpy accumulation.
This removes the need for scipy for non-sparse data.

Reviewed By: junjieqi

Differential Revision: D62884307

fbshipit-source-id: 5443634e487387a2b518fd2a7f9a3d9a40abd4b4
2024-09-20 09:15:27 -07:00
Matthijs Douze 016aa04602 make balanced clusters the default (#2796)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2796

This diff makes balanced clusters the default for 2-level clustering. This seems to improve a bit over the default uniform clusters, see

https://github.com/fairinternal/faiss_improvements/blob/master/better_coarse_quantizer/two_level_clustering.ipynb

Warning: the nc2 argument of two_level_clustering becomes the *total* number of clusters.

Reviewed By: algoriddle

Differential Revision: D44421222

fbshipit-source-id: 951b7fc043be4a41762a7e6f7a6fcfb71e303832
2023-03-28 07:23:30 -07:00
Matthijs Douze b8fe92dfee contrib clustering module (#2217)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2217

This diff introduces a new Faiss contrib module that contains:
- generic k-means implemented in python (was in distributed_ondisk)
- the two-level clustering code, including a simple function that runs it on a Faiss IVF index.
- sparse clustering code (new)

The main idea is that that code is often re-used so better have it in contrib.

Reviewed By: beauby

Differential Revision: D34170932

fbshipit-source-id: cc297cc56d241b5ef421500ed410d8e2be0f1b77
2022-02-28 14:18:47 -08:00