Commit Graph

6 Commits (asadoughi-patch-1)

Author SHA1 Message Date
Nicholas Ormrod eb5e7341a6 facebook-unused-include-check in fbcode/faiss (#4029)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/4029

Remove headers flagged by facebook-unused-include-check over fbcode.faiss.

+ format and autodeps

This is a codemod. It was automatically generated and will be landed once it is approved and tests are passing in sandcastle.
You have been added as a reviewer by Sentinel or Butterfly.

Autodiff project: uif
Autodiff partition: fbcode.faiss
Autodiff bookmark: ad.uif.fbcode.faiss

Reviewed By: dtolnay

Differential Revision: D65957849

fbshipit-source-id: f6199250db595defd56f5e7b2828f838702e9a16
2024-11-15 13:53:46 -08:00
Michael Norris eff0898a13 Enable linting: lint config changes plus arc lint command (#3966)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3966

This actually enables the linting.

Manual changes:
- tools/arcanist/lint/fbsource-licenselint-config.toml
- tools/arcanist/lint/fbsource-lint-engine.toml

Automated changes:
`arc lint --apply-patches --take LICENSELINT --paths-cmd 'hg files faiss'`

Reviewed By: asadoughi

Differential Revision: D64484165

fbshipit-source-id: 4f2f6e953c94ef6ebfea8a5ae035ccfbea65ed04
2024-10-22 09:46:48 -07:00
Alexandr Guzhva 4d78137565 Place a useful cmake function 'link_to_faiss_lib' into a separate file (#3939)
Summary:
Add `cmake/link_to_faiss_lib.cmake`, which exposes a useful and reusable CMake `link_to_faiss_lib()` function

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3939

Reviewed By: mnorris11

Differential Revision: D64250261

Pulled By: mengdilin

fbshipit-source-id: bab5b7fab8effb33cb73024eb7eefd2319998e5b
2024-10-14 14:00:13 -07:00
Mengdi Lin 0df5d24a90 clean up hnsw benchmark (#3901)
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3901

1) remove system time from benchmark as this metric has extremely high jitter (50-100%) and is not useful for us

2) clean up command-line arguments and define a main function the external world can call

3) tweak default so microbenchmark runs fast by default (this does not the parameters we pass to microbenchmarks for servicelab)

Reviewed By: mnorris11

Differential Revision: D63650110

fbshipit-source-id: efc81563291f00701a0d1df1d27172adeb3ef231
2024-09-30 19:35:31 -07:00
Facebook Community Bot c8d1474fc5
Re-sync with internal repository (#3885)
The internal and external repositories are out of sync. This Pull Request attempts to brings them back in sync by patching the GitHub repository. Please carefully review this patch. You must disable ShipIt for your project in order to merge this pull request. DO NOT IMPORT this pull request. Instead, merge it directly on GitHub using the MERGE BUTTON. Re-enable ShipIt after merging.
2024-09-24 07:49:31 -07:00
mengdilin 149c1f4b3c Add performance regression tests (#3793)
Summary:
Add `CMakeList` compile `faiss/perf_tests` benchmarks. We will run the google benchmarks as part of CI so people can see benchmarking results (there is no diff-to-diff regression detection in open-sourced CI)

==== Test Plan =====

Sees logs in CI that looks like
```
Run on (4 X 3184.9 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x2)
  L1 Instruction 32 KiB (x2)
  L2 Unified 512 KiB (x2)
  L3 Unified 32768 KiB (x1)
Load Average: 2.69, 2.84, 1.56
----------------------------------------------------------------------------------------------
Benchmark                                    Time             CPU   Iterations UserCounters...
----------------------------------------------------------------------------------------------
QT_4bit/iterations:20                 53646755 ns     53643729 ns           20 code_size=1k
QT_4bit_uniform/iterations:20         52248603 ns     52246874 ns           20 code_size=1k
QT_6bit/iterations:20                 63697930 ns     63693459 ns           20 code_size=1.5k
QT_8bit/iterations:20                 43305175 ns     43303946 ns           20 code_size=2k
QT_8bit_direct/iterations:20          30771920 ns     30770261 ns           20 code_size=2k
QT_8bit_direct_signed/iterations:20   30744625 ns     30742891 ns           20 code_size=2k
QT_8bit_uniform/iterations:20         44227773 ns     44224242 ns           20 code_size=2k
QT_bf16/iterations:20                 32758794 ns     32758717 ns           20 code_size=4k
QT_fp16/iterations:20                 41068848 ns     41066492 ns           20 code_size=4k
2024-09-20T23:15:01+00:00
Running ./build/perf_tests/bench_scalar_quantizer_decode
Run on (4 X 3244.56 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x2)
  L1 Instruction 32 KiB (x2)
  L2 Unified 512 KiB (x2)
  L3 Unified 32768 KiB (x1)
Load Average: 2.43, 2.78, 1.56
----------------------------------------------------------------------------------------------
Benchmark                                    Time             CPU   Iterations UserCounters...
----------------------------------------------------------------------------------------------
QT_4bit/iterations:20                   338300 ns       338284 ns           20 code_size=64
QT_4bit_uniform/iterations:20           332928 ns       332914 ns           20 code_size=64
QT_6bit/iterations:20                   4[1568](https://github.com/facebookresearch/faiss/actions/runs/10966335129/job/30454475438?pr=3878#step:3:1585)3 ns       415674 ns           20 code_size=96
QT_8bit/iterations:20                   266034 ns       266026 ns           20 code_size=128
QT_8bit_direct/iterations:20             37552 ns        37553 ns           20 code_size=128
QT_8bit_direct_signed/iterations:20      39701 ns        39696 ns           20 code_size=128
QT_8bit_uniform/iterations:20           261535 ns       261529 ns           20 code_size=128
QT_bf16/iterations:20                    45518 ns        45506 ns           20 code_size=256
QT_fp16/iterations:20                   334602 ns       334584 ns           20 code_size=256
2024-09-20T23:15:02+00:00
Running ./build/perf_tests/bench_no_multithreading_rcq_search
Run on (4 X 3243.03 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x2)
  L1 Instruction 32 KiB (x2)
  L2 Unified 512 KiB (x2)
  L3 Unified 32768 KiB (x1)
Load Average: 2.43, 2.78, 1.56
WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points
WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points
WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points
WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points
WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points
WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points
WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points
WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points
WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points
WARNING clustering 65536 points to 65536 centroids: please provide at least 2555904 training points
---------------------------------------------------------------
Benchmark                     Time             CPU   Iterations
---------------------------------------------------------------
search/iterations:20   12763792 ns     10367188 ns           20
2024-09-20T23:15:51+00:00
Running ./build/perf_tests/bench_scalar_quantizer_accuracy
Run on (4 X 3231.04 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x2)
  L1 Instruction 32 KiB (x2)
  L2 Unified 512 KiB (x2)
  L3 Unified 32768 KiB (x1)
Load Average: 2.85, 2.84, 1.65
----------------------------------------------------------------------------------------------
Benchmark                                    Time             CPU   Iterations UserCounters...
----------------------------------------------------------------------------------------------
QT_4bit/iterations:20                    0.000 ns        0.000 ns            0 code_size=64 code_size_two=128k ndiff_for_idempotence=0 sql2_recons_error=0.047396
QT_4bit_uniform/iterations:20            0.000 ns        0.000 ns            0 code_size=64 code_size_two=128k ndiff_for_idempotence=0 sql2_recons_error=0.0473931
QT_6bit/iterations:20                    0.000 ns        0.000 ns            0 code_size=96 code_size_two=192k ndiff_for_idempotence=0 sql2_recons_error=2.6899m
QT_8bit/iterations:20                    0.000 ns        0.000 ns            0 code_size=128 code_size_two=256k ndiff_for_idempotence=0 sql2_recons_error=164.317u
QT_8bit_direct/iterations:20             0.000 ns        0.000 ns            0 code_size=128 code_size_two=256k ndiff_for_idempotence=0 sql2_recons_error=42.5514
QT_8bit_direct_signed/iterations:20      0.000 ns        0.000 ns            0 code_size=128 code_size_two=256k ndiff_for_idempotence=0 sql2_recons_error=42.5494
QT_8bit_uniform/iterations:20            0.000 ns        0.000 ns            0 code_size=128 code_size_two=256k ndiff_for_idempotence=0 sql2_recons_error=164.152u
QT_bf16/iterations:20                    0.000 ns        0.000 ns            0 code_size=256 code_size_two=512k ndiff_for_idempotence=0 sql2_recons_error=92.8328u
QT_fp16/iterations:20                    0.000 ns        0.000 ns            0 code_size=256 code_size_two=512k ndiff_for_idempotence=0 sql2_recons_error=1.44838u
2024-09-20T23:15:51+00:00
Running ./build/perf_tests/bench_scalar_quantizer_encode
Run on (4 X 3243.72 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x2)
  L1 Instruction 32 KiB (x2)
  L2 Unified 512 KiB (x2)
  L3 Unified 32768 KiB (x1)
Load Average: 2.85, 2.84, 1.65
----------------------------------------------------------------------------------------------
Benchmark                                    Time             CPU   Iterations UserCounters...
----------------------------------------------------------------------------------------------
QT_4bit/iterations:20                   702046 ns       701319 ns           20 code_size=64
QT_4bit_uniform/iterations:20           595889 ns       595880 ns           20 code_size=64
QT_6bit/iterations:20                  1287503 ns      1287542 ns           20 code_size=96
QT_8bit/iterations:20                   511811 ns       511804 ns           20 code_size=128
QT_8bit_direct/iterations:20            152977 ns       152970 ns           20 code_size=128
QT_8bit_direct_signed/iterations:20     185578 ns       185572 ns           20 code_size=128
QT_8bit_uniform/iterations:20           454412 ns       454408 ns           20 code_size=128
QT_bf16/iterations:20                    51331 ns        51324 ns           20 code_size=256
QT_fp16/iterations:20                   390658 ns       390649 ns           20 code_size=256
```

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/3793

Reviewed By: junjieqi

Differential Revision: D63147599

Pulled By: mengdilin

fbshipit-source-id: 03165b5acb3b0647a69f7db144ab76efda2fee11
2024-09-23 15:58:31 -07:00