Commit Graph

23 Commits (c66ffe8736159f9ac000cd1248f521f5eef246cc)

Author SHA1 Message Date
Matthijs Douze e1adde0d84 Faster brute force search (#1502)
Summary:
This diff streamlines the code that collects results for brute force distance computations for the L2 / IP and range search / knn search combinations.

It introduces a `ResultHandler` template class that abstracts what happens with the computed distances and ids. In addition to the heap result handler and the range search result handler, it introduces a reservoir result handler that improves the search speed for  large k (>=100).

Benchmark results (https://fb.quip.com/y0g1ACLEqJXx#OCaACA2Gm45) show that on small datasets (10k) search is 10-50% faster (improvements are larger for small k). There is room for improvement in the reservoir implementation, whose implementation is quite naive currently, but the diff is already useful in its current form.

Experiments on precomputed db vector norms for L2 distance computations were not very concluding performance-wise, so the implementation is removed from IndexFlatL2.

This diff also removes IndexL2BaseShift, which was never used.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1502

Test Plan:
```
buck test //faiss/tests/:test_product_quantizer
buck test //faiss/tests/:test_index -- TestIndexFlat
```

Reviewed By: wickedfoo

Differential Revision: D24705464

Pulled By: mdouze

fbshipit-source-id: 270e10b19f3c89ed7b607ec30549aca0ac5027fe
2020-11-04 22:16:23 -08:00
Jeff Johnson ef6e53f8ba Cleanup flag/data propagation for IndexShards and IndexReplicas
Summary:
This diff fixes https://github.com/facebookresearch/faiss/issues/1412

There were various inconsistencies in how the shard and replica wrappers updated their internal state as the sub-indices were updated. This makes the two container classes work in the same way with similar synchronization functionality.

Reviewed By: beauby

Differential Revision: D23974186

fbshipit-source-id: c688c0c9124f823e4239aa2ff617b007b4564859
2020-09-29 10:25:46 -07:00
Matthijs Douze 6d73c2ff69 Fix int64 for python tests in windows (#1381)
Summary:
`long` is 32 bits on windows and so is the default int type for numpy (eg. the one used for `np.arange`).
This diff explicitly specifies 64-bit ints for all occurrences where it matters.

Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1381

Reviewed By: wickedfoo

Differential Revision: D23371232

Pulled By: mdouze

fbshipit-source-id: 220262cd70ee70379f83de93561a4eae71c94b04
2020-08-27 12:40:55 -07:00
Lucas Hosseini 24c4460dd2 Avoid leaking file descriptors in python tests. (#1353)
Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1353

Test Plan: Imported from OSS

Reviewed By: mdouze

Differential Revision: D23292456

Pulled By: beauby

fbshipit-source-id: 44458eb16d037883ff39827accf5edddb1b1bb89
2020-08-24 06:46:52 -07:00
Lucas Hosseini a17a631dc3
Sync 20200323. (#1157)
* Sync 20200323.

* Bump version.

* Remove warning filter.
2020-03-24 14:06:48 +01:00
Lucas Hosseini 22b7876ef5
Facebook sync (2020-03-10) (#1136) 2020-03-10 14:24:07 +01:00
Lucas Hosseini 36ddba9196
Facebook sync (2019-09-10) (#943)
* Facebook sync (2019-09-10)

* Fix depends Makefile target.

* Add faiss symlink for new include directives.

* Fix missing header.

* Fix tests.

* Fix Makefile.

* Update depend.

* Fix include directives spacing.
2019-09-20 18:59:10 +02:00
Lucas Hosseini a8118acbc5
Facebook sync (May 2019) + relicense (#838)
Changelog:

- changed license: BSD+Patents -> MIT
- propagates exceptions raised in sub-indexes of IndexShards and IndexReplicas
- support for searching several inverted lists in parallel (parallel_mode != 0)
- better support for PQ codes where nbit != 8 or 16
- IVFSpectralHash implementation: spectral hash codes inside an IVF
- 6-bit per component scalar quantizer (4 and 8 bit were already supported)
- combinations of inverted lists: HStackInvertedLists and VStackInvertedLists
- configurable number of threads for OnDiskInvertedLists prefetching (including 0=no prefetch)
- more test and demo code compatible with Python 3 (print with parentheses)
- refactored benchmark code: data loading is now in a single file
2019-05-28 16:17:22 +02:00
Lucas Hosseini afe0fdc161
Facebook sync (Mar 2019) (#756)
Facebook sync (Mar 2019)

- MatrixStats object
- option to round coordinates during k-means optimization
- alternative option for search in HNSW
- moved stats and imbalance_factor of IndexIVF to InvertedLists object
- range search for IVFScalarQuantizer
- direct unit8 codec in ScalarQuantizer
- renamed IndexProxy to IndexReplicas and moved to main Faiss
- better support for PQ code assignment with external index
- support for IMI2x16 (4B virtual centroids!)
- support for k = 2048 search on GPU (instead of 1024)
- most CUDA mem alloc failures throw exceptions instead of terminating on an assertion
- support for renaming an ondisk invertedlists
- interrupt computations with ctrl-C in python
2019-03-29 16:32:28 +01:00
Lucas Hosseini f417a53628
Fix CI tests (#687)
* Fix test_transfer_invlists.cpp

* Fix relative imports.

* Fix test_index_accuracy.py.

* Use default OSX version.

* Allow osx gcc6 build to fail.
2019-01-08 17:52:36 +01:00
matthijs daf589d9d2 add bench_all_ivf 2018-12-20 05:43:36 -08:00
Lucas Hosseini 323dbf3be3
Facebook sync (Dec 2018). (#660)
* Add GpuIndexBinaryFlat
* Add IndexBinaryHNSW
2018-12-19 17:48:35 +01:00
Lucas Hosseini 76bec0b500
Facebook sync (#573)
Features:

- automatic tracking of C++ references in Python
- non-intel platforms supported -- some functions optimized for ARM
- override nprobe for concurrent searches
- support for floating-point quantizers in binary indexes
Bug fixes:

- no more segfaults in python (I know it's the same as the first feature but it's important!)
- fix GpuIndexIVFFlat issues for float32 with 64 / 128 dims
- fix sharding of flat indexes on GPU with index_cpu_to_gpu_multiple
2018-08-30 19:38:50 +02:00
Lucas Hosseini 6880286ea0
Facebook sync (#504)
* Facebook sync

* Update swig wrappers.

* Fix comment.
2018-07-06 14:12:11 +02:00
Lucas Hosseini 6e40d6689f
Move python tests back together with C++ tests. (#479) 2018-06-04 12:20:44 +02:00
Lucas Hosseini cf18101f6d Refactor makefiles and add configure script (#466)
* Refactors Makefiles and add configure script.

* Give MKL higher priority in configure script.

* Clean up Linux example makefile.inc.

* Cleanup makefile.inc examples.

* Fix python clean Makefile target.

* Regen swig wrappers.

* Remove useless CUDAFLAGS variable.

* Fix python linking flags.

* Separate compile and link phase in python makefile.

* Add macro to look for swig.

* Add CUDA check in configure script.

* Cleanup make depend targets.

* Cleanup CUDA flags.

* Fix linking flags.

* Fix python GPU linking.

* Remove useless flags from python gpu module linking.

* Add check for cuda libs.

* Cleanup GPU targets.

* Clean up test target.

* Add cpu/gpu targets to python makefile.

* Clean up tutorial Makefile.

* Remove stale OS var from example makefiles.

* Clean up cuda example flags.
2018-06-02 08:35:30 +02:00
Matthijs Douze 0c482e54eb sync with FB version 2018-02-23 (#347)
- support on-disk IVF
2018-02-23 07:49:45 -08:00
matthijs 9933892ec9 sync with FB version 2017-01-09
- adding HNSW indexing method

- simultaneous search and reconstruction for IndexIVFPQ
2018-01-09 06:42:06 -08:00
matthijs 250a3d3f18 sync with FB version 2017-11-22
various bugfixes from github issues
kmean with some frozen centroids
GPU better tiling for large flat datasets
default AVX for vector ops
2017-11-22 05:11:28 -08:00
matthijs a5ef16db89 sync with FB version 2017-08-09 2017-08-09 11:13:51 -07:00
matthijs 8e3dc6f2b0 changed license 2017-07-30 00:18:45 -07:00
matthijs f7aedbdfc0 sync with FB version 2017-07-18
- implemented ScalarQuantizer (without IVF)
- implemented update for IndexIVFFlat
- implemented L2 normalization preproc
2017-07-18 02:51:27 -07:00
matthijs 784e2facd8 Synchronization with FB version 2017-06-21
* moved most FAISS_ASSERT calls to C++ exceptions, and adjusted
  memory allocation to avoid mem leaks

* added an IndexIVFScalarQuantizer type that offers an
  intermediate compression between IVFFlat and IVFPQ

* support removal of indices in IndexIDMap / IndexFlat combination

* various fixes in GPU code
2017-06-21 09:01:06 -07:00