Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2870
Factor by_residual for all the IndexIVF inheritors.
Some training code can be put in IndexIVF and `train_residual` is replaced with `train_encoder`.
This will be used for the IndependentQuantizer work.
Reviewed By: alexanderguzhva
Differential Revision: D45987304
fbshipit-source-id: 7310a687b556b2faa15a76456b1d9000e21b58ce
Summary:
https://github.com/facebookresearch/faiss/issues/2727
Implements search_with_params function on c_api for index.
Implemented c_api equivalents of SearchParameters and SearchParametersIVF.
My C/C++ is pretty rusty so I imagine my arguments to the new functions for each search parameters could be refined. Happy to take suggestions :)
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2732
Reviewed By: alexanderguzhva
Differential Revision: D43917264
Pulled By: mdouze
fbshipit-source-id: c9eaf0d96ec0fad4862528aac9b5946294f5e444
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2682
IndexShards normally sees the indexes as opaque, so there is no way to factrorize the coarse quantizer.
This diff introduces IndexIVFShards that handles IVF indexes with a common quantizer so that the quantization is computed only once.
Reviewed By: alexanderguzhva
Differential Revision: D42781513
fbshipit-source-id: 441316eff4c1ba0468501c456af9194ea5f042d6
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2676
This is a cosmetic diff where default values for scalar fields are moved to the .h file where the object is declared rather than in the object constructor. The advantage is that it is easier to read the default value of a parameter from just the class.
Reviewed By: alexanderguzhva
Differential Revision: D42722205
fbshipit-source-id: 9d9bc4641068f6d6233f60f0a3a16ab793c94bb8
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2483
This diff changes the following:
1. all search functions now take a `SearchParameters` argument that overrides the internal search parameters
2. the default implementation for most classes throws when the params argument is non-nullptr / non-None
3. the IndexIVF and IndexHNSW classes have functioning SearchPArameters
4. the SearchParameters includes an IDSelector that can search only in a subset of the index based on a defined subset of ids
There is also some refactoring: the IDSelector was moved to its own .h/.cpp and python/__init__.py is spit in parts.
The diff is quite bulky because the search function prototypes need to be changed in all index classes.
Things to fix in subsequent diffs:
- support SearchParameters for more index types (Flat variants)
- better sub-object ownership for SearchParams (with std::unique_ptr?)
- special handling of IDSelectorRange to make it faster
Reviewed By: alexanderguzhva
Differential Revision: D39852589
fbshipit-source-id: 4988bdb5b9bee1207cd327d3f80bf5e0e2467fe1
Summary:
This line should be searching for xq, but the current line is searching the first 5 line in xb. It should be a bug.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2419
Reviewed By: alexanderguzhva
Differential Revision: D39167612
Pulled By: mdouze
fbshipit-source-id: fc2534fa799dcbedae1af7881f5d70026b4de675
Summary:
Signed-off-by: Ryan Russell <git@ryanrussell.org>
Various readability fixes focused on `.md` files:
- Grammar
- Fix some incorrect command references to `distributed_kmeans.py`
- Styling the markdown bash code snippets sections so they format
Attempted to put a lot of little things into one PR and commit; let me know if any mods are needed!
Best,
Ryan
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2378
Reviewed By: alexanderguzhva
Differential Revision: D37717671
Pulled By: mdouze
fbshipit-source-id: 0039192901d98a083cd992e37f6b692d0572103a
Summary:
Exporting a few more functions to the C API
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2367
Reviewed By: alexanderguzhva
Differential Revision: D37480505
Pulled By: mdouze
fbshipit-source-id: 899baca8795e29b20e16b56ea3c0d13960e1ea37
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2255
The `DistanceComputer` object is derived from an Index (obtained with `get_distance_computer()`). It maintains a current query and quickly computes distances from that query to any item in the database. This is useful, eg. for the IndexHNSW and IndexNSG that rely on query-to-point comparisons in the datasets.
This diff introduces the `FlatCodesDistanceComputer`, that inherits from `DistanceComputer` for Flat indexes. In addition to the distance-to-item function, it adds a `distance_to_code` that computes the distance from any code to the current query, even if it is not stored in the index.
This is implemented for all FlatCode indexes (IndexFlat, IndexPQ, IndexScalarQuantizer and IndexAdditiveQuantizer).
In the process, the two classes were extracted to their own header file `impl/DistanceComputer.h`
Reviewed By: beauby
Differential Revision: D34863609
fbshipit-source-id: 39d8c66475e55c3223c4a6a210827aa48bca292d
Summary:
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2132
This diff adds the class IndexFlatCodes that becomes the parent of all "flat" encodings.
IndexPQ
IndexFlat
IndexAdditiveQuantizer
IndexScalarQuantizer
IndexLSH
Index2Layer
The other changes are:
- for IndexFlat, there is no vector<float> with the data anymore. It is replaced with a `get_xb()` function. This broke quite a few external codes, that this diff also attempts to fix.
- I/O functions needed to be adapted. This is done without changing the I/O format for any index.
- added a small contrib function to get the data from the IndexFlat
- the functionality has been made uniform, for example remove_ids and add are now in the parent class.
Eventually, we may support generic storage for flat indexes, similar to `InvertedLists`, eg to memmap the data, but this will again require a big change.
Reviewed By: wickedfoo
Differential Revision: D32646769
fbshipit-source-id: 04a1659173fd51b130ae45d345176b72183cae40
Summary:
I want to invoke norm computations by using CGO, but I find some functions which have been implemented in cpp are not exported in c api, so I commit the PR to solve the problem.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/2036
Reviewed By: beauby
Differential Revision: D30762172
Pulled By: mdouze
fbshipit-source-id: 097b32f29658c1864bd794734daaef0dd75d17ef
Summary:
This adds some more functions to the C API, under a new DeviceUtils_c.h module. Resolves https://github.com/facebookresearch/faiss/issues/1414.
- `faiss_get_num_gpus`
- `faiss_gpu_profiler_start`
- `faiss_gpu_profiler_stop`
- `faiss_gpu_sync_all_devices`
The only minor issue right now is that building this requires basing it against an older version of Faiss until the building system is updated to use CMake (https://github.com/facebookresearch/faiss/issues/1390). I have provided a separate branch with the same contribution which is based against a version that works and builds OK: [`imp/c_api/add_gpu_device_utils`](https://github.com/Enet4/faiss/tree/imp/c_api/add_gpu_device_utils)
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1613
Reviewed By: wickedfoo
Differential Revision: D25942933
Pulled By: mdouze
fbshipit-source-id: 5b73a86b0c1702dfb7b9e56bd741f72495aac2fd
Summary:
This diff streamlines the code that collects results for brute force distance computations for the L2 / IP and range search / knn search combinations.
It introduces a `ResultHandler` template class that abstracts what happens with the computed distances and ids. In addition to the heap result handler and the range search result handler, it introduces a reservoir result handler that improves the search speed for large k (>=100).
Benchmark results (https://fb.quip.com/y0g1ACLEqJXx#OCaACA2Gm45) show that on small datasets (10k) search is 10-50% faster (improvements are larger for small k). There is room for improvement in the reservoir implementation, whose implementation is quite naive currently, but the diff is already useful in its current form.
Experiments on precomputed db vector norms for L2 distance computations were not very concluding performance-wise, so the implementation is removed from IndexFlatL2.
This diff also removes IndexL2BaseShift, which was never used.
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1502
Test Plan:
```
buck test //faiss/tests/:test_product_quantizer
buck test //faiss/tests/:test_index -- TestIndexFlat
```
Reviewed By: wickedfoo
Differential Revision: D24705464
Pulled By: mdouze
fbshipit-source-id: 270e10b19f3c89ed7b607ec30549aca0ac5027fe
Summary:
This diff fixes https://github.com/facebookresearch/faiss/issues/1412
There were various inconsistencies in how the shard and replica wrappers updated their internal state as the sub-indices were updated. This makes the two container classes work in the same way with similar synchronization functionality.
Reviewed By: beauby
Differential Revision: D23974186
fbshipit-source-id: c688c0c9124f823e4239aa2ff617b007b4564859
* Pass compilation in current state
* Fix formatting and add missing parts
* Define DistanceComputer
* Add documentation to name changed on DistanceComputer::operator()()
* Apply suggestions from code review
Changes in docs
Co-Authored-By: Eduardo Pinho <enet4mikeenet@gmail.com>
* Updated MetricType
* [c_api] use all relevant flags in compilation
* [c_api] Remove redundant IndexIVFFlat declarations
- From IndexIVF_c.h, already declared in IndexIVFFlat_c.h
* [c_api] type changes
- replace `long` with a more suitable type
- provide definitions for `faiss_component_t` and `faiss_distance_t`
* [c_api] Define CFLAGS and CUDACFLAGS
* [c_api] Update impl and interface for v1.5
- move IndexShards to dedicated module IndexShards_c.{h|cpp}
- remove getter/setters to unreachable fields
- reimplement faiss_IndexIVF_imbalance_factor (to use invlists)
- minor IndexIVF documentation tweaks
- Remove QueryResult, provide RangeQueryResult
* [c_api] Document FaissErrorCode
* [c_api] Update GPU impl and interface for v1.5
- Remove unavailable method setTempMemoryFraction
* [c_api] Relicense to MIT
In accordance to the rest of the project