faiss

mirror of https://github.com/facebookresearch/faiss.git synced 2025-06-03 21:54:02 +08:00

Author	SHA1	Message	Date
Lucas Hosseini	7212261a86	Fix docker build for GPU nightlies. Reviewed By: wickedfoo Differential Revision: D24670301 fbshipit-source-id: 5b19a9a88a880c20e51f6db1ce663224cf8d212c	2020-11-02 07:49:33 -08:00
Lucas Hosseini	04fde4032a	Fix nightly build for GPU. Reviewed By: wickedfoo Differential Revision: D24559296 fbshipit-source-id: a9fbf51c5153b8b2dff4b2dd684cd84f5aaabc49	2020-10-26 22:33:52 -07:00
Jeff Johnson	8d776e6453	PyTorch tensor / Faiss index interoperability (#1484 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1484 This diff allows for native usage of PyTorch tensors for Faiss indexes on both CPU and GPU. It is currently only implemented in this diff for things that inherit from `faiss.Index`, which covers the non-binary indices, and it patches the same functions on `faiss.Index` that were also covered by `__init__.py` for numpy interoperability. There must be uniformity among the inputs: if any array input is a Torch tensor, then all array inputs must be Torch tensors. Similarly, if any array input is a numpy ndarray, then all array inputs must be numpy ndarrays. If `faiss.contrib.torch_utils` is imported, it ensures that `import faiss` has already been performed to patch all of the functions using the base `__init__.py` numpy wrappers, and then patches the following functions again: ``` add add_with_ids assign train search remove_ids reconstruct reconstruct_n range_search update_vectors search_and_reconstruct sa_encode sa_decode ``` to allow usage of PyTorch CPU tensors, and additionally PyTorch GPU tensors if the index being used is on the GPU. numpy functionality is still available when `faiss.contrib.torch_utils` is imported; we pass through to the original patched numpy function when we detect numpy inputs. In addition, to allow for better (asynchronous) GPU usage without requiring the CPU to be involved, all of these functions which construct tensors/arrays for output now take optional arguments for storage (numpy or torch.Tensor) to be provided that will contain the output data. `range_search` is the only exception to this, as the size of the output data is indeterminate. The eventual GPU implementation will likely require the user to provide a maximum cap on the output size, and allow that to be passed instead. If the optional pre-allocated output values are presented by the user, they are used; otherwise, new return ndarray / Tensors are constructed as before and used for the return. If this feature were not provided on the GPU, then every execution would be completely serial as we would depend upon the CPU to allocate GPU memory before every operation. Instead, now this can function much like NN graph execution on the GPU, assuming that all of the data requirements are pre-allocated, so the execution will run at the full speed of the GPU and not be stalled sequentially launching kernels. This diff also exposes the `GpuResources` shared_ptr object owned by a GPU index. This is required for pytorch GPU so that we can perform proper stream ordering in Faiss with respect to the current pytorch stream. So, Faiss indices now perform more or less as any NN operation in Torch does. Note, however, that a Faiss index has its own setting on current device, and if the pytorch GPU tensor inputs are resident on a different device than what the Faiss index expects, a cross-device copy will be initiated. I may choose to make this an error in the future and require matching device to device. This diff also found a bug when passing GPU data directly to `train()` for `GpuIndexIVFFlat` and `GpuIndexIVFScalarQuantizer`, as I guess we never tested passing GPU data directly to these functions before. `GpuIndexIVFPQ` was doing the right thing however. The assign function is now also implemented on the GPU as well, and is now marked `const` to be in line with the `search` function. Also added better checking of non-contiguous inputs for both Torch tensors and numpy ndarrays. Updated the `knn_gpu` function with a base implementation always present that allows for usage of numpy arrays, which is overridden when `torch_utils` is imported to allow torch usage. This supports row/column major layout, float32/float16 data and int64/int32 indices for both numpy and torch. Reviewed By: mdouze Differential Revision: D24299400 fbshipit-source-id: b4f117b9c120bd1ad83e7702087051ab7b303b29	2020-10-23 22:24:22 -07:00
Lucas Hosseini	bab6db84e0	Make pytorch available in CircleCI. (#1486 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1486 Reviewed By: wickedfoo Differential Revision: D24493486 Pulled By: beauby fbshipit-source-id: 00156213061503ff593b2e9ede062850b23527a9	2020-10-22 21:10:32 -07:00
Lucas Hosseini	7891094da6	Add nightly packages for GPU. (#1485 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1485 Test Plan: Imported from OSS Reviewed By: wickedfoo Differential Revision: D24492171 Pulled By: beauby fbshipit-source-id: 20fbcbdd50ab30e110e41b34e0c07d88432b1422	2020-10-22 19:47:13 -07:00
Lucas Hosseini	0b365fa6d8	Add docker image for CUDA 10.1. (#1477 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1477 Test Plan: Imported from OSS Reviewed By: wickedfoo Differential Revision: D24492173 Pulled By: beauby fbshipit-source-id: 5247accb2dc31bb125f9b06fb2275346b2e6465f	2020-10-22 19:47:13 -07:00
Lucas Hosseini	616dc44e1e	Update conda package for faiss-gpu. (#1476 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1476 Test Plan: Imported from OSS Reviewed By: wickedfoo Differential Revision: D24492172 Pulled By: beauby fbshipit-source-id: 63497b54d8aed10d45ebc4ed7659dd1d18b36edf	2020-10-22 19:47:13 -07:00
Lucas Hosseini	64c13cdda3	Run python gpu tests. (#1479 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1479 Reviewed By: mdouze Differential Revision: D24413984 Pulled By: beauby fbshipit-source-id: 006343c996a278df1d9fc70e11283d31d63a0330	2020-10-22 09:56:30 -07:00
Matthijs Douze	f2369fcc82	benchmark SSD IndexIVF Summary: This is some code for benchmakring the SSD reads. Reviewed By: MDSilber Differential Revision: D24457715 fbshipit-source-id: 475668e4dc450dc4652ef8828111335c236bfa44	2020-10-21 18:21:39 -07:00
Eyal Trabelsi	33e319f8b3	Skip TestShardedFlat for 1 GPU fixes #982 (#1466 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1466 Reviewed By: wickedfoo, mdouze Differential Revision: D24448344 Pulled By: beauby fbshipit-source-id: 7e8c563f1a5a1d745a1073319365c485fcbe1698	2020-10-21 09:15:12 -07:00
Jeff Johnson	e9dda0590c	Deeper tests of GPU IVF list equality + test that would have caught bug fixed by D24405231 (#1480 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1480 There was no test that captured the bug fixed in D24405231, namely that the GPU's version of the IVF lists for IVFSQ both contained garbage and the lists were longer than the CPU version. This diff contains 4 changes: - provides an API for accessing both the list indices and encoded vectors for all IVF GPU index types, as well as the number of lists, on par with the CPU's InvertedLists structure. The encoded vectors are returned in the expected, canonical CPU format (even when the GPU version may differ) - Updates the inverted list vector encoding from `unsigned char` to `uint8_t` to match the CPU's InvertedLists datatype - Adds tests for IVFFlat, IVFPQ and IVFSQ to explicitly assert CPU and GPU IVF list equality when copying both to and from GPU. - Removed usage of `long` that represented indices in Faiss GPU and replaced with `Index::idx_t` everywhere Reviewed By: beauby Differential Revision: D24411004 fbshipit-source-id: b3335e559102008d805122f3b4594db6738c3ae9	2020-10-20 17:17:00 -07:00
Matthijs Douze	cf6593bdca	Update ISSUE_TEMPLATE.md (#1462 ) Summary: add a field to ask people how they installed Faiss Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1462 Reviewed By: LowikC Differential Revision: D24279581 Pulled By: mdouze fbshipit-source-id: 2492c73e31d22f3b7f37de6bcfcac90eae0ccd07	2020-10-20 04:37:50 -07:00
Matthijs Douze	9c51bbb977	Fix faiss_contrib (#1478 ) Summary: Fixes the path issue mentioned in https://github.com/facebookresearch/faiss/issues/1472 Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1478 Reviewed By: LowikC Differential Revision: D24394529 Pulled By: mdouze fbshipit-source-id: 5e4261a0f271751c736c562514f2ee8604c50702	2020-10-20 04:35:19 -07:00
Matthijs Douze	92306e3a69	Synthetic dataset with inner product option Summary: The synthetic dataset can now have IP groundtruth Reviewed By: wickedfoo Differential Revision: D24219860 fbshipit-source-id: 42e094479311135e932821ac0a97ed0fb237bf78	2020-10-20 03:46:26 -07:00
Jeff Johnson	fbb6789f0e	cuda 11 fix Summary: Fix compilation of a CUDA 11 API to disable tensor core usage. Reviewed By: ip4368 Differential Revision: D24404288 fbshipit-source-id: 5cc9fdcf3c86669bc85d5c13a7a523daf7fee62d	2020-10-19 18:53:43 -07:00
Lucas Hosseini	89187fee3c	Fix CMake build on GPU. (#1475 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1475 Test Plan: Imported from OSS Reviewed By: mdouze Differential Revision: D24393544 Pulled By: beauby fbshipit-source-id: 4776cb75dc11e5bb4bb4417ee44f0c794084c301	2020-10-19 16:56:17 -07:00
Jeff Johnson	6be85b0554	GPU IVFSQ code_size fix Summary: This bug was introduced in D24064745, which broke the code distance for GPU IVFSQ. The `code_size` is size in bytes per encoded vector, not per scalar. This diff updates the expression for computing GPU and CPU vector sizes. The bug was not seen on FB machines since it appears that the memory allocator (jemalloc?) was more forgiving in terms of mapped page sizes, and garbage tends to be far away in N-dimensional space from real queries. Reviewed By: beauby Differential Revision: D24405231 fbshipit-source-id: f9ad0d3f326afe412ea864537a24efbd74d97f1f	2020-10-19 16:54:00 -07:00
Matthijs Douze	28edc56fa8	Search in sharded invlists Summary: This diff adds a CombinedIndexSharded1T class to combined_index that uses the 30 shards from the Spark reducer. The metadata is stored in pickle files on manifold. Differential Revision: D24018824 fbshipit-source-id: be4ff8b38c3d6e1bb907e02b655d0e419b7a6fea	2020-10-19 10:39:22 -07:00
Jeff Johnson	8fd753e7e6	Disable tensor core usage on V100 GPUs / CUDA 10 Summary: Tensor core usage on V100 + CUDA 10 for sgemm f32 x f32 = f32 seems to allow demotion of the inputs to f16 (wtf?!), resulting in an unacceptable loss of precision. All accumulation in Faiss is f32, and the use cases for f16 x f16 = f32 are opt-in and relatively small in practice I believe. For A100 or CUDA 11, there is an option to only allow tensor core usage if we guarantee preservation of precision, which we instead prefer. Reviewed By: beauby Differential Revision: D24348944 fbshipit-source-id: d22cfaa233d21ee9c20974914ad155dab8c901fd	2020-10-16 11:50:27 -07:00
Lucas Hosseini	d1f72c5922	Remove guard for OSX packages. Reviewed By: wickedfoo Differential Revision: D24295557 fbshipit-source-id: 87608edba1c67f10ea11a4ceb73234ada7663bab	2020-10-13 19:44:26 -07:00
Lucas Hosseini	4a63f77fde	Remove redundant jobs for releases. Reviewed By: mdouze Differential Revision: D24286656 fbshipit-source-id: 93c176e5c063f845114f76e2ac01dbad69b70fb5	2020-10-13 14:03:04 -07:00
Jeff Johnson	92f2391f41	Remove unused nvidia host fp16 headers/functions Summary: Removes unused host fp16 code, the dependency upon which was removed a while ago. Reviewed By: beauby Differential Revision: D24279982 fbshipit-source-id: 5f6820c41eb387f766b2bed7e70203f5e01f49e9	2020-10-13 11:47:00 -07:00
Lucas Hosseini	70eaa9b1a3	Add missing copyright headers. (#1460 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1460 Reviewed By: wickedfoo Differential Revision: D24278804 Pulled By: beauby fbshipit-source-id: 5ea96ceb63be76a34f1eb4da03972159342cd5b6	2020-10-13 11:15:59 -07:00
Jeff Johnson	e796f4f9df	Improve Faiss / PyTorch GPU interoperability Summary: PyTorch GPU in general is free to use whatever stream it currently wants, based on `torch.cuda.current_stream()`. Due to C++/python language barrier issues, we couldn't previously pass the actual `cudaStream_t` that is currently in use on a given device from PyTorch C++ to Faiss C++ via python. This diff adds conversion functions to convert a Python integer representing a pointer to a `cudaStream_t` (which is itself a `CUstream_st*`), so we can pass the stream specified in `torch.cuda.current_stream()` to `StandardGpuResources::setDefaultStream`. We thus guarantee that all Faiss work is ordered on the same stream that is in use in PyTorch. For use in Python, there is now the `faiss.contrib.pytorch_tensors.using_stream` context object which automatically sets and unsets the current PyTorch stream within Faiss. This takes a `StandardGpuResources` object in Python, and an optional `torch.cuda.Stream` if one wants to use a different stream, otherwise it uses the current one. This is how it is used: ``` # Create a non-default stream s = torch.cuda.Stream() # Have Torch use it with torch.cuda.stream(s): # Have Faiss use the same stream as the above with faiss.contrib.pytorch_tensors.using_stream(res): # Do some work on the GPU faiss.bfKnn(res, args) ``` `using_stream` uses the same pattern as the Pytorch `torch.cuda.stream` object. This replaces any brute-force GPU/CPU synchronization work that was necessary before. Other changes in this diff: - cleans up the config objects in the GpuIndex subclasses, to distinguish between read-only parameters that can only be set upon index construction, versus those that can be changed at runtime. - StandardGpuResources now more properly distinguishes between user-supplied streams (like the PyTorch one) which will not be destroyed upon resources destruction, versus internal streams. - `search_index_pytorch` now needs to take a `StandardGpuResources` object as well, there is no way to get this from an index instance otherwise (or at least, I would have to return a `shared_ptr`, in which case we should just update the Python SWIG stuff to use `shared_ptr` for GpuResources or something. Reviewed By: mdouze Differential Revision: D24260026 fbshipit-source-id: b18bb0eb34eb012584b1c923088228776c10b720	2020-10-13 09:11:19 -07:00
Lucas Hosseini	b459931ae4	Mark ~BufferedIOWriter() noexcept(false) (#1459 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1459 In C++11, destructors default to `noexcept(true)`. This destructor can throw (through `FAISS_THROW_IF_NOT()`), so marking it accordingly. Reviewed By: mdouze Differential Revision: D24253879 fbshipit-source-id: 7ba40387ed214dc2a03a495bc0d31ac9601c4c15	2020-10-13 06:55:27 -07:00
Lucas Hosseini	882d4f1051	Fix conda build cmd for nightly packages. Reviewed By: mdouze Differential Revision: D24269549 fbshipit-source-id: 8e6800781ae54a67c1d2424611a761f838d12026	2020-10-12 22:17:11 -07:00
Lucas Hosseini	52c6465a5e	Test reporting in CircleCI (#1452 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1452 Reviewed By: mdouze Differential Revision: D24269693 Pulled By: beauby fbshipit-source-id: a1a98d263ef4b20107c845421615b1f35b52c6e2	2020-10-12 22:13:31 -07:00
Lucas Hosseini	8e44bff055	Nightly builds (#1451 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1451 Reviewed By: mdouze Differential Revision: D24245777 Pulled By: beauby fbshipit-source-id: f2ce92b28e3d7ffdc2e85bcd78d321da15fec87e	2020-10-12 14:40:52 -07:00
Lucas Hosseini	0aaf0a6357	Enable tests by default. (#1458 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1458 Reviewed By: mdouze Differential Revision: D24252321 Pulled By: beauby fbshipit-source-id: 38dc1f710c63ff1a292e962c636c380d82281b7f	2020-10-12 10:39:35 -07:00
Lucas Hosseini	38bf9f64e9	Get rid of stale generated docs. Summary: Those docs are not very useful as is, and having to re-generate the html manually leads to them being stale most of the time. Should we decide that we want to have them, we can bring them back with some automated generation. Reviewed By: mdouze Differential Revision: D24246072 fbshipit-source-id: 39798b2861ff25ee3fa1f95abdbc3e7ddf3469ed	2020-10-12 00:33:37 -07:00
Lucas Hosseini	0fb6c00cfa	Bump version to 1.6.4 (#1453 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1453 Reviewed By: mdouze Differential Revision: D24243546 Pulled By: beauby fbshipit-source-id: 190e0601bad3f08c5fac37923170a68ba1e83f16 v1.6.4	2020-10-12 00:01:06 -07:00
Lucas Hosseini	81b1aeea5e	Fix warnings. Reviewed By: wickedfoo Differential Revision: D24168429 fbshipit-source-id: 0f5fe5eee0f313224a4681dc84ba05169ceb482d	2020-10-09 16:49:18 -07:00
Matthijs Douze	8b05434a50	Remove useless function Summary: Removed an unused function that caused compile errors in some configurations. Added contrib function (exhaustive_search.knn) to compute the k nearest neighbors without constructing an index. Renamed the equivalent GPU function as exhaustive_search.knn_gpu (it does not make much sense to mention numpy in the name as all functions take numpy arguments by default). Reviewed By: beauby Differential Revision: D24215427 fbshipit-source-id: 6d8e1eafa7c57593304b7b76f83b3015e4d2a2bb	2020-10-09 07:57:04 -07:00
Jeff Johnson	0412d761e5	GPU brute-force kNN can take int32 indices (#1445 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1445 As requested in https://github.com/facebookresearch/faiss/issues/1304, `bfKnn` can now produce int32 indices for output. The native kernels themselves for brute-force kNN only operate on int32 indices in any case, so this is faster. Also added a SWIG definition for float16 numpy arrays. As there is not a native half type, the reverse definition is undefined, so this is only really used for taking float16 data (e.g., from numpy) as input when in Python. Added a `knn_numpy_gpu` wrapper as well that handles calling the `bfKnn` GPU implementation using CPU numpy arrays. This handles transposition and f32/f16/i32 data types as needed. Reviewed By: mdouze Differential Revision: D24152296 fbshipit-source-id: caa7daea23438cf26aa248e380f0dab2b6b907fd	2020-10-08 17:50:42 -07:00
cclauss	efa1e3f64f	Use print() function in both Python 2 and Python 3 (#1443 ) Summary: Legacy __print__ statements are syntax errors in Python 3 but __print()__ function works as expected in both Python 2 and Python 3. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1443 Reviewed By: LowikC Differential Revision: D24157415 Pulled By: mdouze fbshipit-source-id: 4ec637aa26b61272e5337d47b7796a330ce25bad	2020-10-08 00:27:29 -07:00
Jeff Johnson	9b007c7418	GPU supports arbitrary dimensions per PQ sub-quantizer Summary: This diff removes a long-standing limitation with GpuIndexIVFPQ, in that only a limited number of dimensions per sub-quantizer were supported when not using precomputed codes. This is part of the general cleanup and extension/optimization that I am performing of the GPU PQ code. Now, we keep the same old specialized distance computations, but if we attempt to use a number of dimensions per sub-Q that are not specialized, we fall back to a general implementation based on batch matrix multiplication for computing PQ distances per code. The batch MM PQ distance computation is enabled automatically if you use an odd number of dimensions per sub-quantizer (say, 7, 11, 53, ...). It can also be manually enabled via the `useMMCodeDistance` option in `GpuIndexIVFPQConfig` for testing purposes, though the result should be within some epsilon of the other implementation. This diff also removes the iterated GEMM wrapper. I don't honestly know why I was using this instead of `cublasGemmStridedBatchedEx`, maybe I couldn't find that or this was originally implemented in a much older version of CUDA. The iterated GEMM call was used in a few other places (e.g., precomputed code computation). Now, this (and the PQ distance computation) use batch MM which is a single CUDA call. This diff also adds stream synchronization to the temporary memory manager, as the fallback PQ distance computation needs to use temporary memory, and there were too many buffers for these to pre-allocate. It also fixes the bug in https://github.com/facebookresearch/faiss/issues/1421. Reviewed By: mdouze Differential Revision: D24130629 fbshipit-source-id: 1c8bc53c86d0523832ad89c8bd4fa4b5fc187cae	2020-10-06 11:06:03 -07:00
Matthijs Douze	5ad630635c	expose threat-safe stats (#1438 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1438 This diff changes Faiss and the `combined_index.py` to propagate thread-safe stats to handler.py Reviewed By: MDSilber Differential Revision: D24082543 fbshipit-source-id: 944e6b7630daeede5eb9501b81557a6fe5afec44	2020-10-03 23:26:36 -07:00
Jeff Johnson	6918f4ee48	Alternative GPU IVFPQ layout + unify IVF handling code Summary: This diff contains the following changes: - adds support for an alternative IVFPQ memory layout, where the codes are interleaved by vector in groups of 32 rather than sub-quantizer, in order to support a variety of SIMD-like optimizations for PQ lookup kernels (eg like SCANN or other techniques that use in-register storage). This internal GPU-only format is transparent to the rest of the code, and attempts to copy an index to/from CPU deal with the difference in layout. The feature is enabled using `GpuIndexIVFPQConfig::alternativeLayout` upon index construction, which is not intended for general use yet, though it is functional. This is the difference in layout explained: ``` /// The default memory layout is [vector][PQ component]: /// (v0 d0) (v0 d1) ... (v0 dD-1) (v1 d0) (v1 d1) ... /// /// An alternative memory layout (layoutBy32) is /// [vector / 32][PQ component][vector % 32] with padding: /// (v0 d0) (v1 d0) ... (v31 d0) (v0 d1) (v1 d1) ... (v31 dD-1) (v31 dD-1) (v33 /// d0) ... /// so the list length is always a multiple of numSubQuantizers * 32 ``` - adds kernels to support IVFPQ query using this format. These new kernels are naive implementations that do not use register or shared memory at all to store the code distance information, however unlike the prior GPU IVFPQ code, they support arbitrary-sized PQ encodings (arbitrary dimensions per sub-quantizer and arbitrary number of sub-quantizers per vector). This is enabled for both precomputed and normal codes. It is my intention that I will eventually remove the restriction on dimensions per sub-Q / number of sub-Qs in the GPU code so that it functions more like the CPU code, though due to the necessity of implementation specialization, it will be likely that a small number of choices will be optimized, leaving the rest to use slower fallback implementations. It is likely that this will eventually become the default / only format supported by the GPU, but the optimized kernels have not yet been developed using this layout. This diff is being checked in first in order to checkpoint the development. The existing lookup kernels and storage have not been affected. Furthermore, it may be likely that IVFFlat and IVFSQ eventually change to this interleaved format as they offer some advantages in implementation. - Unifies the IVF handling and copy code more between IVFFlat, IVFPQ and IVFScalarQuantizer on the GPU. There was a lot of copy pasta/duplicated code between the three implementations which was also divergent. This code is now all handled by the IVFBase and GpuIndexIVF classes. - Adds a logging feature to StandardGpuResources which allows for printing all memory allocation/deallocation requests to the console as they happen in real time. This is useful for debugging. Reviewed By: mdouze Differential Revision: D24064745 fbshipit-source-id: 434fb4ec39aaba32271742ba7a40460847386141	2020-10-02 09:37:28 -07:00
Matthijs Douze	65ee09484f	Test GPU ground-truth computation (#1432 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1432 The contrib function knn_ground_truth does not provide exactly the same resutls on GPU and CPU (but relative accuracy is still 1e-7). This diff relaxes the constraint on CPU and added test on GPU. Reviewed By: wickedfoo Differential Revision: D24012199 fbshipit-source-id: aaa20dbdf42b876b3ed7da34028646dbb20833d3	2020-09-30 11:14:18 -07:00
Jeff Johnson	ef6e53f8ba	Cleanup flag/data propagation for IndexShards and IndexReplicas Summary: This diff fixes https://github.com/facebookresearch/faiss/issues/1412 There were various inconsistencies in how the shard and replica wrappers updated their internal state as the sub-indices were updated. This makes the two container classes work in the same way with similar synchronization functionality. Reviewed By: beauby Differential Revision: D23974186 fbshipit-source-id: c688c0c9124f823e4239aa2ff617b007b4564859	2020-09-29 10:25:46 -07:00
adream307	c69c18ef32	fix ivflib (#1410 ) Summary: if `ils = dynamic_cast<ArrayInvertedLists *> (index_ivf->invlists)` failed, `ils` would be `nullptr`. so check if `ils` is `nullptr` before use it. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1410 Reviewed By: beauby Differential Revision: D23985814 Pulled By: mdouze fbshipit-source-id: f62a3988e74b4de1f1c9a127475368302a35d4a5	2020-09-29 03:40:05 -07:00
Matthijs Douze	27393c436c	remove Lua support Summary: For some obscure reason Lua support depends on python2, which is going to be removed. https://fb.workplace.com/groups/311767668871855/permalink/4219593711422545/ Since the Lua interface is not used in any active code it seems, it's easier to just remove the Lua interface. It also removes the Faiss Lua dep in the few occurrences where it is used (omry see recog-eval). Reviewed By: wickedfoo Differential Revision: D23865458 fbshipit-source-id: 4149517af18acce29179d04152c7364c2548efa0	2020-09-23 22:51:59 -07:00
Lucas Hosseini	3ac3ca0fab	Update conda packages build. (#1422 ) Summary: This PR paves the way for nightly builds. + Get rid of cmake 3.17 manual install as cmake 3.18 is now available in conda. + Update docker files for conda packages. + Specify CUDA architectures via CMake's `CMAKE_CUDA_ARCHITECTURES`. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1422 Reviewed By: mdouze Differential Revision: D23870447 Pulled By: beauby fbshipit-source-id: 40ae7517e83356443a007a43261713e7e3a140d4	2020-09-23 10:18:25 -07:00
Matthijs Douze	d8af5138cd	fix double free on std::string Summary: there was a dynamic allocation on a std::string from multiple threads. Instead of adding a mutex in perfromance sensitive code, I use a statically allocated string instead. The stress test crashed before, now it runs fine. Reviewed By: wickedfoo Differential Revision: D23702154 fbshipit-source-id: 5dd37f1c151d8ce7f756f54a059235d8673cdabc	2020-09-18 00:57:50 -07:00
Matthijs Douze	15ffe976f0	fix tutorial files Summary: Added TARGETS files and fixed GPU tutorial code. Related to issue https://github.com/facebookresearch/faiss/issues/1408 Reviewed By: beauby Differential Revision: D23728732 fbshipit-source-id: 22bec7d439ffc187c144575e1e984111bfbfcbb2	2020-09-17 07:32:56 -07:00
Matthijs Douze	d2d32af6c5	properly propagate exceptions in inverted list reading (#1407 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1407 When Inverted list reading throws an exception, it is propagated until it reaches the openmp loop, which crashes the caller. This diff catches the exception and properly propagates it to the caller in Python. It should be possible to test it with an ondisk instead of relying on Manifold. Reviewed By: MDSilber Differential Revision: D23688968 fbshipit-source-id: 0943fac41d4e9b8b86535439e3fdee18ce96d4a5	2020-09-14 17:40:13 -07:00
Jeff Johnson	8f6a019ecd	fix float16 precomputed code issues (#1406 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1406 I appear to have broken this with the rework of float16 support in Faiss GPU, though I cannot figure out why the tests only started failing recently. cuBLAS does not support a f16 x f32 = f32 matrix multiplication. With a f16 coarse quantizer and IVFPQ precomputed codes, we were attempting to perform such a multiplication. Now, in the precomputed code calculation, we intercept this and change it to a f32 x f32 = f32 computation. The test when run by itself was also failing separately, though when run in series with the other test_gpu_index_ivfpq tests it was succeeding, due to the fact that the seed is only initialized once. The epsilon needed to change a slight bit. Reviewed By: mdouze Differential Revision: D23687070 fbshipit-source-id: 14a535407ed433eeaef3bc77cb0d6f5909c55b9f	2020-09-14 16:17:57 -07:00
Matthijs Douze	b4b576b139	Replace lower() numeric_limits (#1402 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1402 in C++03 there was no way to get the min value that is portable between int and float types (for float min returns the lowests strictly positive value). For C++11 this is lowest, so let's use it in the heap funcs. Reviewed By: beauby Differential Revision: D23622612 fbshipit-source-id: d3e3b2b7f695d971866f7b45bfc41986cd6b9bf4	2020-09-10 13:29:18 -07:00
Lucas Hosseini	9873376d8c	Add GPU CI. (#1378 ) Summary: Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1378 Reviewed By: mdouze Differential Revision: D23370605 Pulled By: beauby fbshipit-source-id: 6a9de47b6167ba8be12cce2735b1132991a190ab	2020-09-01 06:51:56 -07:00
Jeff Johnson	630a6f9702	reduced precision fix (#1388 ) Summary: Fix to https://github.com/facebookresearch/faiss/issues/1385, set the value during cuBLAS handle construction. Also the tensor core option is deprecated for CUDA 11+. Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1388 Test Plan: Unable to test numerical results, but builds on GCP A100 instance with CUDA 11. Reviewed By: mdouze Differential Revision: D23427285 Pulled By: wickedfoo fbshipit-source-id: d9487559035175ec7e06600dcd8f6a307f50abad	2020-09-01 00:26:50 -07:00

... 2 3 4 5 6 ...

547 Commits