Summary: Copy construction of Aligned table was wrong, which crashed cloning of IVFPQ.
Reviewed By: wickedfoo
Differential Revision: D26426400
fbshipit-source-id: 1d43ea6309d0a56eb592f9d6c5b52282f494e653
Summary:
IndexPQ and IndexIVFPQ implementations with AVX shuffle instructions.
The training and computing of the codes does not change wrt. the original PQ versions but the code layout is "packed" so that it can be used efficiently by the SIMD computation kernels.
The main changes are:
- new IndexPQFastScan and IndexIVFPQFastScan objects
- simdib.h for an abstraction above the AVX2 intrinsics
- BlockInvertedLists for invlists that are 32-byte aligned and where codes are not sequential
- pq4_fast_scan.h/.cpp: for packing codes and look-up tables + optmized distance comptuation kernels
- simd_result_hander.h: SIMD version of result collection in heaps / reservoirs
Misc changes:
- added contrib.inspect_tools to access fields in C++ objects
- moved .h and .cpp code for inverted lists to an invlists/ subdirectory, and made a .h/.cpp for InvertedListsIOHook
- added a new inverted lists type with 32-byte aligned codes (for consumption by SIMD)
- moved Windows-specific intrinsics to platfrom_macros.h
Pull Request resolved: https://github.com/facebookresearch/faiss/pull/1542
Test Plan:
```
buck test mode/opt -j 4 //faiss/tests/:test_fast_scan_ivf //faiss/tests/:test_fast_scan
buck test mode/opt //faiss/manifold/...
```
Reviewed By: wickedfoo
Differential Revision: D25175439
Pulled By: mdouze
fbshipit-source-id: ad1a40c0df8c10f4b364bdec7172e43d71b56c34
Changelog:
- changed license: BSD+Patents -> MIT
- propagates exceptions raised in sub-indexes of IndexShards and IndexReplicas
- support for searching several inverted lists in parallel (parallel_mode != 0)
- better support for PQ codes where nbit != 8 or 16
- IVFSpectralHash implementation: spectral hash codes inside an IVF
- 6-bit per component scalar quantizer (4 and 8 bit were already supported)
- combinations of inverted lists: HStackInvertedLists and VStackInvertedLists
- configurable number of threads for OnDiskInvertedLists prefetching (including 0=no prefetch)
- more test and demo code compatible with Python 3 (print with parentheses)
- refactored benchmark code: data loading is now in a single file
* Refactors Makefiles and add configure script.
* Give MKL higher priority in configure script.
* Clean up Linux example makefile.inc.
* Cleanup makefile.inc examples.
* Fix python clean Makefile target.
* Regen swig wrappers.
* Remove useless CUDAFLAGS variable.
* Fix python linking flags.
* Separate compile and link phase in python makefile.
* Add macro to look for swig.
* Add CUDA check in configure script.
* Cleanup make depend targets.
* Cleanup CUDA flags.
* Fix linking flags.
* Fix python GPU linking.
* Remove useless flags from python gpu module linking.
* Add check for cuda libs.
* Cleanup GPU targets.
* Clean up test target.
* Add cpu/gpu targets to python makefile.
* Clean up tutorial Makefile.
* Remove stale OS var from example makefiles.
* Clean up cuda example flags.