Library Features
Here we present the main library features, including the supported index types, distance functions and data types.
Index Types
SVS supports the following index types:
Graphs for static datasets
The Vamana graph (in Python, in C++) enables fast in-memory graph-based similarity search with high accuracy for static databases, where the database is fixed and never updated.
Graphs for streaming data
The DynamicVamana graph (in Python, in C++) enables fast in-memory graph-based similarity search with high accuracy for streaming data, where the database is built dynamically by adding and removing vectors.
Flat Index
The flat index (in Python, in C++) can be used to run exhaustive search, e.g., useful to compute the ground-truth nearest neighbors for a dataset.
Distance functions
SVS supports the distance functions listed in Built-In Distance Functors (see svs.DistanceType
for the corresponding
Python classes). The distance function is specified when the index is created by the corresponding index constructors. In the
case of the Vamana index, it must also be specified when the graph is built (see svs.Vamana.build
and
svs::Vamana::build()
for details).
Data types
The supported data types are: float32, float16, int8 and uint8. Other data types might work but performance has not been tested.
The data type can be set independently for the database vectors and the query vector. For example, one could compress the database vectors to float16, which allows for a 2x storage reduction often with negligible accuracy loss, and keep the query in float32.
In Python
The data type for the database vectors is specified by the data_type
argument when the vectors are loaded with
svs.VectorDataLoader
. The data type for the
query vectors is specified in the query_type
argument for the corresponding index constructors
(svs.Vamana
, svs.Flat
).
In C++
The database vectors data type is specified in the template argument of the svs::VectorDataLoader
.
svs::VectorDataLoader<float>("data_f32.svs")
For details on setting the query vectors data type see Indexes.
Warning
This will not perform any dataset conversion. If a dataset was saved to disk as float16 data, for example,
then it must be loaded with data_type = svs.DataType.float16
in Python or
svs::Float16
in C++.
The supported data type combinations for (queries, database vectors) are: (float32, float32), (float32, float16), (uint8, uint8), (int8, int8), among others.
Vector compression
The memory footprint can be reduced and the search performance improved by combining the graph-search with the Locally-adaptive Vector Quantization (LVQ) [ABHT23] approach. The library has support for performing online vector compression with LVQ or for loading an index with a previously compressed dataset. See How to Choose Compression Parameters for more details about LVQ and how to set its parameters.
Search with compressed vectors
See Search using vector compression for details on how to use LVQ for search.
Graph building with compressed vectors
LVQ-compressed vectors can be used to build the graph, thus reducing the memory footprint not only for search but also for indexing. Depending on the dataset, the search accuracy can be almost unchanged even when the graph is built with highly compressed vectors using LVQ-4 or LVQ-8, that is, using only 4 or 8 bits per vector component [ABHT23].
To build the graph using compressed vectors we just need to follow the same procedure we did for graph building. First, define the building parameters,
In Python
# Now, we can build a graph index over the data set.
parameters = svs.VamanaBuildParameters(
graph_max_degree = 64,
window_size = 128,
)
In C++
auto parameters = svs::index::vamana::VamanaBuildParameters{
1.2, // alpha
64, // graph max degree
128, // search window size
1024, // max candidate pool size
60, // prune to degree
true, // full search history
};
then load the dataset, using the loader (details for Python, details for C++) corresponding to the chosen compression type,
In Python
data_loader = svs.VectorDataLoader(
os.path.join(test_data_dir, "example_data"), # Uncompressed data
svs.DataType.float32,
dims = 128 # Passing dimensionality is optional
)
B1 = 4 # Number of bits for the first level LVQ quantization
B2 = 8 # Number of bits for the residuals quantization
padding = 32
strategy = svs.LVQStrategy.Turbo
compressed_loader = svs.LVQLoader(data_loader,
primary=B1,
residual=B2,
strategy=strategy, # Passing the strategy is optional.
padding=padding # Passing padding is optional.
)
In C++
// Quantization
size_t padding = 32;
namespace lvq = svs::quantization::lvq;
// Wrap the compressor object in a lazy functor.
// This will defer loading and compression of the LVQ dataset until the threadpool
// used in the index has been created.
auto compressor = svs::lib::Lazy([=](svs::threads::ThreadPool auto& threadpool) {
auto data = svs::VectorDataLoader<float, 128>("example_data").load();
return lvq::LVQDataset<8, 0, 128>::compress(data, threadpool, padding);
});
index = svs::Vamana::assemble<float>(
"example_config", svs::GraphLoader("example_graph"), compressor, svs::DistanceL2()
);
and finally call the build function with the chosen compression loader
In Python
# Build the index.
index = svs.Vamana.build(
parameters,
compressed_loader,
svs.DistanceType.L2,
num_threads = 4
)
In C++
// Compressed building
index =
svs::Vamana::build<float>(parameters, compressor, svs::DistanceL2(), num_threads);
recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Build");
check(0.8212, recall);
References
Subramanya, S.J.; Devvrit, F.; Simhadri, H.V.; Krishnawamy, R.; Kadekodi, R..: Diskann: Fast accurate billion-point nearest neighbor search on a single node. In: Advances in Neural Information Processing Systems 32 (2019).
Aguerrebere, C.; Bhati I.; Hildebrand M.; Tepper M.; Willke T..: Similarity search in the blink of an eye with compressed indices. In: Proceedings of the VLDB Endowment, 16, 11, 3433 - 3446. (2023)
Aguerrebere, C.; Hildebrand M.; Bhati I.; Willke T.; Tepper M..: Locally-adaptive Quantization for Streaming Vector Search. In: arxiv preprint arXiv:2402.02044 (2024)
Malkov, Y. A. and Yashunin, D. A..: Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. In: IEEE transactions on pattern analysis and machine intelligence 42, 4 (2018), 824–836.
Johnson, J.; Douze, M.; Jégou, H..: Billion-scale similarity search with GPUs. In: IEEE Transactions on Big Data 7, 3 (2019), 535–547.
Guo, R.; Sun, P.; Lindgren, E.; Geng, Q.; Simcha, D.; Chern, F.; Kumar, S..: Accelerating large-scale inference with anisotropic vector quantization. In International Conference on Machine Learning. PMLR, 3887-3896 (2020)
Iwasaki, M. and Miyazaki, D..: Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data. https://github.com/yahoojapan/NGT (2018)
Aumüller, M.; Bernhardsson, E.; Faithfull, A..: ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. In: Information Systems 87 (2020), 101374. https://doi.org/10.1016/j.is.2019.02.006
Qu, Y.; Ding, Y.; Liu, J.; Liu, K.; Ren, R.; Zhao, W. X.; Dong, D.; Wu, H. and Wang, H..: RocketQA: An optimized training approach to dense passage retrieval for open-domain question answering. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 5835–5847. (2021)
Singh, A.; Subramanya, S.J.; Krishnaswamy, R.; Simhadri, H.V..: FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search. In: arxiv preprint arXiv:2105.09613 (2021)
Douze, M.; Guzhva, A.; Deng, C.; Johnson, J.; Szilvasy, G.; Mazaré, P.E.; Lomeli, M.; Hosseini, L.; Jégou, H.: The Faiss library. In: arxiv preprint arXiv:2401.08281 (2024)
Tepper M.; Bhati I.; Aguerrebere, C.; Hildebrand M.; Willke T.: LeanVec: Search your vectors faster by making them fit. arXiv preprint arXiv:2312.16335 (2024)