Vamana Index
Documentation for the type-erased version of the svs::index::vamana::VamanaIndex
.
-
class Vamana : public svs::manager::IndexManager<VamanaInterface>
Type erased container for the Vamana index.
Vamana Manager
Public Types
-
using search_parameters_type = typename IFace::search_parameters_type
Require the presence of the
search_parameters_type
alias in the instantiating interface.
Public Functions
-
inline float get_alpha() const
Return the value of
alpha
used during graph construction.
-
inline size_t get_max_candidates() const
Return the max candidate pool size that was used for graph construction.
-
inline void save(const std::filesystem::path &config_directory, const std::filesystem::path &graph_directory, const std::filesystem::path &data_directory)
Save the whole index to disk to enable reloading in the future.
In general, the save locations must be directories since each data structure (config, graph, and data) may require an arbitrary number of auxiliary files in order to by completely saved to disk. The given directories must be different.
Each directory may be created as a side-effect of this method call provided that the parent directory exists. Data in existing directories will be overwritten without warning.
The choice to use a multi-directory structure is to enable design-space exploration with different data quantization techniques or graph implementations. That is, the save format for the different components is designed to be orthogonal to allow mixing and matching of different types upon reloading.
- Parameters:
config_directory – Directory where the index metadata will be saved. The contents of the metadata include the configured search window size, entry point etc.
graph_directory – Directory where the graph will be saved.
data_directory – Directory where the vector dataset will be saved.
-
inline size_t size() const
Return the number of elements in the indexed dataset.
-
inline size_t dimensions() const
Return the logical number of dimensions of each vector in the indexed dataset.
-
inline std::vector<svs::DataType> query_types() const
Return query-types that this index is specialized to work with.
-
inline bool can_change_threads() const
Return whether the back-end implementation can change the number of threads.
-
inline size_t get_num_threads() const
Return the current number of worker threads used by this index for searches.
-
inline void set_num_threads(size_t num_threads)
Set the number of threads to use for searching.
Only effective if
can_change_threads()
returnstrue
.- Parameters:
num_threads – The number of threads to use. If set to
0
, will implicitly default to1
.
Public Static Functions
-
template<manager::QueryTypeDefinition QueryTypes, typename GraphLoaderType, typename DataLoader, typename Distance>
static inline Vamana assemble(const std::filesystem::path &config_path, const GraphLoaderType &graph_loader, DataLoader &&data_loader, const Distance &distance, size_t num_threads = 1) Load a Vamana Index from a previously saved index.
The data loader should be any object loadable via
svs::detail::dispatch_load
returning a Vamana compatible dataset. Concrete examples include:An instance of
VectorDataLoader
.An implementation of
svs::data::ImmutableMemoryDataset
(passed by value).
- Template Parameters:
QueryType – The element type of queries that will be used when requesting searches over the index.
- Parameters:
config_path – Path to the directory where the index configuration was saved. This corresponds to the
config_dir
argument ofsvs::Vamana::save
.graph_loader – The loader for the graph to use. See
svs::GraphLoader
. The file path corresponds to the directory given as thegraph_dir
argument ofsvs::Vamana::save
.data_loader – An acceptable data loader. See the documentation below for details.
distance – The distance functor or
svs::DistanceType
enum to use for similarity search computations.num_threads – The number of threads to use to process queries. May be changed at run-time.
-
template<manager::QueryTypeDefinition QueryTypes, typename DataLoader, typename Distance, typename Allocator = HugepageAllocator<uint32_t>>
static inline Vamana build(const index::vamana::VamanaBuildParameters ¶meters, DataLoader &&data_loader, Distance distance, size_t num_threads = 1, const Allocator &graph_allocator = {}) Construct the a Vamana Index for the given dataset.
The data loader should be any object loadable via
svs::detail::dispatch_load
returning a Vamana compatible dataset. Concrete examples include:An instance of
VectorDataLoader
.An implementation of
svs::data::ImmutableMemoryDataset
(passed by value).
- Template Parameters:
QueryType – The element type of the queries that will be given to this index.
- Parameters:
parameters – The build parameters for the search graph constructed over the data.
data_loader – Either a data loader from disk or a dataset by value. See detailed notes below.
distance – The distance functor to use or a
svs::DistanceType
enum.num_threads – The number of threads to use for query processing (may be changed after class construction).
graph_allocator – The allocator to use for the backing graph.
-
using search_parameters_type = typename IFace::search_parameters_type
Helper Classes
-
struct VamanaBuildParameters
Parameters controlling graph construction for the Vamana graph index.
Public Members
-
float alpha
The pruning parameter.
-
size_t graph_max_degree
The maximum degree in the graph. A higher max degree may yield a higher quality graph in terms of recall for performance, but the memory footprint of the graph is directly proportional to the maximum degree.
-
size_t window_size
The search window size to use during graph construction. A higher search window size will yield a higher quality graph since more overall vertices are considered, but will increase construction time.
-
size_t max_candidate_pool_size
Set a limit on the number of neighbors considered during pruning. In practice, set this to a high number (at least 5 times greater than the window_size) and forget about it.
-
size_t prune_to
This is the amount that candidates will be pruned to after certain pruning procedures. Setting this to less than
graph_max_degree
can result in significant speedups in index building.
-
bool use_full_search_history = true
When building, either the contents of the search buffer can be used or the entire search history can be used.
The latter case may yield a slightly better graph as the cost of more search time.
-
float alpha
-
struct VamanaSearchParameters
Runtime parameters controlling the accuracy and performance of index search.
Public Members
-
SearchBufferConfig buffer_config_ = {}
Configuration of the search buffer.
Increasing the search window size and capacity generally yields more accurate but slower search results.
-
bool search_buffer_visited_set_ = false
Enabling of the visited set for search.
The visited set tracks whether candidates the distance between a query and a candidate has already been computed. Enabling this feature generally improves performance in the high-recall or high-neighbor regime.
-
size_t prefetch_lookahead_ = 4
The number of iterations ahead to prefetch candidates.
-
size_t prefetch_step_ = 1
Parameter controlling the ramp phase of prefetching.
-
SearchBufferConfig buffer_config_ = {}
Example
This example will cover the following topic:
Building a
svs::Vamana
index orchestrator.Performing queries to retrieve neighbors from the
svs::Vamana
index.Saving the index to disk.
Loading a
svs::Vamana
index orchestrator from disk.Compressing an on-disk dataset and loading/searching with a compressed
svs::Vamana
.
The complete example is included at the end of this file.
Preamble
First, We need to include some headers.
// SVS Dependencies
#include "svs/orchestrators/vamana.h" // bulk of the dependencies required.
#include "svs/core/recall.h" // Convenient k-recall@n computation.
#include "svs/extensions/vamana/lvq.h" // LVQ extensions for the Vamana index.
#include "svs/quantization/lvq/impl/lvq_impl.h" // LVQ implementation.
// Alternative main definition
#include "svsmain.h"
// stl
#include <map>
#include <string>
#include <string_view>
#include <vector>
Then, we need to include some helpful utilities. @snippet vamana.cpp Helper Utilities
double run_recall(
svs::Vamana& index,
const svs::data::SimpleData<float>& queries,
const svs::data::SimpleData<uint32_t>& groundtruth,
size_t search_window_size,
size_t num_neighbors,
std::string_view message = ""
) {
index.set_search_window_size(search_window_size);
auto results = index.search(queries, num_neighbors);
double recall = svs::k_recall_at_n(groundtruth, results, num_neighbors, num_neighbors);
if (!message.empty()) {
fmt::print("[{}] ", message);
}
fmt::print("Windowsize = {}, Recall = {}\n", search_window_size, recall);
return recall;
}
const bool DEBUG = false;
void check(double expected, double got, double eps = 0.001) {
double diff = std::abs(expected - got);
if constexpr (DEBUG) {
fmt::print("Expected {}. Got {}\n", expected, got);
} else {
if (diff > eps) {
throw ANNEXCEPTION("Expected ", expected, ". Got ", got, '!');
}
}
}
The function run_recall
sets the search window size of the svs::Vamana
index performs a search over a given set of queries, and computes the k
recall at k
where k
is the number of returned neighbors.
The function check
compares the recall with an expected recall and throws an
exception if they differ by too much (this is mainly to allow automated testing of this
example).
Finally, we do some argument decoding. @snippet vamana.cpp Argument Extraction
const size_t nargs = args.size();
if (nargs != 4) {
throw ANNEXCEPTION("Expected 3 arguments. Instead, got ", nargs, '!');
}
const std::string& data_vecs = args.at(1);
const std::string& query_vecs = args.at(2);
const std::string& groundtruth_vecs = args.at(3);
It is expected that the svs function svs.generate_test_dataset()
was used
to generate the data, graph, and metdata files.
Building and Index
The first step is to construct an instance of svs::index::vamana::VamanaBuildParameters
to describe the hyper-parameters of the graph we wish to construct.
Don’t worry too much about selecting the correct values for these hyper-parameters right now.
This usually involves a bit of experimentation and is dataset dependent.
Now that we’ve established our parameters, it is time to construct the index.
size_t num_threads = 4;
svs::Vamana index = svs::Vamana::build<float>(
parameters, svs::VectorDataLoader<float>(data_vecs), svs::DistanceL2(), num_threads
);
There are several things to note about about this construction.
The type parameter
float
passed tosvs::Vamana::build()
indicates the element types of the queries that will be supported by the resulting index.Using queries with a different element type will result in a run-time error. This is due to limits in type-erasure and dynamic function calls.
The data path is wrapped in a
svs::VectorDataLoader
with the correct element type. In this case, we’re explicitly using the dynamic sizing for the data. If static dimensionality was desired, than the second value parameter forsvs::VectorDataLoader
could be used.An instance of the
svs::distance::DistanceL2
functor is passed directly.
Searching the Index
The graph is now built and we can perform queries over the graph.
First, we load the queries and the computed ground truth for our example dataset using svs::io::auto_load()
.
// Load the queries and ground truth.
auto queries = svs::load_data<float>(query_vecs);
auto groundtruth = svs::load_data<uint32_t>(groundtruth_vecs);
Performing queries is easy.
First establish a base-line search window size (svs::Vamana::set_search_window_size()
).
This provides a parameter by which performance and accuracy can be traded.
The larger search_window_size
is, the higher the accuracy but the lower the performance.
Note that search_window_size
must be at least as large as the desired number of neighbors.
index.set_search_window_size(30);
svs::QueryResult<size_t> results = index.search(queries, 10);
double recall = svs::k_recall_at_n(groundtruth, results);
check(0.8215, recall);
We use svs::Vamana::search()
to find the 10 approximate nearest neighbors to each query in the form of a svs::QueryResult
.
Then, we use svs::k_recall_at_n()
to compute the 10-recall at 10 of the returned neighbors, checking to confirm the accuracy.
The code snippet below demonstrates how to vary the search window size to change the achieved recall.
auto expected_recall =
std::map<size_t, double>({{10, 0.5509}, {20, 0.7281}, {30, 0.8215}, {40, 0.8788}});
for (auto windowsize : {10, 20, 30, 40}) {
recall = run_recall(index, queries, groundtruth, windowsize, 10, "Sweep");
check(expected_recall.at(windowsize), recall);
}
Saving the Index
If you are satisfied with the performance of the generated index, you can save it to disk to avoid rebuilding it in the future.
This is done with svs::Vamana::save()
.
index.save("example_config", "example_graph", "example_data");
Reloading a Saved Index
To reload the index from file, use svs::Vamana::load()
.
This informs the dispatch mechanisms that we’re loading an uncompressed data file from disk.
// We can reload an index from a previously saved set of files.
index = svs::Vamana::assemble<float>(
"example_config",
svs::GraphLoader("example_graph"),
svs::VectorDataLoader<float>("example_data"),
svs::DistanceType::L2,
4 // num_threads
);
recall = run_recall(index, queries, groundtruth, 30, 10, "Reload");
check(0.8215, recall);
Performing queries is identical to before.
Using Vector Compression
The library has experimental support for performing online vector compression.
The second argument to svs::Vamana::load()
can be one of the compressed loaders, which will compress an uncompressed dataset on the fly.
Specifying the loader is all that is required to use vector compression.
Note that vector compression is usually accompanied by an accuracy loss for the same search window size and may require increasing the window size to compensate.
The example below shows a svs::quantization::lvq::OneLevelWithBias
quantization scheme with a static dimensionality of 128.
// Quantization
size_t padding = 32;
namespace lvq = svs::quantization::lvq;
// Wrap the compressor object in a lazy functor.
// This will defer loading and compression of the LVQ dataset until the threadpool
// used in the index has been created.
auto compressor = svs::lib::Lazy([=](svs::threads::ThreadPool auto& threadpool) {
auto data = svs::VectorDataLoader<float, 128>("example_data").load();
return lvq::LVQDataset<8, 0, 128>::compress(data, threadpool, padding);
});
index = svs::Vamana::assemble<float>(
"example_config", svs::GraphLoader("example_graph"), compressor, svs::DistanceL2()
);
recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Load");
check(0.8215, recall);
Building using Vector Compression
Index building using LVQ is very similar to index building using standard uncompressed vectors, though it may not be supported by all compression techniques.
// Compressed building
index =
svs::Vamana::build<float>(parameters, compressor, svs::DistanceL2(), num_threads);
recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Build");
check(0.8212, recall);
Entire Example
This ends the example demonstrating the features of the svs::Vamana
index.
The entire executable code is shown below.
Please reach out with any questions.
//! [Includes]
// SVS Dependencies
#include "svs/orchestrators/vamana.h" // bulk of the dependencies required.
#include "svs/core/recall.h" // Convenient k-recall@n computation.
#include "svs/extensions/vamana/lvq.h" // LVQ extensions for the Vamana index.
#include "svs/quantization/lvq/impl/lvq_impl.h" // LVQ implementation.
// Alternative main definition
#include "svsmain.h"
// stl
#include <map>
#include <string>
#include <string_view>
#include <vector>
//! [Includes]
//! [Helper Utilities]
double run_recall(
svs::Vamana& index,
const svs::data::SimpleData<float>& queries,
const svs::data::SimpleData<uint32_t>& groundtruth,
size_t search_window_size,
size_t num_neighbors,
std::string_view message = ""
) {
index.set_search_window_size(search_window_size);
auto results = index.search(queries, num_neighbors);
double recall = svs::k_recall_at_n(groundtruth, results, num_neighbors, num_neighbors);
if (!message.empty()) {
fmt::print("[{}] ", message);
}
fmt::print("Windowsize = {}, Recall = {}\n", search_window_size, recall);
return recall;
}
const bool DEBUG = false;
void check(double expected, double got, double eps = 0.001) {
double diff = std::abs(expected - got);
if constexpr (DEBUG) {
fmt::print("Expected {}. Got {}\n", expected, got);
} else {
if (diff > eps) {
throw ANNEXCEPTION("Expected ", expected, ". Got ", got, '!');
}
}
}
//! [Helper Utilities]
// Alternative main definition
int svs_main(std::vector<std::string> args) {
//! [Argument Extraction]
const size_t nargs = args.size();
if (nargs != 4) {
throw ANNEXCEPTION("Expected 3 arguments. Instead, got ", nargs, '!');
}
const std::string& data_vecs = args.at(1);
const std::string& query_vecs = args.at(2);
const std::string& groundtruth_vecs = args.at(3);
//! [Argument Extraction]
// Building the index
//! [Build Parameters]
auto parameters = svs::index::vamana::VamanaBuildParameters{
1.2, // alpha
64, // graph max degree
128, // search window size
1024, // max candidate pool size
60, // prune to degree
true, // full search history
};
//! [Build Parameters]
//! [Index Build]
size_t num_threads = 4;
svs::Vamana index = svs::Vamana::build<float>(
parameters, svs::VectorDataLoader<float>(data_vecs), svs::DistanceL2(), num_threads
);
//! [Index Build]
// Searching the index
//! [Load Aux]
// Load the queries and ground truth.
auto queries = svs::load_data<float>(query_vecs);
auto groundtruth = svs::load_data<uint32_t>(groundtruth_vecs);
//! [Load Aux]
//! [Perform Queries]
index.set_search_window_size(30);
svs::QueryResult<size_t> results = index.search(queries, 10);
double recall = svs::k_recall_at_n(groundtruth, results);
check(0.8215, recall);
//! [Perform Queries]
//! [Search Window Size]
auto expected_recall =
std::map<size_t, double>({{10, 0.5509}, {20, 0.7281}, {30, 0.8215}, {40, 0.8788}});
for (auto windowsize : {10, 20, 30, 40}) {
recall = run_recall(index, queries, groundtruth, windowsize, 10, "Sweep");
check(expected_recall.at(windowsize), recall);
}
//! [Search Window Size]
// Saving the index
//! [Saving]
index.save("example_config", "example_graph", "example_data");
//! [Saving]
// Reloading a saved index
//! [Loading]
// We can reload an index from a previously saved set of files.
index = svs::Vamana::assemble<float>(
"example_config",
svs::GraphLoader("example_graph"),
svs::VectorDataLoader<float>("example_data"),
svs::DistanceType::L2,
4 // num_threads
);
recall = run_recall(index, queries, groundtruth, 30, 10, "Reload");
check(0.8215, recall);
//! [Loading]
// Search using vector compression
//! [Compressed Loader]
// Quantization
size_t padding = 32;
namespace lvq = svs::quantization::lvq;
// Wrap the compressor object in a lazy functor.
// This will defer loading and compression of the LVQ dataset until the threadpool
// used in the index has been created.
auto compressor = svs::lib::Lazy([=](svs::threads::ThreadPool auto& threadpool) {
auto data = svs::VectorDataLoader<float, 128>("example_data").load();
return lvq::LVQDataset<8, 0, 128>::compress(data, threadpool, padding);
});
index = svs::Vamana::assemble<float>(
"example_config", svs::GraphLoader("example_graph"), compressor, svs::DistanceL2()
);
//! [Compressed Loader]
//! [Search Compressed]
recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Load");
check(0.8215, recall);
//! [Search Compressed]
//! [Build Index Compressed]
// Compressed building
index =
svs::Vamana::build<float>(parameters, compressor, svs::DistanceL2(), num_threads);
recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Build");
check(0.8212, recall);
//! [Build Index Compressed]
//! [Only Loading]
// We can reload an index from a previously saved set of files.
index = svs::Vamana::assemble<float>(
"example_config",
svs::GraphLoader("example_graph"),
svs::VectorDataLoader<float>("example_data"),
svs::DistanceType::L2,
4 // num_threads
);
//! [Only Loading]
//! [Set n-threads]
index.set_num_threads(4);
//! [Set n-threads]
return 0;
}
// Special main providing some helpful utilties.
SVS_DEFINE_MAIN();