Vamana Index

Documentation for the type-erased version of the svs::index::vamana::VamanaIndex.

class Vamana : public svs::manager::IndexManager<VamanaInterface>

Type erased container for the Vamana index.

Vamana Manager

Public Types

using search_parameters_type = typename IFace::search_parameters_type

Require the presence of the search_parameters_type alias in the instantiating interface.

Public Functions

inline float get_alpha() const

Return the value of alpha used during graph construction.

inline size_t get_max_candidates() const

Return the max candidate pool size that was used for graph construction.

inline void save(const std::filesystem::path &config_directory, const std::filesystem::path &graph_directory, const std::filesystem::path &data_directory)

Save the whole index to disk to enable reloading in the future.

In general, the save locations must be directories since each data structure (config, graph, and data) may require an arbitrary number of auxiliary files in order to by completely saved to disk. The given directories must be different.

Each directory may be created as a side-effect of this method call provided that the parent directory exists. Data in existing directories will be overwritten without warning.

The choice to use a multi-directory structure is to enable design-space exploration with different data quantization techniques or graph implementations. That is, the save format for the different components is designed to be orthogonal to allow mixing and matching of different types upon reloading.

See also

assemble, build

Parameters:
  • config_directory – Directory where the index metadata will be saved. The contents of the metadata include the configured search window size, entry point etc.

  • graph_directory – Directory where the graph will be saved.

  • data_directory – Directory where the vector dataset will be saved.

inline size_t size() const

Return the number of elements in the indexed dataset.

inline size_t dimensions() const

Return the logical number of dimensions of each vector in the indexed dataset.

inline std::vector<svs::DataType> query_types() const

Return query-types that this index is specialized to work with.

inline bool can_change_threads() const

Return whether the back-end implementation can change the number of threads.

inline size_t get_num_threads() const

Return the current number of worker threads used by this index for searches.

inline void set_num_threads(size_t num_threads)

Set the number of threads to use for searching.

Only effective if can_change_threads() returns true.

Parameters:

num_threads – The number of threads to use. If set to 0, will implicitly default to 1.

Public Static Functions

template<manager::QueryTypeDefinition QueryTypes, typename GraphLoaderType, typename DataLoader, typename Distance>
static inline Vamana assemble(const std::filesystem::path &config_path, const GraphLoaderType &graph_loader, DataLoader &&data_loader, const Distance &distance, size_t num_threads = 1)

Load a Vamana Index from a previously saved index.

The data loader should be any object loadable via svs::detail::dispatch_load returning a Vamana compatible dataset. Concrete examples include:

See also

save, build

Template Parameters:

QueryType – The element type of queries that will be used when requesting searches over the index.

Parameters:
  • config_path – Path to the directory where the index configuration was saved. This corresponds to the config_dir argument of svs::Vamana::save.

  • graph_loader – The loader for the graph to use. See svs::GraphLoader. The file path corresponds to the directory given as the graph_dir argument of svs::Vamana::save.

  • data_loader – An acceptable data loader. See the documentation below for details.

  • distance – The distance functor or svs::DistanceType enum to use for similarity search computations.

  • num_threads – The number of threads to use to process queries. May be changed at run-time.

template<manager::QueryTypeDefinition QueryTypes, typename DataLoader, typename Distance, typename Allocator = HugepageAllocator<uint32_t>>
static inline Vamana build(const index::vamana::VamanaBuildParameters &parameters, DataLoader &&data_loader, Distance distance, size_t num_threads = 1, const Allocator &graph_allocator = {})

Construct the a Vamana Index for the given dataset.

The data loader should be any object loadable via svs::detail::dispatch_load returning a Vamana compatible dataset. Concrete examples include:

See also

assemble, save

Template Parameters:

QueryType – The element type of the queries that will be given to this index.

Parameters:
  • parameters – The build parameters for the search graph constructed over the data.

  • data_loader – Either a data loader from disk or a dataset by value. See detailed notes below.

  • distance – The distance functor to use or a svs::DistanceType enum.

  • num_threads – The number of threads to use for query processing (may be changed after class construction).

  • graph_allocator – The allocator to use for the backing graph.

Helper Classes

struct VamanaBuildParameters

Parameters controlling graph construction for the Vamana graph index.

Public Members

float alpha

The pruning parameter.

size_t graph_max_degree

The maximum degree in the graph. A higher max degree may yield a higher quality graph in terms of recall for performance, but the memory footprint of the graph is directly proportional to the maximum degree.

size_t window_size

The search window size to use during graph construction. A higher search window size will yield a higher quality graph since more overall vertices are considered, but will increase construction time.

size_t max_candidate_pool_size

Set a limit on the number of neighbors considered during pruning. In practice, set this to a high number (at least 5 times greater than the window_size) and forget about it.

size_t prune_to

This is the amount that candidates will be pruned to after certain pruning procedures. Setting this to less than graph_max_degree can result in significant speedups in index building.

bool use_full_search_history = true

When building, either the contents of the search buffer can be used or the entire search history can be used.

The latter case may yield a slightly better graph as the cost of more search time.

struct VamanaSearchParameters

Runtime parameters controlling the accuracy and performance of index search.

Public Members

SearchBufferConfig buffer_config_ = {}

Configuration of the search buffer.

Increasing the search window size and capacity generally yields more accurate but slower search results.

bool search_buffer_visited_set_ = false

Enabling of the visited set for search.

The visited set tracks whether candidates the distance between a query and a candidate has already been computed. Enabling this feature generally improves performance in the high-recall or high-neighbor regime.

size_t prefetch_lookahead_ = 4

The number of iterations ahead to prefetch candidates.

size_t prefetch_step_ = 1

Parameter controlling the ramp phase of prefetching.

Example

This example will cover the following topic:

  • Building a svs::Vamana index orchestrator.

  • Performing queries to retrieve neighbors from the svs::Vamana index.

  • Saving the index to disk.

  • Loading a svs::Vamana index orchestrator from disk.

  • Compressing an on-disk dataset and loading/searching with a compressed svs::Vamana.

The complete example is included at the end of this file.

Preamble

First, We need to include some headers.

// SVS Dependencies
#include "svs/orchestrators/vamana.h"  // bulk of the dependencies required.
#include "svs/core/recall.h"           // Convenient k-recall@n computation.
#include "svs/extensions/vamana/lvq.h" // LVQ extensions for the Vamana index.
#include "svs/quantization/lvq/impl/lvq_impl.h" // LVQ implementation.

// Alternative main definition
#include "svsmain.h"

// stl
#include <map>
#include <string>
#include <string_view>
#include <vector>

Then, we need to include some helpful utilities. @snippet vamana.cpp Helper Utilities

double run_recall(
    svs::Vamana& index,
    const svs::data::SimpleData<float>& queries,
    const svs::data::SimpleData<uint32_t>& groundtruth,
    size_t search_window_size,
    size_t num_neighbors,
    std::string_view message = ""
) {
    index.set_search_window_size(search_window_size);
    auto results = index.search(queries, num_neighbors);
    double recall = svs::k_recall_at_n(groundtruth, results, num_neighbors, num_neighbors);
    if (!message.empty()) {
        fmt::print("[{}] ", message);
    }
    fmt::print("Windowsize = {}, Recall = {}\n", search_window_size, recall);
    return recall;
}

const bool DEBUG = false;
void check(double expected, double got, double eps = 0.001) {
    double diff = std::abs(expected - got);
    if constexpr (DEBUG) {
        fmt::print("Expected {}. Got {}\n", expected, got);
    } else {
        if (diff > eps) {
            throw ANNEXCEPTION("Expected ", expected, ". Got ", got, '!');
        }
    }
}

The function run_recall sets the search window size of the svs::Vamana index performs a search over a given set of queries, and computes the k recall at k where k is the number of returned neighbors.

The function check compares the recall with an expected recall and throws an exception if they differ by too much (this is mainly to allow automated testing of this example).

Finally, we do some argument decoding. @snippet vamana.cpp Argument Extraction

const size_t nargs = args.size();
if (nargs != 4) {
    throw ANNEXCEPTION("Expected 3 arguments. Instead, got ", nargs, '!');
}
const std::string& data_vecs = args.at(1);
const std::string& query_vecs = args.at(2);
const std::string& groundtruth_vecs = args.at(3);

It is expected that the svs function svs.generate_test_dataset() was used to generate the data, graph, and metdata files.

Building and Index

The first step is to construct an instance of svs::index::vamana::VamanaBuildParameters to describe the hyper-parameters of the graph we wish to construct. Don’t worry too much about selecting the correct values for these hyper-parameters right now. This usually involves a bit of experimentation and is dataset dependent.

Now that we’ve established our parameters, it is time to construct the index.

size_t num_threads = 4;
svs::Vamana index = svs::Vamana::build<float>(
    parameters, svs::VectorDataLoader<float>(data_vecs), svs::DistanceL2(), num_threads
);

There are several things to note about about this construction.

  • The type parameter float passed to svs::Vamana::build() indicates the element types of the queries that will be supported by the resulting index.

    Using queries with a different element type will result in a run-time error. This is due to limits in type-erasure and dynamic function calls.

  • The data path is wrapped in a svs::VectorDataLoader with the correct element type. In this case, we’re explicitly using the dynamic sizing for the data. If static dimensionality was desired, than the second value parameter for svs::VectorDataLoader could be used.

  • An instance of the svs::distance::DistanceL2 functor is passed directly.

Searching the Index

The graph is now built and we can perform queries over the graph. First, we load the queries and the computed ground truth for our example dataset using svs::io::auto_load().

// Load the queries and ground truth.
auto queries = svs::load_data<float>(query_vecs);
auto groundtruth = svs::load_data<uint32_t>(groundtruth_vecs);

Performing queries is easy. First establish a base-line search window size (svs::Vamana::set_search_window_size()). This provides a parameter by which performance and accuracy can be traded. The larger search_window_size is, the higher the accuracy but the lower the performance. Note that search_window_size must be at least as large as the desired number of neighbors.

index.set_search_window_size(30);
svs::QueryResult<size_t> results = index.search(queries, 10);
double recall = svs::k_recall_at_n(groundtruth, results);
check(0.8215, recall);

We use svs::Vamana::search() to find the 10 approximate nearest neighbors to each query in the form of a svs::QueryResult. Then, we use svs::k_recall_at_n() to compute the 10-recall at 10 of the returned neighbors, checking to confirm the accuracy.

The code snippet below demonstrates how to vary the search window size to change the achieved recall.

auto expected_recall =
    std::map<size_t, double>({{10, 0.5509}, {20, 0.7281}, {30, 0.8215}, {40, 0.8788}});
for (auto windowsize : {10, 20, 30, 40}) {
    recall = run_recall(index, queries, groundtruth, windowsize, 10, "Sweep");
    check(expected_recall.at(windowsize), recall);
}

Saving the Index

If you are satisfied with the performance of the generated index, you can save it to disk to avoid rebuilding it in the future. This is done with svs::Vamana::save().

index.save("example_config", "example_graph", "example_data");

Reloading a Saved Index

To reload the index from file, use svs::Vamana::load(). This informs the dispatch mechanisms that we’re loading an uncompressed data file from disk.

// We can reload an index from a previously saved set of files.
index = svs::Vamana::assemble<float>(
    "example_config",
    svs::GraphLoader("example_graph"),
    svs::VectorDataLoader<float>("example_data"),
    svs::DistanceType::L2,
    4 // num_threads
);

recall = run_recall(index, queries, groundtruth, 30, 10, "Reload");
check(0.8215, recall);

Performing queries is identical to before.

Using Vector Compression

The library has experimental support for performing online vector compression. The second argument to svs::Vamana::load() can be one of the compressed loaders, which will compress an uncompressed dataset on the fly.

Specifying the loader is all that is required to use vector compression. Note that vector compression is usually accompanied by an accuracy loss for the same search window size and may require increasing the window size to compensate. The example below shows a svs::quantization::lvq::OneLevelWithBias quantization scheme with a static dimensionality of 128.

    // Quantization
    size_t padding = 32;
    namespace lvq = svs::quantization::lvq;

    // Wrap the compressor object in a lazy functor.
    // This will defer loading and compression of the LVQ dataset until the threadpool
    // used in the index has been created.
    auto compressor = svs::lib::Lazy([=](svs::threads::ThreadPool auto& threadpool) {
        auto data = svs::VectorDataLoader<float, 128>("example_data").load();
        return lvq::LVQDataset<8, 0, 128>::compress(data, threadpool, padding);
    });
    index = svs::Vamana::assemble<float>(
        "example_config", svs::GraphLoader("example_graph"), compressor, svs::DistanceL2()
    );
    recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Load");
    check(0.8215, recall);

Building using Vector Compression

Index building using LVQ is very similar to index building using standard uncompressed vectors, though it may not be supported by all compression techniques.

    // Compressed building
    index =
        svs::Vamana::build<float>(parameters, compressor, svs::DistanceL2(), num_threads);
    recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Build");
    check(0.8212, recall);

Entire Example

This ends the example demonstrating the features of the svs::Vamana index. The entire executable code is shown below. Please reach out with any questions.

//! [Includes]
// SVS Dependencies
#include "svs/orchestrators/vamana.h"  // bulk of the dependencies required.
#include "svs/core/recall.h"           // Convenient k-recall@n computation.
#include "svs/extensions/vamana/lvq.h" // LVQ extensions for the Vamana index.
#include "svs/quantization/lvq/impl/lvq_impl.h" // LVQ implementation.

// Alternative main definition
#include "svsmain.h"

// stl
#include <map>
#include <string>
#include <string_view>
#include <vector>
//! [Includes]

//! [Helper Utilities]
double run_recall(
    svs::Vamana& index,
    const svs::data::SimpleData<float>& queries,
    const svs::data::SimpleData<uint32_t>& groundtruth,
    size_t search_window_size,
    size_t num_neighbors,
    std::string_view message = ""
) {
    index.set_search_window_size(search_window_size);
    auto results = index.search(queries, num_neighbors);
    double recall = svs::k_recall_at_n(groundtruth, results, num_neighbors, num_neighbors);
    if (!message.empty()) {
        fmt::print("[{}] ", message);
    }
    fmt::print("Windowsize = {}, Recall = {}\n", search_window_size, recall);
    return recall;
}

const bool DEBUG = false;
void check(double expected, double got, double eps = 0.001) {
    double diff = std::abs(expected - got);
    if constexpr (DEBUG) {
        fmt::print("Expected {}. Got {}\n", expected, got);
    } else {
        if (diff > eps) {
            throw ANNEXCEPTION("Expected ", expected, ". Got ", got, '!');
        }
    }
}
//! [Helper Utilities]

// Alternative main definition
int svs_main(std::vector<std::string> args) {
    //! [Argument Extraction]
    const size_t nargs = args.size();
    if (nargs != 4) {
        throw ANNEXCEPTION("Expected 3 arguments. Instead, got ", nargs, '!');
    }
    const std::string& data_vecs = args.at(1);
    const std::string& query_vecs = args.at(2);
    const std::string& groundtruth_vecs = args.at(3);
    //! [Argument Extraction]

    // Building the index

    //! [Build Parameters]
    auto parameters = svs::index::vamana::VamanaBuildParameters{
        1.2,  // alpha
        64,   // graph max degree
        128,  // search window size
        1024, // max candidate pool size
        60,   // prune to degree
        true, // full search history
    };
    //! [Build Parameters]

    //! [Index Build]
    size_t num_threads = 4;
    svs::Vamana index = svs::Vamana::build<float>(
        parameters, svs::VectorDataLoader<float>(data_vecs), svs::DistanceL2(), num_threads
    );
    //! [Index Build]

    // Searching the index

    //! [Load Aux]
    // Load the queries and ground truth.
    auto queries = svs::load_data<float>(query_vecs);
    auto groundtruth = svs::load_data<uint32_t>(groundtruth_vecs);
    //! [Load Aux]

    //! [Perform Queries]
    index.set_search_window_size(30);
    svs::QueryResult<size_t> results = index.search(queries, 10);
    double recall = svs::k_recall_at_n(groundtruth, results);
    check(0.8215, recall);
    //! [Perform Queries]

    //! [Search Window Size]
    auto expected_recall =
        std::map<size_t, double>({{10, 0.5509}, {20, 0.7281}, {30, 0.8215}, {40, 0.8788}});
    for (auto windowsize : {10, 20, 30, 40}) {
        recall = run_recall(index, queries, groundtruth, windowsize, 10, "Sweep");
        check(expected_recall.at(windowsize), recall);
    }
    //! [Search Window Size]

    // Saving the index

    //! [Saving]
    index.save("example_config", "example_graph", "example_data");
    //! [Saving]

    // Reloading a saved index

    //! [Loading]
    // We can reload an index from a previously saved set of files.
    index = svs::Vamana::assemble<float>(
        "example_config",
        svs::GraphLoader("example_graph"),
        svs::VectorDataLoader<float>("example_data"),
        svs::DistanceType::L2,
        4 // num_threads
    );

    recall = run_recall(index, queries, groundtruth, 30, 10, "Reload");
    check(0.8215, recall);
    //! [Loading]

    // Search using vector compression

    //! [Compressed Loader]
    // Quantization
    size_t padding = 32;
    namespace lvq = svs::quantization::lvq;

    // Wrap the compressor object in a lazy functor.
    // This will defer loading and compression of the LVQ dataset until the threadpool
    // used in the index has been created.
    auto compressor = svs::lib::Lazy([=](svs::threads::ThreadPool auto& threadpool) {
        auto data = svs::VectorDataLoader<float, 128>("example_data").load();
        return lvq::LVQDataset<8, 0, 128>::compress(data, threadpool, padding);
    });
    index = svs::Vamana::assemble<float>(
        "example_config", svs::GraphLoader("example_graph"), compressor, svs::DistanceL2()
    );
    //! [Compressed Loader]

    //! [Search Compressed]
    recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Load");
    check(0.8215, recall);
    //! [Search Compressed]

    //! [Build Index Compressed]
    // Compressed building
    index =
        svs::Vamana::build<float>(parameters, compressor, svs::DistanceL2(), num_threads);
    recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Build");
    check(0.8212, recall);
    //! [Build Index Compressed]

    //! [Only Loading]
    // We can reload an index from a previously saved set of files.
    index = svs::Vamana::assemble<float>(
        "example_config",
        svs::GraphLoader("example_graph"),
        svs::VectorDataLoader<float>("example_data"),
        svs::DistanceType::L2,
        4 // num_threads
    );
    //! [Only Loading]

    //! [Set n-threads]
    index.set_num_threads(4);
    //! [Set n-threads]

    return 0;
}

// Special main providing some helpful utilties.
SVS_DEFINE_MAIN();