Vamana Index

Documentation for the type-erased version of the svs::index::vamana::VamanaIndex.

class Vamana : public svs::manager::IndexManager<VamanaInterface>

Type erased container for the Vamana index.

Vamana Manager

Public Types

using search_parameters_type = typename IFace::search_parameters_type: Require the presence of the search_parameters_type alias in the instantiating interface.

Public Functions

inline float get_alpha() const: Return the value of alpha used during graph construction.

inline size_t get_max_candidates() const: Return the max candidate pool size that was used for graph construction.

inline void save(const std::filesystem::path &config_directory, const std::filesystem::path &graph_directory, const std::filesystem::path &data_directory)

Save the whole index to disk to enable reloading in the future.

In general, the save locations must be directories since each data structure (config, graph, and data) may require an arbitrary number of auxiliary files in order to by completely saved to disk. The given directories must be different.

Each directory may be created as a side-effect of this method call provided that the parent directory exists. Data in existing directories will be overwritten without warning.

The choice to use a multi-directory structure is to enable design-space exploration with different data quantization techniques or graph implementations. That is, the save format for the different components is designed to be orthogonal to allow mixing and matching of different types upon reloading.

Helper Classes

struct VamanaBuildParameters

Parameters controlling graph construction for the Vamana graph index.

Public Members

float alpha: The pruning parameter.

size_t graph_max_degree: The maximum degree in the graph. A higher max degree may yield a higher quality graph in terms of recall for performance, but the memory footprint of the graph is directly proportional to the maximum degree.

size_t window_size: The search window size to use during graph construction. A higher search window size will yield a higher quality graph since more overall vertices are considered, but will increase construction time.

size_t max_candidate_pool_size: Set a limit on the number of neighbors considered during pruning. In practice, set this to a high number (at least 5 times greater than the window_size) and forget about it.

size_t prune_to: This is the amount that candidates will be pruned to after certain pruning procedures. Setting this to less than graph_max_degree can result in significant speedups in index building.

bool use_full_search_history = true

When building, either the contents of the search buffer can be used or the entire search history can be used.

The latter case may yield a slightly better graph as the cost of more search time.

struct VamanaSearchParameters

Runtime parameters controlling the accuracy and performance of index search.

Public Members

SearchBufferConfig buffer_config_ = {}

Configuration of the search buffer.

Increasing the search window size and capacity generally yields more accurate but slower search results.

bool search_buffer_visited_set_ = false

Enabling of the visited set for search.

The visited set tracks whether candidates the distance between a query and a candidate has already been computed. Enabling this feature generally improves performance in the high-recall or high-neighbor regime.

size_t prefetch_lookahead_ = 4: The number of iterations ahead to prefetch candidates.

size_t prefetch_step_ = 1: Parameter controlling the ramp phase of prefetching.

Example

This example will cover the following topic:

Building a svs::Vamana index orchestrator.
Performing queries to retrieve neighbors from the svs::Vamana index.
Saving the index to disk.
Loading a svs::Vamana index orchestrator from disk.
Compressing an on-disk dataset and loading/searching with a compressed svs::Vamana.

The complete example is included at the end of this file.

Preamble

First, We need to include some headers.

// SVS Dependencies
#include "svs/orchestrators/vamana.h"  // bulk of the dependencies required.
#include "svs/core/recall.h"           // Convenient k-recall@n computation.
#include "svs/extensions/vamana/lvq.h" // LVQ extensions for the Vamana index.
#include "svs/quantization/lvq/impl/lvq_impl.h" // LVQ implementation.

// Alternative main definition
#include "svsmain.h"

// stl
#include <map>
#include <string>
#include <string_view>
#include <vector>

Then, we need to include some helpful utilities. @snippet vamana.cpp Helper Utilities

double run_recall(
    svs::Vamana& index,
    const svs::data::SimpleData<float>& queries,
    const svs::data::SimpleData<uint32_t>& groundtruth,
    size_t search_window_size,
    size_t num_neighbors,
    std::string_view message = ""
) {
    index.set_search_window_size(search_window_size);
    auto results = index.search(queries, num_neighbors);
    double recall = svs::k_recall_at_n(groundtruth, results, num_neighbors, num_neighbors);
    if (!message.empty()) {
        fmt::print("[{}] ", message);
    }
    fmt::print("Windowsize = {}, Recall = {}\n", search_window_size, recall);
    return recall;
}

const bool DEBUG = false;
void check(double expected, double got, double eps = 0.001) {
    double diff = std::abs(expected - got);
    if constexpr (DEBUG) {
        fmt::print("Expected {}. Got {}\n", expected, got);
    } else {
        if (diff > eps) {
            throw ANNEXCEPTION("Expected ", expected, ". Got ", got, '!');
        }
    }
}

The function run_recall sets the search window size of the svs::Vamana index performs a search over a given set of queries, and computes the k recall at k where k is the number of returned neighbors.

The function check compares the recall with an expected recall and throws an exception if they differ by too much (this is mainly to allow automated testing of this example).

Finally, we do some argument decoding. @snippet vamana.cpp Argument Extraction

const size_t nargs = args.size();
if (nargs != 4) {
    throw ANNEXCEPTION("Expected 3 arguments. Instead, got ", nargs, '!');
}
const std::string& data_vecs = args.at(1);
const std::string& query_vecs = args.at(2);
const std::string& groundtruth_vecs = args.at(3);

It is expected that the svs function svs.generate_test_dataset() was used to generate the data, graph, and metdata files.

Building and Index

The first step is to construct an instance of svs::index::vamana::VamanaBuildParameters to describe the hyper-parameters of the graph we wish to construct. Don’t worry too much about selecting the correct values for these hyper-parameters right now. This usually involves a bit of experimentation and is dataset dependent.

Now that we’ve established our parameters, it is time to construct the index.

size_t num_threads = 4;
svs::Vamana index = svs::Vamana::build<float>(
    parameters, svs::VectorDataLoader<float>(data_vecs), svs::DistanceL2(), num_threads
);

There are several things to note about about this construction.

The type parameter float passed to svs::Vamana::build() indicates the element types of the queries that will be supported by the resulting index.

Using queries with a different element type will result in a run-time error. This is due to limits in type-erasure and dynamic function calls.
The data path is wrapped in a svs::VectorDataLoader with the correct element type. In this case, we’re explicitly using the dynamic sizing for the data. If static dimensionality was desired, than the second value parameter for svs::VectorDataLoader could be used.
An instance of the svs::distance::DistanceL2 functor is passed directly.

Searching the Index

The graph is now built and we can perform queries over the graph. First, we load the queries and the computed ground truth for our example dataset using svs::io::auto_load().

// Load the queries and ground truth.
auto queries = svs::load_data<float>(query_vecs);
auto groundtruth = svs::load_data<uint32_t>(groundtruth_vecs);

Performing queries is easy. First establish a base-line search window size (svs::Vamana::set_search_window_size()). This provides a parameter by which performance and accuracy can be traded. The larger search_window_size is, the higher the accuracy but the lower the performance. Note that search_window_size must be at least as large as the desired number of neighbors.

index.set_search_window_size(30);
svs::QueryResult<size_t> results = index.search(queries, 10);
double recall = svs::k_recall_at_n(groundtruth, results);
check(0.8215, recall);

We use svs::Vamana::search() to find the 10 approximate nearest neighbors to each query in the form of a svs::QueryResult. Then, we use svs::k_recall_at_n() to compute the 10-recall at 10 of the returned neighbors, checking to confirm the accuracy.

The code snippet below demonstrates how to vary the search window size to change the achieved recall.

auto expected_recall =
    std::map<size_t, double>({{10, 0.5509}, {20, 0.7281}, {30, 0.8215}, {40, 0.8788}});
for (auto windowsize : {10, 20, 30, 40}) {
    recall = run_recall(index, queries, groundtruth, windowsize, 10, "Sweep");
    check(expected_recall.at(windowsize), recall);
}

Saving the Index

If you are satisfied with the performance of the generated index, you can save it to disk to avoid rebuilding it in the future. This is done with svs::Vamana::save().

index.save("example_config", "example_graph", "example_data");

Reloading a Saved Index

To reload the index from file, use svs::Vamana::load(). This informs the dispatch mechanisms that we’re loading an uncompressed data file from disk.

// We can reload an index from a previously saved set of files.
index = svs::Vamana::assemble<float>(
    "example_config",
    svs::GraphLoader("example_graph"),
    svs::VectorDataLoader<float>("example_data"),
    svs::DistanceType::L2,
    4 // num_threads
);

recall = run_recall(index, queries, groundtruth, 30, 10, "Reload");
check(0.8215, recall);

Performing queries is identical to before.

Using Vector Compression

The library has experimental support for performing online vector compression. The second argument to svs::Vamana::load() can be one of the compressed loaders, which will compress an uncompressed dataset on the fly.

Specifying the loader is all that is required to use vector compression. Note that vector compression is usually accompanied by an accuracy loss for the same search window size and may require increasing the window size to compensate. The example below shows a svs::quantization::lvq::OneLevelWithBias quantization scheme with a static dimensionality of 128.

    // Quantization
    size_t padding = 32;
    namespace lvq = svs::quantization::lvq;

    // Wrap the compressor object in a lazy functor.
    // This will defer loading and compression of the LVQ dataset until the threadpool
    // used in the index has been created.
    auto compressor = svs::lib::Lazy([=](svs::threads::ThreadPool auto& threadpool) {
        auto data = svs::VectorDataLoader<float, 128>("example_data").load();
        return lvq::LVQDataset<8, 0, 128>::compress(data, threadpool, padding);
    });
    index = svs::Vamana::assemble<float>(
        "example_config", svs::GraphLoader("example_graph"), compressor, svs::DistanceL2()
    );

    recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Load");
    check(0.8215, recall);

Building using Vector Compression

Index building using LVQ is very similar to index building using standard uncompressed vectors, though it may not be supported by all compression techniques.

    // Compressed building
    index =
        svs::Vamana::build<float>(parameters, compressor, svs::DistanceL2(), num_threads);
    recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Build");
    check(0.8212, recall);

Entire Example

This ends the example demonstrating the features of the svs::Vamana index. The entire executable code is shown below. Please reach out with any questions.

//! [Includes]
// SVS Dependencies
#include "svs/orchestrators/vamana.h"  // bulk of the dependencies required.
#include "svs/core/recall.h"           // Convenient k-recall@n computation.
#include "svs/extensions/vamana/lvq.h" // LVQ extensions for the Vamana index.
#include "svs/quantization/lvq/impl/lvq_impl.h" // LVQ implementation.

// Alternative main definition
#include "svsmain.h"

// stl
#include <map>
#include <string>
#include <string_view>
#include <vector>
//! [Includes]

//! [Helper Utilities]
double run_recall(
    svs::Vamana& index,
    const svs::data::SimpleData<float>& queries,
    const svs::data::SimpleData<uint32_t>& groundtruth,
    size_t search_window_size,
    size_t num_neighbors,
    std::string_view message = ""
) {
    index.set_search_window_size(search_window_size);
    auto results = index.search(queries, num_neighbors);
    double recall = svs::k_recall_at_n(groundtruth, results, num_neighbors, num_neighbors);
    if (!message.empty()) {
        fmt::print("[{}] ", message);
    }
    fmt::print("Windowsize = {}, Recall = {}\n", search_window_size, recall);
    return recall;
}

const bool DEBUG = false;
void check(double expected, double got, double eps = 0.001) {
    double diff = std::abs(expected - got);
    if constexpr (DEBUG) {
        fmt::print("Expected {}. Got {}\n", expected, got);
    } else {
        if (diff > eps) {
            throw ANNEXCEPTION("Expected ", expected, ". Got ", got, '!');
        }
    }
}
//! [Helper Utilities]

// Alternative main definition
int svs_main(std::vector<std::string> args) {
    //! [Argument Extraction]
    const size_t nargs = args.size();
    if (nargs != 4) {
        throw ANNEXCEPTION("Expected 3 arguments. Instead, got ", nargs, '!');
    }
    const std::string& data_vecs = args.at(1);
    const std::string& query_vecs = args.at(2);
    const std::string& groundtruth_vecs = args.at(3);
    //! [Argument Extraction]

    // Building the index

    //! [Build Parameters]
    auto parameters = svs::index::vamana::VamanaBuildParameters{
        1.2,  // alpha
        64,   // graph max degree
        128,  // search window size
        1024, // max candidate pool size
        60,   // prune to degree
        true, // full search history
    };
    //! [Build Parameters]

    //! [Index Build]
    size_t num_threads = 4;
    svs::Vamana index = svs::Vamana::build<float>(
        parameters, svs::VectorDataLoader<float>(data_vecs), svs::DistanceL2(), num_threads
    );
    //! [Index Build]

    // Searching the index

    //! [Load Aux]
    // Load the queries and ground truth.
    auto queries = svs::load_data<float>(query_vecs);
    auto groundtruth = svs::load_data<uint32_t>(groundtruth_vecs);
    //! [Load Aux]

    //! [Perform Queries]
    index.set_search_window_size(30);
    svs::QueryResult<size_t> results = index.search(queries, 10);
    double recall = svs::k_recall_at_n(groundtruth, results);
    check(0.8215, recall);
    //! [Perform Queries]

    //! [Search Window Size]
    auto expected_recall =
        std::map<size_t, double>({{10, 0.5509}, {20, 0.7281}, {30, 0.8215}, {40, 0.8788}});
    for (auto windowsize : {10, 20, 30, 40}) {
        recall = run_recall(index, queries, groundtruth, windowsize, 10, "Sweep");
        check(expected_recall.at(windowsize), recall);
    }
    //! [Search Window Size]

    // Saving the index

    //! [Saving]
    index.save("example_config", "example_graph", "example_data");
    //! [Saving]

    // Reloading a saved index

    //! [Loading]
    // We can reload an index from a previously saved set of files.
    index = svs::Vamana::assemble<float>(
        "example_config",
        svs::GraphLoader("example_graph"),
        svs::VectorDataLoader<float>("example_data"),
        svs::DistanceType::L2,
        4 // num_threads
    );

    recall = run_recall(index, queries, groundtruth, 30, 10, "Reload");
    check(0.8215, recall);
    //! [Loading]

    // Search using vector compression

    //! [Compressed Loader]
    // Quantization
    size_t padding = 32;
    namespace lvq = svs::quantization::lvq;

    // Wrap the compressor object in a lazy functor.
    // This will defer loading and compression of the LVQ dataset until the threadpool
    // used in the index has been created.
    auto compressor = svs::lib::Lazy([=](svs::threads::ThreadPool auto& threadpool) {
        auto data = svs::VectorDataLoader<float, 128>("example_data").load();
        return lvq::LVQDataset<8, 0, 128>::compress(data, threadpool, padding);
    });
    index = svs::Vamana::assemble<float>(
        "example_config", svs::GraphLoader("example_graph"), compressor, svs::DistanceL2()
    );
    //! [Compressed Loader]

    //! [Search Compressed]
    recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Load");
    check(0.8215, recall);
    //! [Search Compressed]

    //! [Build Index Compressed]
    // Compressed building
    index =
        svs::Vamana::build<float>(parameters, compressor, svs::DistanceL2(), num_threads);
    recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Build");
    check(0.8212, recall);
    //! [Build Index Compressed]

    //! [Only Loading]
    // We can reload an index from a previously saved set of files.
    index = svs::Vamana::assemble<float>(
        "example_config",
        svs::GraphLoader("example_graph"),
        svs::VectorDataLoader<float>("example_data"),
        svs::DistanceType::L2,
        4 // num_threads
    );
    //! [Only Loading]

    //! [Set n-threads]
    index.set_num_threads(4);
    //! [Set n-threads]

    return 0;
}

// Special main providing some helpful utilties.
SVS_DEFINE_MAIN();