Getting started with Python

This tutorial will show you how to install SVS and run your first search with it! Tutorials for running dynamic indexing, setting index and search parameters, using vector compression, as well as more advanced installation options are also available.

Installation

Building and installing SVS should be relatively straight-forward. We test on Ubuntu 22.04 LTS, but any Linux distribution should work.

Prerequisites

  • Python >= 3.9

  • A C++20 capable compiler:

    • GCC >= 11.0

    • Clang >= 13.0

  • OneMKL

    • Make sure you set the OneMKL environment variables, e.g., source /opt/intel/oneapi/setvars.sh.

    OneMKL can be installed as part of the Intel oneAPI Base Toolkit by following one of the methods indicated in the oneAPI docs .

    For example, the following commands show how to install the OneMKL component of the Intel oneAPI Base Toolkit on a Linux system using the offline installer:

    wget [link to the offline installer]
    sudo sh [downloaded installer script] -a --components intel.oneapi.lin.mkl.devel --action install --eula accept -s
    source /opt/intel/oneapi/setvars.sh
    

Building and installing

To build and install the SVS Python module, clone the repo and run the following pip install command.

# Clone the repository
git clone https://github.com/intel/ScalableVectorSearch
cd ScalableVectorSearch

# Install svs using pip
pip install bindings/python

If you encounter any issues with the pip install command, please follow the advanced installation instructions.

Verifying the installation

Run the following command to verify that SVS was successfully installed. It should print ['native'].

python3 -c "import svs; print(svs.available_backends())"

SVS search example

In this tutorial we will showcase the most important features of SVS. The full example is available at the end of this tutorial. You can run it with the following commands:

cd examples/python
python3 example_vamana.py

Generating test data

We generate a sample dataset using the svs.generate_test_dataset() generation function. This function generates a data file, a query file, and the ground truth. Note that this is randomly generated data, with no semantic meaning for the elements within it.

We first load svs and the os module also required for this example.

import os
import svs

Then proceed to generate the test dataset.

# Create a test dataset.
# This will create a directory "example_data_vamana" and populate it with three
# entries:
# - data.fvecs: The test dataset.
# - queries.fvecs: The test queries.
# - groundtruth.ivecs: The groundtruth.
test_data_dir = "./example_data_vamana"
svs.generate_test_dataset(
    10000,                      # Create 10000 vectors in the dataset.
    1000,                       # Generate 1000 query vectors.
    128,                        # Set the vector dimensionality to 128.
    test_data_dir,              # The directory where results will be generated.
    data_seed = 1234,           # Random number seed for reproducibility.
    query_seed = 5678,          # Random number seed for reproducibility.
    num_threads = 4,            # Number of threads to use.
    distance = svs.DistanceType.L2,   # The distance type to use.
)

Building the index

Now that data has been generated, we need to construct an index over that data. The index is a graph connecting related data vectors in such a way that searching for nearest neighbors yields good results. The first step is to define the hyper-parameters of the graph we wish to construct. Don’t worry too much about selecting the correct values for these hyper-parameters right now. This usually involves a bit of experimentation and is dataset dependent. See How to Choose Graph Building Hyper-parameters for details.

This is done by creating an instance of svs.VamanaBuildParameters.

# Now, we can build a graph index over the data set.
parameters = svs.VamanaBuildParameters(
    graph_max_degree = 64,
    window_size = 128,
)

Now that we’ve established our hyper-parameters, it is time to construct the index. Passing the dims is optional, but may yield performance benefits if given.

We can build the index directly from the dataset file on disk

# Build the index.
index = svs.Vamana.build(
    parameters,
    svs.VectorDataLoader(
        os.path.join(test_data_dir, "data.fvecs"), svs.DataType.float32
    ),
    svs.DistanceType.L2,
    num_threads = 4,
)

or from a Numpy array

# Build the index.
data = svs.read_vecs(os.path.join(test_data_dir, "data.fvecs"))
index = svs.Vamana.build(
    parameters,
    data,
    svs.DistanceType.L2,
    num_threads = 4,
)

Note the use of svs.VectorDataLoader to indicate both the file path and the data type of the fvecs file on disk (see I/O and Conversion Tools for supported file formats). See svs.Vamana.build for details about the build function.

Note

svs.Vamana.build supports building from Numpy arrays with dtypes float32, float16, int8 and uint8.

Searching the index

The graph is now built and we can perform queries over the graph. First, we load the queries for our example dataset. After searching, we compare the search results with ground truth results which we also load from the dataset.

# Load the queries and ground truth.
queries = svs.read_vecs(os.path.join(test_data_dir, "queries.fvecs"))
groundtruth = svs.read_vecs(os.path.join(test_data_dir, "groundtruth.ivecs"))

Performing queries is easy. First establish a base-line search window size. This provides a parameter by which performance and accuracy can be traded. The larger search_window_size is, the higher the accuracy but the lower the performance. Note that search_window_size must be at least as large as the desired number of neighbors. See How to Set the Search Window Size for details.

We use the search function to find the 10 approximate nearest neighbors to each query. Then, we compute the 10-recall at 10 of the returned neighbors, checking to confirm the accuracy.

# Set the search window size of the index and perform queries.
index.search_window_size = 30
I, D = index.search(queries, 10)

# Compare with the groundtruth.
recall = svs.k_recall_at(groundtruth, I, 10, 10)
print(f"Recall = {recall}")
assert(recall == 0.8288)

See svs.Vamana.search for details about the search function.

Saving the index

If you are satisfied with the performance of the generated index, you can save it to disk to avoid rebuilding it in the future.

# Finally, we can save the results.
index.save(
    os.path.join(test_data_dir, "example_config"),
    os.path.join(test_data_dir, "example_graph"),
    os.path.join(test_data_dir, "example_data"),
)

See svs.Vamana.save() for details about the save function.

Note

The save index function currently uses three folders for saving. All three are needed to be able to reload the index.

  • One folder for the graph.

  • One folder for the data.

  • One folder for metadata.

This is subject to change in the future.

Reloading a saved index

To reload the index from file, use the corresponding constructor with the three folder names used to save the index. Performing queries is identical to before.

# We can reload an index from a previously saved set of files.
index = svs.Vamana(
    os.path.join(test_data_dir, "example_config"),
    svs.GraphLoader(os.path.join(test_data_dir, "example_graph")),
    svs.VectorDataLoader(
        os.path.join(test_data_dir, "example_data"), svs.DataType.float32
    ),
    svs.DistanceType.L2,
    num_threads = 4,
)

# We can rerun the queries to ensure everything works properly.
index.search_window_size = 30
I, D = index.search(queries, 10)

# Compare with the groundtruth.
recall = svs.k_recall_at(groundtruth, I, 10, 10)
print(f"Recall = {recall}")
assert(recall == 0.8288)

Note that the second argument, the one corresponding to the file for the data, requires a svs.VectorDataLoader and the corresponding data type.

Search using vector compression

Note

The open-source SVS library supports all functionalities and features described in this documentation, except for our proprietary vector compression techniques, specifically LVQ [ABHT23] and Leanvec [TBAH24]. These techniques are closed-source and supported exclusively on Intel hardware. We provide shared library and PyPI package to enable these vector compression techniques in C++ and Python, respectively.

Vector compression can be used to speed up the search. It can be done on the fly by loading the index with a LVQLoader (details for Python) or by loading an index with a previously compressed dataset.

See How to Choose Compression Parameters for details on setting the compression parameters.

First, specify the compression loader. Specifying dims in svs.VectorDataLoader is optional and can boost performance considerably (see for details on how to enable this functionality).

data_loader = svs.VectorDataLoader(
    os.path.join(test_data_dir, "example_data"),  # Uncompressed data
    svs.DataType.float32,
    dims = 128    # Passing dimensionality is optional
)
B1 = 4    # Number of bits for the first level LVQ quantization
B2 = 8    # Number of bits for the residuals quantization
padding = 32
strategy = svs.LVQStrategy.Turbo
compressed_loader = svs.LVQLoader(data_loader,
    primary=B1,
    residual=B2,
    strategy=strategy, # Passing the strategy is optional.
    padding=padding # Passing padding is optional.
)

Then load the index and run the search as usual.

index = svs.Vamana(
    os.path.join(test_data_dir, "example_config"),
    svs.GraphLoader(os.path.join(test_data_dir, "example_graph")),
    compressed_loader,
    # Optional keyword arguments
    distance = svs.DistanceType.L2,
    num_threads = 4
)

# Compare with the groundtruth..
index.search_window_size = 30
I, D = index.search(queries, 10)
recall = svs.k_recall_at(groundtruth, I, 10, 10)
print(f"Compressed recall: {recall}")
assert(recall == 0.8223)

Note

Vector compression is usually accompanied by an accuracy loss for the same search window size and may require increasing the window size to compensate.

Saving an index with compressed vectors

SVS has support to save and load indices with a previously compressed dataset. The saving and loading procedures are the same as with uncompressed vectors.

Entire example

This ends the example demonstrating the features of the Vamana index. The entire executable code is shown below. Please reach out with any questions.

# Import `unittest` to allow for automated testing.
import unittest

# [imports]
import os
import svs
# [imports]

DEBUG_MODE = False
def assert_equal(lhs, rhs, message: str = ""):
    if DEBUG_MODE:
        print(f"{message}: {lhs} == {rhs}")
    else:
        assert lhs == rhs, message

def run_test_float(index, queries, groundtruth):
    expected = {
        10: 0.5664,
        20: 0.7397,
        30: 0.8288,
        40: 0.8837,
    }

    for window_size in range(10, 50, 10):
        index.search_window_size = window_size
        I, D = index.search(queries, 10)
        recall = svs.k_recall_at(groundtruth, I, 10, 10)
        assert_equal(
            recall, expected[window_size], f"Standard Search Check ({window_size})"
        )

def run_test_two_level4_8(index, queries, groundtruth):
    expected = {
        10: 0.5482,
        20: 0.7294,
        30: 0.8223,
        40: 0.8756,
    }

    for window_size in range(10, 50, 10):
        index.search_window_size = window_size
        I, D = index.search(queries, 10)
        recall = svs.k_recall_at(groundtruth, I, 10, 10)
        assert_equal(
            recall, expected[window_size], f"Compressed Search Check ({window_size})"
        )

def run_test_build_two_level4_8(index, queries, groundtruth):
    expected = {
        10: 0.5484,
        20: 0.7295,
        30: 0.8221,
        40: 0.8758,
    }

    for window_size in range(10, 50, 10):
        index.search_window_size = window_size
        I, D = index.search(queries, 10)
        recall = svs.k_recall_at(groundtruth, I, 10, 10)
        assert_equal(
            recall, expected[window_size], f"Compressed Search Check ({window_size})"
        )

# Shadow this as a global to make it available to the test-case clean-up.
test_data_dir = None

def run():

    # ###
    # Generating test data
    # ###

    # [generate-dataset]
    # Create a test dataset.
    # This will create a directory "example_data_vamana" and populate it with three
    # entries:
    # - data.fvecs: The test dataset.
    # - queries.fvecs: The test queries.
    # - groundtruth.ivecs: The groundtruth.
    test_data_dir = "./example_data_vamana"
    svs.generate_test_dataset(
        10000,                      # Create 10000 vectors in the dataset.
        1000,                       # Generate 1000 query vectors.
        128,                        # Set the vector dimensionality to 128.
        test_data_dir,              # The directory where results will be generated.
        data_seed = 1234,           # Random number seed for reproducibility.
        query_seed = 5678,          # Random number seed for reproducibility.
        num_threads = 4,            # Number of threads to use.
        distance = svs.DistanceType.L2,   # The distance type to use.
    )
    # [generate-dataset]


    # ###
    # Building the index
    # ###

    # [build-parameters]
    # Now, we can build a graph index over the data set.
    parameters = svs.VamanaBuildParameters(
        graph_max_degree = 64,
        window_size = 128,
    )
    # [build-parameters]

    # [build-index]
    # Build the index.
    index = svs.Vamana.build(
        parameters,
        svs.VectorDataLoader(
            os.path.join(test_data_dir, "data.fvecs"), svs.DataType.float32
        ),
        svs.DistanceType.L2,
        num_threads = 4,
    )
    # [build-index]

    # [build-index-fromNumpyArray]
    # Build the index.
    data = svs.read_vecs(os.path.join(test_data_dir, "data.fvecs"))
    index = svs.Vamana.build(
        parameters,
        data,
        svs.DistanceType.L2,
        num_threads = 4,
    )
    # [build-index-fromNumpyArray]


    # ###
    # Searching the index
    # ###

    # [load-aux]
    # Load the queries and ground truth.
    queries = svs.read_vecs(os.path.join(test_data_dir, "queries.fvecs"))
    groundtruth = svs.read_vecs(os.path.join(test_data_dir, "groundtruth.ivecs"))
    # [load-aux]

    # [perform-queries]
    # Set the search window size of the index and perform queries.
    index.search_window_size = 30
    I, D = index.search(queries, 10)

    # Compare with the groundtruth.
    recall = svs.k_recall_at(groundtruth, I, 10, 10)
    print(f"Recall = {recall}")
    assert(recall == 0.8288)
    # [perform-queries]

    # [search-window-size]
    # We can vary the search window size to demonstrate the trade off in accuracy.
    for window_size in range(10, 50, 10):
        index.search_window_size = window_size
        I, D = index.search(queries, 10)
        recall = svs.k_recall_at(groundtruth, I, 10, 10)
        print(f"Window size = {window_size}, Recall = {recall}")
    # [search-window-size]

    ##### Begin Test
    run_test_float(index, queries, groundtruth)
    ##### End Test


    # ###
    # Saving the index
    # ###

    # [saving-results]
    # Finally, we can save the results.
    index.save(
        os.path.join(test_data_dir, "example_config"),
        os.path.join(test_data_dir, "example_graph"),
        os.path.join(test_data_dir, "example_data"),
    )
    # [saving-results]


    # ###
    # Reloading a saved index
    # ###

    # [loading]
    # We can reload an index from a previously saved set of files.
    index = svs.Vamana(
        os.path.join(test_data_dir, "example_config"),
        svs.GraphLoader(os.path.join(test_data_dir, "example_graph")),
        svs.VectorDataLoader(
            os.path.join(test_data_dir, "example_data"), svs.DataType.float32
        ),
        svs.DistanceType.L2,
        num_threads = 4,
    )

    # We can rerun the queries to ensure everything works properly.
    index.search_window_size = 30
    I, D = index.search(queries, 10)

    # Compare with the groundtruth.
    recall = svs.k_recall_at(groundtruth, I, 10, 10)
    print(f"Recall = {recall}")
    assert(recall == 0.8288)
    # [loading]

    ##### Begin Test
    run_test_float(index, queries, groundtruth)
    ##### End Test

    # [only-loading]
    # We can reload an index from a previously saved set of files.
    index = svs.Vamana(
        os.path.join(test_data_dir, "example_config"),
        svs.GraphLoader(os.path.join(test_data_dir, "example_graph")),
        svs.VectorDataLoader(
            os.path.join(test_data_dir, "example_data"), svs.DataType.float32
        ),
        svs.DistanceType.L2,
        num_threads = 4,
    )
    # [only-loading]

    # [runtime-nthreads]
    index.num_threads = 4
    # [runtime-nthreads]


    # ###
    # Search using vector compression
    # ###

    # [search-compressed-loader]
    data_loader = svs.VectorDataLoader(
        os.path.join(test_data_dir, "example_data"),  # Uncompressed data
        svs.DataType.float32,
        dims = 128    # Passing dimensionality is optional
    )
    B1 = 4    # Number of bits for the first level LVQ quantization
    B2 = 8    # Number of bits for the residuals quantization
    padding = 32
    strategy = svs.LVQStrategy.Turbo
    compressed_loader = svs.LVQLoader(data_loader,
        primary=B1,
        residual=B2,
        strategy=strategy, # Passing the strategy is optional.
        padding=padding # Passing padding is optional.
    )
    # [search-compressed-loader]

    # [search-compressed]
    index = svs.Vamana(
        os.path.join(test_data_dir, "example_config"),
        svs.GraphLoader(os.path.join(test_data_dir, "example_graph")),
        compressed_loader,
        # Optional keyword arguments
        distance = svs.DistanceType.L2,
        num_threads = 4
    )

    # Compare with the groundtruth..
    index.search_window_size = 30
    I, D = index.search(queries, 10)
    recall = svs.k_recall_at(groundtruth, I, 10, 10)
    print(f"Compressed recall: {recall}")
    assert(recall == 0.8223)
    # [search-compressed]

    ##### Begin Test
    run_test_two_level4_8(index, queries, groundtruth)
    ##### End Test

    # [build-index-compressed]
    # Build the index.
    index = svs.Vamana.build(
        parameters,
        compressed_loader,
        svs.DistanceType.L2,
        num_threads = 4
    )
    # [build-index-compressed]

    # 1. Building Uncompressed
    # 2. Loading Uncompressed
    # 3. Loading with a recompressor

    # We can rerun the queries to ensure everything works properly.
    index.search_window_size = 30
    I, D = index.search(queries, 10)

    # Compare with the groundtruth.
    recall = svs.k_recall_at(groundtruth, I, 10, 10)
    print(f"Recall = {recall}")
    assert(recall == 0.8221)
    # [loading]

    ##### Begin Test
    run_test_build_two_level4_8(index, queries, groundtruth)
    ##### End Test

#####
##### Main Executable
#####

if __name__ == "__main__":
    run()

#####
##### As a unit test.
#####

class VamanaExampleTestCase(unittest.TestCase):
    def tearDown(self):
        if test_data_dir is not None:
            print(f"Removing temporary directory {test_data_dir}")
            os.rmdir(test_data_dir)

    def test_all(self):
        run()