Static Dimensionality

SVS supports the option of setting the dimensionality at compile time (static) versus at runtime (dynamic). Setting it statically is a notable optimization because, as the dimensionality of a dataset is fixed, it presents no detrimental aspects and it boosts performance by improving the compiler’s ability to unroll loops in the similarity function kernel more extensively. For example, for the standard 96-dimensional dataset Deep, we observe up to a 32% performance speedup when using static versus dynamic dimensionality to search on 100 million points [ABHT23].

Uncompressed data

In Python, to add support to the svs module for static dimensionality for the Vamana graph index:

  1. Define the desired dimensionality specialization in the vamana.h file by adding the corresponding line to the for_standard_specializations template indicating the desired query data type, vector data type and dimensionality (see supported data types).

    For example, to add static dimensionality support for the 96-dimensional dataset Deep, for float32-valued queries and base vectors, add the following line:

XN(float,   float, 96);

Or use the following if also want to enable graph building directly from a Numpy array.

X (float,   float, 96, EnableBuild::FromFileAndArray);
  1. Install svs.

For the Dynamic and Flat indices follow the same procedure with the dynamic_vamana.h and flat.cpp files respectively.

In C++

When building or loading an index, the Extent template argument of the svs::VectorDataLoader needs to be set to the specified dimensionality.

svs::VectorDataLoader<float, 96>("data_f32.svs")

LVQ compressed data

In Python, to add support for static dimensionality for the Vamana graph index when using LVQ compression:

  1. Define the desired dimensionality specialization in the vamana.h file by adding the corresponding line to the lvq_specialize_B1xB2 template, where B1 and B2 are the number of bits in the primary and secondary LVQ levels. Indicate the desired distance type (Euclidean distance and inner product are currently supported), dimensionality, implementation strategy (Turbo or Sequential), and whether graph building with compressed vectors is to be enabled for that setting.

    For example, to add static dimensionality support for a 512-dimensional dataset, with LVQ using 4 and 8 bits in the primary and secondary levels respectively, using Turbo, for inner product, with graph building enabled, add the following line to the lvq_specialize_4x8 template:

X(DistanceIP, 4, 8, 512, Turbo, true);
  1. Add the corresponding template to the compressed_specializations template in the same file.

  2. Install svs.

For the DynamicVamana graph index:

  1. Define the desired dimensionality specialization in the dynamic_vamana.h file by adding the corresponding line to the for_compressed_specializations template, indicating the desired distance type (Euclidean distance and inner product are currently supported), the number of bits in the primary and secondary LVQ levels, the implementation strategy (Turbo or Sequential), and the dimensionality.

    For example, to add static dimensionality support for a 512-dimensional dataset, with LVQ using 4 and 8 bits in the primary and secondary levels respectively, using Turbo, for inner product add the following line:

X(DistanceIP, 4, 8, Turbo, 512);
  1. Install svs.

For the Flat index follow the same procedure with the flat.cpp file.