Static Dimensionality
SVS supports the option of setting the dimensionality at compile time (static) versus at runtime (dynamic). Setting it statically is a notable optimization because, as the dimensionality of a dataset is fixed, it presents no detrimental aspects and it boosts performance by improving the compiler’s ability to unroll loops in the similarity function kernel more extensively. For example, for the standard 96-dimensional dataset Deep, we observe up to a 32% performance speedup when using static versus dynamic dimensionality to search on 100 million points [ABHT23].
Uncompressed data
In Python, to add support to the svs module for static dimensionality for the Vamana graph index:
Define the desired dimensionality specialization in the vamana.h file by adding the corresponding line to the
for_standard_specializations
template indicating the desired query data type, vector data type and dimensionality (see supported data types).For example, to add static dimensionality support for the 96-dimensional dataset Deep, for float32-valued queries and base vectors, add the following line:
XN(float, float, 96);
Or use the following if also want to enable graph building directly from a Numpy array.
X (float, float, 96, EnableBuild::FromFileAndArray);
For the Dynamic and Flat indices follow the same procedure with the dynamic_vamana.h and flat.cpp files respectively.
In C++
When building or loading an index, the Extent
template argument of the svs::VectorDataLoader
needs to
be set to the specified dimensionality.
svs::VectorDataLoader<float, 96>("data_f32.svs")
LVQ compressed data
In Python, to add support for static dimensionality for the Vamana graph index when using LVQ compression:
Define the desired dimensionality specialization in the vamana.h file by adding the corresponding line to the
lvq_specialize_B1xB2
template, where B1 and B2 are the number of bits in the primary and secondary LVQ levels. Indicate the desired distance type (Euclidean distance and inner product are currently supported), dimensionality, implementation strategy (Turbo or Sequential), and whether graph building with compressed vectors is to be enabled for that setting.For example, to add static dimensionality support for a 512-dimensional dataset, with LVQ using 4 and 8 bits in the primary and secondary levels respectively, using Turbo, for inner product, with graph building enabled, add the following line to the
lvq_specialize_4x8
template:
X(DistanceIP, 4, 8, 512, Turbo, true);
Add the corresponding template to the
compressed_specializations
template in the same file.
For the DynamicVamana graph index:
Define the desired dimensionality specialization in the dynamic_vamana.h file by adding the corresponding line to the
for_compressed_specializations
template, indicating the desired distance type (Euclidean distance and inner product are currently supported), the number of bits in the primary and secondary LVQ levels, the implementation strategy (Turbo or Sequential), and the dimensionality.For example, to add static dimensionality support for a 512-dimensional dataset, with LVQ using 4 and 8 bits in the primary and secondary levels respectively, using Turbo, for inner product add the following line:
X(DistanceIP, 4, 8, Turbo, 512);
For the Flat index follow the same procedure with the flat.cpp file.