Common Python API
Memory Allocators
- class svs.DRAM
Small class for an allocator capable of using huge pages. Prioritizes page use in the order: 1~GiB, 2~MiB, 4~KiB. See Huge Pages for more information on what huge pages are and how to allocate them on your system.
Enums
- class svs.DistanceType
Select which distance function to use
Members:
L2 : Euclidean Distance (minimize)
MIP : Maximum Inner Product (maximize)
Cosine : Cosine similarity (maximize)
- class svs.DataType
Datatype Selector
Members:
uint8 : 8-bit unsigned integer.
uint16 : 16-bit unsigned integer.
uint32 : 32-bit unsigned integer.
uint64 : 64-bit unsigned integer.
int8 : 8-bit signed integer.
int16 : 16-bit signed integer.
int32 : 32-bit signed integer.
int64 : 64-bit signed integer.
float16 : 16-bit IEEE floating point.
float32 : 32-bit IEEE floating point.
float64 : 64-bit IEEE floating point.
Helper Functions
- svs.read_vecs(filename)
Read a file in the bvecs/fvecs/ivecs format and return a NumPy array with the results.
The data type of the returned array is determined by the file extension with the following mapping:
bvecs: 8-bit unsigned integers.
fvecs: 32-bit floating point numbers.
ivecs: 32-bit signed integers.
- Parameters:
filename (str) – The file to read.
- Returns:
Numpy array with the results.
- svs.write_vecs(array, filename, skip_check=False)
- Parameters:
array (array) – The raw array to save.
filename (str) – The file where the results will be saved.
skip_check (bool) –
Be default, this function will check if the file extension for the vecs file is appropriate for the given array (see list below).
Passing skip_check = True overrides this logic and forces creation of the file.
- Result:
The array is saved to the requested file.
File extention to array element type:
fvecs: np.float32
hvecs: np.float16
ivecs: np.uint32
bvecs: np.uint8
Warning
The user must specify the file extension corresponding to the desired file format in the filename
argument of
svs.write_vecs()
.
- svs.read_svs(filename, dtype=<class 'numpy.float32'>)
Read the svs native data file as a numpy array. Note: As of no, now type checking is performed. Make sure the requested type actually matches the contents of the file.
- Parameters:
filename (str) – The file to read.
dtype – The data type of the encoded vectors in the file.
- Result:
A numpy matrix with the results.
- svs.convert_fvecs_to_float16(source_file: str, destination_file: str) None
Convert the fvecs file on disk with 32-bit floating point entries to a fvecs file with 16-bit floating point entries.
- Parameters:
source_file – The source file path to convert.
destination_file – The destination file to generate.
- svs.generate_test_dataset(nvectors, nqueries, ndims, directory, data_seed=None, query_seed=None, num_threads=1, num_neighbors=100, distance=<DistanceType.L2: 0>)
Generate a sample dataset consisting of the base data, queries, and groundtruth all in the standard
*vecs
form.- Parameters:
nvectors (int) – The number of base vectors in the generated dataset.
nqueries (int) – The number of query vectors in the generated dataset.
ndims (int) – The number of dimensions per vector in the dataset.
directory (str) – The directory in which to generate the dataset.
data_seed (optional) – The seed to use for random number generation in the dataset.
query_seed (optional) – The seed to use for random number generation for the queries.
num_threads (optional) – Number of threads to use to generate the groundtruth.
num_neighbors (int) – The number of neighbors to compute for the groundtruth.
distance (optional) – The distance metric to use for groundtruth generation.
Creates
directory
if it didn’t already exist. The following files are generated:$(directory)/data.fvecs
: The dataset encoded using float32 in as fvecs.$(directory)/queries.fvecs
: The queries encoded using float32 as fvecs.$(directory)/groundtruth.ivecs
: The computednum_neighbors
nearest neighbors of the queries in the dataset with respect to the provided distance.
- svs.convert_vecs_to_svs(vecs_file: str, svs_file: str, dtype: svs::python.DataType = <DataType.float32: 9>) None
Convert the vecs file (containing the specified element types) to the svs native format.
- Parameters:
vecs_file – The source [f/h/i/b]vecs file.
svs_file – The destination native file.
dtype – The svs.DataType of the vecs file. Supported types: (float32, float16, uint32, and uint8).
File extension type map:
fvecs = svs.DataType.float32
hvecs = svs.DataType.float16
ivecs = svs.DataType.uint32
bvecs = svs.DataType.uint8