.. _testing: Testing ======= This file describes the testing methodology used in SVS and includes documentation on how update reference results for integration tests. The test suite defined in the ``tests`` directory are broken into two categories: *unit tests* that test individual components and *integration tests* which exercise more cross-unit functionality. The latter includes index searching and index construction. Now, we cannot necessarily make our integration tests exhaustive for a number of reasons. First, integration tests should have high coverage of the many different combinations of indexes, distance functions, compile time optimizations, etc., each of which takes time. Second, these integration tests must all reference the same binary dataset to minimize the repository size. This dataset may not be appropriate for all distance functions. Hence, integration tests compare against reference behavior (i.e., recall for a given configuration generated at some point in the past) and fail if the observed behavior deviates too far. These reference recalls are generated by executables in the benchmarking framework. See the section on :ref:`building the library <build>` for CMake options related to building integration tests into the test executable (``SVS_FORCE_INTEGRATION_TESTS``) and building reference result test generators into the benchmarking executable (``SVS_BUILD_BENCHMARK_TEST_GENERRATORS``). Testing-Related Manifest ------------------------ A summary of the testing related files in the SVS repository and their role is given below. :: data/ |-- test_dataset/ : Directory containing the reference dataset used for testing as well | | as reference recall values. | | | |-- reference/ : Directory containing reference configuration/recall values for | | | index implementations. | | | | | +-- vamana_reference.toml: Reference configurations and recall for the Vamana | | | index. Auto-generated by the benchmarking framework. | | | | | +-- inverted_reference.toml: Reference configurations and recall for the | | Inverted index. Auto-generated by the benchmarking | | framework. | | | |-- data_f32.fvecs : The dataset upon which reference results are generated. | |-- data_f32.svs : A copy of `data_f32.fvecs` but encoded in the SVS format. | |-- graph_f32.svs : A reference Vamana index constructed for the test dataset. | |-- groundtruth_cosine.ivecs : The groundtruth for the queries over the test dataset | | for the CosineSimilarity distance measure. | | | |-- groundtruth_euclidean.ivecs : The groundtruth for the queries for the L2 | | similarity measure. | | | |-- groundtruth_mip.ivecs : The groundtruth for the queries for the L2 | | similarity measure. | | | |-- known_f32.fvecs : A small "fvecs" file with known contents. The expected | | contents are returned by ``test_dataset::reference_file_contents()`` | | | |-- known_f32.svs : A copy of `data_f32.fvecs` but encoded in the SVS format. | |-- queries_f32.svs : The queries to use for the test dataset. | |-- leanvec_data_matrix.fvecs : LeanVec OOD matrix to transform data. Shape(128, 64) | |-- leanvec_query_matrix.fvecs : LeanVec OOD matrix to transform queries. Shape(128, 64) | +-- vamana_config.toml : The serialized ``VamanaIndexParameters`` of the reference | Vamana graph. | +-- tools/ | +-- benchmark_inputs/ | +-- vamana/test-generator.toml : Input file used to generate the reference | results (data/test_dataset/reference/vamana_reference.toml) from the | benchmarking framework. | +-- vamana/test-generator.toml : Input file used to generate the reference results (data/test_dataset/reference/inverted_reference.toml) from the benchmarking framework. Generating Test Inputs ---------------------- Complete scripts for compiling, executing, and updating reference inputs are collected here. Vamana ****** :: SVS_NUM_THREADS=10 CC=gcc-11 C++=g++-11 mkdir build && cd build cmake .. -DCMAKE_BUILD_TYPE=Release \ -DSVS_BUILD_BENCHMARK_TEST_GENERATORS=YES make -j # Generate the expected results. ./benchmark/svs_benchmark vamana_test_generator \ ../tools/benchmark_inputs/vamana-test-generator.toml \ ./vamana_reference.toml \ ${SVS_NUM_THREADS} \ ../data/test_dataset # After checking that the results look good, update the reference file. cp ./vamana_reference.toml ../data/test_dataset/vamana_reference.toml