Skip to the content.

Similarity Search Optimization Guides

This section contains optimization guides for vector similarity search workloads on Intel hardware. These guides help users of popular vector search solutions achieve optimal performance on Intel Xeon processors.

Overview

Vector similarity search is a core component of modern AI applications including:

Intel Scalable Vector Search (SVS)

Intel Scalable Vector Search (SVS) is a high-performance library for vector similarity search, optimized for Intel hardware. SVS can be used directly as a standalone library, and is integrated into popular solutions to bring these optimizations to a wider audience.

SVS features:

Understanding LVQ and LeanVec Compression

Traditional vector compression methods face limitations in graph-based search. Product Quantization (PQ) requires keeping full-precision vectors for re-ranking, defeating compression benefits. Standard scalar quantization with global bounds fails to efficiently utilize available quantization levels.

LVQ (Locally-adaptive Vector Quantization)

LVQ addresses these limitations by applying per-vector normalization and scalar quantization, adapting the quantization bounds individually for each vector. This local adaptation ensures efficient use of the available bit range, resulting in high-quality compressed representations.

Key benefits:

LVQ achieves a four-fold reduction of vector size while maintaining search accuracy. A typical 768-dimensional float32 vector requiring 3072 bytes can be reduced to just a few hundred bytes.

LeanVec (LVQ with Dimensionality Reduction)

LeanVec builds on LVQ by first applying linear dimensionality reduction, then compressing the reduced vectors with LVQ. This two-step approach significantly cuts memory and compute costs, enabling faster similarity search and index construction with minimal accuracy loss—especially effective for high-dimensional deep learning embeddings.

Best suited for:

Two-Level Compression

Both LVQ and LeanVec support two-level compression schemes:

  1. Level 1: Fast candidate retrieval using compressed vectors
  2. Level 2: Re-ranking for accuracy (LVQ encodes residuals, LeanVec encodes the full dimensionality data)

The naming convention reflects bits per dimension at each level:

Vector Compression Selection

Compression Best For Observations
LVQ4x4 Fast search and low memory use Consider LeanVec for even faster search
LeanVec4x8 Fastest search and ingestion LeanVec dimensionality reduction might reduce recall
LVQ4 Maximum memory saving Recall might be insufficient
LVQ8 Faster ingestion than LVQ4x4 Search likely slower than LVQ4x4
LeanVec8x8 Improved recall when LeanVec4x8 is insufficient LeanVec dimensionality reduction might reduce recall
LVQ4x8 Improved recall when LVQ4x4 is insufficient Slightly worse memory savings

Rule of thumb:

Available Guides

Software Description Guide
Redis Redis Query Engine with SVS-VAMANA Redis Guide

References