IVF-PQ Integration
IVF and PQ combine hierarchically to achieve sub-linear search time and memory efficiency. Let’s formalize their synergy using our toy dataset and mathematical analysis.
1. Workflow Overview
Indexing Phase (Preprocessing):
- IVF Clustering: Partition data into
clusters. - PQ Compression: Encode vectors in each cluster into PQ codes.
Query Phase (Search):
- IVF: Find the
closest clusters to . - PQ: Compute approximate distances within those clusters using LUTs.
2. Indexing Phase (Step-by-Step)
Step 1: IVF Clustering
- Input: Dataset
. - K-means: Partition
into clusters with centroids . - Output: Inverted lists mapping centroids to their vectors.
Example:
- Clusters: 3 clusters (red, green, blue) in
. - Inverted Lists:
Step 2: PQ Compression
- Subspace Decomposition: Split each cluster’s vectors into
subspaces. - Codebook Training: Learn
centroids per subspace via k-means. - PQ Encoding: Replace subvectors with centroid indices.
Example:
- Cluster 2 (green) PQ Codes:
Original Vector | PQ Code | Reconstructed Vector |
---|---|---|
3. Query Phase (Step-by-Step)
Step 1: IVF Cluster Selection
- Compute Distances to Centroids:
- Select
Closest Clusters: - Example: For
, probe Cluster 2 (green).
- Example: For
Step 2: PQ Distance Approximation
-
Precompute LUTs:
For each subspace, compute distances from to codebook centroids. -
Search in Probed Clusters:
For each PQ codein the cluster:
Example:
- Query
: - LUTs:
, . - PQ Codes in Cluster 2:
- LUTs:
PQ Code | |
---|---|
4. Error Sources & Mitigation
Error 1: IVF Cluster Miss
- Cause: True nearest neighbor is outside probed clusters.
- Mitigation: Increase
(e.g., from 1 to 5).
Error 2: PQ Quantization
- Cause: Approximating subvectors with centroids.
- Mitigation:
- Increase
(more subspaces → finer quantization). - Increase
(more centroids per subspace).
- Increase
Example:
- True Distance:
. - PQ Approximation:
(error = ).
5. Parameter Tuning
Parameter | IVF Impact | PQ Impact |
---|---|---|
num_partitions ( |
Higher |
No direct impact. |
nprobe |
Higher → slower search, better recall. | No direct impact. |
num_sub_vectors ( |
No direct impact. | Higher |
accelerator |
Faster k-means clustering. | Faster codebook training. |
Parameter | Effect on Efficiency | Effect on Accuracy |
---|---|---|
num_partitions ( |
Higher |
Too high |
nprobe |
Higher nprobenprobe → Slower search. | Higher recall. |
num_sub_vectors ( |
Higher |
Lower |
Example:
For a 1536D vector:
- Optimal
: 16 (splits into 96D subspaces). - Optimal
: 256 (balances cluster granularity and training cost).
6. Real-World Scaling
For 1B vectors in
- IVF: Reduces search scope to
vectors. - PQ: Reduces per-vector distance cost from 1536 operations to 16 LUT additions.
- Total Speedup:
7. Summary
Aspect | IVF-PQ Integration |
---|---|
Indexing | Hierarchical: Clusters (IVF) + compressed codes (PQ). |
Querying | Two-stage: Coarse search (IVF) + fine approximation (PQ). |
Efficiency | Combines IVF’s search reduction and PQ’s compression for exponential speedup. |
Accuracy Control | Tunable via |
8. Back to Your Code
def create_indexed_table(...):
# IVF-PQ parameters
target_table.create_index(
num_partitions=256, # 256 clusters (IVF)
num_sub_vectors=16, # Split 1536D into 16 subspaces (PQ)
nprobe=20, # Probe 20/256 clusters (7.8%)
accelerator="cuda" # GPU for faster training
)
Why This Works:
- Balances speed (GPU + 7.8% search scope) and accuracy (16 subspaces).
- Scales to billion-scale datasets with minimal memory footprint.
Final Takeaway
IVF-PQ is like a two-layered sieve:
- IVF filters out irrelevant data (coarse layer).
- PQ refines the search within the filtered subset (fine layer).
Together, they make high-dimensional ANN tractable!
Back to outline: IVF-PQ Outline