Resolution Capacity: Beyond Reconstruction MSE for Progressive Approximate Nearest-Neighbor Search
Production RAG systems degrade as corpora grow because the quantization objective is wrong for progressive search. Reconstruction MSE preserves vector fidelity; progressive ANN needs early rank separation, balanced bit-plane partitions, and low conditional occupancy.
bitplane-ann introduces Resolution Capacity as that objective. The artifact shows that Hadamard rotation preserves prefix capacity on isotropic and anisotropic embeddings, while PCA and reconstruction-optimized rotations can collapse early-bit partitions even when reconstruction error is low.
This is the English repo+PDF research artifact for the June 2026 paper. It contains the canonical paper, compact public result summaries, Python reference experiments, official SAQ comparison notes, and R6 end-to-end beam+rerank evidence.
DOI: 10.5281/zenodo.20517605
The core failure mode is visible in the first few bits: PCA concentrates variance, overloads early prefix bins, and wastes Resolution Capacity; Hadamard spreads signal across dimensions, keeping prefix bins balanced enough for high-recall beam filtering.
paper/- canonical LaTeX source, compiled PDF, generated figures, and figure source CSVs.experiments/python_reference/- original Python reference experiments for Hadamard/PCA, early exit, Voronoi-local negative results, CAQ-style baseline, and Resolution Capacity diagnostics.experiments/EXPERIMENT_REPORT_PYTHON.md- English journal for the Python research phase.experiments/RUST_ACCELERATION_REPORT.md- English journal for Rust-backed 25k/1M validation, official SAQ, and R6 beam+rerank.experiments/results/- compact public CSV/JSON summaries supporting the paper and README claims.docs/provenance.md- release provenance and attribution notes.
| Claim | Measured setting | Result | Scope |
|---|---|---|---|
| Resolution Capacity | BGE-small and E5-small, N=25,000, b=8 |
Hadamard keeps eta_b > 0.78; PCA falls to eta_b=0.256 on E5-small |
Rotation/distribution diagnostic |
| Hadamard vs PCA | BGE-small and E5-small | Hadamard improves Recall@8 by 19-35 percentage points | Progressive retrieval objective |
| OPQ/PQ comparison | E5-small, equal 8-bit-per-dimension budget | Hadamard 99.6% vs OPQ/PQ 51.6% Recall@10 | Equal-budget diagnostic |
| Isotropy Paradox | Voronoi-local Hadamard experiments | Local k-means cells do not improve bit-1 rank signal after Hadamard | Negative finding |
| Oracle Gap | Early-exit experiments | Optimal exit is 1.00-1.07 bits; practical exit is 3.70-5.08 bits | Stopping-confidence finding |
| Rust SIMD | 25k parity checks and 1M-scale scans | 41-45x faster than dense prefix-score materialization | Kernel implementation result |
| R6 beam+rerank | SIFT1M, BGE@1M, GIST1M | 99.80-99.94% Recall@10 at 2.9-15.1 ms/query | Full end-to-end pipeline |
| SAQ artifact | BGE/E5 25k, AVX-512, SAQ commit 2163ebc |
SAQ reaches higher direct recall; bitplane direct prefix and beam+rerank occupy different regimes | Closest-prior benchmark |
| Storage | 8-bit scalar codes vs float32 vectors | One uint8 per dimension, 4x smaller than float32, no graph overhead |
Memory tradeoff |
HNSW remains much faster end-to-end in the current measurements. The paper positions bitplane-ann as a compact, graph-free progressive candidate-generation and capacity-control method whose practical differentiator is memory, not current latency.
- Normalize vectors.
- Pad to the next power-of-2 dimension
d' >= d. - Apply randomized Walsh-Hadamard rotation
H_d' * diag(s) * x, wheres in {-1,+1}^d'. - Lloyd-Max quantize each coordinate to an 8-bit scalar code using data-oblivious per-dimension codebooks.
- Store one contiguous
uint8array of shapeN x d. - At query time, compute prefix-L1 distance from the top
bbits:
prefix_dist_b(q, c) = sum_j |(q_j >> (8-b)) - (c_j >> (8-b))|
For beam mode, keep floor(N / 2^b) candidates after each depth and rerank final survivors with float32 cosine/IP. The same uint8 array serves every bit depth: no graph, no centroid routing, no duplicated index layout.
The paper uses the current variable names and definitions below.
Let B_{j,b}(X) be the set of non-empty b-bit scalar quantization bins for dimension j of corpus X, with |B_{j,b}(X)| <= 2^b.
Marginal Bin Load
R_b(X) = (1/d) * sum_j N / |B_{j,b}(X)|
R_b is the average number of corpus vectors per occupied bin per dimension. Under a perfectly uniform distribution, R_uniform ~= N / 2^b, and R_b >= R_uniform.
Marginal Resolution Capacity
eta_b(X) = R_uniform / R_b(X) = (N / 2^b) / R_b(X), in (0, 1]
eta_b=1 means perfect uniform partition utilization. eta_b << 1 means capacity collapse: a few bins overflow with imposters while many bins sit empty. Higher eta_b is better.
For direct prefix retrieval on BGE-small and E5-small text embeddings at N=25,000, b=8, the observed thresholds are:
| Regime | Condition | Direct-mode behavior |
|---|---|---|
| Safe | R_b < 130 and eta_b > 0.75 |
Recall@8 >= 99% |
| Caution | 130 <= R_b <= 200 and 0.49 <= eta_b <= 0.75 |
Recall@8 90-99% |
| Danger | R_b > 200 and eta_b < 0.49 |
Recall@8 < 80% |
These thresholds are empirical direct-mode diagnostics, not corpus-size limits. At 1M scale, R_b=3,906 is far beyond the direct-mode danger region, so direct prefix recall drops to 71-85%. Beam+rerank recovers to 99.80-99.94% Recall@10 because the beam schedule intentionally keeps N / 2^b candidates and reranks them with float32 vectors.
Resolution Capacity at 25k, b=8
| Dataset | Rotation | R_b |
eta_b |
Recall@8 |
|---|---|---|---|---|
| BGE-small | PCA | 190 | 0.514 | 80.85% |
| BGE-small | Hadamard | 100 | 0.976 | 99.8% |
| E5-small | PCA | 382 | 0.256 | 64.35% |
| E5-small | Hadamard | 124 | 0.787 | 99.6% |
1M beam retention at depth 8
| Dataset | Dimensions | Beam | Beam retention |
|---|---|---|---|
| SIFT1M | 128 | 3,906 / 1M | 100.00% |
| BGE@1M | 384 | 3,906 / 1M | 100.00% |
| GIST1M | 960 | 3,906 / 1M | 100.00% |
This retention table is a prior R0-R3 diagnostic: it asks whether the exact top-10 neighbors appear inside the depth-8 prefix beam. R6 is the authoritative end-to-end pipeline measurement.
R6 end-to-end beam+rerank
| Dataset | Recall@10 | Total ms/query | Bytes/vector | Notes |
|---|---|---|---|---|
| SIFT1M | 99.94% | 2.87 | 128 | depth=8, beam=3,906 |
| BGE@1M | 99.80% | 5.60 | 384 | depth=8, beam=3,906 |
| GIST1M | 99.84% | 15.10 | 960 | depth=8, beam=3,906 |
Diagnostic latency comparison at 1M
| Dataset | Method | Recall@10 | ms/query | Bytes/vector |
|---|---|---|---|---|
| SIFT1M | HNSW | 99.76% | 0.15 | 512 + graph |
| SIFT1M | bitplane beam+rerank | 99.94% | 2.87 | 128 |
| BGE@1M | HNSW | 99.64% | 0.28 | 1,536 + graph |
| BGE@1M | bitplane beam+rerank | 99.80% | 5.60 | 384 |
| GIST1M | HNSW | 97.00% | 0.98 | 3,840 + graph |
| GIST1M | bitplane beam+rerank | 99.84% | 15.10 | 960 |
The paper's caveat is explicit: current Rust byte-SAD prefix-L1 kernels are 4-8x slower than FAISS exact at d >= 384, and HNSW is 15-20x faster end-to-end. The practical differentiator is compact storage and no graph overhead.
Official SAQ artifact vs bitplane direct-prefix diagnostic
SAQ was run from the official C++ artifact at commit 2163ebc on AVX-512 hardware with K=4,096, B in {1,8}, nprobe in {8,32,128,512}, and 4 threads. These are direct-retrieval operating points, not R6 beam+rerank.
| Dataset | Method | Config | Recall@10 | ms/query |
|---|---|---|---|---|
| BGE-small | SAQ | B=1, nprobe=512 |
75.4% | 92.2 |
| BGE-small | SAQ | B=8, nprobe=512 |
96.3% | 95.8 |
| BGE-small | bitplane direct prefix | depth=8 | 76.1% | 13.4 |
| E5-small | SAQ | B=1, nprobe=512 |
68.2% | 91.8 |
| E5-small | SAQ | B=8, nprobe=512 |
93.3% | 96.3 |
| E5-small | bitplane direct prefix | depth=8 | 70.9% | 13.4 |
- No dedicated SIMD bitplane kernels yet; production speed requires AVX-512 bitplane kernels.
eta_b/R_banalysis is primarily validated at 25k; systematic sweeps across modalities and scales are future work.- R6 validates end-to-end beam+rerank latency, but HNSW remains much faster.
- Bimodal adaptive exit is not confirmed; compute savings are consistent across queries, not query-dependent.
- Official SAQ was benchmarked as-is with
K=4,096; results may differ atK ~= sqrt(N).
Install the lightweight Python dependencies and rebuild the figures from tracked CSV/JSON summaries:
uv sync
uv run python paper/make_figures.pyThe script writes:
paper/figures/fig_recall_curves.pdfpaper/figures/fig_scale_rust.pdf- matching
*_source.csvfiles
cd paper
tectonic main.texIf tectonic is unavailable, use a standard LaTeX installation:
cd paper
pdflatex main.tex
pdflatex main.texThe canonical paper files are:
paper/main.texpaper/main.pdf
The Python reference scripts live under experiments/python_reference/.
Smoke example:
uv run --extra experiments python experiments/python_reference/experiment0_bitplanes.py \
--corpus-size 1000 \
--queries 50 \
--sample-queries 5 \
--prune-queries 10 \
--out-dir experiments/results/python_reference_smokeFull 25k runs may download Hugging Face datasets/models and may create local .npy caches. Those large caches are intentionally not tracked.
The R6 large-scale beam+rerank result is archived as compact output files under experiments/results/rust_acceleration/. Re-running R6 requires the external implementation workspace bitplane-infer, its Rust/Python backend, and local 1M-scale datasets; those large implementation and data directories are intentionally not vendored here.
paper/ Canonical LaTeX paper and generated figures
experiments/python_reference/
Python reference experiments
experiments/results/ Compact public CSV/JSON result summaries
docs/assets/ README and explanatory visual assets
docs/provenance.md Artifact provenance and attribution
@misc{bitplane-ann2026,
author = {Anokhin, Oleksandr},
title = {Resolution Capacity: Beyond Reconstruction MSE for Progressive Approximate Nearest-Neighbor Search},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.20517605},
url = {https://doi.org/10.5281/zenodo.20517605}
}- Software, scripts, configuration, and compact reproducibility artifacts: Apache-2.0. See
LICENSE. - Paper source, compiled paper PDF, and paper figures under
paper/: CC BY 4.0. SeeLICENSE-PAPER.md.
