GitHub - olanokhin/bitplane-ann: Research artifact for Resolution Capacity in progressive approximate nearest-neighbor search

Resolution Capacity: Beyond Reconstruction MSE for Progressive Approximate Nearest-Neighbor Search

Production RAG systems degrade as corpora grow because the quantization objective is wrong for progressive search. Reconstruction MSE preserves vector fidelity; progressive ANN needs early rank separation, balanced bit-plane partitions, and low conditional occupancy.

bitplane-ann introduces Resolution Capacity as that objective. The artifact shows that Hadamard rotation preserves prefix capacity on isotropic and anisotropic embeddings, while PCA and reconstruction-optimized rotations can collapse early-bit partitions even when reconstruction error is low.

This is the English repo+PDF research artifact for the June 2026 paper. It contains the canonical paper, compact public result summaries, Python reference experiments, official SAQ comparison notes, and R6 end-to-end beam+rerank evidence.

DOI: 10.5281/zenodo.20517605

The core failure mode is visible in the first few bits: PCA concentrates variance, overloads early prefix bins, and wastes Resolution Capacity; Hadamard spreads signal across dimensions, keeping prefix bins balanced enough for high-recall beam filtering.

What Is In This Artifact

paper/ - canonical LaTeX source, compiled PDF, generated figures, and figure source CSVs.
experiments/python_reference/ - original Python reference experiments for Hadamard/PCA, early exit, Voronoi-local negative results, CAQ-style baseline, and Resolution Capacity diagnostics.
experiments/EXPERIMENT_REPORT_PYTHON.md - English journal for the Python research phase.
experiments/RUST_ACCELERATION_REPORT.md - English journal for Rust-backed 25k/1M validation, official SAQ, and R6 beam+rerank.
experiments/results/ - compact public CSV/JSON summaries supporting the paper and README claims.
docs/provenance.md - release provenance and attribution notes.

Paper Claims

Claim	Measured setting	Result	Scope
Resolution Capacity	BGE-small and E5-small, `N=25,000`, `b=8`	Hadamard keeps `eta_b > 0.78`; PCA falls to `eta_b=0.256` on E5-small	Rotation/distribution diagnostic
Hadamard vs PCA	BGE-small and E5-small	Hadamard improves Recall@8 by 19-35 percentage points	Progressive retrieval objective
OPQ/PQ comparison	E5-small, equal 8-bit-per-dimension budget	Hadamard 99.6% vs OPQ/PQ 51.6% Recall@10	Equal-budget diagnostic
Isotropy Paradox	Voronoi-local Hadamard experiments	Local k-means cells do not improve bit-1 rank signal after Hadamard	Negative finding
Oracle Gap	Early-exit experiments	Optimal exit is 1.00-1.07 bits; practical exit is 3.70-5.08 bits	Stopping-confidence finding
Rust SIMD	25k parity checks and 1M-scale scans	41-45x faster than dense prefix-score materialization	Kernel implementation result
R6 beam+rerank	SIFT1M, BGE@1M, GIST1M	99.80-99.94% Recall@10 at 2.9-15.1 ms/query	Full end-to-end pipeline
SAQ artifact	BGE/E5 25k, AVX-512, SAQ commit `2163ebc`	SAQ reaches higher direct recall; bitplane direct prefix and beam+rerank occupy different regimes	Closest-prior benchmark
Storage	8-bit scalar codes vs float32 vectors	One `uint8` per dimension, 4x smaller than float32, no graph overhead	Memory tradeoff

HNSW remains much faster end-to-end in the current measurements. The paper positions bitplane-ann as a compact, graph-free progressive candidate-generation and capacity-control method whose practical differentiator is memory, not current latency.

Method

Normalize vectors.
Pad to the next power-of-2 dimension d' >= d.
Apply randomized Walsh-Hadamard rotation H_d' * diag(s) * x, where s in {-1,+1}^d'.
Lloyd-Max quantize each coordinate to an 8-bit scalar code using data-oblivious per-dimension codebooks.
Store one contiguous uint8 array of shape N x d.
At query time, compute prefix-L1 distance from the top b bits:

prefix_dist_b(q, c) = sum_j |(q_j >> (8-b)) - (c_j >> (8-b))|

For beam mode, keep floor(N / 2^b) candidates after each depth and rerank final survivors with float32 cosine/IP. The same uint8 array serves every bit depth: no graph, no centroid routing, no duplicated index layout.

Resolution Capacity

The paper uses the current variable names and definitions below.

Let B_{j,b}(X) be the set of non-empty b-bit scalar quantization bins for dimension j of corpus X, with |B_{j,b}(X)| <= 2^b.

Marginal Bin Load

R_b(X) = (1/d) * sum_j N / |B_{j,b}(X)|

R_b is the average number of corpus vectors per occupied bin per dimension. Under a perfectly uniform distribution, R_uniform ~= N / 2^b, and R_b >= R_uniform.

Marginal Resolution Capacity

eta_b(X) = R_uniform / R_b(X) = (N / 2^b) / R_b(X), in (0, 1]

eta_b=1 means perfect uniform partition utilization. eta_b << 1 means capacity collapse: a few bins overflow with imposters while many bins sit empty. Higher eta_b is better.

For direct prefix retrieval on BGE-small and E5-small text embeddings at N=25,000, b=8, the observed thresholds are:

Regime	Condition	Direct-mode behavior
Safe	`R_b < 130` and `eta_b > 0.75`	Recall@8 >= 99%
Caution	`130 <= R_b <= 200` and `0.49 <= eta_b <= 0.75`	Recall@8 90-99%
Danger	`R_b > 200` and `eta_b < 0.49`	Recall@8 < 80%

These thresholds are empirical direct-mode diagnostics, not corpus-size limits. At 1M scale, R_b=3,906 is far beyond the direct-mode danger region, so direct prefix recall drops to 71-85%. Beam+rerank recovers to 99.80-99.94% Recall@10 because the beam schedule intentionally keeps N / 2^b candidates and reranks them with float32 vectors.

Core Results

Resolution Capacity at 25k, b=8

Dataset	Rotation	`R_b`	`eta_b`	Recall@8
BGE-small	PCA	190	0.514	80.85%
BGE-small	Hadamard	100	0.976	99.8%
E5-small	PCA	382	0.256	64.35%
E5-small	Hadamard	124	0.787	99.6%

1M beam retention at depth 8

Dataset	Dimensions	Beam	Beam retention
SIFT1M	128	3,906 / 1M	100.00%
BGE@1M	384	3,906 / 1M	100.00%
GIST1M	960	3,906 / 1M	100.00%

This retention table is a prior R0-R3 diagnostic: it asks whether the exact top-10 neighbors appear inside the depth-8 prefix beam. R6 is the authoritative end-to-end pipeline measurement.

R6 end-to-end beam+rerank

Dataset	Recall@10	Total ms/query	Bytes/vector	Notes
SIFT1M	99.94%	2.87	128	depth=8, beam=3,906
BGE@1M	99.80%	5.60	384	depth=8, beam=3,906
GIST1M	99.84%	15.10	960	depth=8, beam=3,906

Diagnostic latency comparison at 1M

Dataset	Method	Recall@10	ms/query	Bytes/vector
SIFT1M	HNSW	99.76%	0.15	512 + graph
SIFT1M	bitplane beam+rerank	99.94%	2.87	128
BGE@1M	HNSW	99.64%	0.28	1,536 + graph
BGE@1M	bitplane beam+rerank	99.80%	5.60	384
GIST1M	HNSW	97.00%	0.98	3,840 + graph
GIST1M	bitplane beam+rerank	99.84%	15.10	960

The paper's caveat is explicit: current Rust byte-SAD prefix-L1 kernels are 4-8x slower than FAISS exact at d >= 384, and HNSW is 15-20x faster end-to-end. The practical differentiator is compact storage and no graph overhead.

Official SAQ artifact vs bitplane direct-prefix diagnostic

SAQ was run from the official C++ artifact at commit 2163ebc on AVX-512 hardware with K=4,096, B in {1,8}, nprobe in {8,32,128,512}, and 4 threads. These are direct-retrieval operating points, not R6 beam+rerank.

Dataset	Method	Config	Recall@10	ms/query
BGE-small	SAQ	`B=1`, `nprobe=512`	75.4%	92.2
BGE-small	SAQ	`B=8`, `nprobe=512`	96.3%	95.8
BGE-small	bitplane direct prefix	depth=8	76.1%	13.4
E5-small	SAQ	`B=1`, `nprobe=512`	68.2%	91.8
E5-small	SAQ	`B=8`, `nprobe=512`	93.3%	96.3
E5-small	bitplane direct prefix	depth=8	70.9%	13.4

Honest Caveats

No dedicated SIMD bitplane kernels yet; production speed requires AVX-512 bitplane kernels.
eta_b/R_b analysis is primarily validated at 25k; systematic sweeps across modalities and scales are future work.
R6 validates end-to-end beam+rerank latency, but HNSW remains much faster.
Bimodal adaptive exit is not confirmed; compute savings are consistent across queries, not query-dependent.
Official SAQ was benchmarked as-is with K=4,096; results may differ at K ~= sqrt(N).

Reproduce The Public Figures

Install the lightweight Python dependencies and rebuild the figures from tracked CSV/JSON summaries:

uv sync
uv run python paper/make_figures.py

The script writes:

paper/figures/fig_recall_curves.pdf
paper/figures/fig_scale_rust.pdf
matching *_source.csv files

Rebuild The Paper

cd paper
tectonic main.tex

If tectonic is unavailable, use a standard LaTeX installation:

cd paper
pdflatex main.tex
pdflatex main.tex

The canonical paper files are:

paper/main.tex
paper/main.pdf

Reproduce The Research Scripts

The Python reference scripts live under experiments/python_reference/.

Smoke example:

uv run --extra experiments python experiments/python_reference/experiment0_bitplanes.py \
  --corpus-size 1000 \
  --queries 50 \
  --sample-queries 5 \
  --prune-queries 10 \
  --out-dir experiments/results/python_reference_smoke

Full 25k runs may download Hugging Face datasets/models and may create local .npy caches. Those large caches are intentionally not tracked.

The R6 large-scale beam+rerank result is archived as compact output files under experiments/results/rust_acceleration/. Re-running R6 requires the external implementation workspace bitplane-infer, its Rust/Python backend, and local 1M-scale datasets; those large implementation and data directories are intentionally not vendored here.

Results To Read First

Repository Layout

paper/                    Canonical LaTeX paper and generated figures
experiments/python_reference/
                          Python reference experiments
experiments/results/       Compact public CSV/JSON result summaries
docs/assets/               README and explanatory visual assets
docs/provenance.md         Artifact provenance and attribution

How to Cite

@misc{bitplane-ann2026,
  author       = {Anokhin, Oleksandr},
  title        = {Resolution Capacity: Beyond Reconstruction MSE for Progressive Approximate Nearest-Neighbor Search},
  year         = {2026},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.20517605},
  url          = {https://doi.org/10.5281/zenodo.20517605}
}

License

Software, scripts, configuration, and compact reproducibility artifacts: Apache-2.0. See LICENSE.
Paper source, compiled paper PDF, and paper figures under paper/: CC BY 4.0. See LICENSE-PAPER.md.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
docs		docs
experiments		experiments
paper		paper
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE-PAPER.md		LICENSE-PAPER.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What Is In This Artifact

Paper Claims

Method

Resolution Capacity

Core Results

Honest Caveats

Reproduce The Public Figures

Rebuild The Paper

Reproduce The Research Scripts

Results To Read First

Repository Layout

How to Cite

License

About

Licenses found

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What Is In This Artifact

Paper Claims

Method

Resolution Capacity

Core Results

Honest Caveats

Reproduce The Public Figures

Rebuild The Paper

Reproduce The Research Scripts

Results To Read First

Repository Layout

How to Cite

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages