Skip to content

Filyus/packed_spatial_index

Repository files navigation

Packed Spatial Index

crates.io docs.rs Rust CI MSRV License

A fast, packed static spatial index for 2D and 3D axis-aligned bounding boxes (AABBs). Pack the boxes into a Hilbert R-tree once, then run millions of queries of every kind:

  • range / intersection search
  • nearest neighbors (kNN) from a point or a box, under Euclidean or any custom metric — including great-circle distance for lon/lat data
  • ray casts (all hits or the closest)
  • spatial joins between two indexes
  • region / culling — 2D triangle / convex-polygon and 3D view-frustum queries that prune to the true shape: ~1.5–7× fewer hits and ~2–14× faster than the bounding-box workaround (synthetic 200k-box bench)

Queries run on runtime-dispatched SIMD — the widest kernel your CPU offers is chosen at load time (AVX-512 → AVX2 → SSE2), no special build flags. Range search runs ~1.6–1.9× over the scalar index on AVX-512 and ~1.3–1.65× on AVX2 (it emulates the missing compress instruction so the win holds on older CPUs too). Builds beat comparable Rust indexes too (benchmarks). The same bytes load back as zero-copy, mmap-friendly views; a file can carry an optional per-item payload and file-level metadata; and a streaming reader answers a windowed query over a 100 MB index on object storage in a handful of range reads, without loading the whole file.

Live WASM demo

use packed_spatial_index::{Index2DBuilder, Box2D};

let mut builder = Index2DBuilder::new(2);
builder.add(Box2D::new(0.0, 0.0, 1.0, 1.0));
builder.add(Box2D::new(5.0, 5.0, 6.0, 6.0));
let index = builder.finish()?;

let hits = index.search(Box2D::new(0.0, 0.0, 2.0, 2.0));
assert_eq!(hits, vec![0]);
# Ok::<(), packed_spatial_index::BuildError>(())

Installation

Requires Rust 1.89 or newer.

[dependencies]
packed_spatial_index = "0.18"

When to use it

Use this crate when your geometry is static or rebuilt in batches, you can key results by insertion-order index into your own payload array, and you want a compact in-memory (or mmap'd) index with reusable buffers for high query throughput. It is not a dynamic R-tree — there are no insert/delete operations after finish().

Performance

Built for throughput on static geometry: fast builds, SIMD range / kNN / raycast (AVX2 / AVX-512), and reusable query buffers for tight loops. The Performance page has the full benchmarks against static_aabb2d_index, FlatGeobuf and the bvh crate, showing where it leads and where it doesn't — plus build flags for AVX2 / AVX-512 codegen (-C target-cpu=native).

Queries at a glance

Every in-memory f64 query (range, kNN, raycast, join) exists on Index2D / Index3D, the simd-feature SimdIndex2D / SimdIndex3D, and the zero-copy views. The compact f32 indexes and the streaming reader cover a subset (see the coverage matrix). Range/ray results are item indices in insertion order; result order is unspecified. For a boolean "any overlap?" reach for any (no allocation, stops at the first hit) rather than search(..).is_empty(); search returns an owned Vec, so in hot loops reuse a buffer (search_into / search_with) or fold with visit. See the guide and docs.rs for full per-method docs.

Query Methods
Range / intersection search, search_iter, search_into, search_with, any, first, visit
Nearest neighbors (point) neighbors, neighbors_within, neighbors_into, neighbors_with, visit_neighbors
Nearest neighbors (box) neighbors_of_box, neighbors_of_box_within, neighbors_of_box_into, neighbors_of_box_with, visit_neighbors_of_box
Geographic / custom-metric kNN neighbors_metric, neighbors_metric_into, visit_neighbors_metric — pass a |box| -> f64 distance (e.g. haversine_distance_2d for lon/lat)
Ray segment raycast, raycast_into, raycast_with, raycast_closest, raycast_closest_with, visit_raycast
Spatial join join, join_with, self_join, self_join_with
Region / culling 2D on Index2D / Index2DView: triangle search_triangle / any_triangle / visit_triangle, convex polygon search_polygon / any_polygon / visit_polygon (+_into). 3D frustum on Index3D / Index3DView: search_frustum / any_frustum / visit_frustum (+_into)
Extent / exact extent, and search_exact / neighbors_exact on the f32 indexes
# use packed_spatial_index::{Index2DBuilder, Box2D, Point2D, Ray2D};
# let mut b = Index2DBuilder::new(2);
# b.add(Box2D::new(0.0, 0.0, 1.0, 1.0));
# b.add(Box2D::new(5.0, 5.0, 6.0, 6.0));
# let index = b.finish()?;
let overlaps = index.search(Box2D::new(0.0, 0.0, 2.0, 2.0)); // range query
let nearest = index.neighbors(Point2D::new(5.5, 5.5), 1);    // kNN
let hit = index.raycast_closest(Ray2D::new(Point2D::new(-1.0, 0.5), 1.0, 0.0, 10.0));
assert_eq!(overlaps, vec![0]);
assert_eq!(nearest, vec![1]);
assert_eq!(hit, Some((0, 1.0)));
# Ok::<(), packed_spatial_index::BuildError>(())

Types at a glance

A full coverage matrix (which index type answers which query, and why some cells are empty by design) is in the guide.

Serialization & metadata

to_bytes / from_bytes round-trip an index; the serialize() builder adds the optional pieces — one opaque payload blob per item and descriptive metadata (coordinate reference system, payload content type, attribution):

# use packed_spatial_index::{Box2D, Index2DBuilder, read_metadata};
# let mut b = Index2DBuilder::new(1);
# b.add(Box2D::new(0.0, 0.0, 1.0, 1.0));
# let index = b.finish()?;
let bytes = index
    .serialize()
    .crs("EPSG:4326")
    .payloads(&[b"feature-0".as_slice()])
    .to_bytes()?;

// Read the metadata back without loading the index.
assert_eq!(read_metadata(&bytes)?.crs.as_deref(), Some("EPSG:4326"));
# Ok::<(), Box<dyn std::error::Error>>(())

The metadata is opaque (the crate stores the strings you give it, verbatim). Pair query results with their payloads via the zero-copy views or the streaming reader — see Persistence and the binary format.

When every record is the same size, .records(stride, ..) (or .triangles(..) for Triangle2D / Triangle3D, and the compact Triangle2DF32 / Triangle3DF32) stores a fixed-width payload: no offset table, so the file is smaller, a streamed query reads one fewer time, and a view can borrow the records as a zero-copy typed slice. A triangle payload plus the index over each triangle's bounding box (Index3D::from_triangles) is a streamable mesh BVH; raycast finds candidates and Ray3D::closest_triangle does the exact hit (the f32 records test 8 at a time with simd). See the raycast_mesh example.

For half the box bytes in memory and on the wire, build the same thing on the compact f32 index: Index3DF32::from_triangles(..).serialize().triangles(..) then stream it with StreamIndex3DF32f32-storage alone, no simd needed. The stored f32 boxes are rounded outward, so range and ray results are a conservative superset; search_exact refines them against your f64 boxes.

Features

Feature Pulls in Adds
parallel (default) rayon adaptive parallel index builds
simd (default) wide SoA indexes + SIMD search / raycast (AVX2 / AVX-512)
f32-storage compact f32-box indexes (scalar Index2DF32 / Index3DF32; the SimdIndex*F32 variants also need simd)
stream query a serialized index over a RangeReader (local file or remote object) without loading it whole
async futures-util (implies stream) query over an AsyncRangeReader (browser / edge worker, HTTP range or object storage)
bench-internals hidden support API for this crate's benchmarks

Serialization, metadata, and the scalar indexes are always available — no feature required.

cargo build --no-default-features                      # minimal: scalar + serialize + metadata
cargo build --no-default-features --features simd      # SIMD only

Documentation

  • Guide — recipes, choosing a query method, builder configuration, examples, WASM demo.
  • Persistence — serialize / load / zero-copy views, querying large or on-disk indexes via mmap, and streaming queries over a RangeReader (local file or remote object).
  • Performance — benchmarks vs static_aabb2d_index, FlatGeobuf, and the bvh crate.
  • Internals — technique deep-dives: SIMD kernels, two-queue kNN, traversal prefetch.
  • Binary format — the PSINDEX on-disk layout.
  • API referencedocs.rs/packed_spatial_index.

Limitations

  • Static: rebuild when the dataset changes; no insert/delete.
  • Results are item indices, not stored payloads; result order is unspecified.
  • f32-storage indexes store outward-rounded boxes — plain range search may return extra near-boundary hits; use search_exact / neighbors_exact (with your source f64 boxes) for exact results, and prefer f64 indexes for exact queries with many hits.

Safety

The public API is safe Rust; unsafe is confined to narrow, audited paths (validated unaligned reads, repr(C) bulk copies, gated x86-64 SIMD). Serialized input is treated as untrusted: the in-memory loaders validate the whole buffer before use, and the streaming reader validates pointers and payload offsets as it follows them, with per-query cost limits to bound broad queries. See SAFETY.md for the memory-safety and untrusted-input hardening details.

Status

Pre-1.0: the API and on-disk format may still change between minor releases. The crate is covered by unit, property and fuzz tests across the feature matrix, but it has not yet been proven in production. Validate it for your workload before you depend on it.

Feedback

Built something with it? I'd love to hear about it! Start a discussion with your use case, your numbers or any rough edges, and file an issue for bugs. Real-world reports are what push it toward 1.0.

Development

AI assistance is part of building this project. The architecture is human-directed and the generated output is reviewed carefully before it ships, also a broad test suite catches mistakes.

License

Licensed under the Apache License, Version 2.0.

About

Packed Spatial Index

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages