Cache min/max indices in patches by palaska · Pull Request #7753 · vortex-data/vortex

palaska · 2026-05-01T15:37:25Z

Cache min/max patch indices in Patches and short-circuit search_index when the query falls outside that range. Speeds up ALP/BitPacked scalar_at. This should help with slicing any patched array into a region that doesn't overlap the patch range

Signed-off-by: Baris Palaska <[email protected]>

codspeed-hq · 2026-05-01T15:44:10Z

Merging this PR will degrade performance by 31.73%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 16 improved benchmarks
❌ 71 regressed benchmarks
✅ 1119 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	WallTime	`cuda/bitpacked_u8/unpack/3bw[100M]`	352.3 µs	298.6 µs	+17.96%
❌	Simulation	`chunked_dict_primitive_canonical_into[f32, (1000, 100, 100)]`	686.1 µs	766.3 µs	-10.46%
❌	Simulation	`chunked_dict_primitive_canonical_into[u32, (1000, 100, 100)]`	685.8 µs	764.6 µs	-10.31%
❌	Simulation	`chunked_dict_primitive_canonical_into[u32, (1000, 10, 100)]`	670.5 µs	749.3 µs	-10.52%
❌	Simulation	`chunked_dict_primitive_canonical_into[u32, (1000, 100, 10)]`	87.4 µs	97.2 µs	-10.08%
❌	Simulation	`chunked_dict_primitive_canonical_into[f32, (1000, 10, 100)]`	670.4 µs	750.6 µs	-10.69%
❌	Simulation	`chunked_dict_primitive_into_canonical[f32, (1000, 10, 100)]`	727.7 µs	808.6 µs	-10%
❌	Simulation	`chunked_dict_primitive_into_canonical[u32, (1000, 100, 10)]`	97.6 µs	108.7 µs	-10.22%
❌	Simulation	`decode_primitives[f32, (1000, 2)]`	17 µs	19.8 µs	-14.19%
❌	Simulation	`decode_primitives[f32, (1000, 32)]`	17.1 µs	19.3 µs	-11.85%
❌	Simulation	`decode_primitives[f32, (1000, 4)]`	17 µs	19.9 µs	-14.71%
❌	Simulation	`decode_primitives[f32, (1000, 512)]`	18.3 µs	20.9 µs	-12.34%
❌	Simulation	`decode_primitives[f32, (1000, 8)]`	17 µs	19.3 µs	-11.75%
❌	Simulation	`decode_primitives[i64, (1000, 2)]`	19.3 µs	22.3 µs	-13.65%
❌	Simulation	`decode_primitives[i64, (1000, 4)]`	19.3 µs	22 µs	-12.43%
❌	Simulation	`decode_primitives[i64, (1000, 512)]`	21.4 µs	24.3 µs	-11.91%
❌	Simulation	`decode_primitives[i64, (1000, 8)]`	19.3 µs	22 µs	-11.94%
❌	Simulation	`decode_primitives[u8, (1000, 2)]`	16.4 µs	18.8 µs	-12.67%
❌	Simulation	`decode_primitives[u8, (1000, 32)]`	16 µs	18.4 µs	-13.09%
❌	Simulation	`decode_primitives[u8, (1000, 4)]`	16 µs	18.4 µs	-13.25%
...	...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

_{Comparing bp/minmax-index (8bc3d8a) with develop (f307edc)}

Signed-off-by: Baris Palaska <[email protected]>

joseph-isaacs

Indices arrays have min/max stats these should be used instead.

We might not currently save or use this, but by these benchmarks we should.

If these benchmarks are cleaned up I am happy to merge those first.

joseph-isaacs · 2026-05-05T09:40:16Z

See: https://github.com/vortex-data/vortex/blob/e03bd1dcc63a34f260a16c05395146e26d1c9df3/docs/developer-guide/benchmarking.md

Signed-off-by: Baris Palaska <[email protected]>

Isolating benchmarks before this optimization: #7753 --------- Signed-off-by: Baris Palaska <[email protected]>

robert3005

Meta point, this is why you want to have special logic that's shared across all arrays, if we have Patched array then that could be one place where all of this logic could live without hacks. Alas we need Patched array to get over the line

Signed-off-by: Baris Palaska <[email protected]>

…bp/minmax-index

palaska · 2026-05-06T12:50:00Z

Stats lookups (rwlock + hashmap + scalar conversion) were too expensive to be in the hot path (search_index is called in tight loops) so I still went with a OnceLock approach, combined with stats.

Signed-off-by: Baris Palaska <[email protected]>

…bp/minmax-index

palaska · 2026-05-06T16:30:18Z

I think RwLock -> ArcSwap change made setting stats more expensive..

joseph-isaacs · 2026-05-06T17:14:39Z

It does look that way. I wonder if it made reading them faster?

robert3005 · 2026-05-06T17:22:32Z

Github is failing me - ~~remove extra Arc around arcswap~~ tihs is not going to work, arcswap isn't clone

cache

03723d2

Signed-off-by: Baris Palaska <[email protected]>

palaska added the changelog/performance A performance improvement label May 1, 2026

1k

705b9a5

Signed-off-by: Baris Palaska <[email protected]>

joseph-isaacs requested changes May 5, 2026

View reviewed changes

rm bench

ecc9ec1

Signed-off-by: Baris Palaska <[email protected]>

palaska mentioned this pull request May 5, 2026

Add Patches lookup benchmarks #7795

Merged

use stats

78b52ff

Signed-off-by: Baris Palaska <[email protected]>

palaska added a commit that referenced this pull request May 5, 2026

Add Patches lookup benchmarks (#7795)

1718eb3

Isolating benchmarks before this optimization: #7753 --------- Signed-off-by: Baris Palaska <[email protected]>

Merge branch 'develop' into bp/minmax-index

6efcd3e

robert3005 reviewed May 5, 2026

View reviewed changes

palaska added 5 commits May 6, 2026 11:08

still cache bounds

ed818d1

Signed-off-by: Baris Palaska <[email protected]>

Merge branch 'develop' into bp/minmax-index

8ad26a9

min and max index use cached bounds

c4d5195

Signed-off-by: Baris Palaska <[email protected]>

Merge branch 'bp/minmax-index' of github.com:vortex-data/vortex into …

0cfe98b

…bp/minmax-index

Merge branch 'develop' into bp/minmax-index

e590c61

palaska requested review from joseph-isaacs and robert3005 May 6, 2026 12:47

palaska added 2 commits May 6, 2026 16:23

arcswap + scalarref

f99eabf

Signed-off-by: Baris Palaska <[email protected]>

Merge branch 'bp/minmax-index' of github.com:vortex-data/vortex into …

8bc3d8a

…bp/minmax-index

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache min/max indices in patches#7753

Cache min/max indices in patches#7753
palaska wants to merge 12 commits intodevelopfrom
bp/minmax-index

palaska commented May 1, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented May 1, 2026 •

edited

Loading

Uh oh!

joseph-isaacs left a comment

Uh oh!

joseph-isaacs commented May 5, 2026

Uh oh!

robert3005 left a comment

Uh oh!

palaska commented May 6, 2026

Uh oh!

palaska commented May 6, 2026 •

edited

Loading

Uh oh!

joseph-isaacs commented May 6, 2026

Uh oh!

robert3005 commented May 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

palaska commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will degrade performance by 31.73%

Performance Changes

Uh oh!

joseph-isaacs left a comment

Choose a reason for hiding this comment

Uh oh!

joseph-isaacs commented May 5, 2026

Uh oh!

robert3005 left a comment

Choose a reason for hiding this comment

Uh oh!

palaska commented May 6, 2026

Uh oh!

palaska commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joseph-isaacs commented May 6, 2026

Uh oh!

robert3005 commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

palaska commented May 1, 2026 •

edited

Loading

codspeed-hq Bot commented May 1, 2026 •

edited

Loading

palaska commented May 6, 2026 •

edited

Loading

robert3005 commented May 6, 2026 •

edited

Loading