Skip to content

feat(reader): IS NULL / IS NOT NULL chunk pruning via null_count#128

Merged
dfa1 merged 3 commits into
mainfrom
feat/null-count-pruning
Jun 21, 2026
Merged

feat(reader): IS NULL / IS NOT NULL chunk pruning via null_count#128
dfa1 merged 3 commits into
mainfrom
feat/null-count-pruning

Conversation

@dfa1

@dfa1 dfa1 commented Jun 21, 2026

Copy link
Copy Markdown
Owner

Writer now records per-segment null_count in the ArrayNode ArrayStats
(the canonical fbs field Rust already writes), force-defaulting the build
so null_count == 0 persists (flatbuffers omits default-0 otherwise; Rust
writes 0 too). Reader parses it in ArrayStats.fromFbs.

Adds RowFilter.isNull / isNotNull and the two canPruneChunk arms:
IS NULL skips chunks with zero nulls, IS NOT NULL skips all-null chunks.
Same chunk-level zone-map pruning machinery as min/max (not a row filter).

NullCountPruningTest proves the right chunks are pruned; full
writer/reader/cli/inspector suites + Rust JNI interop green.

Co-Authored-By: Claude Opus 4.8 [email protected]

dfa1 and others added 3 commits June 21, 2026 17:04
Writer now records per-segment null_count in the ArrayNode ArrayStats
(the canonical fbs field Rust already writes), force-defaulting the build
so null_count == 0 persists (flatbuffers omits default-0 otherwise; Rust
writes 0 too). Reader parses it in ArrayStats.fromFbs.

Adds RowFilter.isNull / isNotNull and the two canPruneChunk arms:
IS NULL skips chunks with zero nulls, IS NOT NULL skips all-null chunks.
Same chunk-level zone-map pruning machinery as min/max (not a row filter).

NullCountPruningTest proves the right chunks are pruned; full
writer/reader/cli/inspector suites + Rust JNI interop green.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
Per-chunk null_count now flows into columnStats(): aggregateStats sums
counts across flats, reporting the total only when every chunk carries
one (a single missing count makes the column total unknown -> null).
Also fixes the early-return guard, which dropped null_count for all-null
columns that have no min/max.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
@dfa1 dfa1 merged commit cb844f2 into main Jun 21, 2026
0 of 6 checks passed
@dfa1 dfa1 deleted the feat/null-count-pruning branch June 21, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant