feat(reader): IS NULL / IS NOT NULL chunk pruning via null_count#128
Merged
Conversation
Writer now records per-segment null_count in the ArrayNode ArrayStats (the canonical fbs field Rust already writes), force-defaulting the build so null_count == 0 persists (flatbuffers omits default-0 otherwise; Rust writes 0 too). Reader parses it in ArrayStats.fromFbs. Adds RowFilter.isNull / isNotNull and the two canPruneChunk arms: IS NULL skips chunks with zero nulls, IS NOT NULL skips all-null chunks. Same chunk-level zone-map pruning machinery as min/max (not a row filter). NullCountPruningTest proves the right chunks are pruned; full writer/reader/cli/inspector suites + Rust JNI interop green. Co-Authored-By: Claude Opus 4.8 <[email protected]>
Co-Authored-By: Claude Opus 4.8 <[email protected]>
Per-chunk null_count now flows into columnStats(): aggregateStats sums counts across flats, reporting the total only when every chunk carries one (a single missing count makes the column total unknown -> null). Also fixes the early-return guard, which dropped null_count for all-null columns that have no min/max. Co-Authored-By: Claude Opus 4.8 <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Writer now records per-segment null_count in the ArrayNode ArrayStats
(the canonical fbs field Rust already writes), force-defaulting the build
so null_count == 0 persists (flatbuffers omits default-0 otherwise; Rust
writes 0 too). Reader parses it in ArrayStats.fromFbs.
Adds RowFilter.isNull / isNotNull and the two canPruneChunk arms:
IS NULL skips chunks with zero nulls, IS NOT NULL skips all-null chunks.
Same chunk-level zone-map pruning machinery as min/max (not a row filter).
NullCountPruningTest proves the right chunks are pruned; full
writer/reader/cli/inspector suites + Rust JNI interop green.
Co-Authored-By: Claude Opus 4.8 [email protected]