You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+50Lines changed: 50 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,55 @@ All notable changes to SQL4Json are documented in this file.
5
5
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
8
+
## [1.2.0] - 2026-05-02
9
+
10
+
### Added — Grammar Introspection API
11
+
A public IDE-tooling surface in the new JPMS-exported `io.github.mnesimiyilmaz.sql4json.grammar` package — for syntax highlighters, completion popups, and lightweight static analysers that need to reason about SQL4Json text without depending on ANTLR.
12
+
-`SQL4JsonGrammar` with `keywords()`, `functions()`, and `tokenize(String)` static views; the tokenizer never throws — unrecognised spans surface as `BAD_TOKEN`.
-`FunctionInfo` / `Category` (record + enum) — function catalog entries carrying category, arity, signature, and description.
15
+
-`FunctionRegistry.scalarFunctionNames()` / `valueFunctionNames()` / `aggregateFunctionNames()` — unmodifiable views over registered names, used by tooling consumers and drift tests.
16
+
- Drift tests guard the hand-maintained `KEYWORDS`, `FUNCTIONS`, and `TOKEN_KIND_BY_TYPE` tables — adding a lexer rule or function without updating the catalog now fails CI.
17
+
18
+
### Added — Array Predicates
19
+
- Five new condition operators for querying values inside JSON array fields, modelled on PostgreSQL's native-array operators.
20
+
-`tags CONTAINS 'admin'` — keyword operator for scalar membership.
- Parameter binding follows the JDBC / Hibernate / jOOQ pattern: `tags @> :myList` (or `?`) binds a whole `Collection` to one slot; `ARRAY[?, ?]` is element-by-element with one scalar bind per slot. No collection-expansion inside `ARRAY[…]`. Bind-time validation raises `SQL4JsonBindException` for type mismatches (collection in `ARRAY[?]` slot, scalar in bare-array-RHS, collection in `CONTAINS`).
27
+
-`<`, `>`, `<=`, `>=` against an `ARRAY[…]` literal raise `SQL4JsonParseException` at parse time with a clear message.
28
+
- Field state on the LHS: missing, JSON null, scalar, or object → all five operators return `false` for that row (no exception).
29
+
-`JOIN` aliases work — array predicates resolve through the same alias-aware path as the rest of the engine; flat-key reassembly fallback handles post-JOIN merged rows.
30
+
- Catalog: `SQL4JsonGrammar.keywords()` now includes `ARRAY` and `CONTAINS`; `tokenize(...)` surfaces `@>`, `<@`, `&&` as `TokenKind.OPERATOR` and `[`, `]` as `TokenKind.PUNCTUATION`.
31
+
32
+
### Added — Command-line Interface
33
+
- Command-line entrypoint shipping as a separate shaded jar with classifier `cli` (`sql4json-1.2.0-cli.jar`). Flags: `-q/--query` (literal or `@path`), `-f/--file`, `-o/--output`, `--data name=path` (repeatable, multi-source JOIN), `-p/--param name=<json>` (repeatable, named-parameter bind), `--pretty`, `-h/--help`, `-v/--version`. Stable exit codes: `0` success / `--help` / `--version`, `1` runtime failure (SQL4Json error, IO error), `2` usage error. `SQL4JSON_DEBUG=1` attaches a full stack trace to failure messages on stderr.
34
+
- Library jar (`sql4json-1.2.0.jar`) unchanged — pure library, no `Main-Class`. The `io.github.mnesimiyilmaz.sql4json.cli` package is intentionally non-exported from `module-info.java`; the *flag set* and *exit codes* are the stable surface, not the implementation classes.
35
+
-`JsonSerializer.prettySerialize(JsonValue)` — public sibling of `serialize(JsonValue)` for two-space pretty-printing. Empty objects and arrays stay compact; output has no trailing newline. Drives the CLI `--pretty` flag.
36
+
37
+
### Added — Performance Profiling
38
+
- New `docs/performance.md` reference document with the full sweep across seven sizes (8 MB → 512 MB) and ~50 scenarios, the reference environment, dataset details, and the regen recipe. `README.md` gains a concise headline table that links to the full doc.
39
+
-`ProfilingTest` now runs each scenario `N` times (default `3`, configurable via `-Dprofiling.runs=N`) and reports the median wall-clock time per `(label, size)` cell. Report header now records total RAM, initial heap, profiling-runs count, OS arch, and the data seed read from `src/test/resources/data-files/SEED`. Data files are byte-reproducible via `generate_json.py --seed`.
40
+
41
+
### Changed
42
+
-`NOW()` is no longer a dedicated lexer literal; the `VALUE_FUNCTION` rule was retired and `NOW()` lexes as a regular function call, dispatched at parse time to `Expression.NowRef` (lazy / per-row) or an eager `SqlDateTime`. `containsNonDeterministic` still fires in every path, so cache-bypass behaviour is preserved.
43
+
-`IN` / `BETWEEN` non-literal operands generalised: any non-literal element/bound (not just `ParameterRef`) now flows through `ConditionContext.valueExpressions` / `lowerBoundExpr` / `upperBoundExpr` and is evaluated per-row. `ParameterSubstitutor` snapshots `NowRef` to a literal `SqlDateTime` at substitute time, so all rows in a parameterized execution see the same timestamp (JDBC-style "bind once, execute").
44
+
- String functions auto-coerce non-string inputs via `rawValue().toString()` (matching `CONCAT`): `LOWER`, `UPPER`, `SUBSTRING`, `TRIM`, `LENGTH`, `LEFT`, `RIGHT`, `LPAD`, `RPAD`, `REVERSE`, `REPLACE`, `POSITION`. String-typed argument positions coerce too; numeric positions (e.g. `SUBSTRING` start/length) are unchanged. Null input still short-circuits to `NULL`.
45
+
-`TO_DATE` consolidates non-string inputs through the same coerce-then-parse path; `SqlDate` / `SqlDateTime` pass through unchanged.
46
+
- Whitespace lexer channel: the `ESC` rule emits to `channel(HIDDEN)` instead of `-> skip`, so `tokenize()` can surface whitespace runs as `TokenKind.WHITESPACE`. Query parsing is unaffected (parser still filters by `DEFAULT_CHANNEL`).
47
+
- Sealed `JsonNumberValue` and `SqlNumber` types: split into `JsonLongValue` / `JsonDoubleValue` / `JsonDecimalValue` and `SqlLong` / `SqlDouble` / `SqlDecimal`. Primitives stored unboxed in the long/double variants — per-instance footprint roughly halves on row-materializing workloads. `JsonNumberValue` and `SqlNumber` become sealed interfaces; pattern-destructure call sites switch over the typed variants.
48
+
-`FlatRow` materialization: GROUP BY, HAVING, WINDOW, ORDER BY, JOIN, DISTINCT, SELECT and the engine pre-flatten path now emit `Object[]`-backed rows keyed by ordinal via a shared `RowSchema`. Lazy `Row` is unchanged for streaming WHERE / lazy SELECT. A `null` slot decodes as `SqlNull.INSTANCE` on read.
49
+
-`RowAccessor` sealed interface bridges lazy `Row` and `FlatRow` across `ExpressionEvaluator`, condition handlers (`InConditionHandler`, `BetweenConditionHandler`, `LikeConditionHandler`, `NotLikeConditionHandler`, `ComparisonConditionHandler`, `NullCheckConditionHandler`, `ArrayPredicateConditionHandler`), `ArrayPathNavigator`, `CriteriaNode`, the pipeline (`Stream<RowAccessor>`), `JsonUnflattener`, and `GroupAggregator`.
50
+
-`WindowStage` writes window results into per-row `Object[]` buffers indexed by `RowSchema.windowSlot`. The legacy `Row.windowResults` / `windowResultsByAlias` maps are gone; alias mirroring uses `RowSchema.withWindowSlots(calls, aliases)` so the SELECT alias becomes the canonical column key for the slot. CASE-buried windows resolve through the same schema-slot lookup via `Row.getWindowResult`.
51
+
-`SqlValueComparator` adds typed pattern-matched fast paths for `(SqlLong, SqlLong)` / `(SqlLong, SqlDouble)` / `(SqlDouble, SqlLong)` / `(SqlDouble, SqlDouble)` — avoids the `Number.doubleValue()` boxing on the WHERE / ORDER BY hot path. `SqlDecimal` involvement still routes through the generic `doubleValue()` compare.
52
+
53
+
### Fixed
54
+
- Window-only functions without `OVER` (`ROW_NUMBER`, `RANK`, `DENSE_RANK`, `NTILE`, `LAG`, `LEAD`) now raise `SQL4JsonParseException` at parse time with a clear message, instead of the misleading runtime *"Scalar function 'X' requires at least one argument"*. Aggregate functions remain valid without `OVER`.
55
+
- Whole-number literals serialize as integers: `SELECT 42 AS x FROM $r` now produces `"x":42` instead of `"x":42.0`, matching column-from-JSON values.
0 commit comments