v1.2.0
Added — Grammar Introspection API
A public IDE-tooling surface in the new JPMS-exported io.github.mnesimiyilmaz.sql4json.grammar package — for syntax highlighters, completion popups, and lightweight static analysers that need to reason about SQL4Json text without depending on ANTLR.
SQL4JsonGrammarwithkeywords(),functions(), andtokenize(String)static views; the tokenizer never throws — unrecognised spans surface asBAD_TOKEN.Token/TokenKind(record + enum) — tokenizer output with absolute, exclusive-end offsets.FunctionInfo/Category(record + enum) — function catalog entries carrying category, arity, signature, and description.FunctionRegistry.scalarFunctionNames()/valueFunctionNames()/aggregateFunctionNames()— unmodifiable views over registered names, used by tooling consumers and drift tests.- Drift tests guard the hand-maintained
KEYWORDS,FUNCTIONS, andTOKEN_KIND_BY_TYPEtables — adding a lexer rule or function without updating the catalog now fails CI.
Added — Array Predicates
- Five new condition operators for querying values inside JSON array fields, modelled on PostgreSQL's native-array operators.
tags CONTAINS 'admin'— keyword operator for scalar membership.tags @> ARRAY['admin','editor']— contains-all.tags <@ ARRAY['admin','editor','viewer']— contained-by.tags && ARRAY['blocked','flagged']— overlap.tags = ARRAY['admin','editor']/tags != ARRAY[…]— structural equality (order-sensitive, length-sensitive).
ARRAY[expr, expr, …]array-literal syntax inrhsValue— emptyARRAY[]allowed.- Parameter binding follows the JDBC / Hibernate / jOOQ pattern:
tags @> :myList(or?) binds a wholeCollectionto one slot;ARRAY[?, ?]is element-by-element with one scalar bind per slot. No collection-expansion insideARRAY[…]. Bind-time validation raisesSQL4JsonBindExceptionfor type mismatches (collection inARRAY[?]slot, scalar in bare-array-RHS, collection inCONTAINS). <,>,<=,>=against anARRAY[…]literal raiseSQL4JsonParseExceptionat parse time with a clear message.- Field state on the LHS: missing, JSON null, scalar, or object → all five operators return
falsefor that row (no exception). JOINaliases work — array predicates resolve through the same alias-aware path as the rest of the engine; flat-key reassembly fallback handles post-JOIN merged rows.- Catalog:
SQL4JsonGrammar.keywords()now includesARRAYandCONTAINS;tokenize(...)surfaces@>,<@,&&asTokenKind.OPERATORand[,]asTokenKind.PUNCTUATION.
Added — Command-line Interface
- Command-line entrypoint shipping as a separate shaded jar with classifier
cli(sql4json-1.2.0-cli.jar). Flags:-q/--query(literal or@path),-f/--file,-o/--output,--data name=path(repeatable, multi-source JOIN),-p/--param name=<json>(repeatable, named-parameter bind),--pretty,-h/--help,-v/--version. Stable exit codes:0success /--help/--version,1runtime failure (SQL4Json error, IO error),2usage error.SQL4JSON_DEBUG=1attaches a full stack trace to failure messages on stderr. - Library jar (
sql4json-1.2.0.jar) unchanged — pure library, noMain-Class. Theio.github.mnesimiyilmaz.sql4json.clipackage is intentionally non-exported frommodule-info.java; the flag set and exit codes are the stable surface, not the implementation classes. JsonSerializer.prettySerialize(JsonValue)— public sibling ofserialize(JsonValue)for two-space pretty-printing. Empty objects and arrays stay compact; output has no trailing newline. Drives the CLI--prettyflag.
Added — Performance Profiling
- New
docs/performance.mdreference document with the full sweep across seven sizes (8 MB → 512 MB) and ~50 scenarios, the reference environment, dataset details, and the regen recipe.README.mdgains a concise headline table that links to the full doc. ProfilingTestnow runs each scenarioNtimes (default3, configurable via-Dprofiling.runs=N) and reports the median wall-clock time per(label, size)cell. Report header now records total RAM, initial heap, profiling-runs count, OS arch, and the data seed read fromsrc/test/resources/data-files/SEED. Data files are byte-reproducible viagenerate_json.py --seed.
Changed
NOW()is no longer a dedicated lexer literal; theVALUE_FUNCTIONrule was retired andNOW()lexes as a regular function call, dispatched at parse time toExpression.NowRef(lazy / per-row) or an eagerSqlDateTime.containsNonDeterministicstill fires in every path, so cache-bypass behaviour is preserved.IN/BETWEENnon-literal operands generalised: any non-literal element/bound (not justParameterRef) now flows throughConditionContext.valueExpressions/lowerBoundExpr/upperBoundExprand is evaluated per-row.ParameterSubstitutorsnapshotsNowRefto a literalSqlDateTimeat substitute time, so all rows in a parameterized execution see the same timestamp (JDBC-style "bind once, execute").- String functions auto-coerce non-string inputs via
rawValue().toString()(matchingCONCAT):LOWER,UPPER,SUBSTRING,TRIM,LENGTH,LEFT,RIGHT,LPAD,RPAD,REVERSE,REPLACE,POSITION. String-typed argument positions coerce too; numeric positions (e.g.SUBSTRINGstart/length) are unchanged. Null input still short-circuits toNULL. TO_DATEconsolidates non-string inputs through the same coerce-then-parse path;SqlDate/SqlDateTimepass through unchanged.- Whitespace lexer channel: the
ESCrule emits tochannel(HIDDEN)instead of-> skip, sotokenize()can surface whitespace runs asTokenKind.WHITESPACE. Query parsing is unaffected (parser still filters byDEFAULT_CHANNEL). - Sealed
JsonNumberValueandSqlNumbertypes: split intoJsonLongValue/JsonDoubleValue/JsonDecimalValueandSqlLong/SqlDouble/SqlDecimal. Primitives stored unboxed in the long/double variants — per-instance footprint roughly halves on row-materializing workloads.JsonNumberValueandSqlNumberbecome sealed interfaces; pattern-destructure call sites switch over the typed variants. FlatRowmaterialization: GROUP BY, HAVING, WINDOW, ORDER BY, JOIN, DISTINCT, SELECT and the engine pre-flatten path now emitObject[]-backed rows keyed by ordinal via a sharedRowSchema. LazyRowis unchanged for streaming WHERE / lazy SELECT. Anullslot decodes asSqlNull.INSTANCEon read.RowAccessorsealed interface bridges lazyRowandFlatRowacrossExpressionEvaluator, condition handlers (InConditionHandler,BetweenConditionHandler,LikeConditionHandler,NotLikeConditionHandler,ComparisonConditionHandler,NullCheckConditionHandler,ArrayPredicateConditionHandler),ArrayPathNavigator,CriteriaNode, the pipeline (Stream<RowAccessor>),JsonUnflattener, andGroupAggregator.WindowStagewrites window results into per-rowObject[]buffers indexed byRowSchema.windowSlot. The legacyRow.windowResults/windowResultsByAliasmaps are gone; alias mirroring usesRowSchema.withWindowSlots(calls, aliases)so the SELECT alias becomes the canonical column key for the slot. CASE-buried windows resolve through the same schema-slot lookup viaRow.getWindowResult.SqlValueComparatoradds typed pattern-matched fast paths for(SqlLong, SqlLong)/(SqlLong, SqlDouble)/(SqlDouble, SqlLong)/(SqlDouble, SqlDouble)— avoids theNumber.doubleValue()boxing on the WHERE / ORDER BY hot path.SqlDecimalinvolvement still routes through the genericdoubleValue()compare.
Fixed
- Window-only functions without
OVER(ROW_NUMBER,RANK,DENSE_RANK,NTILE,LAG,LEAD) now raiseSQL4JsonParseExceptionat parse time with a clear message, instead of the misleading runtime "Scalar function 'X' requires at least one argument". Aggregate functions remain valid withoutOVER. - Whole-number literals serialize as integers:
SELECT 42 AS x FROM $rnow produces"x":42instead of"x":42.0, matching column-from-JSON values.