Releases: mnesimiyilmaz/sql4json
Releases · mnesimiyilmaz/sql4json
v1.3.0
Changed
- BREAKING:
redactErrorDetailsdefault changed fromtruetofalse. Error messages now expose underlying
details by default. To preserve the previous behavior, setredactErrorDetails=trueexplicitly when configuring the
engine. - Codebase reformatted with Palantir Java Format via Spotless. No functional changes; subsequent contributions are
enforced byspotless:checkin thevalidatephase.
Build
- Reproducible builds enabled via
project.build.outputTimestampand pinned manifest entries (Built-By,Build-Jdk).
The CI release pipeline overrides the timestamp with the tag's commit timestamp. - Switched Maven Central publishing to
central-publishing-maven-pluginwithautoPublish=trueand
waitUntil=published. - GPG signing now CI-friendly:
--pinentry-mode loopbackandbestPractices=true. - Enforcer rules expanded:
dependencyConvergence,requireUpperBoundDeps,banDuplicatePomDependencyVersions,
reactorModuleConvergence. maven-shade-pluginfilters and transformers tightened to remove duplicateMETA-INF/MANIFEST.MFwarnings and
preserve license/notice files from shaded dependencies.maven-jar-pluginconfigured withaddDefaultEntries=falseto avoid non-deterministic manifest entries.
CI
- New: matrix build across Ubuntu and Windows, JDK 21 and 25 (
fail-fast: false). - New: concurrency control cancels in-progress runs on PR updates.
- New: Surefire reports uploaded as artifacts on failure (
retention-days: 14). - New: tag-driven release workflow validates
pom.xmlversion against the pushed tag and captures the commit timestamp
for reproducible deploys. - New: branch protection on
main; tag protection onv*.
Security
- OWASP Dependency-Check wired into the
releaseprofile (verifyphase), failing the release on CVSS ≥ 7. Reports
uploaded as workflow artifacts.
v1.2.0
Added — Grammar Introspection API
A public IDE-tooling surface in the new JPMS-exported io.github.mnesimiyilmaz.sql4json.grammar package — for syntax highlighters, completion popups, and lightweight static analysers that need to reason about SQL4Json text without depending on ANTLR.
SQL4JsonGrammarwithkeywords(),functions(), andtokenize(String)static views; the tokenizer never throws — unrecognised spans surface asBAD_TOKEN.Token/TokenKind(record + enum) — tokenizer output with absolute, exclusive-end offsets.FunctionInfo/Category(record + enum) — function catalog entries carrying category, arity, signature, and description.FunctionRegistry.scalarFunctionNames()/valueFunctionNames()/aggregateFunctionNames()— unmodifiable views over registered names, used by tooling consumers and drift tests.- Drift tests guard the hand-maintained
KEYWORDS,FUNCTIONS, andTOKEN_KIND_BY_TYPEtables — adding a lexer rule or function without updating the catalog now fails CI.
Added — Array Predicates
- Five new condition operators for querying values inside JSON array fields, modelled on PostgreSQL's native-array operators.
tags CONTAINS 'admin'— keyword operator for scalar membership.tags @> ARRAY['admin','editor']— contains-all.tags <@ ARRAY['admin','editor','viewer']— contained-by.tags && ARRAY['blocked','flagged']— overlap.tags = ARRAY['admin','editor']/tags != ARRAY[…]— structural equality (order-sensitive, length-sensitive).
ARRAY[expr, expr, …]array-literal syntax inrhsValue— emptyARRAY[]allowed.- Parameter binding follows the JDBC / Hibernate / jOOQ pattern:
tags @> :myList(or?) binds a wholeCollectionto one slot;ARRAY[?, ?]is element-by-element with one scalar bind per slot. No collection-expansion insideARRAY[…]. Bind-time validation raisesSQL4JsonBindExceptionfor type mismatches (collection inARRAY[?]slot, scalar in bare-array-RHS, collection inCONTAINS). <,>,<=,>=against anARRAY[…]literal raiseSQL4JsonParseExceptionat parse time with a clear message.- Field state on the LHS: missing, JSON null, scalar, or object → all five operators return
falsefor that row (no exception). JOINaliases work — array predicates resolve through the same alias-aware path as the rest of the engine; flat-key reassembly fallback handles post-JOIN merged rows.- Catalog:
SQL4JsonGrammar.keywords()now includesARRAYandCONTAINS;tokenize(...)surfaces@>,<@,&&asTokenKind.OPERATORand[,]asTokenKind.PUNCTUATION.
Added — Command-line Interface
- Command-line entrypoint shipping as a separate shaded jar with classifier
cli(sql4json-1.2.0-cli.jar). Flags:-q/--query(literal or@path),-f/--file,-o/--output,--data name=path(repeatable, multi-source JOIN),-p/--param name=<json>(repeatable, named-parameter bind),--pretty,-h/--help,-v/--version. Stable exit codes:0success /--help/--version,1runtime failure (SQL4Json error, IO error),2usage error.SQL4JSON_DEBUG=1attaches a full stack trace to failure messages on stderr. - Library jar (
sql4json-1.2.0.jar) unchanged — pure library, noMain-Class. Theio.github.mnesimiyilmaz.sql4json.clipackage is intentionally non-exported frommodule-info.java; the flag set and exit codes are the stable surface, not the implementation classes. JsonSerializer.prettySerialize(JsonValue)— public sibling ofserialize(JsonValue)for two-space pretty-printing. Empty objects and arrays stay compact; output has no trailing newline. Drives the CLI--prettyflag.
Added — Performance Profiling
- New
docs/performance.mdreference document with the full sweep across seven sizes (8 MB → 512 MB) and ~50 scenarios, the reference environment, dataset details, and the regen recipe.README.mdgains a concise headline table that links to the full doc. ProfilingTestnow runs each scenarioNtimes (default3, configurable via-Dprofiling.runs=N) and reports the median wall-clock time per(label, size)cell. Report header now records total RAM, initial heap, profiling-runs count, OS arch, and the data seed read fromsrc/test/resources/data-files/SEED. Data files are byte-reproducible viagenerate_json.py --seed.
Changed
NOW()is no longer a dedicated lexer literal; theVALUE_FUNCTIONrule was retired andNOW()lexes as a regular function call, dispatched at parse time toExpression.NowRef(lazy / per-row) or an eagerSqlDateTime.containsNonDeterministicstill fires in every path, so cache-bypass behaviour is preserved.IN/BETWEENnon-literal operands generalised: any non-literal element/bound (not justParameterRef) now flows throughConditionContext.valueExpressions/lowerBoundExpr/upperBoundExprand is evaluated per-row.ParameterSubstitutorsnapshotsNowRefto a literalSqlDateTimeat substitute time, so all rows in a parameterized execution see the same timestamp (JDBC-style "bind once, execute").- String functions auto-coerce non-string inputs via
rawValue().toString()(matchingCONCAT):LOWER,UPPER,SUBSTRING,TRIM,LENGTH,LEFT,RIGHT,LPAD,RPAD,REVERSE,REPLACE,POSITION. String-typed argument positions coerce too; numeric positions (e.g.SUBSTRINGstart/length) are unchanged. Null input still short-circuits toNULL. TO_DATEconsolidates non-string inputs through the same coerce-then-parse path;SqlDate/SqlDateTimepass through unchanged.- Whitespace lexer channel: the
ESCrule emits tochannel(HIDDEN)instead of-> skip, sotokenize()can surface whitespace runs asTokenKind.WHITESPACE. Query parsing is unaffected (parser still filters byDEFAULT_CHANNEL). - Sealed
JsonNumberValueandSqlNumbertypes: split intoJsonLongValue/JsonDoubleValue/JsonDecimalValueandSqlLong/SqlDouble/SqlDecimal. Primitives stored unboxed in the long/double variants — per-instance footprint roughly halves on row-materializing workloads.JsonNumberValueandSqlNumberbecome sealed interfaces; pattern-destructure call sites switch over the typed variants. FlatRowmaterialization: GROUP BY, HAVING, WINDOW, ORDER BY, JOIN, DISTINCT, SELECT and the engine pre-flatten path now emitObject[]-backed rows keyed by ordinal via a sharedRowSchema. LazyRowis unchanged for streaming WHERE / lazy SELECT. Anullslot decodes asSqlNull.INSTANCEon read.RowAccessorsealed interface bridges lazyRowandFlatRowacrossExpressionEvaluator, condition handlers (InConditionHandler,BetweenConditionHandler,LikeConditionHandler,NotLikeConditionHandler,ComparisonConditionHandler,NullCheckConditionHandler,ArrayPredicateConditionHandler),ArrayPathNavigator,CriteriaNode, the pipeline (Stream<RowAccessor>),JsonUnflattener, andGroupAggregator.WindowStagewrites window results into per-rowObject[]buffers indexed byRowSchema.windowSlot. The legacyRow.windowResults/windowResultsByAliasmaps are gone; alias mirroring usesRowSchema.withWindowSlots(calls, aliases)so the SELECT alias becomes the canonical column key for the slot. CASE-buried windows resolve through the same schema-slot lookup viaRow.getWindowResult.SqlValueComparatoradds typed pattern-matched fast paths for(SqlLong, SqlLong)/(SqlLong, SqlDouble)/(SqlDouble, SqlLong)/(SqlDouble, SqlDouble)— avoids theNumber.doubleValue()boxing on the WHERE / ORDER BY hot path.SqlDecimalinvolvement still routes through the genericdoubleValue()compare.
Fixed
- Window-only functions without
OVER(ROW_NUMBER,RANK,DENSE_RANK,NTILE,LAG,LEAD) now raiseSQL4JsonParseExceptionat parse time with a clear message, instead of the misleading runtime "Scalar function 'X' requires at least one argument". Aggregate functions remain valid withoutOVER. - Whole-number literals serialize as integers:
SELECT 42 AS x FROM $rnow produces"x":42instead of"x":42.0, matching column-from-JSON values.
v1.1.0
Added — Object Mapping
SQL4Json.queryAsandqueryAsList— map query results directly to Java records, POJOs, or basic types (both single-source and JOIN variants).PreparedQuery.executeAsandexecuteAsList(StringandJsonValueoverloads).SQL4JsonEngine.queryAsandqueryAsList.JsonValue.as(Class)andJsonValue.as(Class, Sql4jsonSettings)default methods on the sealed interface.MappingSettingssubsection ofSql4jsonSettingswithMissingFieldPolicyenum (IGNORE/FAIL).SQL4JsonMappingException(new sealed subclass ofSQL4JsonException).
Added — Parameter Binding
PreparedQuery.execute(String json, BoundParameters params)andexecute(JsonValue, BoundParameters).PreparedQuery.execute(String json, Object... positionalParams)shortcut.PreparedQuery.execute(String json, Map<String, ?> namedParams)shortcut.PreparedQuery.executeAs/executeAsListwithBoundParametersoverloads.SQL4JsonEngine.query/queryAsJsonValue/queryAs/queryAsListwithBoundParametersoverloads.BoundParametersimmutable carrier (named & positional modes,named()/positional()/of(Object...)/of(Map)factories,bind/bindAll,EMPTYsingleton).- Grammar support for
?(positional) and:name(named) placeholders throughout WHERE / SELECT / GROUP BY / HAVING / JOIN / function arguments. - Dynamic
LIMIT/OFFSETbinding (placeholders accepted in both positions). - IN-list expansion:
IN (?)+ collection bound → expanded to N literals; empty collection → zero-row predicate. LimitsSettings.maxParameters(default1024) — DoS guard against placeholder flooding.SQL4JsonBindException(new sealed subclass ofSQL4JsonException) — surfaced at substitute time for missing / extra / type-mismatch bindings, IN-list overflow, and LIMIT/OFFSET validation failures.
Changed
Sql4jsonSettingsgains amappingcomponent (all existing construction paths preserved via builder).- Grammar now recognizes
?and:nameas placeholders in value positions. LimitsSettingscanonical-constructor signature extended (maxParameters) — public callers using.builder()unaffected.SQL4JsonEnginebypassesQueryResultCachefor parameterized queries.- Internal: ISO date/datetime/instant parsing consolidated into
json.IsoTemporals;registry.DateCoerciondelegates.
v1.0.0
Core Query Engine
- SQL SELECT parsing via ANTLR4 grammar (case-insensitive keywords)
- Pipeline-based query execution: WHERE → GROUP BY → HAVING → WINDOW → ORDER BY → LIMIT → SELECT → DISTINCT
- Lazy streaming evaluation for WHERE and SELECT stages; materializing stages for GROUP BY, WINDOW, and ORDER BY
- Hash join execution for multi-source JOIN queries
- Nested JSON flattening with path-based field keys and query-scoped string interning
- Streaming JSON parser for efficient processing of large root arrays
- Zero external runtime dependencies beyond the ANTLR4 runtime
SQL Syntax
- SELECT with
*, specific columns, aliases (AS), and dot-notation aliases for structured nested output - FROM with root reference (
$r), nested path drilling ($r.response.data), table names for JOINs, and subqueries - JOIN / INNER JOIN, LEFT JOIN, RIGHT JOIN with equality
ONconditions (single or multi-column viaAND); chained JOINs supported - WHERE with comparison (
=,!=,>,<,>=,<=), pattern matching (LIKE,NOT LIKE), range (BETWEEN,NOT BETWEEN), set membership (IN,NOT IN), and null checks (IS NULL,IS NOT NULL) - GROUP BY with multiple columns and
HAVINGfilter on aggregated aliases - ORDER BY with
ASC/DESCon multiple columns and expressions - LIMIT and OFFSET for pagination
- DISTINCT for duplicate row elimination
- Subqueries in FROM clause (nesting depth bounded by a configurable limit)
- Logical connectives
AND/ORwith parenthesized grouping - Arbitrary nesting of function calls in SELECT, WHERE, GROUP BY, ORDER BY, and HAVING
- CASE expressions: Simple (
CASE expr WHEN val THEN result END) and searched (CASE WHEN condition THEN result END) forms, usable in SELECT, WHERE, ORDER BY, GROUP BY, and HAVING with full nesting support
Scalar & Aggregate Functions
- String (13): LOWER, UPPER (with locale support), CONCAT, SUBSTRING, TRIM, LENGTH, REPLACE, LEFT, RIGHT, LPAD, RPAD, REVERSE, POSITION
- Math (8): ABS, ROUND, CEIL, FLOOR, MOD, POWER, SQRT, SIGN
- Date/Time (10): TO_DATE (with optional format), NOW, YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, DATE_ADD, DATE_DIFF
- Conversion (3): CAST (7 target types: STRING, NUMBER, INTEGER, DECIMAL, BOOLEAN, DATE, DATETIME), NULLIF, COALESCE
- Aggregate (5): COUNT (including
COUNT(*)), SUM, AVG, MIN, MAX
Window Functions
- Ranking: ROW_NUMBER, RANK, DENSE_RANK, NTILE
- Offset: LAG, LEAD (with optional offset argument)
- Aggregate windows: SUM, AVG, COUNT, MIN, MAX usable with
OVER - OVER clause with
PARTITION BYand/orORDER BY(full-partition scope — window frames not yet supported)
Public API
SQL4Json— static facade:query(),queryAsJsonValue(),prepare(),engine()PreparedQuery— parse-once-execute-many, thread-safe and immutableSQL4JsonEngine— long-lived engine with bound data and optional result cache; thread-safeSQL4JsonEngineBuilder— fluent builder supporting both unnamed ($r) and named (JOIN) data sourcesQueryResultCache— SPI for custom cache implementations (default LRU provided); non-deterministic queries (e.g.NOW()) automatically bypass cachingJsonCodec— SPI for plugging in external JSON libraries (Jackson, Gson, etc.)JsonValue— sealed interface providing a library-native, dependency-free JSON abstraction
Configuration & Security Defaults
Sql4jsonSettings— immutable top-level settings record composed of four subsection records; customize any subsection viaSql4jsonSettings.builder()SecuritySettings:maxLikeWildcards(soft-ReDoS guard onLIKEpatterns),redactErrorDetails(multi-tenant information-disclosure guard)LimitsSettings:maxSqlLength,maxSubqueryDepth(default 16),maxInListSize,maxRowsPerQuery(enforced at GROUP BY, ORDER BY, WINDOW, JOIN, DISTINCT, and the final pipeline/streaming sink)CacheSettings: bounded LIKE-pattern cache (likePatternCacheSize), optional LRU query-result cache (queryResultCacheEnabled,queryResultCacheSize), plus a pluggableQueryResultCacheSPIDefaultJsonCodecSettings:maxInputLength,maxNestingDepth,maxStringLength,maxNumberLength,maxPropertyNameLength,maxArrayElements,duplicateKeyPolicyDuplicateKeyPolicyenum —REJECT(default),FIRST_WINS,LAST_WINS— resolves RFC 8259 ambiguity on duplicate object keys- Conservative defaults out of the box — protect against denial-of-service and information-disclosure risks in multi-tenant deployments without requiring any configuration
Type System
JsonValuesealed interface: object, array, string, number, boolean, nullSqlValuesealed interface for typed query processing: string, number, boolean, date, datetime, null- Sealed exception hierarchy:
SQL4JsonException→SQL4JsonParseException(with line/column),SQL4JsonExecutionException
Build & Infrastructure
- Java 21+ required (JPMS module with explicit exports)
- Published to Maven Central (
io.github.mnesimiyilmaz:sql4json) - GitHub Actions CI (formatting check + build + test)
- Spotless code formatting, CycloneDX SBOM generation
- OWASP dependency-check plugin
- Dependabot for automated dependency updates