SQL4Json Performance

What changed in 1.2.0: Memory-compaction work — the sealed JsonNumberValue / SqlNumber numeric variants with unboxed primitive storage, plus FlatRow Object[]-backed materialization for GROUP BY / HAVING / WINDOW / ORDER BY / JOIN / DISTINCT / SELECT and the engine pre-flatten path. The numbers below are the 1.2.0 release measurement, captured on 2026-04-30 against the reference machine. Re-run ./mvnw test -Plarge-tests -Dtest=ProfilingTest -Dprofiling.runs=3 on your hardware to reproduce.

How to read these numbers

Median of 3 runs; warmup excluded; wall-clock time captured via JMX. Single reference machine. Numbers will differ on other hardware — re-run locally to compare.

Single-threaded. Each query runs end-to-end on the calling thread — SQL4Json does not parallelize a single query internally. Multi-core scaling, if any, comes from running independent queries concurrently in your application.

Source: ProfilingTest.java. Regen: ./mvnw test -Plarge-tests -Dtest=ProfilingTest.

Reference environment

JVM: OpenJDK 64-Bit Server VM 21.0.1+12-LTS (Amazon.com Inc.)
OS: Windows 11 10.0 (amd64)
Cores: 32
Total RAM: 64,671 MB
Max heap (-Xmx): 8,192 MB
Initial heap (-Xms): 2,640 MB (approx, post-warmup)
Profiling runs: 3
Data seed: 20260428

Dataset

Synthetic. Generated by generate_json.py with --seed 20260428. Each record has up to 13 top-level fields (some randomly dropped) and up to depth-3 nesting (address.geo, metadata.audit, preferences.notifications).

Record counts per size:

Size	Bytes	Records
8 MB	8,389,001	12,239
16 MB	16,778,023	24,431
32 MB	33,554,658	48,831
64 MB	67,109,367	97,598
128 MB	134,218,195	195,334
256 MB	268,435,588	390,657
512 MB	536,871,390	781,553

Headline results

Median wall-clock time per query at four representative sizes. The Rows out column shows how many records each query returns at 512 MB — context for the time numbers.

Query	8 MB	32 MB	128 MB	512 MB	Rows out (at 512 MB)
`SELECT * FROM $r`	68	183	821	3,392	781,553
`SELECT id, is_active, score FROM $r`	55	217	986	3,936	781,553
`WHERE score >= 80 AND score <= 90`	47	183	752	2,993	58,520
`GROUP BY country` (multi-aggregate)	69	270	1,276	5,460	9
`ORDER BY score DESC LIMIT 50`	47	179	728	2,883	50
`users JOIN countries ON country=code`	97	349	1,436	5,445	808,446
`ROW_NUMBER() OVER (PARTITION BY country ORDER BY score DESC)`	79	300	1,377	6,118	781,553
Filter + GROUP BY + HAVING + ORDER BY (kitchen sink)	51	198	815	3,304	9

All times in milliseconds.

Full sweep — wall time (ms, median of 3 runs)

A — cell means the scenario was not run at that size. Only HEADLINE and CURATED-tier scenarios run at 128 / 256 / 512 MB; FULL-only scenarios are 8 / 16 / 32 / 64 MB.

Label	8 MB	16 MB	32 MB	64 MB	128 MB	256 MB	512 MB
p01 SELECT-ALL	68	93	183	391	821	1,680	3,392
p02 SELECT-ALL-LIMIT-100	0	0	0	0	0	0	1
p03 LIMIT-1000-OFFSET-5000	22	24	23	22	—	—	—
p04 PROJECT-TOP-LEVEL	55	109	217	469	986	1,959	3,936
p05 PROJECT-NESTED	66	130	258	562	—	—	—
p06 PROJECT-ALIAS	58	116	229	498	—	—	—
p07 GROUP-BY-CITY-COUNT	66	127	253	591	1,184	2,482	4,962
p08 GROUP-BY-COUNTRY-MULTI-AGG	69	135	270	652	1,276	2,669	5,460
p09 GROUP-BY-THEME-HAVING	69	134	266	602	—	—	—
p10 GROUP-BY-REVIEW-STATUS	68	131	261	610	—	—	—
p11 WHERE-BOOL-EQ	49	97	192	416	—	—	—
p12 WHERE-RANGE-SCORE	47	92	183	373	752	1,509	2,993
p13 WHERE-STRING-EQ	47	94	186	379	—	—	—
p14 WHERE-LIKE	54	106	211	407	821	1,634	3,243
p15 WHERE-DEEP-NESTED	50	99	196	393	—	—	—
p16 WHERE-OR-COMPOUND	48	96	191	386	—	—	—
p17 WHERE-IN	51	100	199	407	—	—	—
p18 WHERE-BETWEEN	47	92	184	371	—	—	—
p19 WHERE-IS-NULL	49	94	188	379	—	—	—
p20 WHERE-NOT-LIKE	58	115	229	471	—	—	—
p21 ORDER-BY-SCORE-ASC	69	135	279	639	1,401	2,855	6,229
p22 ORDER-BY-CITY-DESC	71	136	275	608	—	—	—
p23 ORDER-BY-MULTI	87	168	358	775	—	—	—
p24 ORDER-BY-SCORE-TOP50	47	91	179	367	728	1,494	2,883
p25 DISTINCT-CITY	48	94	187	383	753	1,541	3,011
p26 DISTINCT-COUNTRY-THEME	50	98	195	390	—	—	—
p27 FUNC-UPPER	47	47	46	47	—	—	—
p28 FUNC-CONCAT	47	47	47	47	—	—	—
p29 FUNC-LENGTH-FILTER	52	103	205	432	—	—	—
p30 FUNC-ROUND	45	45	45	45	—	—	—
p31 FUNC-ABS-FILTER	51	102	203	413	—	—	—
p32 FUNC-COALESCE	48	48	47	47	—	—	—
p33 FUNC-NESTED-ROUND-AVG	67	130	266	596	—	—	—
p34 FUNC-SUBSTRING	45	43	45	44	—	—	—
p35 FUNC-TRIM-UPPER	48	47	47	48	—	—	—
p36 COMPLEX-FILTER-GROUP-SORT	51	98	198	403	815	1,672	3,304
p37 COMPLEX-NESTED-GROUP-HAVING	64	126	255	564	—	—	—
p38 COMPLEX-MULTI-COND-PROJ	47	92	186	373	—	—	—
p39 SUBQUERY-SIMPLE	49	95	192	399	—	—	—
p40 SUBQUERY-GROUP-AFTER-FILTER	51	101	207	420	—	—	—
p41 PREPARED-GROUPBY	63	126	255	575	—	—	—
p42 PREPARED-REUSE	6	6	6	6	—	—	—
p43 ENGINE-NO-CACHE	64	127	254	601	—	—	—
p44 ENGINE-CACHE	63	125	251	568	—	—	—
p45 ENGINE-MIXED	45	90	181	364	—	—	—
p46 STRESS-FULL-SORT	64	134	291	636	—	—	—
p47 STRESS-HIGH-CARDINALITY-GROUP	81	151	310	728	—	—	—
p48 STRESS-DISTINCT-ID	52	106	216	478	—	—	—
p49 STRESS-WIDE-PROJECTION	113	218	499	482	—	—	—
p50 STRESS-COMPLEX-SUBQUERY	51	100	204	415	846	1,736	3,485
p51 JOIN-INNER	97	164	349	683	1,436	2,688	5,445
p52 JOIN-LEFT	92	168	354	687	—	—	—
p53 JOIN-INNER-FILTER	79	155	328	658	—	—	—
p54 JOIN-INNER-GROUPBY	80	158	331	660	1,351	2,659	5,489
p55 JOIN-INNER-ORDERBY-LIMIT	79	154	323	646	—	—	—
p56 JOIN-LEFT-ANTI	80	161	342	666	—	—	—
p57 WINDOW-ROW-NUMBER-GLOBAL	63	122	249	579	—	—	—
p58 WINDOW-ROW-NUMBER-PART	79	142	300	644	1,377	2,863	6,118
p59 WINDOW-RANK-PART	71	148	315	674	—	—	—
p60 WINDOW-DENSE-RANK-PART	71	148	315	675	—	—	—
p61 WINDOW-NTILE-4	62	126	276	622	—	—	—
p62 WINDOW-LAG	75	152	314	666	—	—	—
p63 WINDOW-LEAD	74	153	310	667	—	—	—
p64 WINDOW-AVG-OVER-PART	64	131	262	556	1,148	2,393	4,796
p65 FUNC-NESTED-LPAD-TRIM-COALESCE	50	49	49	49	—	—	—
p66 FUNC-YEAR-GROUPBY	112	201	482	925	—	—	—
p67 FUNC-DATE-DIFF-FILTER	71	115	229	487	—	—	—
p68 FUNC-CAST-INT	45	44	44	44	—	—	—

Full sweep — peak heap (MB, single-run measurement)

Label	8 MB	16 MB	32 MB	64 MB	128 MB	256 MB	512 MB
p01 SELECT-ALL	93	178	350	711	1,204	2,428	4,436
p02 SELECT-ALL-LIMIT-100	13	22	38	70	134	262	518
p03 LIMIT-1000-OFFSET-5000	53	58	74	106	—	—	—
p04 PROJECT-TOP-LEVEL	101	210	414	844	1,227	2,906	5,170
p05 PROJECT-NESTED	129	254	490	847	—	—	—
p06 PROJECT-ALIAS	109	218	430	874	—	—	—
p07 GROUP-BY-CITY-COUNT	129	254	498	895	1,511	3,342	5,642
p08 GROUP-BY-COUNTRY-MULTI-AGG	129	254	502	1,006	1,526	3,245	5,643
p09 GROUP-BY-THEME-HAVING	129	254	502	811	—	—	—
p10 GROUP-BY-REVIEW-STATUS	129	254	498	879	—	—	—
p11 WHERE-BOOL-EQ	97	194	382	571	—	—	—
p12 WHERE-RANGE-SCORE	93	186	366	608	693	966	1,962
p13 WHERE-STRING-EQ	97	190	374	716	—	—	—
p14 WHERE-LIKE	93	186	366	683	746	898	1,674
p15 WHERE-DEEP-NESTED	101	194	386	740	—	—	—
p16 WHERE-OR-COMPOUND	97	190	374	708	—	—	—
p17 WHERE-IN	101	198	390	774	—	—	—
p18 WHERE-BETWEEN	97	186	366	735	—	—	—
p19 WHERE-IS-NULL	97	194	378	745	—	—	—
p20 WHERE-NOT-LIKE	101	202	398	861	—	—	—
p21 ORDER-BY-SCORE-ASC	109	218	430	898	1,571	2,745	4,971
p22 ORDER-BY-CITY-DESC	113	222	434	855	—	—	—
p23 ORDER-BY-MULTI	121	242	486	854	—	—	—
p24 ORDER-BY-SCORE-TOP50	93	182	362	496	742	1,359	1,723
p25 DISTINCT-CITY	101	202	394	642	746	910	1,654
p26 DISTINCT-COUNTRY-THEME	105	210	410	682	—	—	—
p27 FUNC-UPPER	93	102	118	150	—	—	—
p28 FUNC-CONCAT	93	102	118	150	—	—	—
p29 FUNC-LENGTH-FILTER	101	198	394	866	—	—	—
p30 FUNC-ROUND	89	98	114	146	—	—	—
p31 FUNC-ABS-FILTER	101	202	398	715	—	—	—
p32 FUNC-COALESCE	93	102	118	150	—	—	—
p33 FUNC-NESTED-ROUND-AVG	129	254	502	905	—	—	—
p34 FUNC-SUBSTRING	89	102	114	146	—	—	—
p35 FUNC-TRIM-UPPER	93	102	118	150	—	—	—
p36 COMPLEX-FILTER-GROUP-SORT	97	194	382	556	872	1,188	2,813
p37 COMPLEX-NESTED-GROUP-HAVING	121	242	474	883	—	—	—
p38 COMPLEX-MULTI-COND-PROJ	93	186	366	565	—	—	—
p39 SUBQUERY-SIMPLE	93	186	370	769	—	—	—
p40 SUBQUERY-GROUP-AFTER-FILTER	101	198	390	745	—	—	—
p41 PREPARED-GROUPBY	129	254	498	1,001	—	—	—
p42 PREPARED-REUSE	21	30	46	78	—	—	—
p43 ENGINE-NO-CACHE	125	250	498	992	—	—	—
p44 ENGINE-CACHE	129	254	498	812	—	—	—
p45 ENGINE-MIXED	93	186	362	504	—	—	—
p46 STRESS-FULL-SORT	109	218	430	882	—	—	—
p47 STRESS-HIGH-CARDINALITY-GROUP	157	306	606	1,065	—	—	—
p48 STRESS-DISTINCT-ID	105	210	418	838	—	—	—
p49 STRESS-WIDE-PROJECTION	297	594	796	720	—	—	—
p50 STRESS-COMPLEX-SUBQUERY	101	198	390	678	916	1,449	3,457
p51 JOIN-INNER	121	238	474	891	1,568	2,872	5,120
p52 JOIN-LEFT	121	242	482	857	—	—	—
p53 JOIN-INNER-FILTER	117	226	450	853	—	—	—
p54 JOIN-INNER-GROUPBY	117	230	454	861	1,343	2,823	5,217
p55 JOIN-INNER-ORDERBY-LIMIT	117	226	446	856	—	—	—
p56 JOIN-LEFT-ANTI	117	230	454	858	—	—	—
p57 WINDOW-ROW-NUMBER-GLOBAL	101	202	398	815	—	—	—
p58 WINDOW-ROW-NUMBER-PART	113	226	450	858	1,390	3,008	5,214
p59 WINDOW-RANK-PART	113	226	454	863	—	—	—
p60 WINDOW-DENSE-RANK-PART	113	230	454	867	—	—	—
p61 WINDOW-NTILE-4	105	210	418	866	—	—	—
p62 WINDOW-LAG	117	230	454	860	—	—	—
p63 WINDOW-LEAD	117	230	454	868	—	—	—
p64 WINDOW-AVG-OVER-PART	113	218	430	868	1,275	2,979	4,859
p65 FUNC-NESTED-LPAD-TRIM-COALESCE	97	106	122	150	—	—	—
p66 FUNC-YEAR-GROUPBY	181	346	934	1,087	—	—	—
p67 FUNC-DATE-DIFF-FILTER	118	230	458	667	—	—	—
p68 FUNC-CAST-INT	94	98	114	146	—	—	—

Reproducing these numbers

Generate the data files (one-time, ~minutes per size):

cd src/test/resources/data-files
for s in 8 16 32 64 128 256 512; do
  python generate_json.py -s "$s" --seed 20260428 -o "data_${s}mb.json"
done

Run the bench (~10 minutes on the reference machine):
```
./mvnw test -Plarge-tests -Dtest=ProfilingTest
```
Open the fresh report at target/profiling/profiling-report-<timestamp>.md and compare its tables to the ones above. Wall-time medians should match within ±5 % on similar hardware.

Why these numbers will differ on your machine: absolute milliseconds depend on CPU clock speed, memory bandwidth, GC tuning, and concurrent load. Use the published numbers as a shape (which queries are fast, which scale linearly with input, which super-linearly) rather than as a hard target.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQL4Json Performance

How to read these numbers

Reference environment

Dataset

Headline results

Full sweep — wall time (ms, median of 3 runs)

Full sweep — peak heap (MB, single-run measurement)

Reproducing these numbers

FilesExpand file tree

performance.md

Latest commit

History

performance.md

File metadata and controls

SQL4Json Performance

How to read these numbers

Reference environment

Dataset

Headline results

Full sweep — wall time (ms, median of 3 runs)

Full sweep — peak heap (MB, single-run measurement)

Reproducing these numbers