feat: Implement `date_part` scalar function by devanbenz · Pull Request #27005 · influxdata/influxdb

devanbenz · 2025-12-03T20:34:20Z

Implements date_part(part, expression), which extracts a component from a timestamp.

Signature:

Exactly 2 args, in order: date_part('<part>', time)
- arg 1: string literal naming the part (case-insensitive)
- arg 2: the time VarRef, nothing else
Returns int64, evaluated in the query timezone (tz(...)), default UTC

Parts:

part	value
`year`	calendar year
`quarter`	quarter of year, `[1, 4]`
`month`	month of year, `[1, 12]`
`week`	ISO-8601 week of year, `[1, 53]`
`day`	day of month, `[1, 31]`
`hour` / `minute` / `second`	`[0,23]` / `[0,59]` / `[0,59]`
`millisecond` / `microsecond` / `nanosecond`	sub-second component only
`dow`	day of week, Sunday = 0 to Saturday = 6
`isodow`	day of week, Monday = 0 to Sunday = 6
`doy`	day of year, `[1, 366]`
`epoch`	seconds since Unix epoch (whole seconds)

week is the ISO week and year is the calendar year, so the two can disagree at
year boundaries. For example 2023-01-01 returns week 52.

Examples:

-- weekdays only
SELECT * FROM some_measurement
WHERE time >= now() - 10d AND time <= now()
  AND date_part('dow', time) != 0 AND date_part('dow', time) != 6

SELECT value, date_part('hour', time) FROM some_measurement

SELECT rules

Must be paired with an anchor, meaning a stored field or a non-date_part
aggregate or selector. date_part-only selects are rejected.
Multiple date_part fields and aliases are allowed, and may nest in expressions
such as date_part('hour', time) + 1.

GROUP BY date_part rules

Allowed alongside time(). Other calls in GROUP BY are rejected.
Duplicate parts are deduplicated.
A SELECTed date_part('part', time) must match a grouped part. A non-grouped
part is rejected because it is undefined for the bucket. A non-active grouped
part yields null in that series.
Output column is named after the canonical part such as year. A field or alias
colliding with it is rejected.
Resolved from the bucket value, not the row timestamp.
fill(null) and fill(none) are supported. fill(previous), fill(linear),
and fill(<value>) are not.

See #27001 for 1.x limitations.

Remove a lot of code that wasn't needed for date_part including iterator creation. We can just map values similar to simple math functions.

gwossum · 2025-12-09T22:19:31Z

+
+const (
+	DatePartString     = "date_part"
+	DatePartTimeString = "date_part_time"


What is date_part_time? I don't see any tests for it.

It's used to create a reference to time since time is an auxiliary field https://github.com/influxdata/influxdb/pull/27005/files#diff-609a7e16be956ed6386e1a4a4efadf600b7d4de7dcfea27330dc692d1e901dc8R930-R944 I'm going to create some ValueMapper tests for this.

@gwossum I can add tests for this but it would likely require exporting

influxdb/query/select.go

Line 881 in 362217b

type valueMapper struct {

and testing it. We don't currently have any valueMapper specific tests. It's basically just a struct filled with maps so we would likely just be testing go's map functionality, which may not be worth the effort?

davidby-influx

Some changes from the first pass. Will review again after changes.

…ile setting time map

Copilot

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated 1 comment.

Files not reviewed (1)

query/internal/internal.pb.go: Generated file

Co-authored-by: Copilot Autofix powered by AI <[email protected]>

Copilot

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated 1 comment.

Files not reviewed (1)

query/internal/internal.pb.go: Generated file

…b into db/76/date_part

Copilot

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated no new comments.

Files not reviewed (1)

query/internal/internal.pb.go: Generated file

devanbenz · 2026-06-24T19:17:52Z

@copilot resolve the merge conflicts in this pull request

Copilot · 2026-06-24T19:25:18Z

@copilot resolve the merge conflicts in this pull request

Done — I merged origin/master-1.x into this branch and resolved the conflict. Addressed in commit b786751.

Copilot

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated 1 comment.

Files not reviewed (1)

query/internal/internal.pb.go: Generated file

Copilot

Pull request overview

Copilot reviewed 23 out of 24 changed files in this pull request and generated no new comments.

Files not reviewed (1)

query/internal/internal.pb.go: Generated file

Comments suppressed due to low confidence (1)

query/iterator.gen.go.tmpl:568

In ScanAt, when an aux value falls into the default branch and defaultValue == SkipDefault (e.g. fill(none)), the map entry for that key is left untouched. With the new date_part GROUP BY behavior, non-active dimension aux slots are explicitly set to nil, so this path will leave stale values from a previous row in m[k.Val] and can incorrectly populate non-active date_part dimension columns. Clearing the key when no fill default is configured avoids that leakage.

		default:
			// Insert the fill value if one was specified.
			if s.defaultValue != SkipDefault {
				m[k.Val] = castToType(s.defaultValue, k.Type)
			}

devanbenz · 2026-06-24T20:48:19Z

+				if err := ValidateDatePart(expr.Args); err != nil {
+					return err
 				}
+				// GROUP BY date_part over a subquery source is not supported: the


It is much easier to just not support GROUP BY for date_part in sub-queries. I could adjust the code to support it but it will add additional complexity that I'm unsure we want.

davidby-influx · 2026-06-25T22:03:58Z

Code Review — `date_part` builtin + `GROUP BY date_part(...)`

Branch: db/76/date_part vs origin/master-1.x
Diff scope: merge-base 66b0dd767660 … HEAD 474ae657fa (24 files, +5,579 / −607)
Review: xhigh, workflow-backed (54 agents; 41 candidates verified → 19 refuted, 15 reported)

Findings are ranked most-severe first. Verdicts are from independent verifier agents.

🔴 Correctness — silently wrong results / data loss (single-node OSS, all CONFIRMED)

1. `SELECT … INTO` with `GROUP BY date_part` silently drops all but the last group

coordinator/statement_executor.go:1415 — convertRowToPoints treats the injected
date_part column (e.g. year) as a regular field, and every group row shares the same
tag set and the same representative bucket timestamp (single window, Interval=0).
Nothing blocks the INTO path.

SELECT count(v) INTO target FROM cpu GROUP BY date_part('year',time) writes
{year:2020,count:N1}@t0 and {year:2021,count:N2}@t0 with identical
measurement/tags/timestamp → they collide, last-write-wins, silent data loss,
plus a stray int field named after the part.

This is the worst one — silent data loss.

2. Multi-call `GROUP BY date_part` merges aggregates across groups, mislabeled

query/cursor.go:405 — multiScannerCursor.scan aligns per-call scanners on
(ts,name,tags) only and writes into one shared map keyed by the single
DatePartDimensionsString; each scanner's date_part value overwrites the others.

SELECT count(v), count(w) FROM cpu GROUP BY date_part('year',time) pairs one field's
count from one year with the other's count from a different year, stamped with
whichever scanner ran last. No compile guard rejects 2+ calls; no test covers it.

3. Raw (non-aggregate) `SELECT … GROUP BY date_part` does no grouping at all

query/select.go:710 — accepted but takes the aux-cursor branch with no
reduce/DimensionGrouper, so ScanAt hits the plain-int64 arm,
DatePartDimensionsString is never set, and GroupingKeys stays nil.

SELECT value FROM cpu GROUP BY date_part('year',time) returns one flat ungrouped
series with an extra year column — silently diverging from GROUP BY <tag>
semantics, no error.

4. `fill(null)` fragments date_part series under `GROUP BY time(), date_part(...)`

query/select.go:571 — filled points use a fixed auxFields slice that never
carries a DecodedDatePartKey, so empty-window rows lose the grouping value.
validateDatePartSelectFields rejects fill(previous/linear/number) but not the
default fill(null).

Empty windows emit null rows with empty GroupingKeys; the emitter's
sameGroupingKeys check then splits them into spurious extra series and fragments the
real ones.

5. Subquery validation bypass

query/compile.go:1411 — validateDatePartAnchor and the wildcard-collision
re-check run only on the outer statement in Prepare, never recursing into subquery
sources.

SELECT max(yr) FROM (SELECT host, date_part('year',time) AS yr FROM cpu) compiles
cleanly even though the equivalent top-level query is rejected — inner query plans as a
tag-only iterator emitting no points, so max(yr) silently returns nothing instead of
erroring. Inner SELECT * colliding with a stored field named year also escapes the
re-check.

🟠 Semantics diverge from SQL — but locked in by the new tests, so possibly intentional (CONFIRMED)

#	Location	Divergence
6	`query/date_part.go:152`	`millisecond`/`microsecond` return only the sub-second part — Postgres returns `seconds*1000 + frac` (off by up to 59,000 ms).
7	`query/date_part.go:162`	`epoch` uses `t.Unix()`, truncating fractional seconds that SQL `epoch` keeps.
8	`query/date_part.go:163`	`isodow` returns 0–6 (Sun=6) instead of SQL 1–7 (Sun=7) — off by one. `date_part_test.go` asserts the 0–6 values.

Worth a deliberate decision: if the intent is SQL compatibility, fix the code and the
tests; if InfluxDB-specific semantics are intended, document it.

🟡 Clustered/distributed only — not reachable in single-node OSS (PLAUSIBLE)

9. query/point.go:266 — encodeAux/decodeAux can't serialize the
DecodedDatePartKey struct; over the data-node wire codec the grouped value comes back
null and all buckets collapse into one group.
10. query/iterator.gen.go.tmpl:1486 — the generic FilterIterator.Next uses
EvalBool with a non-CallValuer map, so a date_part(...) WHERE predicate reaching
it filters out every point (zero rows). No in-repo callers of NewFilterIterator;
reachability uncertain.
11. tsdb/engine/tsm1/iterator.gen.go.tmpl:289 — itr.m is allocated only when
Condition != nil, but written whenever NeedTimeRef. Safe locally (encoder keeps the
invariant), but a wire-decoded options struct with NeedTimeRef=true, Condition=nil
panics on a nil-map write → crashes the query/node.

⚡ Efficiency (CONFIRMED)

12. ⭐ `DatePartValuer` wired into every tsm1 query, date_part or not

tsdb/engine/tsm1/iterator.gen.go.tmpl:235 — unconditionally adds an extra valuer
indirection (an interface call per VarRef/Call lookup, per scanned point) to all
WHERE-filtered queries — the common path. The scanner cursor already gates the
identical wiring behind needDatePart; opt.NeedTimeRef could gate it here too.

This is a per-point CPU regression across the whole engine, not just date_part queries
— arguably higher priority than its category suggests.

13. `ResolveKeys` allocates per-point garbage discarded on bucket hit

query/date_part.go:381 — allocates a fresh entries slice + a 9-byte EncodedKey
string per point, but the reduce loop consumes EncodedKey only on bucket creation (map
miss) → K−1 of every K allocations are GC garbage. Return only DimKey, compute
EncodedKey lazily, reuse a scratch buffer.

14. Per-field redundant work in `Scan`

query/cursor.go:277 — the DatePartDimensionsString lookup, type assertion,
dpd.Expr.String(), and GroupingKeys insert are recomputed per field though invariant
across the field loop; hoist above it.

📋 Convention (CONFIRMED)

15. New date_part tests use raw `t.Fatal`/`t.Error` instead of testify

tests/server_test.go:8085 (also 8790, 8863, 9242, 9297) — violates the project's
testify rule. The same functions already use require.NoError(t, err, "init error") for
setup, so it's internally inconsistent too.

Refuted (19, not reported)

Mostly DRY/maintainability "keep-in-sync" observations (parallel DatePartExpr switches,
duplicated AST walkers, LocationOrUTC not reused) where verifiers found no current
observable defect, plus the millisecond/isodow duplicates and a
DatePartValuer{}-in-compileFields "dead code" claim that was refuted (it can fire
with a literal second arg).

Suggested fix order

1 (data loss) and 12 (engine-wide perf regression) — highest impact.
Correctness cluster 2–5.
Decision on SQL semantics 6–8 (fix code+tests, or document).
Mechanical cleanups 13–15.

davidby-influx · 2026-06-25T22:06:29Z

AI review not verified by a human. Take with a grain of salt.

- cleanup tests to use testify only

devanbenz added 5 commits December 1, 2025 11:53

feat: Implementing date_part

8148077

feat: Add date_part constants and validation

017d816

feat: Adds date_part built in function

4e4ecad

feat: Implement date_part builtin function

fe5551b

chore: checkfmt

2da8c6d

devanbenz self-assigned this Dec 4, 2025

devanbenz added 6 commits December 4, 2025 10:43

fix: fixes TestDatePartValuer test

70314d7

feat: Use type cast instead of switch, we only expect StringLiteral

5ce1eb5

feat: Update ExtractDatePartExpr to return int64 instead of interface

a18ae51

feat: Working on it

7b2fb86

feat: Trying ti implement SELECT date_part semantics

fc6ccf0

feat: modify selector for date_part so it allows multiple calls

47c37f5

devanbenz linked an issue Dec 8, 2025 that may be closed by this pull request

[1.x] Add date_part scalar function to influxdb #27001

Open

devanbenz added 9 commits December 9, 2025 10:16

feat: Remove nil reference, create global config for IsDatePart

5015287

Remove a lot of code that wasn't needed for date_part including iterator creation. We can just map values similar to simple math functions.

feat: Use constants for date_part and date_part_time

9dd9f89

feat: Add tests for multiple function calls

5f007cf

feat: Adds sub-query tests

993837e

chore: prune some bad tests

d38787b

feat: Simplify code

b1b1d0b

feat: Update Validation for date_part to not return other members

7ad056d

feat: re-add validatedatepart

7c1591a

chore: simplify code

4591a19

devanbenz marked this pull request as ready for review December 9, 2025 21:50

gwossum reviewed Dec 9, 2025

View reviewed changes

Comment thread query/cursor.go Outdated

gwossum reviewed Dec 9, 2025

View reviewed changes

Comment thread query/date_part.go Outdated

gwossum reviewed Dec 9, 2025

View reviewed changes

davidby-influx requested changes Dec 9, 2025

View reviewed changes

devanbenz added 3 commits December 10, 2025 10:37

feat: Update cursor map for time ref

48f2235

feat: Merge branch 'master-1.x' into db/76/date_part

c9d8d4c

feat: Update constants, add invalid enum, check if using date_part wh…

cb92353

…ile setting time map

Copilot started reviewing on behalf of devanbenz June 23, 2026 21:26 View session

Copilot AI reviewed Jun 23, 2026

View reviewed changes

Comment thread query/iterator.go

fix: defensive nil check

84c46dc

Co-authored-by: Copilot Autofix powered by AI <[email protected]>

devanbenz requested a review from Copilot June 24, 2026 00:14

Copilot started reviewing on behalf of devanbenz June 24, 2026 00:14 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Comment thread query/compile.go

devanbenz added 2 commits June 24, 2026 14:00

feat: Add anchor validation

11946b5

feat: Merge branch 'db/76/date_part' of github.com:influxdata/influxd…

0272a34

…b into db/76/date_part

devanbenz requested a review from Copilot June 24, 2026 19:01

Copilot started reviewing on behalf of devanbenz June 24, 2026 19:01 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Copilot started work on behalf of devanbenz June 24, 2026 19:18 View session

Copilot finished work on behalf of devanbenz June 24, 2026 19:25

Copilot AI and others added 2 commits June 24, 2026 14:48

chore: Merge origin/master-1.x into db/76/date_part

c368482

feat: Update protos, dang you copilot for borking it!

d39380f

devanbenz force-pushed the db/76/date_part branch from a51d744 to d39380f Compare June 24, 2026 19:49

devanbenz requested a review from Copilot June 24, 2026 19:49

Copilot started reviewing on behalf of devanbenz June 24, 2026 19:50 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Comment thread query/compile.go

feat: Ensure that validation works after rewriting

474ae65

devanbenz requested a review from Copilot June 24, 2026 20:38

Copilot started reviewing on behalf of devanbenz June 24, 2026 20:39 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

devanbenz marked this pull request as ready for review June 24, 2026 20:46

devanbenz commented Jun 24, 2026

View reviewed changes

feat: Support SELECT INTO, load date_part valuer only for date_part

0c8ace0

- cleanup tests to use testify only

Uh oh!

Conversation

devanbenz commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gwossum Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

devanbenz Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

devanbenz Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

davidby-influx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

devanbenz commented Jun 24, 2026

Uh oh!

Copilot AI commented Jun 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

devanbenz Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

davidby-influx commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — date_part builtin + GROUP BY date_part(...)

🔴 Correctness — silently wrong results / data loss (single-node OSS, all CONFIRMED)

1. SELECT … INTO with GROUP BY date_part silently drops all but the last group

2. Multi-call GROUP BY date_part merges aggregates across groups, mislabeled

3. Raw (non-aggregate) SELECT … GROUP BY date_part does no grouping at all

4. fill(null) fragments date_part series under GROUP BY time(), date_part(...)

5. Subquery validation bypass

🟠 Semantics diverge from SQL — but locked in by the new tests, so possibly intentional (CONFIRMED)

🟡 Clustered/distributed only — not reachable in single-node OSS (PLAUSIBLE)

⚡ Efficiency (CONFIRMED)

12. ⭐ DatePartValuer wired into every tsm1 query, date_part or not

13. ResolveKeys allocates per-point garbage discarded on bucket hit

14. Per-field redundant work in Scan

📋 Convention (CONFIRMED)

15. New date_part tests use raw t.Fatal/t.Error instead of testify

Refuted (19, not reported)

Suggested fix order

Uh oh!

davidby-influx commented Jun 25, 2026

devanbenz commented Dec 3, 2025 •

edited

Loading

davidby-influx commented Jun 25, 2026 •

edited

Loading

Code Review — `date_part` builtin + `GROUP BY date_part(...)`

1. `SELECT … INTO` with `GROUP BY date_part` silently drops all but the last group

2. Multi-call `GROUP BY date_part` merges aggregates across groups, mislabeled

3. Raw (non-aggregate) `SELECT … GROUP BY date_part` does no grouping at all

4. `fill(null)` fragments date_part series under `GROUP BY time(), date_part(...)`

12. ⭐ `DatePartValuer` wired into every tsm1 query, date_part or not

13. `ResolveKeys` allocates per-point garbage discarded on bucket hit

14. Per-field redundant work in `Scan`

15. New date_part tests use raw `t.Fatal`/`t.Error` instead of testify