Skip to content

schedulertest: scheduler invariants and deterministic tables#10727

Draft
davidporter-id-au wants to merge 4 commits into
temporalio:mainfrom
davidporter-id-au:feature/chasm-scheduler-invariants
Draft

schedulertest: scheduler invariants and deterministic tables#10727
davidporter-id-au wants to merge 4 commits into
temporalio:mainfrom
davidporter-id-au:feature/chasm-scheduler-invariants

Conversation

@davidporter-id-au

Copy link
Copy Markdown
Contributor

What

Adds CheckInvariants, evaluated after every settled transition via an AfterStep hook, encoding the legitimate-quiescence rules from Scheduler.isHeldOpen and getIdleExpiration:

  • no-stuck (primary): a non-closed schedule may have no pending task only when held open (paused or draining a backfill); any other taskless, non-closed state is the stuck-open bug class.
  • idle-not-while-held-open: a held-open schedule must not arm an idle close.
  • closed-is-terminal: a closed schedule never reopens.
  • high-water-mark monotonicity for the generator and invoker.

Deterministic table tests run several schedules (running, paused, exhausted, short-interval) forward with the hook wired in, plus direct unit tests proving the predicate flags a stuck state and an HWM regression while allowing a legitimate held-open-without-task state.

Stack

Stacked on #10726 (harness) → #10725 (bugfix). Diff against main is cumulative; net-new here is invariants.go + invariants_test.go.

Test plan

  • go test ./chasm/lib/scheduler/schedulertest/ -run 'TestInvariants|TestCheckInvariants'

🤖 Generated with Claude Code

davidporter-id-au and others added 4 commits June 16, 2026 00:06
CreateSchedulerFromMigration stored whatever LastCompletionResult the migrated
V1 state carried, which is nil when the schedule had no prior completion (e.g.
migrated before its first action) — convertLastCompletionLegacyToCHASM returns
nil in that case. InvokerExecuteTaskHandler.startWorkflow then dereferences
lastCompletionState.Success/.Failure unconditionally, panicking on the first
workflow start after migration.

The normal CreateScheduler/NewScheduler path defaults to a non-nil empty
&schedulerpb.LastCompletionResult{}, so it never hit this. Default to the same
non-nil empty value in CreateSchedulerFromMigration.

Found by the V2->V1->V2 migration round-trip test.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
(cherry picked from commit 23ad378b19a638df2603c47c0e3a806e9877bf2e)
Test drivers that step a component forward need to invoke node-level task
dispatch (EachPureTask/ExecuteSideEffectTask) and inspect accumulated physical
tasks, which the Engine API did not expose. Add:

- NodeForRef / BackendForRef accessors on the in-memory Engine.
- WithNodeBackendDecorator option, applied to each execution's MockNodeBackend
  at creation time, so callers can supply handlers the default leaves unset
  (e.g. HandleGetNamespaceEntry, or a constant transition counter).

All additive and default-nil; existing chasmtest users are unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
(cherry picked from commit aeddc4227acc0ec1c8a8b6b0034efa0d4c6bc42b)
Add a harness that builds a real CHASM scheduler on a chasmtest.Engine and steps
it forward through every task it schedules for itself, advancing a controllable
clock. Pure tasks dispatch via Node.EachPureTask and side-effect tasks via
Node.ExecuteSideEffectTask (the production dispatch paths). The driver exposes
Step/RunToQuiescence, Pause/Unpause/TriggerImmediately/MigrateToWorkflow
operations, a Snapshot of observable state, and an AfterStep hook.

This lives in a non-test package so property and migration tests can import it.
library.go mirrors the in-package fixtures (NewTestLibrary, DefaultSchedule, ...)
but exported and wired with the mock clients the side-effect handlers need.

Smoke tests cover the three settled behaviors: a running interval schedule ticks
forever, a paused schedule is held open and keeps ticking, and a schedule with
no remaining actions idles then closes.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
(cherry picked from commit 77342fdbbdf5ef7fadaa16553a87c496edb2db81)
Add CheckInvariants, evaluated after every settled transition via an AfterStep
hook, encoding the legitimate-quiescence rules from Scheduler.isHeldOpen and
getIdleExpiration:

- no-stuck (primary): a non-closed schedule may have no pending task only when
  held open (paused or draining a backfill); any other taskless, non-closed
  state is the stuck-open bug class.
- idle-not-while-held-open: a held-open schedule must not arm an idle close.
- closed-is-terminal: a closed schedule never reopens.
- high-water-mark monotonicity for the generator and invoker.

Deterministic table tests run several schedules (running, paused, exhausted,
short-interval) forward with the hook wired in, plus direct unit tests proving
the predicate flags a stuck state and an HWM regression while allowing a
legitimate held-open-without-task state.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
(cherry picked from commit 1c2ac604e4638e60b8f2ca180ba25e671f0c936b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant