Skip to content

schedulertest: V2->V1->V2 scheduler migration round-trip test#10729

Draft
davidporter-id-au wants to merge 5 commits into
temporalio:mainfrom
davidporter-id-au:feature/chasm-scheduler-migration-roundtrip
Draft

schedulertest: V2->V1->V2 scheduler migration round-trip test#10729
davidporter-id-au wants to merge 5 commits into
temporalio:mainfrom
davidporter-id-au:feature/chasm-scheduler-migration-roundtrip

Conversation

@davidporter-id-au

Copy link
Copy Markdown
Contributor

What

Exercises a full migration round trip and asserts state is preserved end to end. Both hops run real code: the real SchedulerMigrateToWorkflowTask exports state via CHASMToLegacyStartScheduleArgs (captured from the history client), and the captured StartScheduleArgs is re-imported via LegacyToCreateFromMigrationStateRequest (the same function the V1 workflow's activity calls) and CreateSchedulerFromMigration into a fresh V2 engine, then run forward with invariants enforced. Asserts the conflict token and high-water mark survive the round trip.

The V1 workflow's own runtime ticking between hops is not executed here: it needs the V1 SDK testsuite plus unexported package scheduler fixtures, which would create an import cycle with this package; that behavior is covered by service/worker/scheduler/workflow_test.go.

This test surfaced the nil LastCompletionResult panic fixed in #10725.

Stack

Stacked on #10727 (invariants) → #10726 (harness) → #10725 (bugfix; required, or this test panics). Sibling of the property-test PR. Diff against main is cumulative; net-new here is migration_roundtrip_test.go.

Test plan

  • go test ./chasm/lib/scheduler/schedulertest/ -run TestMigrationRoundTrip

🤖 Generated with Claude Code

davidporter-id-au and others added 5 commits June 16, 2026 00:06
CreateSchedulerFromMigration stored whatever LastCompletionResult the migrated
V1 state carried, which is nil when the schedule had no prior completion (e.g.
migrated before its first action) — convertLastCompletionLegacyToCHASM returns
nil in that case. InvokerExecuteTaskHandler.startWorkflow then dereferences
lastCompletionState.Success/.Failure unconditionally, panicking on the first
workflow start after migration.

The normal CreateScheduler/NewScheduler path defaults to a non-nil empty
&schedulerpb.LastCompletionResult{}, so it never hit this. Default to the same
non-nil empty value in CreateSchedulerFromMigration.

Found by the V2->V1->V2 migration round-trip test.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
(cherry picked from commit 23ad378b19a638df2603c47c0e3a806e9877bf2e)
Test drivers that step a component forward need to invoke node-level task
dispatch (EachPureTask/ExecuteSideEffectTask) and inspect accumulated physical
tasks, which the Engine API did not expose. Add:

- NodeForRef / BackendForRef accessors on the in-memory Engine.
- WithNodeBackendDecorator option, applied to each execution's MockNodeBackend
  at creation time, so callers can supply handlers the default leaves unset
  (e.g. HandleGetNamespaceEntry, or a constant transition counter).

All additive and default-nil; existing chasmtest users are unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
(cherry picked from commit aeddc4227acc0ec1c8a8b6b0034efa0d4c6bc42b)
Add a harness that builds a real CHASM scheduler on a chasmtest.Engine and steps
it forward through every task it schedules for itself, advancing a controllable
clock. Pure tasks dispatch via Node.EachPureTask and side-effect tasks via
Node.ExecuteSideEffectTask (the production dispatch paths). The driver exposes
Step/RunToQuiescence, Pause/Unpause/TriggerImmediately/MigrateToWorkflow
operations, a Snapshot of observable state, and an AfterStep hook.

This lives in a non-test package so property and migration tests can import it.
library.go mirrors the in-package fixtures (NewTestLibrary, DefaultSchedule, ...)
but exported and wired with the mock clients the side-effect handlers need.

Smoke tests cover the three settled behaviors: a running interval schedule ticks
forever, a paused schedule is held open and keeps ticking, and a schedule with
no remaining actions idles then closes.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
(cherry picked from commit 77342fdbbdf5ef7fadaa16553a87c496edb2db81)
Add CheckInvariants, evaluated after every settled transition via an AfterStep
hook, encoding the legitimate-quiescence rules from Scheduler.isHeldOpen and
getIdleExpiration:

- no-stuck (primary): a non-closed schedule may have no pending task only when
  held open (paused or draining a backfill); any other taskless, non-closed
  state is the stuck-open bug class.
- idle-not-while-held-open: a held-open schedule must not arm an idle close.
- closed-is-terminal: a closed schedule never reopens.
- high-water-mark monotonicity for the generator and invoker.

Deterministic table tests run several schedules (running, paused, exhausted,
short-interval) forward with the hook wired in, plus direct unit tests proving
the predicate flags a stuck state and an HWM regression while allowing a
legitimate held-open-without-task state.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
(cherry picked from commit 1c2ac604e4638e60b8f2ca180ba25e671f0c936b)
Exercise a full migration round trip and assert state is preserved end to end.
Both hops run real code: the real SchedulerMigrateToWorkflowTask exports state
via CHASMToLegacyStartScheduleArgs (captured from the history client), and the
captured StartScheduleArgs is re-imported via LegacyToCreateFromMigrationStateRequest
(the same function the V1 workflow's activity calls) and CreateSchedulerFromMigration
into a fresh V2 engine, then run forward with invariants enforced. Asserts the
conflict token and high-water mark survive the round trip.

The V1 workflow's own runtime ticking between hops is not executed here: it
needs the V1 SDK testsuite plus unexported package-scheduler fixtures, which
would create an import cycle with this package; that behavior is covered by
service/worker/scheduler/workflow_test.go. This test surfaced the nil
LastCompletionResult panic fixed earlier in this branch.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
(cherry picked from commit 36e82b09684c455fc9741dd03b9ccef3f6e64fef)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant