-
Notifications
You must be signed in to change notification settings - Fork 286
feat(recovery): start the World at server boot to recover in-flight runs #2544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pranaygp
wants to merge
10
commits into
main
Choose a base branch
from
pgp/world-start-recovery
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
a817e90
feat(recovery): start the World at server boot to recover in-flight runs
pranaygp a2c635b
fix(recovery): unbreak Nitro/Vite builds + bundled-version world startup
pranaygp a4f4c26
fix(nitro): drop auto startup plugin; document manual Nitro wiring
pranaygp 44c0577
docs: import defineNitroPlugin in Nitro sample so code-samples typecheck
pranaygp 6116307
docs: address review — optional startup step in getting-started + nav…
pranaygp 54b5b2e
docs: move startup-recovery note to the worlds that need it
pranaygp 2ae642b
feat(nitro): auto-start the World at server boot
pranaygp 6e24235
test(recovery): cover runs killed mid-step (locked job), not just mid…
pranaygp 345b17e
docs: note Nitro/Nuxt auto-start in postgres-world startup section
pranaygp aadabdd
fix(nuxt): drop redundant manual world-start plugin (auto-start cover…
pranaygp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| --- | ||
| '@workflow/nitro': minor | ||
| --- | ||
|
|
||
| Start the workflow World automatically at server boot via a generated Nitro plugin, so self-hosted Nitro apps (Nitro v2/v3, Nuxt, Express/Hono/Fastify on Nitro) recover in-flight runs after a restart with no manual wiring. Skipped on Vercel deploys. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| --- | ||
| '@workflow/world-local': patch | ||
| --- | ||
|
|
||
| Skip data-dir version-compat enforcement when the package version is the `bundled` sentinel (framework server bundles), so `world.start()` at server startup no longer throws `Invalid version string: "bundled"`. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| --- | ||
| 'workflow': minor | ||
| '@workflow/core': minor | ||
| --- | ||
|
|
||
| Add `ensureWorldStarted()` (exported from `workflow/runtime`) which starts the World once per process at server startup, running boot-time recovery of in-flight runs for self-hosted worlds. Call it from your framework's startup hook (e.g. a Next.js `instrumentation.ts`). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| --- | ||
| '@workflow/world-vercel': minor | ||
| --- | ||
|
|
||
| Add a no-op `start()` for World-interface compliance. The Vercel World is push-based (VQS redelivery), so it needs no boot-time recovery. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| --- | ||
| '@workflow/world': patch | ||
| --- | ||
|
|
||
| Document the `start()` contract: it must be idempotent and may be a no-op for push-based/serverless worlds, and is where queue-backed worlds run boot-time recovery. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,4 @@ | ||
| { | ||
| "title": "Deploying", | ||
| "pages": ["...deploying", "building-a-world"] | ||
| "pages": ["...deploying", "recovering-in-flight-runs", "building-a-world"] | ||
| } |
88 changes: 88 additions & 0 deletions
88
docs/content/docs/v4/deploying/recovering-in-flight-runs.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,88 @@ | ||
| --- | ||
| title: Recovering in-flight runs | ||
| description: Start the World at server boot so runs that were in flight when the process stopped resume after a restart. | ||
| type: guide | ||
| summary: Call the World's start() at server boot to recover in-flight runs after a restart. | ||
| prerequisites: | ||
| - /docs/deploying | ||
| related: | ||
| - /docs/deploying/world/local-world | ||
| - /docs/deploying/world/postgres-world | ||
| - /docs/deploying/world/vercel-world | ||
| --- | ||
|
|
||
| When you self-host a workflow app on a long-lived server (the [local](/docs/deploying/world/local-world) and [Postgres](/docs/deploying/world/postgres-world) worlds), a run can be mid-flight — sleeping, waiting on a hook, or between steps — when the process stops or crashes. To resume those runs, the World's `start()` method runs **boot-time recovery**: it re-enqueues every `pending`/`running` run so execution continues. | ||
|
|
||
| Recovery only happens if `start()` is actually called, and it must be called **once at server startup** — not in response to a request. Otherwise an idle server that restarted with in-flight runs would never pick them back up. | ||
|
|
||
| ## `ensureWorldStarted()` | ||
|
|
||
| Call `ensureWorldStarted()` from `workflow/runtime` in your framework's server-startup hook: | ||
|
|
||
| ```ts | ||
| import { ensureWorldStarted } from 'workflow/runtime'; | ||
|
|
||
| await ensureWorldStarted(); | ||
| ``` | ||
|
|
||
| It is **idempotent** — it starts the World at most once per process, so it is safe to call from a hook that may run more than once. Re-enqueuing a run that is already progressing is harmless: the workflow handler is replay-idempotent, so duplicate enqueues converge rather than double-execute. | ||
|
|
||
| You can call this regardless of which World you target. On the [Vercel World](/docs/deploying/world/vercel-world) it is a no-op — delivery is push-based and the queue redelivers in-flight messages on its own, so there is no long-lived process to recover. | ||
|
|
||
| ## Wiring it per framework | ||
|
|
||
| ### Next.js | ||
|
|
||
| Add an `instrumentation.ts` at your project root. Guard on the Node.js runtime — `instrumentation.ts` also runs in the Edge runtime, which can't load the world modules: | ||
|
|
||
| ```ts title="instrumentation.ts" | ||
| export async function register() { | ||
| if (process.env.NEXT_RUNTIME === 'nodejs') { | ||
| const { ensureWorldStarted } = await import('workflow/runtime'); | ||
| await ensureWorldStarted(); | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ### Nitro, Nuxt, Express, Hono, Fastify (Nitro) | ||
|
|
||
| No action required — the `@workflow/nitro` integration registers a Nitro server plugin that starts the World at boot for you. (Not on Vercel deploys, where the push-based Vercel World needs no boot recovery.) | ||
|
|
||
| ### SvelteKit | ||
|
|
||
| Use the [`init`](https://svelte.dev/docs/kit/hooks#Shared-hooks-init) server hook: | ||
|
|
||
| ```ts title="src/hooks.server.ts" | ||
| import type { ServerInit } from '@sveltejs/kit'; | ||
|
|
||
| export const init: ServerInit = async () => { | ||
| const { ensureWorldStarted } = await import('workflow/runtime'); | ||
| await ensureWorldStarted(); | ||
| }; | ||
| ``` | ||
|
|
||
| ### NestJS | ||
|
|
||
| Call it in your `bootstrap()` before listening: | ||
|
|
||
| ```ts title="src/main.ts" | ||
| async function bootstrap() { | ||
| const { ensureWorldStarted } = await import('workflow/runtime'); | ||
| await ensureWorldStarted(); | ||
| // ...create and listen | ||
| } | ||
| ``` | ||
|
|
||
| ### Astro | ||
|
|
||
| Astro has no startup hook that works across all adapters, so start the World from middleware. `ensureWorldStarted()` is idempotent, so it only does real work on the first request: | ||
|
|
||
| ```ts title="src/middleware.ts" | ||
| import { defineMiddleware } from 'astro:middleware'; | ||
|
|
||
| export const onRequest = defineMiddleware(async (_context, next) => { | ||
| const { ensureWorldStarted } = await import('workflow/runtime'); | ||
| await ensureWorldStarted(); | ||
| return next(); | ||
| }); | ||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also have these ben an optional accordion/compressed setup step mentioned in each of the framework's getting started guides. the step should state this this is not required for vercel deployments (serverless/push based queue worlds) but required for pull based/worker based workflow sdk deployments. and it can link to this docs pages for details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for v4 and v5 docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 6116307 — added an optional, collapsed accordion ("Recover in-flight runs after a restart") to each framework's getting-started guide (Next/Nitro/Express/Hono/Fastify/Nuxt/Vite/TanStack Start/SvelteKit/Nest/Astro, v4 + v5). Each shows the framework's startup snippet, notes it is not required for Vercel deployments, and links to the full Recovering in-flight runs page.