feat: schedule ingestion engine#31
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR replaces the legacy client-side CSV schedule import with a server-driven ingestion workflow backed by Supabase Edge Functions and a transactional Postgres RPC, plus a new admin import wizard UI (upload → diff/conflicts → commit).
Changes:
- Added
diff-scheduleandcommit-scheduleEdge Functions, plus acommit_schedulePostgres RPC to perform atomic schedule writes. - Implemented a new admin schedule import wizard UI with stage-mismatch and orphan-set resolution flows.
- Added unit tests for diffing logic and integration tests for the commit RPC.
Reviewed changes
Copilot reviewed 25 out of 26 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| vite.config.ts | Adds test runner config exclusions. |
| supabase/migrations/20260509142022_commit_schedule_rpc.sql | Adds constraints and the transactional commit_schedule RPC used by ingestion. |
| supabase/functions/_shared/auth.ts | Shared admin auth + CORS helpers for Edge Functions. |
| supabase/functions/diff-schedule/index.ts | Edge Function endpoint to compute a diff from CSV rows vs DB. |
| supabase/functions/diff-schedule/diff.ts | Core diff/matching logic (artists, stages, sets, orphan detection). |
| supabase/functions/diff-schedule/diff.test.ts | Unit tests covering slugging, time conversion, matching rules, and conflicts. |
| supabase/functions/commit-schedule/index.ts | Edge Function endpoint that calls the commit_schedule RPC. |
| supabase/functions/commit-schedule/commit-schedule.test.ts | Integration tests targeting the RPC behavior against local Supabase. |
| src/services/scheduleImportService.ts | Frontend service layer for parsing CSV + invoking diff/commit + building commit payloads. |
| src/pages/admin/FestivalScheduleImport.tsx | New admin page wrapper for the import wizard route. |
| src/pages/admin/FestivalEdition.tsx | Adds an “Import” tab and routing to the new import page. |
| src/components/router/GlobalRoutes.tsx | Wires the /import sub-route under festival edition admin routes. |
| src/components/Admin/ScheduleImport/ScheduleImportWizard.tsx | Wizard state machine: upload → review → commit result, plus cache invalidation. |
| src/components/Admin/ScheduleImport/CsvUploadStep.tsx | CSV upload + timezone selection + invokes diff. |
| src/components/Admin/ScheduleImport/DiffReviewStep.tsx | Review UI container including conflicts and commit action. |
| src/components/Admin/ScheduleImport/DiffSummaryBanner.tsx | Summary banner for diff results. |
| src/components/Admin/ScheduleImport/StageMismatchResolver.tsx | UI to map mismatched stage names or create new stages. |
| src/components/Admin/ScheduleImport/OrphanedSetsPanel.tsx | UI to archive/keep orphaned sets not present in CSV. |
| src/components/Admin/ScheduleImport/CommitResultCard.tsx | Success UI and “import another file” reset action. |
4f1b288 to
c8110dd
Compare
|
✅ DB Migrate succeeded for |
The supabase CLI's edge-runtime container bundles with a Deno old enough that it rejects v5 lockfiles. --use-api uploads each function source via the Management API and skips local container bundling, sidestepping the lockfile version mismatch.
The artists.added_by column is NOT NULL, but the RPC's artist upsert omitted it. Thread p_user_id through so new artists created during a schedule import get the importing user attributed. Also fix two test fixtures that inserted artists without added_by.
- DiffReviewStep: swap hand-rolled error div for shared Alert component - OrphanedSetsPanel: move toggleAll below return, delete local formatTime and use shared formatDateTime with a new optional timezone arg - CsvUploadStep: move readFile helper below the component - unit-tests.yml: drop now-redundant inline comment - timeUtils: formatDateTime accepts an optional timezone (formatInTimeZone)
Replaces the hand-rolled parseCSV with PapaParse. The original toggled
inQuotes on every quote character, so RFC 4180 escaped quotes ("") and
quoted fields containing commas/newlines were mis-parsed — common in
description columns. PapaParse handles all of that, plus header
normalization via transformHeader.
TimezonePicker.tsx is down from ~300 to ~90 lines. The catalog logic (useTimezoneCatalog + IANA-zone helpers) moves to timezoneCatalog.ts, and each CommandItem row is now a TimezoneItem component.
diff.ts is down from 339 to 185 lines and only holds shared types plus the computeDiff orchestrator. Pure utilities (toSlug, artistKey, date math, timezone math) move to diffHelpers.ts. The per-row resolvers (buildIndexes, resolveArtists, resolveStage, computeTimes, findMatchingSet) move to diffResolvers.ts. diff.test.ts imports the helpers directly.
pnpm types:generate now pipes the supabase output to both src/integrations/supabase/types.ts and supabase/functions/_shared/database.types.ts, so the frontend and edge functions read the same definitions. diff-schedule's DbStage/DbArtist/DbSet are derived from the generated Row types, and the manual 'as DbX[]' casts at the query boundary are gone.
Folds the 142025 added_by fix into the original 142022 definition so the RPC has a single canonical source (staging will be reverted, so squashing is safe). Extracts commit_schedule__upsert_artists and commit_schedule__upsert_stages helpers to match the existing commit_schedule__ helper pattern; the main RPC body reads as a sequence of PERFORM calls plus the two set loops.
Mirrors the per-function config we have for diff-schedule/commit-schedule so all Edge Functions share the same Deno layout.
Adds happy-path unit tests for the schedule import pure functions: parseScheduleCsv (column presence, pipe-split artists, header case-insensitivity, skip empty-artist rows) and buildCommitPayload (stage mismatch map vs create, orphan archive filter, untouched pass-through). PapaParse covers the RFC 4180 edge cases.
The wizard's 5 useState calls + commit mutation collapse to a single discriminated-union state. Review-stage concerns (resolution maps, commit mutation, dbStages query) move into a new ReviewStage component that owns its own state; the wizard just routes between upload/review/ result stages. ScheduleImportWizard: 141 → 56 lines.
Prod data can pre-date the slug-dedupe migration, so restoring it into a target that already has the artists_slug_unique constraint fails. Drop the constraint before restore, then dedupe and re-add it afterwards. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
stages.slug is NOT NULL, so commit_schedule__upsert_stages' INSERT failed and rolled back the whole import whenever a new stage was created. Generate the slug via commit_schedule__slugify, matching useCreateStage. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
Drop --no-check so the Deno CI job type-checks edge function code instead of letting broken imports/types ship. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
- Extract handleRow as a reducer: computeDiff now folds rows via state = handleRow(state, row), keeping accumulation in one place. - Make resolveArtists pure — it returns new artists instead of mutating a caller-owned array; cross-row de-dup moved into handleRow. - Move StageResolution next to resolveStage. - Hoist the loop-invariant strip() call out of resolveStage's find. - Narrow computeTimes' parameter to the fields it uses. - Simplify findMatchingSet with early returns. - Split the diffHelpers unit tests into diffHelpers.test.ts. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
- Add Access-Control-Allow-Methods to the edge-function CORS headers so browsers don't reject the POST preflight. - Coerce empty-string timeStart/timeEnd to null in the commit schema; the RPC's ::timestamptz cast errors on "". - Suffix imported stage slugs with an id chunk so two names that slugify to the same value can't violate the (edition, slug) unique constraint. - Throw a user-facing error in parseScheduleCsv on quote/field-count parse errors instead of importing silently corrupted rows. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
- Drop the redundant "diff" prefix from every file in the folder (diff.ts -> computeDiff.ts, diffResolvers.ts -> resolvers.ts, diffHelpers.ts -> helpers.ts, and matching test files). - Extract the shared type definitions into types.ts. - Replace handleRow with collectNewArtists/applyStageResolution helpers inlined directly in the computeDiff loop. - Add resolvers.test.ts covering buildIndexes, resolveArtists, resolveStage, computeTimes and findMatchingSet. - Move test data builders below the tests; drop the "computeDiff:" prefix from test titles; drop an unnecessary comment in resolvers.ts. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
…back - Break the oversized scheduleImportService.ts into a scheduleImport/ folder: types.ts, parseCsv.ts, buildCommitPayload.ts, api.ts. - Validate diff-schedule / commit-schedule responses with zod schemas (the DiffResult/CommitResult types are now inferred from them) instead of unchecked `as` casts. - parseScheduleCsv now throws on any PapaParse error, including delimiter detection failures. - Narrow resolveSetStageName to take just the stage name and move it below buildCommitPayload's return. - Update component imports to the new paths. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
Postgres grants EXECUTE on new functions to PUBLIC by default, which would let any authenticated PostgREST client call commit_schedule directly and bypass the commit-schedule Edge Function's admin-only gate. Revoke EXECUTE from PUBLIC and grant it only to service_role for the RPC and its helpers. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
A B2B cell like "Carl Cox | Carl Cox" produced a duplicated artist list, which changes the diff's roster key (breaking matches against existing sets) and sends duplicate slugs downstream. Normalize each row's artist list to a case-insensitive unique set. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
The schedule-ingestion branch added an artists.slug dedupe + constraint re-add step to sync-from-prod.sh. Drop it — that belongs in a migration, and re-adding the constraint unconditionally can abort the sync. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
commit_schedule__upsert_stages previously suffixed every imported stage slug with a uuid chunk to dodge the (edition, slug) unique constraint. Per review, a stage matching an existing one by name OR slug should be treated as the same stage: unarchive it instead of creating a duplicate. Replaced the single ON CONFLICT upsert with a per-row match-or-insert loop, so the slug stays clean (slugify(name), no suffix). https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
… (name) The per-row name-or-slug loop guarded a path that can't occur: slugify and the diff's strip() both collapse non-alphanumerics, so any two names that would collide on (edition, slug) also strip-collide and are flagged by the diff as a mismatch -- they never reach upsert_stages as a plain new stage. Back to a single ON CONFLICT (festival_edition_id, name) upsert with a plain slugify(name) slug. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
commit_schedule had the set update, set create and orphan archive logic inline. Pull each into its own commit_schedule__ helper that returns its row count, matching the upsert_artists/upsert_stages pattern, so the RPC body reads as the workflow. Helpers run in the same explicit order (update, create, archive) and are revoked from PUBLIC like the rest. https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
| const available = candidates.filter((s) => !alreadyMatched.has(s.id)); | ||
| if (available.length <= 1) return available[0] ?? null; | ||
|
|
||
| if (resolvedStageId) { | ||
| const byStage = available.find((s) => s.stage_id === resolvedStageId); |
There was a problem hiding this comment.
The available[0] fallback is kept deliberately. The alternative — treating same-roster sets whose stage/time don't match as new — would orphan the existing set and its vote history, which is strictly worse than a wrong-slot assignment. Worst case here is a bounded misassignment (the artist's other slot for the same artist), which is recoverable via re-import.
The non-determinism concern is addressed in the sibling comment: f9a47bf adds .order("time_start", { nullsFirst: false }).order("id") to the sets query so the fallback always picks the same row for the same input regardless of Postgres storage order.
Generated by Claude Code
| .from("sets") | ||
| .select( | ||
| "id, name, description, stage_id, time_start, time_end, set_artists(artist_id, artists(id, name, slug))", | ||
| ) | ||
| .eq("festival_edition_id", festivalEditionId) | ||
| .eq("archived", false), | ||
| db.from("artists").select("id, name, slug").eq("archived", false), |
There was a problem hiding this comment.
Fixed in f9a47bf. Added .order("time_start", { nullsFirst: false }).order("id") to the sets query so nulls sort last and ties break on a stable uuid — available[0] now always picks the same row for the same input.
Generated by Claude Code
Add .order("time_start", { nullsFirst: false }).order("id") to the sets
query so the available[0] fallback in findMatchingSet always picks the
same row for the same input regardless of Postgres storage order.
https://claude.ai/code/session_01T7UYCNxqTRMB4HJ6pk1nEm
Replaces the old client-side CSV import with a server-side ingestion system.
Two Supabase Edge Functions (
diff-schedule,commit-schedule) handle the diff and atomic commit via a Postgres RPC. The frontend wizard walks admins through upload → conflict resolution → commit.Key design decisions: sets matched by artist roster + stage (preserving votes), orphaned sets surfaced as explicit archive/keep conflicts, stage name mismatches resolved via map-to-existing or create-new, all writes wrapped in a single transaction with full rollback on failure.