feat(plugins): scheduled sync, metadata enrichment, and detailed progress#33
Merged
Conversation
Lay the groundwork for admin-scheduled, per-user-opt-in plugin syncs: an admin sets a cron cadence per plugin, and each user chooses whether their connection syncs automatically on that cadence. - Add nullable `plugins.sync_cron_schedule` column (migration + entity). NULL means no scheduled sync, so existing installs are unchanged. - Add `PluginsRepository::list_sync_scheduled()` returning enabled plugins with a cron set, for the scheduler to load at boot/reload. - Add `user_plugins::Model::auto_sync_enabled()` reading the host-side `config._codex.autoSync` opt-in (default false, manual-only). The `_codex` namespace constant now lives in codex-db and is re-used by codex-tasks so the parser and accessor share one source. No behavior change yet: the column is inert until the scheduler and API surface are wired up. Includes unit tests for the repo query and the auto-sync accessor.
When an enabled, sync-capable plugin has an admin-configured cron, the scheduler now registers one cron entry per plugin and, on each firing, enqueues a UserPluginSync task for every connected user whose connection is enabled, authenticated, and opted into auto sync (config._codex.autoSync). - Add Scheduler::load_plugin_sync_schedules() + add_plugin_sync_schedule(), wired into start() so reload_schedules() applies cron changes without a restart. Plugins lacking the user_read_sync capability are skipped. - Add fan_out_plugin_sync(): selects eligible connections, skips any with a pending/processing sync (reusing the existing per-user dedup), and logs a one-line summary of the outcome. Also fix a dedup bug this surfaced: enqueue coalesced UserPluginSync by task_type alone (the type has no FK columns and no dedup_params), so every user's fan-out task would have collapsed into one (and the manual path was latently affected too). Add TaskType::plugin_user_dedup() and dedup UserPluginSync on (plugin_id, user_id) in find_existing_task, so different users keep independent tasks while a repeat for the same user coalesces. Includes scheduler and task-queue tests.
Expose the scheduled-sync controls over the API:
- Admin: PATCH /admin/plugins/{id} accepts syncCronSchedule (omit = no
change, null = clear, string = set). The value is validated and
normalized, and rejected unless the plugin's manifest declares the
user_read_sync capability. PluginDto now returns the stored schedule.
- User: new PATCH /user/plugins/{id}/sync-mode { auto } toggles a
connection between automatic and manual sync. It is capability-gated and
does a read-modify-write of the host-only config._codex.autoSync flag,
preserving all other config keys. UserPluginDto exposes a derived
auto_sync field.
Reload the cron scheduler when a schedule changes so it takes effect
without a server restart: on plugin update when the schedule field is
present, and on enable/disable for plugins that carry a schedule (since
registration is gated on the enabled state).
Includes API tests for both endpoints (validation, capability gating,
clearing, and config-merge preservation).
Surface the scheduled-sync controls in the web app: - Admin: a "Sync Schedule (cron)" input on the plugin edit form, shown only for plugins whose manifest declares the user_read_sync capability. It initializes from the plugin and is sent only for sync-capable plugins (a string sets the cadence, empty clears it). - User: an "Automatic sync" switch on the connection card, shown only for connected, sync-capable integrations. Toggling it calls the sync-mode endpoint and reflects the connection's autoSync state. Adds userPluginsApi.setSyncMode and wires it through a mutation with cache invalidation and toasts. Includes component tests covering the switch's visibility (capability gating), checked state, and toggle callback.
Document the admin-managed sync cadence and the per-user auto/manual opt-in: - Plugins overview gains a "Scheduled (Automatic) Sync" section covering how an admin sets a per-plugin cron on the Execution tab, how users opt their connection in (off by default), and what happens when the schedule fires (enabled + connected + opted-in connections only, deduped, expired credentials skipped). Adds autoSync to the _codex settings reference. - AniList Sync page gains a "Manual vs Automatic Sync" subsection that points at the overview.
The "fetch library data + aggregate per-series reading progress" logic was copy-pasted across build_user_library (recommendations) and build_push_entries/build_unmatched_entries (sync), each re-fetching books, progress, metadata, and ratings and folding them the same way. Collapse it into a single batched build_series_engagements in codex-services: one pass fetches series metadata, books, the user's reading progress and ratings, library names, and optionally taxonomy, then folds book-level progress into a per-series SeriesEngagement. Each caller projects that aggregate into its own wire DTO; the sync path adds a shared project_sync_entry and drops the now-dead series_library_info. An EngagementOptions::include_taxonomy flag lets the sync path skip the genres/tags/alternate-title/external-id queries it doesn't need, keeping its query count unchanged. No behaviour change; existing sync push and recommendation tests pass unmodified, with new builder unit tests added.
Add a SeriesMetadata bibliographic block (summary, publisher, role-carrying authors, age rating, language, reading direction) plus two optional, independently gated fields — metadata and customMetadata — to the UserLibraryEntry and SyncEntry wire types. Both are omitted when unset, so existing plugins are unaffected. SeriesEngagement gains projection helpers that build the block and parse the user-defined custom_metadata JSON. Author parsing is refactored into a shared helper that is resilient per-element: an entry with an unrecognized role keeps its name rather than dropping the whole list. Covers and a standalone artists field are deliberately omitted (no external cover URL; role on the author subsumes artists). No caller enables the new fields yet; this only establishes the contract. Tests added for serialization, omission, author parsing, and the projections.
Wire the previously-defined enrichment fields to plugins, controlled by a
manifest capability plus four independent per-user toggles so a connection
sends exactly as much as it opts into.
- Add a `wantsFullMetadata` plugin capability; the host only does the extra
work, and the UI only shows the toggles, when a plugin declares it.
- Add four host-only `_codex` toggles (sendTags, sendGenres, sendMetadata,
sendCustomMetadata) with entity accessors; the effective flag per field is
capability AND the toggle. All default off, so existing connections push the
same minimal payload as before.
- Add top-level genres/tags to sync entries (they carried no taxonomy), so a
rules-based sync plugin can filter on tags without receiving heavier
bibliographic fields.
- Attach each opted-in field on sync push (matched and search-fallback entries)
and on recommendation seeds; taxonomy is only fetched when a toggle needs it.
Recommendations keep sending genres/tags as before.
- Add PATCH /api/v1/user/plugins/{id}/metadata-settings (partial update,
preserves sibling _codex keys, rejected when the plugin lacks the
capability), and expose the capability and current toggle values on the
user-plugin DTO.
Tests added across the entity accessors, settings parsing, per-flag sync push,
and the new endpoint.
…ings UI Surface the four per-field metadata opt-ins on a plugin connection. The settings modal gains a "Metadata Enrichment" section, shown only when the plugin declares the wantsFullMetadata capability, with switches for sending tags, genres, the bibliographic metadata block, and custom metadata. Tags and genres appear only for sync plugins (recommendation entries already carry them); descriptions call out that metadata (summaries) is the heavy option and that custom metadata can expose private fields. Toggles persist through the existing config-update path, preserving other _codex settings. Mock fixtures advertise the capability so the mock UI shows the section, and the plugin sync docs describe the capability and trade-offs. Adds component tests for the capability gating and initial state.
Extend SyncProgress with maxVolume/maxChapter — the highest read volume and chapter derived from Codex's per-book number detection — alongside a readBooks per-book breakdown, backed by a new SyncBookProgress struct carrying detected volume/chapter and page position. Unlike the existing `volumes` count (relative books-read), maxVolume and maxChapter stay accurate for libraries that don't start at volume 1 or have gaps, giving sync plugins the data to report absolute progress. readBooks exposes the raw per-book detail so authors of custom sync targets can map progress however their service expects. All new fields are additive and optional, omitted from the wire when unset, so existing plugins and the `volumes` field are unaffected. This lands the protocol contract only; computing and attaching the values in the push path is wired up separately. Includes serialization, round-trip, and backward-compatibility tests.
Declare a wantsDetailedProgress plugin capability that gates the per-book reading-progress breakdown (readBooks) on sync entries. The accurate maxVolume/maxChapter fields are always sent and stay ungated; only the heavier per-book detail is opt-in, so plugins that don't consume it pay no extra fetch or payload cost. Surface the capability on the user-plugin capabilities DTO so tooling can detect support, populated from the cached manifest alongside the existing capability flags, and regenerate the OpenAPI spec and TypeScript types to match. The flag is inert until the push path attaches the breakdown. Includes capability parse tests and a DTO serialization assertion.
Wire the detailed-progress fields end to end. The shared engagement builder gains an opt-in per-book detail pass: when enabled it batch-fetches book metadata and folds each book's detected volume/chapter and page position into a per-book breakdown, leaving the recommendations path untouched. The sync push projection then derives the highest read volume and chapter from that breakdown — accurate for libraries that don't start at volume 1 or have gaps, unlike the existing relative count — and, for plugins that declare wantsDetailedProgress, attaches the full per-book breakdown. The matched and search-fallback paths share one projection so both carry the same data, and the extra metadata query runs only when a detail-consuming plugin is connected. Includes builder and push projection tests.
…ss SDK types Make the highest read volume/chapter always flow on the sync push path so every sync plugin gets accurate progress for libraries with gaps, with no opt-in; only the heavier per-book breakdown stays gated behind the wantsDetailedProgress capability. Previously the accurate numbers rode the same gate as the breakdown, which would have forced the AniList plugin to receive (and ignore) the full breakdown just to read two numbers. Mirror maxVolume/maxChapter/readBooks and SyncBookProgress into the TypeScript SDK, and update the AniList plugin to prefer maxVolume/maxChapter (flooring fractional chapters) over the relative books-read count, falling back to the count for older hosts. Document the accuracy behaviour, the gapped-library example, and the capability/payload trade-off. Includes plugin mapping tests.
Deploying codex with
|
| Latest commit: |
2934468
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://f0a56d38.codex-asm.pages.dev |
| Branch Preview URL: | https://plugin-sync-cron.codex-asm.pages.dev |
The UserPluginCapabilitiesDto gained a required wantsDetailedProgress field, but the MSW mock handlers for user plugins were not updated, breaking the frontend type-check and build. Add the field to all three capability objects in the mock handlers (anilist enabled plugin, mangabaka available plugin, and the dynamically created enable handler) so the mocks satisfy the generated type and the build passes.
Add a file-based payload recorder to the echo (debug) plugins, starting with
metadata-echo. On each call it writes the request and its matching response to
paired JSON files under the plugin's host-provided data directory, so the
host->plugin protocol traffic can be inspected without trawling server logs.
- Files share a sortable basename (yyyy-MM-dd-HH-mm-ss-{id}-{method}, UTC) and
differ only by a -request.json / -response.json suffix.
- Each file is a JSON envelope holding the payload plus a snapshot of the active
config. Credentials are never written; secret-like config keys are redacted.
- Recording is bounded (oldest files pruned), falls back to a temp dir when no
data directory is provided, and is best-effort so a disk error never breaks an
RPC response. Toggled via recordPayloads / maxPayloadFiles admin config.
Surface two fields the host already supports but the TypeScript SDK omitted:
InitializeParams.dataDir (the scoped writable file-storage directory) and
PluginCapabilities.wantsDetailedProgress (opt in to per-book sync progress).
Add unit tests for the recorder.
A test/debug sync plugin that talks to no external service. It accepts any push (echoing every entry back as alternating created/updated, never failing), returns deterministic, fully-populated entries on pull (respecting the request limit), and reports canned status. Declares wantsDetailedProgress so it receives the per-book detailed progress payload, and records every request and response to files via the shared payload recorder, making it easy to inspect what the host sends over the sync protocol without trawling server logs. Includes unit tests for the provider and the recording integration.
Add the recommendations-echo debug plugin: it echoes the user's library seeds back as fully-populated recommendations (respecting limit and excludeIds), implements updateProfile/dismiss/clear, and records every request/response to files via the shared payload recorder. Includes tests. Put the echo plugins on equal footing in the build and packaging setup: - CI: add sync-echo and recommendations-echo to the plugin test matrices in both ci.yml and build.yml, and to the publish matrix in build.yml. - docker-compose: mount, build, and watch both new plugins in the dev stack. - .dockerignore: exclude all plugins' build artifacts from the build context via globs. - docs: list the echo sync/recommendations plugins and document payload recording. Treat the echo plugins as debug-only rather than store offerings: remove the echo entry from the user-facing Official Plugins gallery and instead provision the echoes via the seed config (sample updated), where test/dev plugins belong. The Makefile needs no change; its plugin glob already discovers the new plugins.
61684ba to
1f178c7
Compare
…tity Two related changes that make no-auth user plugins (e.g. a debug echo or a NocoDB-backed sync) first-class. Per-user identity: the host now sends userId and userPluginId to every user plugin in the initialize params (absent for system plugins). A plugin that has no per-user credential — or authenticates with an admin-configured shared key — can't derive the user's identity from its credentials, so it needs this to scope data per user in its own backend. The identifier is sent regardless of whether the plugin requires auth. No-auth / shared-key support: a plugin is now "connected" when it has credentials OR requires no per-user authentication. requires_authentication() on the manifest (declares OAuth or required credentials) drives a new UserPlugin::is_connected(requires_auth), applied consistently to the user-plugin DTO (which also exposes requires_auth), the manual-sync guard, the sync-status DTO, and the auto-sync scheduler eligibility. Previously such plugins could never connect, so they could be neither synced manually nor scheduled. Frontend: the integrations card shows sync controls (Sync Now, Automatic sync) and an "Enabled" badge for a no-auth plugin, and hides Disconnect since there is no external account to unlink; auth plugins are unchanged. OpenAPI types regenerated. Tests added across the manifest, entity, scheduler, and card.
…the sync cron Whether a plugin receives enriched series data is a property of the plugin, not a per-user preference, so move that control to the admin side and keep only a genuine privacy choice for the user. - Tags, genres, and the bibliographic metadata block are now admin policy on the plugin (config._codex.send*, default on when the manifest declares wantsFullMetadata), exposed via plugins::Model accessors and configured in the plugin's Configure dialog. The per-user tags/genres/metadata toggles are removed. - custom_metadata stays a per-user privacy opt-out (default off), shown on the connection settings; the metadata-settings endpoint is narrowed to that field. - Effective send = capability AND admin policy (tags/genres/metadata) or AND user opt-out (custom). Recommendations keep genres/tags as baseline taste signal. - Per-user rules use the existing userConfigSchema, which now supports a multi-line JSON/textarea field type. The echo plugins declare wantsFullMetadata and a sample rules field, and the plugin SDK gains the capability. - Relocate the automatic-sync cron from the admin Edit modal to the Configure dialog, alongside the library filter (sync plugins only). Also makes the sync-plugin test helper declare OAuth so sync plugins require authentication. Tests added for the admin policy, the user opt-out, and the relocated cron.
…policy to recommendations Round out the admin metadata-enrichment policy in the plugin Configure dialog: - Add an "Allow custom metadata" admin gate (config._codex.allowCustomMetadata, default on) layered over the user's per-connection opt-in. Custom metadata is sent only when the plugin declares the capability, the admin allows it, and the user opts in — wired into both the sync and recommendation handlers. - Show the tags/genres policy for recommendation plugins too, not just sync. genres/tags remain baseline taste signal for recommendations (default on), and the recommendation handler strips them only when an admin explicitly turns the policy off — so there's no change unless an admin opts to trim payload. The Configure dialog now offers send tags/genres/metadata plus allow-custom for any plugin that declares wantsFullMetadata, sync or recommendation. Docs and tests updated.
The worker built its PluginManager without `.with_plugin_file_storage(...)`, unlike the serve path. As a result `resolve_plugin_data_dir` returned None for every plugin spawned in the worker, and the init message carried `dataDir: null`. Plugins that write files (the sync cron and recommendations refresh, plus the echo plugins' recorded request/response payloads) then fell back to a container-local temp dir, so their output never reached the configured plugins directory. Metadata plugins were unaffected because enrichment runs in the serve process, which already wires file storage. Construct PluginFileStorage from config.files.plugins_dir in the worker command and pass it to the manager, mirroring serve. Also fix the compose wiring so the written files are actually visible: - mount ./docker/data/plugins into the production codex service - mount the sync-echo and recommendations-echo dist dirs into the screenshots service
The task priority scheme was reworked so user-facing and integration work (plugin sync/recommendations, metadata fetch) is claimed ahead of bulk background work like scanning and analysis. Two integration assertions in task_queue_integration still expected the old defaults and were failing: - AnalyzeBook is now 500 (was 800) - ScanLibrary is now 600 (was 1000) Update both assertions and their messages to match default_priority().
Follow-up to the priority rework: the task_priority_ordering and task_queue integration tests still asserted the old default priorities and were failing. Plugin and metadata work is now claimed ahead of scanning and analysis. - Rewrite the expected claim order and category comments so user plugin and metadata tasks precede scanning, analysis, and thumbnails. - Update per-type priority values: scan_library 1000->600, analyze_book 800->500, find_duplicates 400->860. - Refresh stale ordering comments in the PostgreSQL ordering test. Ordering-only assertions were left intact since scan-before-analysis still holds under the new values.
Three per-variant extraction tests still asserted pre-rework priorities that the central test_default_priority_values block had already been updated past: - refresh_library_metadata 385 -> 890 - poll_release_source 170 -> 850 - bulk_track_for_releases 155 -> 840 These match the values in default_priority(). The ordering-invariants test is relative-only and needed no change.
… YAML The seed config's library `title_preprocessing_rules` and `auto_match_conditions` fields were typed as raw JSON strings, forcing JSON-in-YAML with double-escaped regex backslashes and deferring any validation until scan time, where a malformed value silently stored and later failed. Deserialize both as native types (Vec<PreprocessingRule> and AutoMatchConditions) and serialize them to the JSON the libraries table stores in the seed loop, with error context naming the offending library. Patterns can now be written as single-quoted YAML scalars (e.g. '\s*\(Digital\)$') with no escaping, and a bad shape fails fast at seed time. This also brings these two fields in line with how allowed_formats and excluded_patterns are already handled. Document both fields with a commented example in the sample config and add seed-parsing tests.
The scheduler runs in every serve process, but the API's handle to it (AppState.scheduler) was gated on CODEX_DISABLE_WORKERS. In the standard split web/worker topology the web process has workers disabled, so the handle was None and reload_schedules() silently no-op'd. An admin-set plugin sync cron was written to the DB but never registered in the live scheduler, so it never fired until the process restarted. The same gap affected every live schedule edit (library scans, release sources). Decouple the handle from disable_workers: the scheduler is always started in serve, so the API always gets a handle and runtime schedule edits take effect without a restart. Also surface the admin-set cadence to users: UserPluginDto now carries a read-only syncCronSchedule, and the connection card shows a human-readable cadence (via a shared describeCron helper) or "not set up yet" with the auto-sync toggle disabled when no schedule is configured. Tests added for the DTO field, the cron description helper, and the card's configured/not-configured states.
Every serve replica runs its own in-process scheduler, so in a horizontally-scaled deployment each plugin sync cron fires once per replica and fans out duplicate per-user syncs. The only guard was a check-then-insert pending check, which races under concurrency. Add a per-firing claim: a scheduled_firing_claims table whose composite primary key (job_key, fire_slot) lets exactly one replica win a given firing. The plugin-sync closure claims "plugin_sync:<id>" at the firing's minute slot before fanning out and skips if it lost; the slot is truncated to the minute so replicas firing for the same occurrence agree despite clock skew. Claiming fails open so a transient claim-table error keeps syncs flowing. Back that with a partial unique index enforcing at most one pending/processing user_plugin_sync task per (plugin_id, user_id). The keys live in the params JSON, so the index is on the extracted values (params->>'...' on Postgres, json_extract on SQLite). enqueue already retries on a unique violation, so this turns its racy dedup into an atomic one, closing the minute-boundary and scheduled-vs-manual overlap windows. Scope: only the plugin-sync fan-out is gated on the claim, since that is the firing that does redundant external work. Entity-keyed jobs are already deduped by existing unique indexes; non-entity-keyed jobs still fire per replica and can reuse the generic claim helper in a follow-up. Tests added for claim election under concurrency, slot truncation, and the unique index rejecting duplicate pending rows.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Expands the plugin integration system with four capabilities: admin-scheduled automatic sync, opt-in metadata enrichment of sync and recommendation payloads, accurate per-book reading progress in the sync protocol, and a set of "echo" debug plugins that record host→plugin traffic to disk. Together these let integration authors build richer, rules-driven plugins, fix a long-standing progress-accuracy bug for libraries with volume gaps, and give operators a credential-free way to inspect what the server actually sends to plugins.
Motivation
Previously a plugin sync only ran when a user clicked "sync", and the sync payload carried reading progress as a relative count of books read, which misreports the true position for any library that doesn't start at volume 1. Plugin authors also had no way to build behavior driven by series metadata (tags, genres, summaries) because that data was never sent, and no way to inspect sync or recommendation payloads without wiring up a real external account and reading interleaved server logs. This work addresses all of those: keeping integrations current automatically on an admin-controlled cadence, shipping richer data under user control, sending correct progress numbers, and making the wire protocol inspectable.
Changes
wantsFullMetadataandwantsDetailedProgresscapabilities and the stored sync schedule.sync-echoandrecommendations-echoplugins (alongside the existing metadata echo) record each request and response as paired JSON files in their data directory, with credentials redacted, for protocol inspection.Notes