Skip to content

Service Accounts #2943

Closed
GregorShear wants to merge 13 commits into
masterfrom
greg/service-accounts-phase1
Closed

Service Accounts #2943
GregorShear wants to merge 13 commits into
masterfrom
greg/service-accounts-phase1

Conversation

@GregorShear

@GregorShear GregorShear commented May 13, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds service accounts: non-login identities (backed by auth.users rows) that authenticate via API keys and are authorized through the existing user_grants / role_grants system. Two new tables: internal.service_accounts and internal.api_keys.
  • Adds GraphQL mutations createServiceAccount, addServiceAccountGrant, removeServiceAccountGrant, createApiKey, revokeApiKey, and a paginated serviceAccounts query.
  • Extends POST /api/v1/auth/token with an api_key grant type that exchanges a flow_sa_-prefixed API key for a short-lived access token.
  • An API key may also be presented directly as an Authorization: Bearer credential.
  • Adds a fine-grained ManageServiceAccounts capability (included in the TeamAdmin bundle) that gates service-account management.

Key design decisions

  • Service accounts are auth.users rows. All existing RLS policies, PostgREST authorization, user_roles() resolution, and role_grants traversal work unchanged. Each account gets a synthetic, non-login address (<catalog_name>@service.estuary.dev) and stores its catalog name as full_name.
  • catalog_name is a management anchor It is unique and determines who may manage the account (admins of a covering prefix, via ManageServiceAccounts) and how the account is addressed.
  • Access is governed solely by the service_account's user_grants, which may span multiple prefixes.
  • Authorization is split to prevent privilege escalation. Managing an account (create, mint/revoke keys, add/remove grants) requires ManageServiceAccounts on the catalog name. Adding a grant additionally requires CreateGrant on the granted prefix, so a caller can't hand a service account reach they couldn't grant anyone. Removing a grant requires only management capability, since narrowing access is always safe.
  • **API keys are hashed with SHA-256. (Refresh tokens hash using bcrypt which normally protects against brute-force attacks - unnecessary with a high-entropy random secret)
  • API keys use fixed expiry. validFor is an ISO-8601 duration (e.g. P90D, P1Y), bounded to (0, 1 year], converted to now() + interval. This is distinct from the sliding window used by refresh tokens.
  • Service accounts cannot create refresh tokens. createRefreshToken rejects SA principals, since a refresh token would bypass the expiring, revocable API-key model.

Test plan

  • Create service account with grants → mint API key → exchange for an access token via POST /api/v1/auth/token
  • Present an API key directly as Authorization: Bearer and reach an authenticated endpoint
  • A caller without ManageServiceAccounts cannot create, manage, or list another tenant's service accounts
  • [ ]addServiceAccountGrant requires CreateGrant on the target prefix; removeServiceAccountGrant does not
  • Revoke API key → exchange and bearer authentication both fail; last_used_at not updated
  • validFor validation: reject non-ISO-8601 durations, non-positive durations, and durations over 1 year
  • Reject a service-account principal attempting createRefreshToken
  • serviceAccounts query is scoped to the caller's ManageServiceAccounts prefixes
  • Duplicate catalogName is rejected with a clear error

@GregorShear GregorShear marked this pull request as draft May 13, 2026 22:49
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch 2 times, most recently from 044861d to 899d324 Compare May 19, 2026 22:22
Comment thread .gitignore
@@ -25,3 +25,4 @@ __pycache__

.claude/*
!.claude/skills/
mise.local.toml

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR but helping me not commit a file related to a parallel workstream...

@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch from 899d324 to 3c9246f Compare May 20, 2026 18:21
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch 2 times, most recently from b07c268 to 66e5620 Compare May 29, 2026 01:20
@GregorShear GregorShear requested a review from jshearer May 29, 2026 01:20
@GregorShear GregorShear marked this pull request as ready for review May 29, 2026 01:20
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch 2 times, most recently from 0917cc1 to 312d85f Compare May 29, 2026 03:01
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch 3 times, most recently from dbbeaeb to 18896c6 Compare June 1, 2026 16:27
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch 3 times, most recently from 3a02899 to 6948da6 Compare June 10, 2026 20:11
@GregorShear GregorShear changed the base branch from master to greg/refresh-token-sha256 June 12, 2026 01:54
@GregorShear GregorShear force-pushed the greg/refresh-token-sha256 branch from 610ec1f to 04f5da4 Compare June 12, 2026 02:06
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch from 6948da6 to a812db9 Compare June 12, 2026 02:06
@GregorShear GregorShear force-pushed the greg/refresh-token-sha256 branch from 04f5da4 to 7fdbf6d Compare June 12, 2026 03:07
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch from a812db9 to d552d56 Compare June 12, 2026 03:07
@GregorShear GregorShear force-pushed the greg/refresh-token-sha256 branch from 7fdbf6d to 73d3226 Compare June 12, 2026 04:45
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch from d552d56 to 4338574 Compare June 12, 2026 04:45
@GregorShear GregorShear force-pushed the greg/refresh-token-sha256 branch from 73d3226 to d53e19a Compare June 12, 2026 06:04
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch 2 times, most recently from fd157d9 to 2bff6de Compare June 12, 2026 15:52
@GregorShear GregorShear changed the base branch from greg/refresh-token-sha256 to greg/refresh-token-exchange June 12, 2026 15:54
@GregorShear GregorShear changed the title service accounts: Phase 1 — backend CRUD and API key token exchange Service Accounts Jun 12, 2026
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch 2 times, most recently from 62f67b0 to b2d35a5 Compare June 12, 2026 19:44
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch 2 times, most recently from 862524a to 55e2824 Compare June 16, 2026 04:23
@GregorShear GregorShear force-pushed the greg/refresh-token-exchange branch from b23e0e7 to f21579a Compare June 16, 2026 23:34
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch 3 times, most recently from 2404c5d to f394743 Compare June 17, 2026 01:24
Restacked onto greg/refresh-token-exchange, which carries the refresh-token GraphQL operations, the /api/v1/auth/token endpoint, and stateful bearer authentication that this branch previously bundled. This commit squashes the prior branch history (PR feedback, tenant filtering, revocation semantics, SHA-256 key hashing) into the remaining service-account delta:

- internal.service_accounts and internal.api_keys tables. A service account is a non-login auth.users identity whose access is determined solely by its user_grants; its prefix is a management anchor determining who may manage it.
- GraphQL operations: serviceAccounts query (tenant-filterable, paginated, with batch-loaded API keys), createServiceAccount, revokeServiceAccountGrants (the kill switch), createApiKey, revokeApiKey.
- api_key grant on POST /api/v1/auth/token: verifies flow_sa_ keys against SHA-256 hashes and mints access tokens in the application layer.
- ManageServiceAccounts orthogonal capability, bundled into TeamAdmin; authorized_prefixes gains a tenant filter that narrows (never widens) the caller's authorized set.
- Service-account principals are denied refresh tokens, and SA synthetic emails are excluded from Stripe billing-contact selection.
Add an api_key grant to POST /api/v1/auth/token so a service-account key
can be traded for a short-lived signed access token, alongside the
existing refresh_token grant. This suits clients that prefer the standard
OAuth shape: present the long-lived key once, then carry a JWT.

Factor the stateful key verification out of authenticate_api_key into
server::verify_api_key, which returns the verified user_id (secret hash +
expiry + revocation check, plus the last_used_at stamp). Both the bearer
path and the exchange endpoint call it and then build their own claims:
the bearer path wraps them in a request-scoped Verified, while the
exchange signs a JWT. The minted token gets a 1-hour expiry matching the
refresh-token exchange, so both grant types yield equivalent access
tokens; the bearer path keeps its 5-minute expiry, which is right for
per-request re-verification.

A signed token can't be revoked before it expires, so the 1-hour lifetime
bounds how long an exchanged token outlives a key revocation; direct
bearer use still cuts off immediately. Revoked or expired keys can't be
exchanged, since the exchange routes through the same verification.

Also drop the bulk revokeApiKeys mutation and its test coverage; keys are
revoked individually via revokeApiKey.
…after merged migrations

The service_accounts migration was dated 20260528, earlier than the
20260601 (grant_bundles_select) and 20260605 (anon_connector_grants)
migrations that have since landed on master. Once this branch merges, a
migration with a timestamp earlier than already-applied upstream
migrations is out of order. Rename it to 20260615 so it sorts last and
applies cleanly everywhere. Content is unchanged.
…lities

The service-account management surface was asymmetric: listing authorized on
the fine-grained `ManageServiceAccounts` capability while every mutation
required the full `Admin` bundle. That produced a class of principal (a
`TeamAdmin` without `Admin`) who could see service accounts but manage none of
them, and it meant the named capability the feature defines was never the
actual gate for any write.

Make the surface track the capabilities it defines:

- Anchor checks (create/add-grant/remove-grant/create-key/revoke-key, all on
  the account's `catalog_name`) now require `ManageServiceAccounts`, matching
  the listing query. A `TeamAdmin` can fully manage accounts and their keys
  without holding `Admin`.
- The per-grant prefix check now requires `CreateGrant` on each granted prefix
  rather than `Admin`. This is the anti-escalation guard — a caller still
  can't hand a service account reach they couldn't grant anyone — but it keys
  off the capability that authorizes granting, not the whole Admin bundle.
  Human-user grant creation still lives in PostgREST; when it migrates to
  GraphQL it should adopt this same `CreateGrant` gate.

To express fine-grained capabilities at these call sites, generalize
`evaluate_names_authorization` and `verify_authorization` to accept anything
that converts into a `CapabilitySet` (legacy `models::Capability`, a single
`models::authz::Capability` bit, or an explicit set). The BFS primitive
already operated on `CapabilitySet`, so this only lifts the wrapper signatures;
existing legacy-capability callers are unaffected. Add a `Display` impl for
`models::authz::Capability` so denial messages render the required capability.
These three changes mirror the same generalization in #2944 so the two
branches converge cleanly on whichever merges second.

Add `test_team_admin_manages_without_full_admin`, which seeds a caller holding
only the `team_admin` bundle (no `Admin`) and asserts both the positive path
(manage accounts, mint keys, grant prefixes within reach) and the
anti-escalation boundary (cannot grant a prefix they lack `CreateGrant` on).

Also correct the migration comments, which claimed API keys are never
exchanged for a JWT — the token-exchange endpoint mints a short-lived access
token from a key.
Service accounts add a third caller of `Verified::assert_authenticity` —
`authenticate_api_key`, which mints an authentication proof from a verified
API-key secret. This is the new authentication entry point this branch
introduces over its refresh-token base.

Account for it where the choke point is documented and guarded: add the API-key
caller to the SECURITY doc's enumerated set, bump the `authenticity_census`
count for server/mod.rs from one to two, and annotate the call site. Without
this, the census inherited from the base branch fails the build — which is the
intended signal that a new way to authenticate a request was added and needs
review.
The service_accounts.last_used_at column was a denormalized cache of max(api_keys.last_used_at): verify_api_key stamped both the key and its owning account with the same now() in one round-trip, so the account column never held information not already present on api_keys.

Drop the column and derive ServiceAccount.lastUsedAt at read time as the max over the account's keys (revoked included, preserving the old column's semantics). This collapses verify_api_key — on the per-request auth hot path — from two row updates to one, shifting the negligible aggregate cost to the low-frequency admin listing. The GraphQL field and SDL are unchanged.
Reworks how a service-account API key presented as an Authorization: Bearer credential is authenticated, to match how refresh tokens now work. Instead of verifying the key statefully and asserting request-scoped claims, the Envelope exchanges the key for a short-lived signed access token and verifies that token like any JWT. The stateful database check (verify_api_key: secret hash, expiry, revocation, last_used_at stamp) remains the source of trust; the signature just lets the minted token flow through the normal verify path.

Replaces server::authenticate_api_key with server::exchange_api_key, which verifies the key and signs a 1-hour JWT for the verified identity. Both the bearer path (which mints, verifies, and discards the token within the request) and the /api/v1/auth/token exchange endpoint route through it, so the inline verify-and-sign in the endpoint collapses to a single call and a revoked or expired key can never yield a token.

This also drops the assert_authenticity seam and its census, consistent with the refresh-token revert: API keys, like refresh tokens, no longer construct claims directly. Tests and doc comments are updated accordingly.
A catalog name is a service account's handle, so two accounts must not share one. Add a unique btree index on internal.service_accounts (catalog_name) — scoped to the table, so it does not collide with live_specs or other catalog entities, and case-sensitive per the catalog's no-lower()-folding convention. The existing SP-GiST index stays for the listing query's ^@ prefix scan, which the btree wouldn't serve.

create_service_account maps the resulting unique-violation (SQLSTATE 23505) to a clean "a service account already exists for catalog name '...'" error rather than surfacing the raw database error. Covered by a duplicate-rejection assertion in test_service_account_lifecycle.
…ic API

Now that catalog_name is unique, it serves as a service account's stable public handle, so the backing auth.users id no longer needs to be exposed. This decouples the API from that storage detail; the id can be reintroduced later if a real need arises (e.g. correlating with grants or audit data).

- Remove the id field from the ServiceAccount type.
- addServiceAccountGrant, removeServiceAccountGrant, and createApiKey now take catalogName instead of serviceAccountId. They authorize ManageServiceAccounts directly on the supplied name (no user_id -> name round-trip) and resolve the backing user_id internally for the write.
- Replace the forward lookup_service_account(user_id) -> name with the reverse resolve_service_account(name) -> user_id; revoke_api_key JOINs to recover the catalog name from a key id.

Tests address accounts by catalog name and read the backing user_id from the DB where they assert at the row level.
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch from 17702fa to 28941ff Compare June 17, 2026 14:06
@GregorShear GregorShear changed the base branch from greg/refresh-token-exchange to master June 17, 2026 21:14
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch from 7c8d949 to f96bc46 Compare June 17, 2026 23:08
@GregorShear GregorShear force-pushed the greg/service-accounts-phase1 branch from f96bc46 to 4c1627b Compare June 17, 2026 23:13
@GregorShear

Copy link
Copy Markdown
Contributor Author

closing in favor of #3058

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants