Service Accounts #2943
Closed
GregorShear wants to merge 13 commits into
Closed
Conversation
044861d to
899d324
Compare
GregorShear
commented
May 20, 2026
| @@ -25,3 +25,4 @@ __pycache__ | |||
|
|
|||
| .claude/* | |||
| !.claude/skills/ | |||
| mise.local.toml | |||
Contributor
Author
There was a problem hiding this comment.
not related to this PR but helping me not commit a file related to a parallel workstream...
899d324 to
3c9246f
Compare
b07c268 to
66e5620
Compare
0917cc1 to
312d85f
Compare
dbbeaeb to
18896c6
Compare
3a02899 to
6948da6
Compare
610ec1f to
04f5da4
Compare
6948da6 to
a812db9
Compare
04f5da4 to
7fdbf6d
Compare
a812db9 to
d552d56
Compare
7fdbf6d to
73d3226
Compare
d552d56 to
4338574
Compare
73d3226 to
d53e19a
Compare
fd157d9 to
2bff6de
Compare
62f67b0 to
b2d35a5
Compare
862524a to
55e2824
Compare
b23e0e7 to
f21579a
Compare
2404c5d to
f394743
Compare
Restacked onto greg/refresh-token-exchange, which carries the refresh-token GraphQL operations, the /api/v1/auth/token endpoint, and stateful bearer authentication that this branch previously bundled. This commit squashes the prior branch history (PR feedback, tenant filtering, revocation semantics, SHA-256 key hashing) into the remaining service-account delta: - internal.service_accounts and internal.api_keys tables. A service account is a non-login auth.users identity whose access is determined solely by its user_grants; its prefix is a management anchor determining who may manage it. - GraphQL operations: serviceAccounts query (tenant-filterable, paginated, with batch-loaded API keys), createServiceAccount, revokeServiceAccountGrants (the kill switch), createApiKey, revokeApiKey. - api_key grant on POST /api/v1/auth/token: verifies flow_sa_ keys against SHA-256 hashes and mints access tokens in the application layer. - ManageServiceAccounts orthogonal capability, bundled into TeamAdmin; authorized_prefixes gains a tenant filter that narrows (never widens) the caller's authorized set. - Service-account principals are denied refresh tokens, and SA synthetic emails are excluded from Stripe billing-contact selection.
Add an api_key grant to POST /api/v1/auth/token so a service-account key can be traded for a short-lived signed access token, alongside the existing refresh_token grant. This suits clients that prefer the standard OAuth shape: present the long-lived key once, then carry a JWT. Factor the stateful key verification out of authenticate_api_key into server::verify_api_key, which returns the verified user_id (secret hash + expiry + revocation check, plus the last_used_at stamp). Both the bearer path and the exchange endpoint call it and then build their own claims: the bearer path wraps them in a request-scoped Verified, while the exchange signs a JWT. The minted token gets a 1-hour expiry matching the refresh-token exchange, so both grant types yield equivalent access tokens; the bearer path keeps its 5-minute expiry, which is right for per-request re-verification. A signed token can't be revoked before it expires, so the 1-hour lifetime bounds how long an exchanged token outlives a key revocation; direct bearer use still cuts off immediately. Revoked or expired keys can't be exchanged, since the exchange routes through the same verification. Also drop the bulk revokeApiKeys mutation and its test coverage; keys are revoked individually via revokeApiKey.
…after merged migrations The service_accounts migration was dated 20260528, earlier than the 20260601 (grant_bundles_select) and 20260605 (anon_connector_grants) migrations that have since landed on master. Once this branch merges, a migration with a timestamp earlier than already-applied upstream migrations is out of order. Rename it to 20260615 so it sorts last and applies cleanly everywhere. Content is unchanged.
…lities The service-account management surface was asymmetric: listing authorized on the fine-grained `ManageServiceAccounts` capability while every mutation required the full `Admin` bundle. That produced a class of principal (a `TeamAdmin` without `Admin`) who could see service accounts but manage none of them, and it meant the named capability the feature defines was never the actual gate for any write. Make the surface track the capabilities it defines: - Anchor checks (create/add-grant/remove-grant/create-key/revoke-key, all on the account's `catalog_name`) now require `ManageServiceAccounts`, matching the listing query. A `TeamAdmin` can fully manage accounts and their keys without holding `Admin`. - The per-grant prefix check now requires `CreateGrant` on each granted prefix rather than `Admin`. This is the anti-escalation guard — a caller still can't hand a service account reach they couldn't grant anyone — but it keys off the capability that authorizes granting, not the whole Admin bundle. Human-user grant creation still lives in PostgREST; when it migrates to GraphQL it should adopt this same `CreateGrant` gate. To express fine-grained capabilities at these call sites, generalize `evaluate_names_authorization` and `verify_authorization` to accept anything that converts into a `CapabilitySet` (legacy `models::Capability`, a single `models::authz::Capability` bit, or an explicit set). The BFS primitive already operated on `CapabilitySet`, so this only lifts the wrapper signatures; existing legacy-capability callers are unaffected. Add a `Display` impl for `models::authz::Capability` so denial messages render the required capability. These three changes mirror the same generalization in #2944 so the two branches converge cleanly on whichever merges second. Add `test_team_admin_manages_without_full_admin`, which seeds a caller holding only the `team_admin` bundle (no `Admin`) and asserts both the positive path (manage accounts, mint keys, grant prefixes within reach) and the anti-escalation boundary (cannot grant a prefix they lack `CreateGrant` on). Also correct the migration comments, which claimed API keys are never exchanged for a JWT — the token-exchange endpoint mints a short-lived access token from a key.
Service accounts add a third caller of `Verified::assert_authenticity` — `authenticate_api_key`, which mints an authentication proof from a verified API-key secret. This is the new authentication entry point this branch introduces over its refresh-token base. Account for it where the choke point is documented and guarded: add the API-key caller to the SECURITY doc's enumerated set, bump the `authenticity_census` count for server/mod.rs from one to two, and annotate the call site. Without this, the census inherited from the base branch fails the build — which is the intended signal that a new way to authenticate a request was added and needs review.
The service_accounts.last_used_at column was a denormalized cache of max(api_keys.last_used_at): verify_api_key stamped both the key and its owning account with the same now() in one round-trip, so the account column never held information not already present on api_keys. Drop the column and derive ServiceAccount.lastUsedAt at read time as the max over the account's keys (revoked included, preserving the old column's semantics). This collapses verify_api_key — on the per-request auth hot path — from two row updates to one, shifting the negligible aggregate cost to the low-frequency admin listing. The GraphQL field and SDL are unchanged.
Reworks how a service-account API key presented as an Authorization: Bearer credential is authenticated, to match how refresh tokens now work. Instead of verifying the key statefully and asserting request-scoped claims, the Envelope exchanges the key for a short-lived signed access token and verifies that token like any JWT. The stateful database check (verify_api_key: secret hash, expiry, revocation, last_used_at stamp) remains the source of trust; the signature just lets the minted token flow through the normal verify path. Replaces server::authenticate_api_key with server::exchange_api_key, which verifies the key and signs a 1-hour JWT for the verified identity. Both the bearer path (which mints, verifies, and discards the token within the request) and the /api/v1/auth/token exchange endpoint route through it, so the inline verify-and-sign in the endpoint collapses to a single call and a revoked or expired key can never yield a token. This also drops the assert_authenticity seam and its census, consistent with the refresh-token revert: API keys, like refresh tokens, no longer construct claims directly. Tests and doc comments are updated accordingly.
A catalog name is a service account's handle, so two accounts must not share one. Add a unique btree index on internal.service_accounts (catalog_name) — scoped to the table, so it does not collide with live_specs or other catalog entities, and case-sensitive per the catalog's no-lower()-folding convention. The existing SP-GiST index stays for the listing query's ^@ prefix scan, which the btree wouldn't serve. create_service_account maps the resulting unique-violation (SQLSTATE 23505) to a clean "a service account already exists for catalog name '...'" error rather than surfacing the raw database error. Covered by a duplicate-rejection assertion in test_service_account_lifecycle.
…ic API Now that catalog_name is unique, it serves as a service account's stable public handle, so the backing auth.users id no longer needs to be exposed. This decouples the API from that storage detail; the id can be reintroduced later if a real need arises (e.g. correlating with grants or audit data). - Remove the id field from the ServiceAccount type. - addServiceAccountGrant, removeServiceAccountGrant, and createApiKey now take catalogName instead of serviceAccountId. They authorize ManageServiceAccounts directly on the supplied name (no user_id -> name round-trip) and resolve the backing user_id internally for the write. - Replace the forward lookup_service_account(user_id) -> name with the reverse resolve_service_account(name) -> user_id; revoke_api_key JOINs to recover the catalog name from a key id. Tests address accounts by catalog name and read the backing user_id from the DB where they assert at the row level.
17702fa to
28941ff
Compare
7c8d949 to
f96bc46
Compare
f96bc46 to
4c1627b
Compare
Contributor
Author
|
closing in favor of #3058 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
auth.usersrows) that authenticate via API keys and are authorized through the existinguser_grants/role_grantssystem. Two new tables:internal.service_accountsandinternal.api_keys.createServiceAccount,addServiceAccountGrant,removeServiceAccountGrant,createApiKey,revokeApiKey, and a paginatedserviceAccountsquery.POST /api/v1/auth/tokenwith anapi_keygrant type that exchanges aflow_sa_-prefixed API key for a short-lived access token.Authorization: Bearercredential.ManageServiceAccountscapability (included in theTeamAdminbundle) that gates service-account management.Key design decisions
auth.usersrows. All existing RLS policies, PostgREST authorization,user_roles()resolution, androle_grantstraversal work unchanged. Each account gets a synthetic, non-login address (<catalog_name>@service.estuary.dev) and stores its catalog name asfull_name.catalog_nameis a management anchor It is unique and determines who may manage the account (admins of a covering prefix, viaManageServiceAccounts) and how the account is addressed.user_grants, which may span multiple prefixes.ManageServiceAccountson the catalog name. Adding a grant additionally requiresCreateGranton the granted prefix, so a caller can't hand a service account reach they couldn't grant anyone. Removing a grant requires only management capability, since narrowing access is always safe.validForis an ISO-8601 duration (e.g.P90D,P1Y), bounded to(0, 1 year], converted tonow() + interval. This is distinct from the sliding window used by refresh tokens.createRefreshTokenrejects SA principals, since a refresh token would bypass the expiring, revocable API-key model.Test plan
POST /api/v1/auth/tokenAuthorization: Bearerand reach an authenticated endpointManageServiceAccountscannot create, manage, or list another tenant's service accountsaddServiceAccountGrantrequiresCreateGranton the target prefix;removeServiceAccountGrantdoes notlast_used_atnot updatedvalidForvalidation: reject non-ISO-8601 durations, non-positive durations, and durations over 1 yearcreateRefreshTokenserviceAccountsquery is scoped to the caller'sManageServiceAccountsprefixescatalogNameis rejected with a clear error