docs(governance): remediation approval ADR + role matrix + RBAC drift-lock#604
Merged
Conversation
…BAC docs - ADR (A-keep): keep the governance machinery, make human approval conditional - free-core single-rule remediation needs none, licensed bulk/auto keeps request->approve with self-review. Records the one-man-shop decision. - New role matrix doc: remediation + exception request/approve/execute grants per built-in role, plus the no-bypass self-review rule. - Fix HOSTS_AND_REMEDIATION.md: Python-era role names (SUPER_ADMIN, scan:*) and the wrong 'executes automatically' claim -> real Go RBAC + manual Fix. - Fix rbac_registry.md: remediation:execute is free core now (not license-gated); reconcile the approver_roles fiction (no approvals policy is configured; the enforced gate is the remediation:approve/exception:approve permission).
Add a spec constraint + test asserting built-in role grants match the remediation/exception governance matrix, so a permissions.yaml edit that breaks separation of duties (e.g. granting ops_lead remediation:approve, or dropping auditor's exception:approve) fails the build. Verified against BuiltInRoles.
This was referenced Jun 19, 2026
remyluslosius
added a commit
that referenced
this pull request
Jun 20, 2026
… + auth fix) (#609) * fix(auth): return 401 for anonymous callers on protected endpoints An anonymous request (no credentials, or a session cookie that expired in the browser and is no longer sent) to a protected endpoint now returns 401 auth.required instead of 403. The SPA redirects to login on a 401, so an expired session surfaces as a clean re-login prompt rather than a dead-end 'failed to load'. An authenticated caller whose role lacks the permission still gets 403 authz.permission_denied; the audit event is unchanged for both. * test+spec: update anonymous-denial contract to 401 across specs/tests The 12 specs/tests that strictly asserted anonymous -> 403 now assert 401 auth.required (alerts, audit-events-query, fleet-observability, host-system-info, os-intelligence, system-rbac AC-09/AC-15, system/fleet connectivity, discovery/ intelligence config). Authenticated-but-unauthorized -> 403 language preserved. Specs that already said '401/403' are unchanged. * feat(remediation): conditional approval (A-keep) — free-core auto-approves Implements the A-keep ADR: free-core single-rule remediation no longer requires a separate human approval, so a single operator can request and Fix a finding directly (removing the self-review deadlock). The approve/reject flow with separation of duties is retained for the licensed bulk/auto track. - Request(...requiresApproval bool): false (free core) inserts an 'approved' row directly (reviewed_at set, reviewed_by NULL, auto-approved review_note) and emits remediation.requested + remediation.approved; true (licensed bulk/auto) inserts 'pending_approval' and goes through Approve/Reject. - The single-rule request handler passes false. - Tests: AC-01 covers auto-approve + the approval-required path; the HTTP AC-05/AC-06 approve and pending-execute paths seed a pending_approval request (the free-core POST auto-approves). Frontend unchanged (the hook already renders approved -> Fix and keeps the pending_approval/approve UI for the licensed track). Note: the ADR + governance docs land in #604; their status flips to 'implemented' once both merge. * fix(remediation): serialize concurrent fixes on a host instead of failing Clicking Fix on several findings on the same host enqueued multiple jobs that ran concurrently; the second collided on the per-host SSH guard (ErrHostBusy) and the remediation worker marked it failed. Now the worker treats a busy host as transient: it backs off and requeues (queue.EnqueueAfter) until the host is free, so the fixes apply one at a time. - queue: add a delayed-visibility column (migration 0039 available_at) + EnqueueAfter(delay); Dequeue skips not-yet-available rows so the requeue does not busy-loop the drain (job-queue AC-13). - remediation: HostHasExecuting + RevertToApproved primitives (api-remediation AC-08); worker processExecute/processRollback pre-check the host and revert+ requeue on an ErrHostBusy race instead of failing the request. * feat(frontend): live remediation status via remediation.completed SSE The Remediation tab required a manual refresh to see a fix finish. The worker already publishes remediation.completed on the event bus; useLiveEvents now subscribes to it and invalidates ['host', id, 'remediations'] + ['host', id], so the tab and the compliance score update automatically when a queued fix or rollback reaches its terminal state. frontend-live-events AC-09 + AC-01 (topic set grows to 6). * chore(release): bump Kensa to v0.5.2 and prepare 0.2.0-rc.11 Kensa v0.5.2 is a PATCH release with a frozen api/ surface, so OpenWatch's library integration is unchanged. Its notable fix corrects a config_value matching bug ('" "' delimiter now matches any whitespace incl. TAB), which removes a class of false FAILs on TAB-delimited rules (RHEL login.defs) — affected hosts may see their compliance score improve. The jsonl skipped-vs- error fix (kensa#104) is confirmed no-impact for the library path (issue #603). - go.mod kensa v0.5.1 -> v0.5.2; KensaModuleVersion + kensa-executor spec pin updated to match (version-pin tests pass; corpus stays at 539 rules, the variable-catalog AC still sees exactly 3 placeholders). - version.env -> 0.2.0-rc.11; README + operator guides + CHANGELOG cut a 0.2.0-rc.11 section. * docs(changelog): reconcile rc.11 section (bundle #604-#608)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Remediation/exception governance: ADR, role matrix, doc fixes, and a drift-lock test
Addresses the governance questions raised against the just-landed remediation feature: what happens to the model for a one-man shop, who can request vs. approve, and making sure the role grants are documented and enforced. Important for the licensed OpenWatch+ track.
What's here
ADR —
docs/engineering/remediation_governance_adr.md(A-keep, Accepted)Keep the governance machinery; make the human approval step conditional: free-core single-rule remediation needs no separate approval, the licensed bulk/auto track keeps
request → approvewith the self-review separation-of-duties guard. Records the one-man-shop decision (a lone operator can't self-approve today; the free tier drops the gate rather than relaxing self-review).Role matrix —
docs/engineering/remediation_exception_governance.md(new)The accurate per-role grants for remediation + exceptions (request / approve / execute / rollback / revoke), the no-bypass self-review rule, and the
approver_rolesclarification.Stale-doc fixes
HOSTS_AND_REMEDIATION.md: replaced Python-era role names (SUPER_ADMIN,scan:approve/rollback) and the false "executes automatically after approval" claim with the real Go RBAC + operator-initiated Fix.rbac_registry.md:remediation:executeis free core now (was shown license-gated); reconciled theapprover_roles: [security_admin, ops_lead]fiction — noapprovalspolicy is configured, and the enforced gate is theremediation:approve/exception:approvepermission (ops_lead does not hold it).Drift-lock test —
system-rbacC-08 / AC-17New spec constraint +
TestGovernanceRoleMatrixasserting built-in role grants match the matrix againstBuiltInRoles. Apermissions.yamledit that breaks separation of duties (e.g. grantingops_leadremediation:approve, or droppingauditor'sexception:approve) now fails the build. All 110 specs at 100% structural coverage; the test passes.The matrix being locked
Docs-and-test only; no behavior change. The conditional-approval implementation in the ADR is a separate follow-up.