Skip to content

Commit 8283173

Browse files
docs: update ADR to make it more straightforward
1 parent 4a36f7b commit 8283173

1 file changed

Lines changed: 71 additions & 114 deletions

File tree

docs/decisions/0012-auditability.rst

Lines changed: 71 additions & 114 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,7 @@ The existing architecture (see `ADR 0005`_) introduced ``ExtendedCasbinRule``, w
1414
This is not an audit trail: there is no actor, no operation type, and no mechanism for
1515
downstream consumers to react to changes.
1616

17-
As the framework is adopted across more Open edX services, operators and developers need
18-
answers the current system cannot provide:
17+
Operators and developers need answers the current system cannot provide:
1918

2019
- Who assigned this role, and when?
2120
- Who removed a user's access, and was it intentional?
@@ -28,27 +27,14 @@ Auditability decomposes into three dimensions:
2827
2. **Explainability**: why was access granted or denied? (policy evaluation at check time)
2928
3. **Usage**: who used access? (resource access events, business operations)
3029

31-
SpiceDB and OpenFGA version the entire authorization graph, enabling historical
32-
reconstruction. Keycloak uses event listeners on administrative actions. openedx-authz sits
33-
between these: a mutable policy store with no built-in audit layer.
30+
`SpiceDB`_ and `OpenFGA`_ track the full authorization graph as a versioned changelog,
31+
enabling historical reconstruction. Keycloak uses event listeners on administrative actions.
32+
openedx-authz sits between these: a mutable policy store with no built-in audit layer.
33+
(See `OEPM-Spike\: RBAC AuthZ Auditability`_ for the peer system analysis.)
3434

35-
The pycasbin ecosystem has no audit plugin and no mechanism in the
36-
``casbin-django-orm-adapter`` for change tracking. ``WatcherEx`` provides rule-level hooks
37-
but carries no actor context and does not cover update operations.
38-
39-
Two transitive dependencies already cover what is needed:
40-
41-
- **django-crum** (``0.7.9``, via ``edx-django-utils``): ``get_current_user()`` from
42-
thread-local. Returns ``None`` in non-request contexts, treated as a system actor.
43-
- **django-simple-history** (``3.11.0``, via ``edx-organizations``): model-level change
44-
tracking with actor, timestamp, and before/after state. Not applied to any openedx-authz
45-
model yet.
46-
47-
The Auth0 FGA Logging API (October 2025) defines three acceptance criteria for this feature:
48-
49-
- Who made a permission change? (attribution)
50-
- What did a user access or attempt? (explainability + usage)
51-
- Can logs be exported to external systems? (SIEM, Aspects)
35+
The pycasbin ecosystem has no audit plugin. Two transitive dependencies cover what is needed:
36+
``django-crum`` (via ``edx-django-utils``) for actor capture, and ``django-simple-history``
37+
(via ``edx-organizations``) for point-in-time state reconstruction.
5238

5339
Decision
5440
********
@@ -60,34 +46,31 @@ Three independent mechanisms, each answering a different question:
6046
- ``django-simple-history`` on ``ExtendedCasbinRule``: what was the full state at time T
6147
(future work)
6248

63-
Attribution: Role Lifecycle Events and Audit Table
64-
==================================================
49+
See the `OEPM-Spike\: RBAC AuthZ Auditability`_ for the architecture diagram of the three
50+
flows.
51+
52+
#. Attribution: Role Lifecycle Events and Audit Table
53+
=====================================================
6554

6655
Emit an ``OpenedxPublicSignal`` from ``openedx_authz.api.roles`` after every successful role
67-
assignment or removal, via ``transaction.on_commit``. A Celery handler writes the event to
68-
``RoleAssignmentAudit``.
56+
assignment or removal, via ``transaction.on_commit``. A synchronous Django signal receiver
57+
writes the event to ``RoleAssignmentAudit`` in the same process.
6958

7059
The handler is enabled by default. Operators with Aspects or a SIEM can disable it via a
7160
Django setting to avoid the redundant write. If the handler fails, the Casbin write and the
7261
event are unaffected.
7362

74-
.. note::
75-
76-
Whether to write to the audit table in the same process (no Celery) or via a separate
77-
task is an open question. Needs latency benchmarking before implementation.
78-
7963
Event payload
8064
-------------
8165

8266
.. code:: python
8367
8468
{
85-
"operation": "ASSIGN" | "REMOVE",
86-
"user": "<namespaced subject key, e.g. user^alice>",
69+
"operation": "created" | "deleted",
70+
"subject": "<namespaced subject key, e.g. user^alice>",
8771
"role": "<namespaced role key, e.g. role^instructor>",
8872
"scope": "<namespaced scope key, e.g. course-v1^course-v1:Org+Course+Run>",
89-
"actor": "<username of the caller, or None for system actor>",
90-
"timestamp": "<ISO 8601 UTC datetime>",
73+
"actor": "<User object for the caller, or None for system actor>",
9174
}
9275
9376
The actor is resolved from ``django_crum.get_current_user()`` at API call time. No callers
@@ -117,8 +100,8 @@ events (notifications, cache updates, analytics). Developers without an event bu
117100
the underlying Django signal directly. If an event bus is configured, events are forwarded to
118101
Aspects or external systems automatically.
119102

120-
Explainability: Real-Time Decision Context
121-
==========================================
103+
#. Explainability: Real-Time Decision Context
104+
=============================================
122105

123106
Expose ``enforce_ex()`` through the public Python API. It returns ``(result, explain_rule)``:
124107
the boolean decision and the matched policy rule. Callers get the exact rule that allowed or
@@ -134,6 +117,8 @@ options are available, both requiring a breaking change to ``is_user_allowed`` t
134117

135118
- **Option A (event replay):** Replay ``ASSIGN``/``REMOVE`` events from ``RoleAssignmentAudit``
136119
up to T. No extra infrastructure; the data is already there once attribution is implemented.
120+
The `Auth0 FGA Logging API`_ uses this same pattern: their logging API is an event store
121+
that you replay to answer historical questions.
137122
- **Option B (snapshots):** Add ``HistoricalRecords()`` to ``ExtendedCasbinRule`` and use
138123
``as_of(T)`` for the full rule state, including policy definitions. History collection must
139124
start before the target timestamp.
@@ -145,49 +130,47 @@ detect whether the model changed.
145130
Consequences
146131
************
147132

148-
Attribution
149-
===========
150-
151-
- Operators get a filterable role assignment history in Django admin. No external tooling
152-
required.
153-
- Developers get a stable ``OpenedxPublicSignal`` extension point. First formally defined
154-
event in openedx-authz.
155-
- Events are best-effort: if the audit write fails, the Casbin policy is still durable.
156-
Consumers requiring guaranteed delivery must implement their own retry logic.
157-
- ``actor`` is nullable. Non-request contexts (management commands, background tasks) record
158-
``None``, logged as a system operation.
159-
- No new dependencies introduced.
160-
- Callers of ``openedx_authz.api.roles`` need no signature changes.
161-
162-
Explainability
163-
==============
164-
165-
- Developers can retrieve the matched policy rule at check time for "why was this denied?"
166-
debugging.
167-
- The explanation is point-in-time only. Historical explainability is deferred.
168-
- Enforcement events are opt-in by design. Enabling them without an external consumer
169-
produces events that are emitted and discarded.
170-
- No new dependencies introduced.
171-
172-
Both flows
173-
==========
174-
175-
- ``RoleAssignmentAudit`` introduces a new migration. No existing table is modified.
176-
- The ``OpenedxPublicSignal`` schema is a public API surface. Field additions are
177-
backward-compatible; removals and renames are breaking changes.
178-
- Usage auditing belongs at the application layer (Open edX tracking events, Aspects), not
179-
in the authorization library.
180-
- ``RoleAssignmentAudit`` is not tamper-proof. Compliance-grade immutability is a
181-
later-phase concern.
182-
- Audit records are independent from live authorization state. Deleting a subject, scope, or
183-
role does not remove its audit history. Records may reference identifiers that no longer
184-
exist in the system.
185-
- ``actor`` is the exception: it is stored as a FK to the ``User`` model with ``SET_NULL``.
186-
Deleting a user sets ``actor`` to ``None``, losing attribution for any audit records they
187-
produced. This is an accepted trade-off: user deletion is rare in Open edX (the standard
188-
path is retirement, which anonymizes rather than hard-deletes), and the FK enables direct
189-
admin filtering by actor. If unconditional attribution durability is needed, ``actor``
190-
should be changed to a plain string field.
133+
#. **Operators get a filterable role assignment history in Django admin.** No external
134+
tooling required.
135+
136+
#. **Developers get a stable** ``OpenedxPublicSignal`` **extension point.** First formally
137+
defined event in openedx-authz. Callers of ``openedx_authz.api.roles`` need no signature
138+
changes.
139+
140+
#. **Events are best-effort.** If the audit write fails, the Casbin policy is still durable.
141+
Consumers requiring guaranteed delivery must implement their own retry logic.
142+
143+
#. **``actor`` is nullable.** Non-request contexts (management commands, background tasks)
144+
record ``None``, logged as a system operation. ``actor`` is stored as a FK to ``User``
145+
with ``SET_NULL``: deleting a user loses attribution for their audit records. This is
146+
accepted because user deletion is rare in Open edX (retirement anonymizes rather than
147+
hard-deletes), and the FK enables admin filtering by actor. If unconditional attribution
148+
durability is needed, ``actor`` should be a plain string field instead.
149+
150+
#. **Audit records are independent from live authorization state.** Deleting a subject,
151+
scope, or role does not remove its audit history. Records may reference identifiers that
152+
no longer exist.
153+
154+
#. **``RoleAssignmentAudit`` introduces a new migration.** No existing table is modified.
155+
156+
#. **The** ``OpenedxPublicSignal`` **schema is a public API surface.** Field additions are
157+
backward-compatible; removals and renames are breaking changes.
158+
159+
#. **``RoleAssignmentAudit`` is not tamper-proof.** Compliance-grade immutability is a
160+
later-phase concern.
161+
162+
#. **No new dependencies introduced.** ``django-crum`` and ``django-simple-history`` are
163+
already transitive dependencies.
164+
165+
#. **Usage auditing belongs at the application layer** (Open edX tracking events, Aspects),
166+
not in the authorization library.
167+
168+
#. **Developers can retrieve the matched policy rule at check time** for "why was this
169+
denied?" debugging. The explanation is point-in-time only; historical explainability is
170+
deferred.
171+
172+
#. **Enforcement events are opt-in by design.** Enabling them without an external consumer
173+
produces events that are emitted and discarded.
191174

192175
Alternatives Considered
193176
***********************
@@ -197,45 +180,15 @@ Alternatives Considered
197180

198181
Rejected for three reasons:
199182

200-
- ``save_policy`` does bulk delete + bulk create and bypasses model signals. Any policy
201-
reload creates a new snapshot. The ``history_date`` reflects when the table was written,
202-
not when a role was assigned. Snapshot diffs cannot tell apart "Alice was assigned
203-
instructor" from "policy reloaded, Alice already had the role."
204-
- Model signals are not fired for bulk operations, so writes through ``save_policy`` are not
205-
captured at all.
183+
- ``save_policy`` (`casbin-django-orm-adapter adapter.py`_) uses ``QuerySet.delete()`` and
184+
``bulk_create``, both of which bypass model signals. History snapshots reflect when the
185+
table was written, not when a role was assigned.
206186
- ``ExtendedCasbinRule`` fields (``ptype``, ``v0``--``v5``) are semi-opaque and require an
207187
interpretation layer. ``RoleAssignmentAudit`` translates at write time.
208188

209189
``django-simple-history`` remains the right tool for Option B (point-in-time state
210190
reconstruction), where it is a snapshot mechanism, not an operation log.
211191

212-
Use Cases Addressed
213-
*******************
214-
215-
+------------------------------------------------------------+---------------+
216-
| Description | Flow |
217-
+============================================================+===============+
218-
| Operator: who assigned a role to a user, and when? | Attribution |
219-
+------------------------------------------------------------+---------------+
220-
| Operator: who removed a role from a user, and when? | Attribution |
221-
+------------------------------------------------------------+---------------+
222-
| Operator: full role history for a given user | Attribution |
223-
+------------------------------------------------------------+---------------+
224-
| Operator: access control history for a given resource | Attribution |
225-
+------------------------------------------------------------+---------------+
226-
| Developer: hook into role lifecycle events from a plugin | Attribution |
227-
+------------------------------------------------------------+---------------+
228-
| Operator/Developer: query role assignment history via API | Attribution |
229-
+------------------------------------------------------------+---------------+
230-
| Developer: understand why a permission check was denied | Explainability|
231-
+------------------------------------------------------------+---------------+
232-
| Operator/Developer: inspect a user's current permissions | Explainability|
233-
+------------------------------------------------------------+---------------+
234-
235-
Deferred: resource access history / usage auditing; export to SIEM / Aspects (available as
236-
a side effect of the event signal once an event bus is configured, not a first-class
237-
deliverable of this ADR).
238-
239192
References
240193
**********
241194

@@ -246,12 +199,16 @@ References
246199
- `openedx-events documentation`_
247200
- `django-simple-history documentation`_
248201
- `django-crum documentation`_
249-
- OEPM-Spike: RBAC AuthZ Auditability
202+
- `OEPM-Spike: RBAC AuthZ Auditability`_
250203

251204
.. _ADR 0002: https://github.com/openedx/openedx-authz/blob/main/docs/decisions/0002-authorization-model-foundation.rst
252205
.. _ADR 0004: https://github.com/openedx/openedx-authz/blob/main/docs/decisions/0004-technology-selection.rst
253206
.. _ADR 0005: https://github.com/openedx/openedx-authz/blob/main/docs/decisions/0005-architecture-and-data-modeling.rst
254207
.. _Auth0 FGA Logging API: https://auth0.com/blog/auth0-fga-logging-api-a-complete-audit-trail-for-authorization/
208+
.. _SpiceDB: https://github.com/authzed/spicedb
209+
.. _OpenFGA: https://openfga.dev/
255210
.. _openedx-events documentation: https://docs.openedx.org/projects/openedx-events/en/latest/
256211
.. _django-simple-history documentation: https://django-simple-history.readthedocs.io/
257212
.. _django-crum documentation: https://pypi.org/project/django-crum/
213+
.. _casbin-django-orm-adapter adapter.py: https://github.com/officialpycasbin/django-orm-adapter/blob/main/casbin_adapter/adapter.py
214+
.. _OEPM-Spike\: RBAC AuthZ Auditability: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/6045859842/Spike+-+RBAC+AuthZ+-+Auditability

0 commit comments

Comments
 (0)