Skip to content

[FC-0099] feat: add model to be used as backreference to maintain rules consistent#100

Merged
mariajgrimaldi merged 26 commits intomainfrom
MJG/consistency-mechanism
Nov 11, 2025
Merged

[FC-0099] feat: add model to be used as backreference to maintain rules consistent#100
mariajgrimaldi merged 26 commits intomainfrom
MJG/consistency-mechanism

Conversation

@mariajgrimaldi
Copy link
Copy Markdown
Member

@mariajgrimaldi mariajgrimaldi commented Oct 16, 2025

Description

This PR introduces a consistency mechanism between Open edX domain objects (such as users and content libraries) and Casbin policies. The goal is to keep authorization data synchronized with the lifecycle of real objects, ensuring that no orphaned or stale policies remain after deletions.

Initially, the plan was to use an event-based approach that listened to Open edX lifecycle events. However, this approach was limited because it would require creating new events for the user lifecycle, the platform does not expose a custom User model, and these events would only be useful in the short term since the long-term plan was already to move to a back-reference model.

Because of these limitations, this implementation goes straight for the more sustainable option: a back-reference model that ensures transactional consistency at the Django application layer.

⚠️ This plan is still open for discussion. If it feels too big for now, we could start with the simpler event-based solution for the MVP and improve it later.

Core Components

Entity Relation Diagram

Details

ExtendedCasbinRule drawio

Extended Casbin Rule Model

The new model, ExtendedCasbinRule, extends Casbin's base policy and adds scope and subject references using Django foreign keys:

  • Scope → the context of the rule (e.g., a ContentLibrary)
  • Subject → the entity that has the role (e.g., a User)

When a resource is deleted, Django's cascade mechanism ensures all related entries are cleaned up automatically through a complete deletion chain:

ContentLibrary → ContentLibraryScope → Scope → ExtendedCasbinRule → CasbinRule
    (DB CASCADE)      (DB CASCADE)    (DB CASCADE)   (signal handler)

Bidirectional Cascade Implementation

The implementation uses database-level CASCADE for parent→child relationships (ContentLibrary→Scope, User→Subject, Scope/Subject→ExtendedCasbinRule, CasbinRule→ExtendedCasbinRule via OneToOneField).

For the reverse direction (ExtendedCasbinRule→CasbinRule), we use a post_delete signal handler that fires for both direct .delete() calls and CASCADE deletions to delete the CasbinRule, preventing orphaned authorization rules. The handler includes recursion protection using ContextVar to prevent infinite loops.

Registry-Based Polymorphic Design

The authorization system uses a registry pattern to avoid hardcoding foreign keys in the base models:

  • Base models (Scope, Subject) do not define resource-specific fields
  • Subclasses (like ContentLibraryScope, UserSubject) register with a namespace identifier
  • Managers use this registry to look up the right subclass

New scope or subject types can be added in separate apps by simply subclassing and registering, without touching core models or requiring redeployment.

Signal Handler for Reverse Cascade

We use a post_delete signal handler to delete the CasbinRule when ExtendedCasbinRule is deleted. The signal handler:

  1. Fires after deletion completes: Django's post_delete signal fires after the deletion process, after the transaction commits (immediate commit is assumed since the deletion CasbinRule deletion happens outside an atomic block)
  2. Works with CASCADE deletions: The signal fires when ExtendedCasbinRule is deleted via CASCADE (e.g., when Scope/Subject is deleted)

Test Cases

This PR includes the /integration folder covering all critical cascade deletion workflows, consider the following:

CASE 1:

  • Assign a user to a role in a library scope through the REST API, let's say, lib:OpenedX:CSPROB
  • Check that the extended model was created, (I personally use a DBMS for MySQL but you can also use the Django shell):
  • Delete the user or the content library, the extended model and the casbin rule should both be deleted

CASE 2:
Repeat the same steps, but in this case unassign the user from the role, then the extended model should be removed (along with the other records of subject / scope)

To run these test cases:

tutor dev exec lms bash
pytest /openedx/openedx-authz/openedx_authz/tests/integration/

Caveats

  • User retirement will be handled via events in a different PR
  • Signal handler dependency: The reverse cascade (ExtendedCasbinRule → CasbinRule) relies on Django's post_delete signal, which fires for both direct .delete() calls and CASCADE deletions, but may not fire in all scenarios (e.g., raw SQL deletions, certain bulk operations using .update() or raw SQL)
  • Scope limited to deletions: The current implementation focuses on deletion consistency. Updates to policies (e.g., changing scope/subject) are not covered but can be added if needed
  • Metadata fields: The model includes timestamps and description fields that can be removed if not needed for MVP

Alternatives Considered

Event-Based Consistency: As described in issue #74, we considered using Open edX events. However, content libraries have lifecycle events but users don't, requiring a mix of events and Django signals. This would create inconsistent behavior and lacks transactional guarantees.

Hybrid Strategy: Use the backreference model for data consistency (users, libraries) and events only for async cases like user retirement. This is flexible and our long-term direction, but deemed too complex for the MVP.

References

@openedx-webhooks openedx-webhooks added open-source-contribution PR author is not from Axim or 2U core contributor PR author is a Core Contributor (who may or may not have write access to this repo). labels Oct 16, 2025
@openedx-webhooks
Copy link
Copy Markdown

openedx-webhooks commented Oct 16, 2025

Thanks for the pull request, @mariajgrimaldi!

This repository is currently maintained by @openedx/committers-openedx-authz.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details
Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

Comment thread openedx_authz/api/roles.py
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/consistency-mechanism branch from 747652c to 42d62bb Compare October 21, 2025 11:10
@mariajgrimaldi mariajgrimaldi changed the title Mjg/consistency mechanism feat: add model to be used as backreference to maintain rules consistent Oct 21, 2025
@mariajgrimaldi mariajgrimaldi changed the title feat: add model to be used as backreference to maintain rules consistent [FC-0099] feat: add model to be used as backreference to maintain rules consistent Oct 21, 2025
@mariajgrimaldi mariajgrimaldi linked an issue Oct 22, 2025 that may be closed by this pull request
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/consistency-mechanism branch from 42d62bb to 88f8695 Compare October 22, 2025 13:09
Copy link
Copy Markdown
Contributor

@bmtcril bmtcril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of open questions, I think:

  1. How to handle wildcard scopes
  2. How to handle objects that do have local database entries

I'm not sure that we have to support number 2, but wanted to raise if for cases like granting or denying access to Aspects dashboards based on RBAC roles or being able to query APIs for certain objects in a Student Information System that uses UUIDs as identifiers.

@mariajgrimaldi mariajgrimaldi marked this pull request as ready for review October 22, 2025 16:49
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/consistency-mechanism branch from 03646c9 to 162adf0 Compare October 22, 2025 17:37
@mariajgrimaldi
Copy link
Copy Markdown
Member Author

@bmtcril: thanks for raising this!

How to handle wildcard scopes

For wildcard scopes, like all libraries, I was thinking we could set the scope to None. That should be fine since deleting any library wouldn’t affect the user's Casbin rule, as it doesn’t reference a specific one. Does that make sense, or do you see another case we should consider?

How to handle objects that do have local database entries

I’m guessing you meant those that don't have local database entries, right? I think we’d still need some way to reference them, but their deletion flow might work differently, maybe event-based but managed in this app, so that when the upstream object is deleted, the local entry is also removed once we’re notified.

@bmtcril
Copy link
Copy Markdown
Contributor

bmtcril commented Oct 22, 2025

@mariajgrimaldi that all makes sense to me!

@MaferMazu MaferMazu moved this to Ready for testing in RBAC AuthZ Board Oct 22, 2025
Copy link
Copy Markdown
Contributor

@rodmgwgu rodmgwgu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed documentation. I haven't finished reviewing the code, but here are my initial impressions:

At first, I didn't like the idea of introducing database models for tracking relations for rules, as in my mind, keeping the authz model decoupled and generic simplifies maintainability and extensibility.

But looking at consistency, I don't currently see a better/most optimal way of doing it other than this approach.

Comment thread openedx_authz/handlers.py Outdated


@receiver(post_delete, sender=ExtendedCasbinRule)
def delete_casbin_rule_on_extended_rule_deletion(sender, instance, **kwargs):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what happens if, for example, we have multiple instances of the lms (for scalability reasons), will the post_delete cause this to be executed on all the instances?

As we are just calling .delete() on CasbinRule, this shouldn't be an issue, but would be good to know to take into account for future changes to this handler.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rodmgwgu Signals should take place in a single process, but there is a reload thread in all processes that currently refreshes the policy on a timer. In the future we're hoping to replace that with a more robust solution.

However there are some cases we should be aware of like bulk deletes where these signals are not fired for each object deleted.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely another caveat. This approach only works for application-level operations that aren’t immediately translated into SQL, like bulk operations. However, I’m not too worried about it. We could add a command to clean the Casbin table and ensure the relationship tree stays consistent.

Comment thread openedx_authz/handlers.py Outdated

The handler keeps authorization data symmetric with three common flows:
- Direct ExtendedCasbinRule deletes (API/UI) trigger removal of the linked CasbinRule.
- Cascades from `Scope` or `Subject` deletions clear their ExtendedCasbinRule rows and, via this handler, the matching CasbinRule entries.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that when ExtendedCasbinRule gets deleted through a CASCADE action, the post_delete event will be triggered?

My understanding is that the cascade deletion is done at the database level, so I'm not sure Django has a way of detecting this? (I haven't tried this so I may be missing something)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see that you have a test for that here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I was also wondering if this would get triggered and which cases we were actually covering, but I tried to be thorough in the integration tests to make sure we cover at least our main use cases.

@mphilbrick211 mphilbrick211 added the FC Relates to an Axim Funded Contribution project label Oct 22, 2025
@mphilbrick211 mphilbrick211 moved this from Needs Triage to In Eng Review in Contributions Oct 22, 2025
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/consistency-mechanism branch from 162adf0 to a11ae03 Compare October 23, 2025 14:21
@mariajgrimaldi
Copy link
Copy Markdown
Member Author

Thanks for your early review, @rodmgwgu!

Another alternative we considered was making it event based, but we ran into a few blockers when thinking about it long term. We'd need lifecycle events for each scope to cascade deletes to the upstream Casbin rule. That's not terrible by itself, but failure handling would be much more complex than using a lower level approach like foreign keys, which Django manages automatically. It's definitely more coupled but also less error prone in my opinion.

With the current approach, we're only deleting in one direction (CasbinRule -> ExtendedCasbinRule), which feels more manageable. The event based option isn't completely off the table though, since there are cases we can't handle through references, like the user retirement one (#110).

Future scopes would just need to inherit from the base scope model, and Django would keep the Casbin table consistent, with the same caveats we already discussed, which would still apply if we went with the event based approach.

@mariajgrimaldi mariajgrimaldi force-pushed the MJG/consistency-mechanism branch from eaa743e to 9be04f5 Compare November 11, 2025 10:38
Comment thread CHANGELOG.rst
* Implement custom matcher to check for staff and superuser status.

0.13.1 - 2025-11-10
0.13.1 - 2025-11-11
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking the entire day that is the 10th.

@mariajgrimaldi mariajgrimaldi merged commit c5ce1d1 into main Nov 11, 2025
13 of 14 checks passed
@mariajgrimaldi mariajgrimaldi deleted the MJG/consistency-mechanism branch November 11, 2025 19:35
@github-project-automation github-project-automation Bot moved this from Ready for review to Done in RBAC AuthZ Board Nov 11, 2025
@github-project-automation github-project-automation Bot moved this from In Eng Review to Done in Contributions Nov 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core contributor PR author is a Core Contributor (who may or may not have write access to this repo). FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U

Projects

Archived in project
Status: Done

Development

Successfully merging this pull request may close these issues.

Consistency Mechanism

6 participants