feat: History log logic for Components and Containers [FC-0123]#38178
feat: History log logic for Components and Containers [FC-0123]#38178ChrisChV merged 2 commits intoopenedx:masterfrom
Conversation
|
Thanks for the pull request, @ChrisChV! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. DetailsWhere can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
| path('assets/', blocks.LibraryBlockAssetListView.as_view()), | ||
| path('assets/<path:file_path>', blocks.LibraryBlockAssetView.as_view()), | ||
| path('publish/', blocks.LibraryBlockPublishView.as_view()), | ||
| # Get the draft change history for this block |
There was a problem hiding this comment.
Question: Can these endpoints be moved to xblock/rest_api and abstract the library logic?
I'm afraid no, but we will probably need to implement something like this there, for course xblocks, right?
There was a problem hiding this comment.
I'm afraid there's no time to do it now; we can do it when this is implemented for the courses.
There was a problem hiding this comment.
I think it might make sense to move these endpoints into openedx-core eventually. See openedx/openedx-core#515
The only two things this needs that aren't in core are awareness (1) of learning context / usage keys, and (2) usernames / profile pictures.
ormsbee
left a comment
There was a problem hiding this comment.
I still have a lot more to review, but I wanted to submit this partial review before I turned in for today. Will look through this more in the morning.
ormsbee
left a comment
There was a problem hiding this comment.
Some more questions/suggestions on how LibraryHistoryContributor is handled.
| if old_version.title != new_version.title: | ||
| return "renamed" | ||
| return "edited" |
There was a problem hiding this comment.
Ack. Per @sdaitzman, edits takes precedence over renames in terms of display, i.e. if something is both edited and renamed at the same time, we're supposed to say "edited". But the publishing applet has no way to actually tell if a change in versions was just a rename or actually edit + rename, does it?
I suppose this means that we'll eventually (post-Verawood) want to add yet another field to the change log record that enumerates the kind of change it is, so that we can have better clarity there.
There was a problem hiding this comment.
Correct, there's no way to distinguish a pure rename from a rename + edit with the current data model. We can track this as a follow-up
ormsbee
left a comment
There was a problem hiding this comment.
There's a lot of duplication in my requests between this round and the last. Some of these are low-level nits like variable naming or unnecessary calls to list.
More seriously, it seems like some of this code does nothing useful, e.g. checking for None where it's not possible, de-duplicating entries that are already distinct because they're returned that way from openedx-core APIs. This makes review awkward because I keep on scanning through the openedx-core APIs to see if somehow the states being guarded against are actually possible. There's also a fair amount of code duplication as well, though I don't know how hard refactoring that would be, or if it would lead to too much complexity. (I'm not requesting that you do any large scale refactoring at this time.)
At a higher level, I do worry at how expensive some of these queries look, especially as history grows over time. Do you have a sense for how the history behavior scales as the log grows over time?
|
|
||
|
|
||
| @dataclass(frozen=True) | ||
| class LibraryHistoryEntry: |
There was a problem hiding this comment.
If this is specifically scoped to the Draft History, let's rename it (and LibraryHistoryContributor) to reflect that.
There was a problem hiding this comment.
Both LibraryHistoryEntry and LibraryHistoryContributor are used across different scopes, including both draft history and publish history. They are general types: LibraryHistoryEntry represents an individual entry in any scope, and LibraryHistoryContributor represents an individual contributor in any scope.
| [ 🛑 UNSTABLE ] Return the combined draft history for a container and all of its descendant | ||
| components, sorted from most-recent to oldest. |
There was a problem hiding this comment.
Am I correct in understanding that there's no way to paginate through this? I am concerned that this will become quite large over time.
There was a problem hiding this comment.
You're right, pagination isn't straightforward here since we're merging and sorting entries from multiple entities in memory. It would require changes to query them together at the DB level. We can track this as a follow-up
It's a valid concern. What would the average case look like for a list of entries in a group (draft or publish)? In my opinion, in the average case, the draft history between two publication events is naturally limited: authors typically don't make hundreds of edits before publishing, so the period between publications remains manageable, and therefore the entries in a publish log as well. The entries in each publish group are loaded when the group is opened; that is also controlled. The worst case would be a library that's never published, where draft history accumulates indefinitely across all descendant components with no cutoff. A large library with many descendants that never gets published could get expensive over time. We can open a follow-up issue to discuss pagination and scaling more thoroughly |
|
@ormsbee Thanks for the review! I've fixed all the nits and responded to your comments. I just need to implement the earliest version. I will implement it first thing tomorrow morning. |
There was a problem hiding this comment.
One minor, non-blocking request. I'm assuming that the test failures are unrelated and will be fixed in a rebase.
Otherwise, please squash your commits and add details from your PR description into your commit message. Please also mention that Claude was used for part of this code in the commit message. Thank you!
2fc3204 to
9142f4e
Compare
**Publish history groups: Pre-Verawood vs Post-Verawood** The `openedx-content` library added a `direct` field to `PublishLogRecord` starting in the Verawood release (openedx/openedx-core#539). This field changes how publish history groups are structured, so the endpoint handles both eras: **Pre-Verawood** (`PublishLogRecord.direct is None`): When a container and its components are published together, each entity produces its own independent group, even though they share the same `publish_log_uuid`. For example, publishing a Unit with 3 components creates 4 separate groups. Each group has `scope_entity_key` set to that specific entity's key, which the frontend must pass to the entries endpoint to fetch that entity's individual changes. **Post-Verawood** (`PublishLogRecord.direct is not None`): The `direct` field marks which entities the user explicitly clicked "Publish" on (`direct=True`) vs. which were pulled in as side effects (`direct=False`, e.g. a shared component published from a sibling container). In this era, all entities from the same `PublishLog` are merged into a single group, and `direct_published_entities` lists only the explicitly published items. The `scope_entity_key` is `null` — the frontend uses the current container key to fetch entries. This design means the frontend does not need era awareness: it always uses `group.scope_entity_key ?? currentContainerKey` when calling the entries endpoint. **Functions** - Implements python api and REST_API functions to get the history log for a component: - `get_library_component_draft_history`: Return the draft change history for a library component since its last publication. - `get_library_component_publish_history`: Return the publish history of a library component as a list of groups. - `get_library_component_publish_history_entries`: Return the individual draft change entries for a specific publish event. - `get_library_component_creation_entry`: Return the creation entry (who created it and when). - Implements python api and REST_API functions to get the history log for containers: - `get_library_container_draft_history`: Return the combined draft history for a container and all of its descendant components. - `get_library_container_publish_history`: Return the publish history of a container as a list of groups. - `get_library_container_publish_history_entries`: Return the individual draft change entries for the container entity in a specific publish event. - `get_library_container_creation_entry`: Return the creation entry for a container. Note: This used Claude's help to write the separation of the Post and Pre-Verawood eras temp
9142f4e to
e917816
Compare
|
Is there a discussion somewhere explaining why we created 8 different endpoints instead of one combined "get entity history" endpoint? Also: the API responses are very repetitive because of the user profile information, e.g. the following can be repeated hundreds of times in a given response: I would suggest we consider a follow up to remove the profile image URLs from the responses. They can either be retrieved via a dedicated API, or use a standard redirect approach (e.g. |
**Publish history groups: Pre-Verawood vs Post-Verawood** The `openedx-content` library added a `direct` field to `PublishLogRecord` starting in the Verawood release (openedx/openedx-core#539). This field changes how publish history groups are structured, so the endpoint handles both eras: **Pre-Verawood** (`PublishLogRecord.direct is None`): When a container and its components are published together, each entity produces its own independent group, even though they share the same `publish_log_uuid`. For example, publishing a Unit with 3 components creates 4 separate groups. Each group has `scope_entity_key` set to that specific entity's key, which the frontend must pass to the entries endpoint to fetch that entity's individual changes. **Post-Verawood** (`PublishLogRecord.direct is not None`): The `direct` field marks which entities the user explicitly clicked "Publish" on (`direct=True`) vs. which were pulled in as side effects (`direct=False`, e.g. a shared component published from a sibling container). In this era, all entities from the same `PublishLog` are merged into a single group, and `direct_published_entities` lists only the explicitly published items. The `scope_entity_key` is `null` — the frontend uses the current container key to fetch entries. This design means the frontend does not need era awareness: it always uses `group.scope_entity_key ?? currentContainerKey` when calling the entries endpoint. **Functions** - Implements python api and REST_API functions to get the history log for a component: - `get_library_component_draft_history`: Return the draft change history for a library component since its last publication. - `get_library_component_publish_history`: Return the publish history of a library component as a list of groups. - `get_library_component_publish_history_entries`: Return the individual draft change entries for a specific publish event. - `get_library_component_creation_entry`: Return the creation entry (who created it and when). - Implements python api and REST_API functions to get the history log for containers: - `get_library_container_draft_history`: Return the combined draft history for a container and all of its descendant components. - `get_library_container_publish_history`: Return the publish history of a container as a list of groups. - `get_library_container_publish_history_entries`: Return the individual draft change entries for the container entity in a specific publish event. - `get_library_container_creation_entry`: Return the creation entry for a container. Note: This used Claude's help to write the separation of the Post and Pre-Verawood eras
|
@ChrisChV could you please let me know your thoughts about the API endpoints and moving the avatar info ^ ? |
Description
Publish history groups: Pre-Verawood vs Post-Verawood
The
openedx-contentlibrary added adirectfield toPublishLogRecordstarting in the Verawood release (openedx/openedx-core#539). This field changes how publish history groups are structured, so the endpoint handles both eras:Pre-Verawood (
PublishLogRecord.direct is None): When a container and its components are published together, each entity produces its own independent group, even though they share the samepublish_log_uuid. For example, publishing a Unit with 3 components creates 4 separate groups. Each group hasscope_entity_keyset to that specific entity's key, which the frontend must pass to the entries endpoint to fetch that entity's individual changes.Post-Verawood (
PublishLogRecord.direct is not None): Thedirectfield marks which entities the user explicitly clicked "Publish" on (direct=True) vs. which were pulled in as side effects (direct=False, e.g. a shared component published from a sibling container). In this era, all entities from the samePublishLogare merged into a single group, anddirect_published_entitieslists only the explicitly published items. Thescope_entity_keyisnull— the frontend uses the current container key to fetch entries.This design means the frontend does not need era awareness: it always uses
group.scope_entity_key ?? currentContainerKeywhen calling the entries endpoint.Functions
get_library_component_draft_history: Return the draft change history for a library component since its last publication.get_library_component_publish_history: Return the publish history of a library component as a list of groups.get_library_component_publish_history_entries: Return the individual draft change entries for a specific publish event.get_library_component_creation_entry: Return the creation entry (who created it and when).get_library_container_draft_history: Return the combined draft history for a container and all of its descendant components.get_library_container_publish_history: Return the publish history of a container as a list of groups.get_library_container_publish_history_entries: Return the individual draft change entries for the container entity in a specific publish event.get_library_container_creation_entry: Return the creation entry for a container.List of features (TODOs):
Supporting information
history logapi functions [FC-0123] openedx-core#501Testing instructions
Follow the testing instructions in openedx/frontend-app-authoring#2948
Deadline
Before the Verawood cut.
Other information
I used Claude's help to write the separation of the Post and Pre-Verawood eras