Backup and Restore Final Adjustments#406
Conversation
|
Thanks for the pull request, @dwong2708! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. DetailsWhere can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
| load_dump_zip_file(file_name) | ||
| message = f'{file_name} loaded successfully' | ||
| start_time = time.time() | ||
| response = load_dump_zip_file(file_name) |
There was a problem hiding this comment.
"response" might be misinterpreted to be related to HTTP request/response. If there's no better, more specific thing to call this, we can call this result for now. But load_dump_zip_file is also confusing function name, because "load" and "dump" are often opposite of each other (e.g. json.load(), json.dump())
There was a problem hiding this comment.
Right. I have adjusted. Thanks
| return { | ||
| "status": "error", | ||
| "log_file_error": self._write_errors(), # return a StringIO with the errors | ||
| "general_info": None | ||
| } |
There was a problem hiding this comment.
It would be useful to define an actual data structure here to return, like a dataclass.
It's also not clear what "general_info" means in this context.
There was a problem hiding this comment.
I agree. I've added it. Thanks
| @@ -425,12 +429,23 @@ def __init__(self, zipf: zipfile.ZipFile) -> None: | |||
| @transaction.atomic | |||
| def load(self) -> dict[str, Any]: | |||
There was a problem hiding this comment.
We're going to want an arg here to allow the target key of the LearningPackage to be passed in here.
There was a problem hiding this comment.
Though that can be a separate PR if you want.
There was a problem hiding this comment.
Might also be an optional param to the init() instead.
There was a problem hiding this comment.
Trying to do this before this PR can be merged. Otherwise I can do other PR.
There was a problem hiding this comment.
According to this comment the staged key generation logic was added.
| "general_info": { | ||
| "learning_package_key": learning_package.key, | ||
| "learning_package_title": learning_package.title, | ||
| "containers": num_containers, |
There was a problem hiding this comment.
We're probably going to want the breakdown of sections/subsections/units here, not just total containers. I think you can just leave this out until we get UX guidance.
| } | ||
|
|
||
| def preliminary_check(self) -> Tuple[list[dict[str, Any]], dict[str, Any]]: | ||
| """Performs a preliminary check of the zip file structure and mandatory files.""" |
There was a problem hiding this comment.
If this function is only doing a check for the existence of the package file, then please rename it accordingly. The term "preliminary check" is overly vague.
| for valid_draft in containers.get("unit_drafts", []): | ||
| entity_key = valid_draft.pop("entity_key") | ||
| version_num = valid_draft["version_num"] # Should exist, validated earlier | ||
| entity_version_identifier = f"{entity_key}__v{version_num}" |
There was a problem hiding this comment.
Please use tuples instead, so that other code later doesn't have to try to parse it back out. Tuples work fine as dict keys.
There was a problem hiding this comment.
Changes have been applied. Thanks
| entity_key = valid_draft.pop("entity_key") | ||
| version_num = valid_draft["version_num"] # Should exist, validated earlier | ||
| entity_version_identifier = f"{entity_key}__v{version_num}" | ||
| if entity_version_identifier in self.all_published_entities_versions: |
There was a problem hiding this comment.
You could make a helper method here and avoid having to comment about the same pattern multiple times, e.g.
| if entity_version_identifier in self.all_published_entities_versions: | |
| if self.version_already_exists(entity_version_identifier): # then the comment goes in the helper |
…omponents TOML files
- Added dataclass to represent load process result - Improved docstrings - Simplified _is_version_already_exists logic
b25c186 to
0931130
Compare
ormsbee
left a comment
There was a problem hiding this comment.
A few minor requests to adjust the interface and try to reduce edge cases.
Thank you!
|
|
||
|
|
||
| def load_dump_zip_file(path: str) -> None: | ||
| def load_library_from_zip(path: str, user: UserType | None = None, use_staged_lp_key: bool = False) -> dict: |
There was a problem hiding this comment.
A few things on this:
- This is a LearningPackage, not a Library. The distinction isn't important now, but it will be once we start storing courses in LC in the coming months.
load_learning_package()is fine. - Instead of making it take a
used_stage_lp_keyargument, please have it take akeyarg and generate ifkey==None. The default behavior of this function should not trust the key that was dumped in the archive itself, and it will be a potential security issue if people do trust that later on.
There was a problem hiding this comment.
So maybe something like:
| def load_library_from_zip(path: str, user: UserType | None = None, use_staged_lp_key: bool = False) -> dict: | |
| def load_learning_package(path: str, key: str = None, user: UserType | None = None) -> dict: |
There was a problem hiding this comment.
With this update, we’ll have two cases:
- Key provided: The user doesn’t need to generate the staged key.
- Key not provided: The user must generate the staged key.
Changes applied.
|
|
||
| The timestamp at the end ensures the key is unique. | ||
| """ | ||
| username = user.username if user else DEFAULT_USERNAME |
There was a problem hiding this comment.
Don't use a DEFAULT_USERNAME because "command" (or anything like that) is a valid username that someone could plausibly pick. It may be that nothing you have here is a security concern, but at some point, someone is likely to add something that assumes that a user has access to things that are under their username and won't check this edge case. It's also likely to not get cleaned up properly.
Honestly, I'd just force the user to put in a username when invoking the management command--it will reduce edge cases, and if someone has shell access to run management commands, they basically have access to everything anyway.
| Generate a staged learning package key based on the given base key. | ||
|
|
||
| Arguments: | ||
| lp_key (str): The base key of the learning package. |
There was a problem hiding this comment.
We should specify that this is the archive's LearningPackage key. Maybe even call it archive_lp_key to make sure it's not confused.
|
|
||
| _, org_slug, lp_slug = parts[:3] | ||
| timestamp = int(time.time() * 1000) # Current time in milliseconds | ||
| return f"lib-restore:{username}:{org_slug}:{lp_slug}:{timestamp}" |
There was a problem hiding this comment.
Since this temp name is coming from LC now instead of from libraries code (when I initially sketched it out), please make a prefix that's not "lib", so something like lp-restore.
Resolves: #386
Description
This PR introduces improvements to the validation process for restoring learning packages, along with minor adjustments to the backup process.
Changes
Backup
uuidfield[entity.container]section from component TOML filesRestore
loadAPI to provide clearer information to the restore endpoint