Skip to content

fix: include filename in UnicodeDecodeError during vault gather (#32)#60

Open
MohammadYusif wants to merge 1 commit into
mfarragher:mainfrom
MohammadYusif:fix/issue-32
Open

fix: include filename in UnicodeDecodeError during vault gather (#32)#60
MohammadYusif wants to merge 1 commit into
mfarragher:mainfrom
MohammadYusif:fix/issue-32

Conversation

@MohammadYusif

Copy link
Copy Markdown

Summary

  • When a vault note contains non-UTF-8 bytes, reading it raised a bare error with no indication of which file was at fault — unworkable for a user with tens of thousands of notes (see issue).
  • Worse, the UnicodeDecodeError from f.read() in _get_md_front_matter_and_content was being swallowed by the trailing bare except:, which then tried to return {}, file_string where file_string was never assigned — so the user actually saw a confusing UnboundLocalError instead of the real decode error.
  • This fix catches UnicodeDecodeError explicitly and re-raises it with the offending filepath appended to the message, so the failing file is immediately identifiable.

Changes

  • obsidiantools/md_utils.py: in _get_md_front_matter_and_content, added an explicit except UnicodeDecodeError (before the catch-all) that re-raises a UnicodeDecodeError with (in file: <filepath>) appended to the reason. Placed ahead of the bare except: so the decode error is no longer swallowed.
  • tests/test_md_utils.py: added test_non_utf8_file_error_includes_filepath, which writes a temp file with an invalid 0xff byte, calls the public get_front_matter, and asserts the raised UnicodeDecodeError message contains the filepath. Verified this test fails on the unpatched code (it raised UnboundLocalError) and passes with the fix.

Fixes #32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UnicodeDecodeError doesn't provide file details

1 participant