fix(deps): update pdf-extract 0.7 -> 0.8 to fix PDF parsing crashes (#4) by andrehrferreira · Pull Request #5 · hivellm/transmutation

andrehrferreira · 2026-06-18T20:56:15Z

Description

Certain PDF files caused the application to crash during parsing with pdf-extract 0.7.x. This bumps pdf-extract to 0.8 (resolves to 0.8.2), which fixes the upstream parsing bugs. The extraction API (extract_text, extract_text_from_mem) is unchanged, so no source changes were needed. Also bumps the crate version to 0.3.3 and updates the CHANGELOG.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Performance improvement
Code refactoring

Related Issue

Fixes #4

Changes Made

Updated pdf-extract dependency from 0.7 to 0.8 (resolves to 0.8.2) in Cargo.toml
Bumped crate version from 0.3.2 to 0.3.3
Added a 0.3.3 bugfix entry to CHANGELOG.md

Testing

Unit tests added/updated
Integration tests added/updated
Manual testing performed
Benchmarks run (if performance-related)

Verified locally:

cargo build succeeds with pdf-extract 0.8.2 — no API changes required.
Full test suite passes (88 passed; 0 failed).
Manually re-ran the test_pdf_extract example against data/1706.03762v7.pdf; the PDF parses and extracts text without crashing.

Checklist

My code follows the project's style guidelines
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
Any dependent changes have been merged and published

Performance Impact

Conversion speed: No measurable change (dependency-only update)
Memory usage: No measurable change
Binary size: Negligible change

Additional Notes

This is a dependency-only fix — no source code changes. pdf-extract 0.10.0 is also available, but this PR stays within the requested 0.8.x line for a minimal, low-risk update. Note: the extraction example still prints upstream Unicode mismatch / missing char warnings from pdf-extract, which are non-fatal and unrelated to the crash.

Certain PDF files caused the application to crash during parsing with pdf-extract 0.7.x. Bump pdf-extract to 0.8 (resolves to 0.8.2), which fixes the upstream parsing bugs. The extraction API (extract_text, extract_text_from_mem) is unchanged, so no source changes were needed. Bump version to 0.3.3 and update CHANGELOG. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>

andrehrferreira merged commit 5241af0 into main Jun 18, 2026
12 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(deps): update pdf-extract 0.7 -> 0.8 to fix PDF parsing crashes (#4)#5

fix(deps): update pdf-extract 0.7 -> 0.8 to fix PDF parsing crashes (#4)#5
andrehrferreira merged 1 commit into
mainfrom
fix/issue-4-pdf-extract-0.8

andrehrferreira commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

andrehrferreira commented Jun 18, 2026

Description

Type of Change

Related Issue

Changes Made

Testing

Checklist

Performance Impact

Additional Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant