Skip to content

Fix off-by-one error in parse_range_str page range handling#107

Open
yijinlee wants to merge 1 commit into
datalab-to:masterfrom
yijinlee:patch-1
Open

Fix off-by-one error in parse_range_str page range handling#107
yijinlee wants to merge 1 commit into
datalab-to:masterfrom
yijinlee:patch-1

Conversation

@yijinlee

Copy link
Copy Markdown

parse_range_str returns 1-based page numbers, but load_pdf_images iterates with a 0-based index. This causes --page-range 1 to return page 2, --page-range 2 to return page 3, etc.

Fix in chandra/input.py, parse_range_str.

Steps to reproduce error:

  1. Take a PDF where page 1 and page 2 have distinct content
  2. Run chandra input.pdf ./output --page-range 1
  3. Output contains page 2's content instead of page 1's

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant