Tables: tolerate trailing whitespace on rows; reject empty delimiter cells#247
Merged
Merged
Conversation
…cells Two table delimiter-row divergences (carve-js and carve-rs agreed; the PHP impl was the outlier), ported from carve-php commit 28d4c10: 1. Trailing whitespace after a row's closing pipe broke recognition. A line like `| a |` followed by spaces, or a separator `|---|` followed by spaces, was not treated as a table row, so a table with a trailing-whitespace separator split into separate blocks and the separator rendered as a paragraph. Trailing spaces/tabs after the closing pipe are now stripped before the structural checks (isTableRow, isSeparatorRow, the cell parsers, and the BlockParser row loop). 2. A delimiter row with an empty cell (`|---||`) was accepted as a header separator. isSeparatorRow used a character class that put `|` inside it and so never validated per cell. It now splits the row into cells and requires each to be a delimiter cell (optional whitespace, optional leading `:`, one or more `-`, optional trailing `:`); an empty cell or any other content disqualifies the row, which then stays an ordinary data row. Behavior now matches the JS and Rust implementations on these delimiter-row edge cases.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #247 +/- ##
============================================
- Coverage 92.06% 92.05% -0.01%
- Complexity 3571 3576 +5
============================================
Files 107 107
Lines 10118 10131 +13
============================================
+ Hits 9315 9326 +11
- Misses 803 805 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two table delimiter-row divergences where the PHP implementation was the outlier (carve-js and carve-rs agreed). Ported from carve-php commit 28d4c10.
1. Trailing whitespace on table rows
Trailing spaces or tabs after a row's closing pipe broke row recognition. A line like
| a |followed by spaces, or a separator|---|followed by spaces, was not treated as a table row, so a table with a trailing-whitespace separator split into separate blocks and the separator rendered as a paragraph.Trailing whitespace after the closing pipe is insignificant and is now stripped before the structural checks (isTableRow, isSeparatorRow, the cell parsers, and the BlockParser row loop).
2. Empty delimiter cell
A delimiter row with an empty cell (
|---||) was wrongly accepted as a header separator. The detection used a character class that placed|inside it and so never validated per cell. It now splits the row into cells and requires each to be a delimiter cell (optional whitespace, optional leading colon, one or more dashes, optional trailing colon). An empty cell or any other content disqualifies the row, which then stays an ordinary data row.Behavior now matches the JS and Rust implementations on these delimiter-row edge cases.