Skip to content

Move html characeter converting mechanism to regexp-handling#1570

Open
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:html_chars_regexp_handling
Open

Move html characeter converting mechanism to regexp-handling#1570
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:html_chars_regexp_handling

Conversation

@tompng
Copy link
Copy Markdown
Member

@tompng tompng commented Jan 19, 2026

Text#to_html_characters was a postprocess that converts ascii quotes/marks to multibyte characters. Postprocessing HTML to do that is not a good idea. Convert plain text node with regexp-handling is better.

`Text#to_html_characters` was a postprocess that converts ascii quotes/marks to multibyte characters.
Postprocessing HTML to do thaat is not a good idea. Convert plain text node is better.
@tompng tompng temporarily deployed to fork-preview-protection January 19, 2026 17:54 — with GitHub Actions Inactive
Comment thread lib/rdoc/text.rb
h[encoding] = {
:close_dquote => encode_fallback('”', encoding, '"'),
:close_squote => encode_fallback('’', encoding, '\''),
:copyright => encode_fallback('©', encoding, '(c)'),
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think html character is ©, not ©, but it's another issue.

@matzbot
Copy link
Copy Markdown
Collaborator

matzbot commented Jan 19, 2026

🚀 Preview deployment available at: https://92299bff.rdoc-6cd.pages.dev (commit: 6eb654a)


TO_HTML_CHARACTERS = Hash.new do |h, encoding|
h[encoding] = {
:close_dquote => encode_fallback('”', encoding, '"'),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Ruby 1.9 hash syntax?

:close_squote => encode_fallback('’', encoding, '\''),
:copyright => encode_fallback('©', encoding, '(c)'),
:ellipsis => encode_fallback('…', encoding, '...'),
:dot_ellipsis => encode_fallback('.…', encoding, '....'),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we can align this one too:

Suggested change
:dot_ellipsis => encode_fallback('.…', encoding, '....'),
:dot_ellipsis => encode_fallback('.…', encoding, '....'),

# Transcodes +character+ to +encoding+ with a +fallback+ character.

def self.encode_fallback(character, encoding, fallback)
character.encode(encoding, :fallback => { character => fallback },
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 1.9 hash syntax. Or we can do it via a cop in a separate PR too

@insquotes = true
end
end
TO_HTML_CHARACTERS[quote.encoding][type] if type
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could TO_HTML_CHARACTERS[quote.encoding] ever be nil and causes NoMethodError?

type = @in_dquote ? :close_dquote : :open_dquote
@in_dquote = !@in_dquote
when "'"
if @insquotes
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be @in_squote?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants