Skip to content

Latest commit

 

History

History
166 lines (113 loc) · 9.05 KB

File metadata and controls

166 lines (113 loc) · 9.05 KB

Critical Analysis: WP_HTML_Template vs. Ticket #60229 Requirements

I've examined ticket #60229, the current branch implementation, and the test suite. Here's my analysis:


What's Working Well

1. Core Safety Goal Achieved The implementation delivers on the primary promise: context-aware auto-escaping. The code correctly escapes &, <, >, ", ' in both text and attribute contexts. Tests confirm placeholders can't be injected via replacement values (non-recursive replacement).

2. Funky Comment Syntax The </%placeholder> syntax is a smart choice—it's valid HTML (parsed as a funky comment), can't be nested by construction, and is visually distinctive. The parsing handles whitespace around names correctly.

3. Nested Templates Template composition (WP_HTML_Template as replacement value) works for text context with proper escaping propagation. The rejection of templates in attribute context is the right safety call.

4. Compile-once Design Lazy compilation with cached edits ($edits array) is efficient for template reuse.

5. Boolean Attribute Support Supply true to create a boolean attribute (disabled="</%d>" + truedisabled), or false/null to remove an attribute entirely. Only works for whole-attribute placeholders—partial placeholders reject boolean values.


What's Missing from the Ticket Requirements

1. No URL Escaping (Ticket TODO) The ticket explicitly says "does not escape URLs differently than other attributes." The XSS test shows javascript:alert("xss") only escapes quotes—no esc_url() equivalent. This is a security gap for href/src attributes.

2. No Attribute Spread Ticket comment 9 discusses "spread" attributes for making tags placeholders. Not implemented.

3. Missing Output Format Methods Ticket TODO lists ->final_output_to_browser(), ->final_output_to_plaintext(), ->final_output_to_markdown(), etc. None exist.

4. Embed Replacement in Tag Processor (Ticket TODO) The ticket wants replacement embedded in the Tag Processor. Current implementation uses a separate class with its own parsing pass.


Overlooked Issues in the Ticket

1. RAWTEXT/RCDATA Element Handling Is Half-Baked Tests show placeholders inside <script>, <style> are preserved literally, while <title>, <textarea> escape them as text. Neither allows actual replacement. The ticket doesn't acknowledge this limitation clearly. If I write a template for a script tag's content, I'd expect placeholders to work.

2. Table Context Parsing (Solvable with Private API) Test explicitly skipped: "IN TABLE templates are not supported yet." However, investigation reveals this is solvable using existing private APIs.

Key finding: Funky comment placeholders (</%name>) are explicitly handled in table contexts! From class-wp-html-processor.php line 3279-3283:

case '#comment':
case '#funky-comment':
case '#presumptuous-tag':
    $this->insert_html_element( $this->state->current_token );
    return true;

The "foster parenting bail" only happens for:

  1. Non-whitespace text directly inside table/tbody/thead/tfoot/tr
  2. Non-table elements like <div> inside table structure

The real blocker: The public create_fragment() API artificially rejects non-body contexts (line 296):

if ( '<body>' !== $context || 'UTF-8' !== $encoding ) {
    return null;
}

Private API path: create_fragment_at_current_node() (line 477) does the right thing:

  1. Takes current element as context
  2. Calls reset_insertion_mode_appropriately() which correctly sets table insertion modes
  3. Returns a fragment processor in the proper context

How to enable table support:

  1. Use Closure::bind to access create_fragment_at_current_node() without modifying WP_HTML_Processor
  2. Create full parser: <!DOCTYPE html><table><tbody>, navigate to <tbody>, call the private method via bound closure
  3. Fragment processor will be in IN_TABLE_BODY mode where <tr> and placeholders are valid

Limitations that would remain:

  • Cannot put arbitrary content (like <div>) inside table cells via placeholders (would trigger foster parenting)
  • Must structure templates so placeholders appear where table elements are expected

Why this matters: WordPress admin uses tables extensively. This approach would enable table template support without waiting for full foster parenting implementation.

3. No i18n Integration The ticket mentions "translation" as a sigil use case, and gziolo's comment asks about createInterpolateElement parity. Zero implementation of translation awareness. For WordPress core, this is a big miss.

4. Performance Not Benchmarked Normalization runs twice: once in compile() for text detection, once in render() for final output. For high-frequency template rendering (e.g., list items), this could add up.

5. Error Handling Philosophy Unclear render() returns false on errors, but bind() uses _doing_it_wrong() warnings and still returns a template. Mixed signals—should errors be fatal or recoverable? The test gaps document lists many untested failure cases.

6. Duplicate Attributes Bug with false/null Removal FIXED When using false or null to remove an attribute, only the first occurrence is removed. If the HTML contains duplicate attributes (e.g., <input disabled="</%d>" disabled>), the first disabled will be removed but the second will remain in the output. Fixed by emitting removal edits for duplicate attributes during compile(). All duplicate attributes are now stripped as part of template compilation.

Implementation note: The fix currently accesses $attributes and $duplicate_attributes on WP_HTML_Tag_Processor via protected visibility. This should be refactored to use Closure::bind instead, allowing the Tag Processor changes to be reverted.


Gaps in Test Coverage (From Test-Gaps.md + My Observations)

High Priority Missing:

  • Non-string replacement values (integers, arrays, nulls) → should fail gracefully ✅ Tests exist
  • Multiple placeholders in single attribute (e.g., href="</%scheme>://</%host>") ✅ Works, tests added
  • Template immutability verification (bind() returns new instance)

The PRE newline test is skipped but the behavior matters—WordPress often outputs preformatted code.

No tests for:

  • Very long templates/replacements (memory/performance)
  • Deeply nested templates (stack depth)
  • Unicode/multibyte characters in placeholder names
  • SVG/MathML elements

API Design Questions

1. Why from()bind()render() instead of a single call? For reusable templates, the separation makes sense. But most WordPress use cases are one-shot renders. A convenience method like ::render($template, $replacements) would be more ergonomic.

2. render() Returns string|false—Is That Right? Returning false on error is classic WordPress, but it means every call site needs error checking. Exceptions or a Result type would be safer, though breaking with WP conventions.

3. What About Streaming/Chunked Output? For large templates, building the entire string in memory before output is wasteful. No consideration of echo-as-you-go.


What's the Competition Doing That WP Isn't?

  • Blade/Twig: Rich conditionals, loops, inheritance. WP_HTML_Template is deliberately minimal.
  • htmx/alpine patterns: Server-side templating with client-side enhancement. No thought given here.
  • Lit/React SSR: Hydration markers. Not applicable, but WP could learn from structured output strategies.

The ticket's philosophy is "prefer trust and safety over features"—valid, but means this won't replace most template use cases. It's targeted at safe HTML generation, not a full templating language.


Summary Table

Requirement Status Notes
Context-aware escaping Works for text and attributes
Funky comment placeholders Clean implementation
Nested HTML via Templates Text context only
Boolean attributes true/false/null for whole-attr
Duplicate attribute removal Stripped during compile()
URL escaping Only generic escaping
Attribute spread Not implemented
Output format methods Not implemented
Tag Processor integration Separate class
Table context support Explicitly unsupported
i18n integration Not addressed
RAWTEXT/RCDATA replacement Placeholders don't work inside

Pending Work

  1. Revert Tag Processor visibility changesclass-wp-html-tag-processor.php has $attributes and $duplicate_attributes changed from private to protected. Revert these and use Closure::bind in WP_HTML_Template instead.