Skip to content

Sanitized markdown rendering + smooth streaming reveal#13

Merged
brentrager merged 1 commit into
mainfrom
SMOODEV-markdown-render
Jun 27, 2026
Merged

Sanitized markdown rendering + smooth streaming reveal#13
brentrager merged 1 commit into
mainfrom
SMOODEV-markdown-render

Conversation

@brentrager

Copy link
Copy Markdown
Contributor

Problem

Assistant replies and citation snippets in @smooai/chat-widget render as raw plain text (everything goes through textContent as an XSS guard), so **bold**, numbered lists, and [links](url) show literally. Separately, the streaming bubble updates in lockstep with stream_token arrival — tokens come in variable-size bursts at uneven rates, so the text jumps in jerky chunks.

Solution

1. Sanitized markdown rendering (src/markdown.ts)

A tiny, safe-by-default markdown→HTML renderer — no new dependency. Chosen over markdown-it (html:false, ~10KB gz into an embeddable global bundle) because the requirement is a constrained safe subset; the renderer is safe by construction:

  • It is a tokenizer, not an HTML passthrough — it only ever emits a fixed allowlist of tags. A literal <script> is treated as text.
  • Every text run is HTML-escaped, so raw <script>, <img onerror=…>, <iframe> render as inert text.
  • Images dropped entirely![alt](src) → alt text, never an <img> (a scraped tracking pixel can't load).
  • Links gated to absolute http(s) only (javascript:/data:/relative → plain text), with target="_blank" + rel="noopener noreferrer nofollow".
  • Allowlisted tags: p br strong em ul/ol/li code/pre a blockquote. Headings downgraded to bold lines (a full <h1> is too big in a bubble/card).

Wired into element.ts: only the final assistant turn renders markdown (user bubbles + mid-stream text stay plain — partial markdown renders ugly). Citation snippets are cleaned first — strip a leading logo image/link + nav boilerplate, collapse whitespace, truncate ~260 chars at a word boundary — then rendered. Aurora-Glass styles added for the rendered elements (list indent, code/pre mono + subtle bg, themed links, tight spacing).

2. Smooth streaming reveal

Display is decoupled from token arrival: incoming token text is buffered and revealed via a requestAnimationFrame typewriter at an adaptive rate (chars/frame scales with the pending backlog, so it drains a deep buffer fast and never falls behind the wire). Only the one streaming bubble's textContent is updated per frame (no list rebuild); on finalize it snaps to the full markdown render. Respects prefers-reduced-motion (snap, no animation), keeps the blinking cursor, and keeps auto-scroll without fighting a visitor who has scrolled up.

Verification

  • pnpm typecheck + pnpm test (58 unit tests) + pnpm build all green.
  • XSS unit tests prove neutralization of <script>alert(1)</script>, <img src=x onerror=alert(1)>, [x](javascript:alert(1)), ![y](http://evil/x.png), raw <a onclick=…>, data: links — asserted via real DOM parse (no script/img/iframe nodes, no on* attrs, http(s)-only hrefs).
  • Streaming unit tests: a burst of tokens reveals the complete final text (no dropped/dup chars); reduced-motion snaps instantly.
  • e2e (repro-stream-mock.spec.ts): drives the real shadow-DOM UI through a mock WS and asserts a streamed **bold**/list/[link] reply renders to <strong>/<li>/<a>, and that <img>/<script> payloads produce zero live nodes.

Bundle size

Global IIFE bundle: 23.2KB → 28.5KB gz (+5.3KB gz for both features), vs ~10KB gz for markdown-it alone.

Release

Changeset = minor → publishes 0.6.0 on merge.

…at-widget

Assistant replies and citation snippets rendered as raw plain text, so
**bold**, numbered lists, and [links](url) showed literally. Render them
through a tiny safe-by-default markdown renderer (markdown.ts) that escapes
all text (raw <script>/<img onerror>/<iframe> stay inert), drops images,
allows only http(s) links (target=_blank + rel=noopener noreferrer nofollow),
emits an allowlisted tag set, and downgrades headings to bold so they fit a
chat bubble. Hand-rolled over markdown-it (~10KB gz) to keep the embeddable
global bundle small: +5.3KB gz for BOTH features combined. User bubbles and
mid-stream text stay plain (partial markdown renders ugly); only the final
assistant turn renders markdown. Citation snippets are cleaned first (strip
leading logo/nav boilerplate, collapse whitespace, truncate ~260 chars at a
word boundary).

Also smooth the streaming reveal: buffer incoming token text and reveal it via
a requestAnimationFrame typewriter at an adaptive rate (chars/frame scales with
the pending backlog so it never lags the network); update only the one
streaming bubble per frame, snap to the full markdown render on finalize,
respect prefers-reduced-motion, and keep auto-scroll without fighting a
scrolled-up visitor.

Tests: markdown unit tests incl. XSS payloads (<script>, <img onerror>,
[x](javascript:), ![y](http://evil/x.png), <a onclick>); streaming reveal
unit tests (burst → full text, reduced-motion snap); e2e drives the real
shadow-DOM UI through the mock WS to assert <strong>/<li>/<a> render and no
<img>/<script> in the DOM.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
@changeset-bot

changeset-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: d5d6168

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@smooai/chat-widget Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@brentrager brentrager merged commit 586003c into main Jun 27, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant