Sanitized markdown rendering + smooth streaming reveal#13
Merged
Conversation
…at-widget Assistant replies and citation snippets rendered as raw plain text, so **bold**, numbered lists, and [links](url) showed literally. Render them through a tiny safe-by-default markdown renderer (markdown.ts) that escapes all text (raw <script>/<img onerror>/<iframe> stay inert), drops images, allows only http(s) links (target=_blank + rel=noopener noreferrer nofollow), emits an allowlisted tag set, and downgrades headings to bold so they fit a chat bubble. Hand-rolled over markdown-it (~10KB gz) to keep the embeddable global bundle small: +5.3KB gz for BOTH features combined. User bubbles and mid-stream text stay plain (partial markdown renders ugly); only the final assistant turn renders markdown. Citation snippets are cleaned first (strip leading logo/nav boilerplate, collapse whitespace, truncate ~260 chars at a word boundary). Also smooth the streaming reveal: buffer incoming token text and reveal it via a requestAnimationFrame typewriter at an adaptive rate (chars/frame scales with the pending backlog so it never lags the network); update only the one streaming bubble per frame, snap to the full markdown render on finalize, respect prefers-reduced-motion, and keep auto-scroll without fighting a scrolled-up visitor. Tests: markdown unit tests incl. XSS payloads (<script>, <img onerror>, [x](javascript:), , <a onclick>); streaming reveal unit tests (burst → full text, reduced-motion snap); e2e drives the real shadow-DOM UI through the mock WS to assert <strong>/<li>/<a> render and no <img>/<script> in the DOM. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
🦋 Changeset detectedLatest commit: d5d6168 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Assistant replies and citation snippets in
@smooai/chat-widgetrender as raw plain text (everything goes throughtextContentas an XSS guard), so**bold**, numbered lists, and[links](url)show literally. Separately, the streaming bubble updates in lockstep withstream_tokenarrival — tokens come in variable-size bursts at uneven rates, so the text jumps in jerky chunks.Solution
1. Sanitized markdown rendering (
src/markdown.ts)A tiny, safe-by-default markdown→HTML renderer — no new dependency. Chosen over markdown-it (
html:false, ~10KB gz into an embeddable global bundle) because the requirement is a constrained safe subset; the renderer is safe by construction:<script>is treated as text.<script>,<img onerror=…>,<iframe>render as inert text.→ alt text, never an<img>(a scraped tracking pixel can't load).http(s)only (javascript:/data:/relative → plain text), withtarget="_blank"+rel="noopener noreferrer nofollow".pbrstrongemul/ol/licode/preablockquote. Headings downgraded to bold lines (a full<h1>is too big in a bubble/card).Wired into
element.ts: only the final assistant turn renders markdown (user bubbles + mid-stream text stay plain — partial markdown renders ugly). Citation snippets are cleaned first — strip a leading logo image/link + nav boilerplate, collapse whitespace, truncate ~260 chars at a word boundary — then rendered. Aurora-Glass styles added for the rendered elements (list indent, code/pre mono + subtle bg, themed links, tight spacing).2. Smooth streaming reveal
Display is decoupled from token arrival: incoming token text is buffered and revealed via a
requestAnimationFrametypewriter at an adaptive rate (chars/frame scales with the pending backlog, so it drains a deep buffer fast and never falls behind the wire). Only the one streaming bubble'stextContentis updated per frame (no list rebuild); on finalize it snaps to the full markdown render. Respectsprefers-reduced-motion(snap, no animation), keeps the blinking cursor, and keeps auto-scroll without fighting a visitor who has scrolled up.Verification
pnpm typecheck+pnpm test(58 unit tests) +pnpm buildall green.<script>alert(1)</script>,<img src=x onerror=alert(1)>,[x](javascript:alert(1)),, raw<a onclick=…>,data:links — asserted via real DOM parse (noscript/img/iframenodes, noon*attrs, http(s)-only hrefs).repro-stream-mock.spec.ts): drives the real shadow-DOM UI through a mock WS and asserts a streamed**bold**/list/[link]reply renders to<strong>/<li>/<a>, and that<img>/<script>payloads produce zero live nodes.Bundle size
Global IIFE bundle: 23.2KB → 28.5KB gz (+5.3KB gz for both features), vs ~10KB gz for markdown-it alone.
Release
Changeset = minor → publishes 0.6.0 on merge.