Commit 08d6e0a
JSON escape rendered shoebox content
Once rendering is completed, the contents of the shoebox is converted
into a string of JSON and inserted into the HTML output that is sent
back to the browser.
However, because JSON and HTML content are mixed, there is the potential
for security vulnerabilities. Specifically, if an attacker can cause an
application to place user-generated content into the shoebox, that
content could trick the browser into thinking JSON parsing had ended,
and evaluate arbitrary code in the origin of the host.
For example, if an untrusted user could supply an article with the
title of `</script><script>alert("owned")</script>`, the naive
interpolation of that into the shoebox might look like:
```html
<script type="fastboot/shoebox" id="shoebox-article">
{"article":{"title":"</script><script>alert("owned")</script>"}}
</script>
```
In this case, the browser would interpret the `</script>` inside the
JSON string as a real closing `script` tag, and thus would allow the
attacker's code to execute in the application's origin ("XSS").
Upon examining the HTML5 parser specification, [we can observe that there
is one, and only one, way to exit the "script data" state][spec]: the existence
of a `<` character, which moves the state machine into the "script data
less-than sign state". From the "script data less-than sign state",
there are several more states that can be traversed through, and it
requires the creation of a temporary buffer.
[spec]: https://www.w3.org/TR/html5/syntax.html#script-data-state
Thus we can conclude that the simplest, most effective way to prevent
inadvertent end-of-script situations is to prevent the `<` character
from ever appearing in shoebox content. If you never leave the "script
data" state, you can feel fairly certain that you have prevented this
particular vector of XSS attacks.
The good news is that this is easily accomplished. Both the JavaScript
specification and the JSON specification allow for [Unicode escape
sequences](https://mathiasbynens.be/notes/javascript-escapes#unicode).
Before insertion into the HTML document, we can replace characters that
could be ambiguous to the HTML parser and replace them with Unicode
escape sequences. These are no different from the unescaped values to
the eyes of the JSON or JavaScript parser, but give us a high degree of
confidence that the HTML parser will not attempt to treat them as
anything other than script data.
This commit Unicode escapes the following characters:
* `<` and `>`, to prevent ambiguity with opening and closing tags.
* `&`, to prevent ambiguity with HTML entities.
* `\u2028` and `\u2029`, Unicode line/paragraph separators, which the
JSON parser and JavaScript parser treat differently and thus can lead
to mismatched data if JavaScript is used as the JSON parser.1 parent 9899016 commit 08d6e0a
3 files changed
Lines changed: 42 additions & 15 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
310 | 310 | | |
311 | 311 | | |
312 | 312 | | |
313 | | - | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
314 | 317 | | |
315 | 318 | | |
316 | 319 | | |
| |||
322 | 325 | | |
323 | 326 | | |
324 | 327 | | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
325 | 344 | | |
326 | 345 | | |
327 | 346 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
23 | 28 | | |
24 | 29 | | |
25 | 30 | | |
| |||
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
0 commit comments