Skip to content

Upstream sanitizer api#12395

Open
noamr wants to merge 24 commits intowhatwg:mainfrom
noamr:zcorpan/upstream-sanitizer-api
Open

Upstream sanitizer api#12395
noamr wants to merge 24 commits intowhatwg:mainfrom
noamr:zcorpan/upstream-sanitizer-api

Conversation

@noamr
Copy link
Copy Markdown
Collaborator

@noamr noamr commented Apr 21, 2026

Convert the incubated spec in https://wicg.github.io/sanitizer-api/ to the HTML format and make it part of the HTML standard.

(See WHATWG Working Mode: Changes for more details.)


/canvas.html ( diff )
/comms.html ( diff )
/dom.html ( diff )
/dynamic-markup-insertion.html ( diff )
/embedded-content-other.html ( diff )
/form-elements.html ( diff )
/imagebitmap-and-animations.html ( diff )
/index.html ( diff )
/indices.html ( diff )
/infrastructure.html ( diff )
/interaction.html ( diff )
/microdata.html ( diff )
/parsing.html ( diff )
/rendering.html ( diff )
/system-state.html ( diff )
/timers-and-user-prompts.html ( diff )
/web-messaging.html ( diff )
/webstorage.html ( diff )
/workers.html ( diff )

@noamr noamr marked this pull request as draft April 21, 2026 13:16
@noamr noamr changed the base branch from zcorpan/upstream-sanitizer-api to main April 21, 2026 13:17
@noamr noamr changed the title WIP upstream sanitizer api Upstream sanitizer api Apr 21, 2026
@noamr noamr force-pushed the zcorpan/upstream-sanitizer-api branch from 223a4d1 to d2034e5 Compare April 21, 2026 19:42
@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 21, 2026

@zcorpan @evilpie @mozfreddyb @otherdaniel
initial review? :)
this is quite a big PR...

@evilpie
Copy link
Copy Markdown
Member

evilpie commented Apr 22, 2026

Amazing, thanks for working on this.

The built-in safe default configuration is pretty integral to the API, where did I go?

For anyone else looking at this, the gist of the changes are in dynamic-markup-insertion.html.

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 22, 2026

Amazing, thanks for working on this.

The built-in safe default configuration is pretty integral to the API, where did I go?

Oh you're right I had it on my todo list and forgot. Getting to it. Thanks!

Comment thread source
Comment thread source Outdated
@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 22, 2026

Amazing, thanks for working on this.
The built-in safe default configuration is pretty integral to the API, where did I go?

Oh you're right I had it on my todo list and forgot. Getting to it. Thanks!

Done.

@noamr noamr marked this pull request as ready for review April 22, 2026 10:56
Comment thread source Outdated
Comment on lines +127178 to +127179
<p>The <dfn>built-in safe baseline configuration</dfn> is the result of <span data-x="parse a JSON
string to an Infra value">parsing</span> the following JSON string:</p>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use this style for all of the built-in configs, since IMO the JSON is more readable.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very strange to me. We've always used tables or lists for this kind of data. Using pseudo-code seems contrary to our usual style.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea perhaps we should stay consistent here. I don't feel strongly about this though. @zcorpan are you ok with me reverting this back to tables with links?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Tables are maybe even better for readability.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I think. PR preview is broken though.

@annevk
Copy link
Copy Markdown
Member

annevk commented Apr 22, 2026

I thought as part of moving this into the HTML standard we'd also address the parser integration issue?

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 23, 2026

I thought as part of moving this into the HTML standard we'd also address the parser integration issue?

This is a huge PR so I thought doing it in two stages, the first one being a purely technical upstream, would be easier to review?

Open and happy to incorporate the stream-while-parsing changes in this PR if you and @zcorpan are ok to review that in one go.

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 23, 2026

@zcorpan @annevk can we align on whether we upstream the sanitizer as is and then change it to stream-while-parsing, or do it in one go? I'm perfectly happy with both options.

@noamr noamr added the agenda+ To be discussed at a triage meeting label Apr 23, 2026
@zcorpan
Copy link
Copy Markdown
Member

zcorpan commented Apr 23, 2026

I prefer doing the parser integration in a follow-up PR.

Comment thread source
<li><p>If <var>element</var> is a string, then return a new
<span>SanitizerElementNamespace</span> dictionary with its <code
data-x="dom-SanitizerElementNamespace-name">name</code> member set to <var>element</var> and its
<code data-x="dom-SanitizerElementNamespace-namespace">_namespace</code> member set to the
Copy link
Copy Markdown
Member

@evilpie evilpie Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to use _namespace instead of namespace everywhere? I though that was only necessary in IDL to make the parser happy.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't. So the dictionary member in the IDL is _namespace but when the dictionary is looked up it's namespace? Does that work with WebIDL generation etc?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the underscore is just escaping in IDL syntax.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@noamr noamr removed the agenda+ To be discussed at a triage meeting label Apr 23, 2026
@evilpie
Copy link
Copy Markdown
Member

evilpie commented Apr 23, 2026

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 23, 2026

I think these three PRs would be good to merge before merging into the HTML standard:

Since some security sensitive changes rely on "sanitizing while parsing", and that in turn relies on the current post-processing sanitizer being upstreamed, I don't think we should delay upstreaming any further.

Can we race it? If any of these go in before the upstream PR is in I'll incorporate them into the HTML PR.

@noamr noamr closed this Apr 23, 2026
@noamr noamr reopened this Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants