RFC 237: Exclusions in WEB_FEATURES.yml files by jugglinmike · Pull Request #237 · web-platform-tests/rfcs

jugglinmike · 2026-03-26T21:40:24Z

jcscottiii · 2026-04-01T19:06:29Z

The goal of establishing symbolic relationships (Single Source of Truth) is fantastic and definitely the right direction for maintainability over time.

I notice the RFC leans toward a string-based micro-syntax (!#) because it keeps the YAML schema "flat" and simple for the parser. However, I believe this shifts the complexity onto the human author. (or your favorite AI haha)

I’d love for us to consider a hybrid like Schema that supports both simple strings and objects.

Example

We keep simple strings for standard paths (no visual tax), but allow objects when a rule needs metadata (like an exclusion).

features:
  - name: alerts
    files:
      - path: ./*
        exclude_ids:       # Standard YAML list, zero ambiguity
          - print
          - logging
  - name: print
    files:
      - ./print-*         # Simple file path can stay a simple string!

Compared to the original

features:
  - name: alerts
    files:
      - ./*
      - "!#print"
      - "!#logging"

One major benefit:

No Custom Parsing Logic: Any standard YAML parser reads exclude_ids as a list natively. We don't need custom regex to find where the feature name starts in !#feature-name. Something I sometimes regret doing with the **

jugglinmike · 2026-04-07T02:41:48Z

Hi @jcscottiii! Thanks for your feedback!

One novel aspect of your proposal is that it scopes exclusions to individual path patterns. While I think that could be tenable, it's not a capability that we've specifically felt a need for.

It sounds like we're aligned on prioritizing human authors/readers. To that end, I think "scoped" exclusions may make these rules more difficult to understand since they would effectively introduce a grouping operator that hasn't been motivated by experience. (Well, not our experience, anyway. I'd be happy to hear about any instances where you wanted it!)

How would you feel about expressing exclusions with a standalone object so that each list item could be either a string value or a dict with a single key (namely, exclude_ids)? For example:

 features:
   - name: alerts
     files:
-      - path: ./*
-        exclude_ids:       # Standard YAML list, zero ambiguity
+      - ./*
+      - exclude_ids:       # Standard YAML list, zero ambiguity
           - print
           - logging
   - name: print
     files:
       - ./print-*         # Simple file path can stay a simple string!

Anecdotally, I've only observed a small number (read: 1 to 3) of exclusions per feature entry, so a nested list like exclude_ids might be more structure than we truly need. I'm curious about simplifying it further to a string value for a key named exclude_id (despite the bit of repetition it adds to the running example):

 features:
   - name: alerts
     files:
-      - path: ./*
-        exclude_ids:       # Standard YAML list, zero ambiguity
-          - print
-          - logging
+      - ./*
+      - exclude_id: print   # Standard YAML string, zero ambiguity
+      - exclude_id: logging # Standard YAML string, zero ambiguity
   - name: print
     files:
       - ./print-*         # Simple file path can stay a simple string!

...but I don't feel this flattening would have a huge impact on ergonomics, so I could go either way!

In any case (in your proposal and in my suggested amendments), the ./ prefix is not technically necessary. Are you suggesting it should become mandatory?

jcscottiii · 2026-04-08T15:42:19Z

@jgraham mentioned that this should be a set of rules to web-feature-ids. Something like this:

	file-1.html: css-flexbox
	file-2.html: css-grid
	*: css-multicol

I'll let @jgraham provide a more thorough review.

jugglinmike · 2026-05-04T22:29:21Z

A bit more context for the benefit of tomorrow's RFC triage call:

Since this RFC is about streamlining the metadata, here's an approximation of its impact. While the syntax is still under consideration, that patch shows the order of magnitude: "130 files changed, 275 insertions(+), 791 deletions(-)"

Until recently, WPT didn't formally document the WEB_FEATURES.yml files. We landed some documentation last week, so you can now find it online at https://web-platform-tests.org/writing-tests/out-of-band-metadata.html

Looking forward to discussing more tomorrow!

jgraham · 2026-05-05T11:14:50Z

@jgraham mentioned that this should be a set of rules to web-feature-ids

Yes.

Mapping feature ids to filename patterns makes sense if you are trying to work out "which files are part of this feature". You find the id you care about and then evaluate the given rules against all the files in the directory. That is indeed a common use case for this data, but critically it's a use case that's basically always automated.

On the other hand if you have one file and want to know what features it will correspond to you have to evaluate the full set of rules. That's a use case you have when adding a file when you want to know if it will be correctly labelled, or if a rule update is required. Typically that use case isn't automated.

So if we want to optimise for the latter, we should look to make it as easy as possible to figure out what features correspond to a given file. In that case having the rules be a list of patterns that could apply and stopping on the first that does apply seems likely to be much easier to work with e.g.

*-grid-*: [css-grid, css-multicol]
*-flex-*: [css-flexbox, css-multicol]
*: css-multicol

There is arguably a bug there that a file which has both -grid- and -flex in it would only be labelled as css-grid and css-multicol. In theory that seems bad. In practice it seems likely to be fine to fix it by adding another rule like:

*-grid-flex-*: [css-grid, css-flexbox, css-multicol]

and just assuming that people can follow a naming convention rather than requiring a perfectly general syntax. In particular, I think that once you start making things additive you end up back at the starting point for this RFC which is "we need to invent an exclusion mechanism so that some rules don't apply in some cases" and then you're back at it being really hard for a human to correctly deduce the impact of the rules.

jugglinmike · 2026-05-28T17:55:16Z

we should look to make it as easy as possible to figure out what features correspond to a given file.

I believe that we can achieve that goal without any changes to the schema. Since all file-matching patterns are strictly ordered in the current design (via lists of lists), we can simply change the way the files are interpreted (namely by adding strict precedence). Just like in the flat structure that @jgraham sketched out above, this will obviate the need for the ! prefix, reducing total entry count.

This might be worth considering because the flat structure would introduce more repetition in entries that hold a lot of patterns. The pathological case is ./css/css-conditional/container-queries/WEB_FEATURES.yml with its 128 patterns for container-queries. We shouldn't design for the pathological case, though, so I've collected some stats to give a better sense for the number of file patterns typically used (taking care to ignore the !-prefixed entries).

Mean: 2.3628988642509463
Standard deviation: 5.748726896571267

xychart
    title "File patterns per web-feature entry"
    x-axis "# of web-features" [1, 2, 3, 4, 5, 6, 7, 8, 9+]
    y-axis "# of file patterns" 1 --> 869
    bar [869, 748, 78, 43, 27, 17, 12, 4, 51]

Grist for the mill, in any case!

Source code

#!/usr/bin/env python3

import itertools
import os
import yaml

MAX_BUCKET = 9

def find(root):
    for dirpath, dirnames, filenames in os.walk(root):
        if 'WEB_FEATURES.yml' not in filenames:
            continue

        filename = os.path.join(dirpath, 'WEB_FEATURES.yml')
        with open(filename, 'r') as handle:
            yield (filename, yaml.safe_load(handle)['features'])

def int_list(ints):
    return ', '.join(map(str, ints))

def render(mean, standard_deviation, grouped):
    x_axis = int_list(range(1, MAX_BUCKET)) + f', {MAX_BUCKET}+'
    max_value = max(grouped.values())
    values = int_list(grouped.values())
    return f'''
Mean: {mean}  
Standard deviation: {standard_deviation}

```mermaid
xychart
    title "File patterns per web-feature entry"
    x-axis "# of web-features" [{x_axis}]
    y-axis "# of file patterns" 1 --> {max_value}
    bar [{values}]
```
    '''

def main(root):
    file_entry_counts = []
    grouped = {x: 0 for x in range(1, MAX_BUCKET + 1)}
    for filename, features in find(root):
        for feature in features:
            # Negation entries will be obviated by the new semantics
            count = len([x for x in feature['files'] if not x.startswith('!')])
            grouped[min(MAX_BUCKET, count)] += 1

            file_entry_counts.append(count)

    size = len(file_entry_counts)
    mean = sum(file_entry_counts) / size
    variance = sum([(count - mean) ** 2 for count in file_entry_counts]) / size
    standard_deviation = variance ** 0.5
    return render(mean, standard_deviation, grouped)

if __name__ == '__main__':
    print(main('.'))

jugglinmike · 2026-06-04T18:00:58Z

@jcscottiii @jgraham @gsnedders I've updated the proposal to reflect the consensus of the latest RFCs & Infrastructure meeting.

jcscottiii

This looks a lot better. I have some questions. Feel free to post your thoughts on them.

jcscottiii · 2026-06-10T15:50:53Z

+```yaml
+features:
+- foo.html: NULL
+- print-*: print
+- "*": alerts


The old schema used explicit properties (name and files). Because of this, it was trivial to add new metadata fields in the future (e.g., bug_url, reason) without breaking the parser.

We could change it be something like:

Suggested change

```yaml

features:

- foo.html: NULL

- print-*: print

- "*": alerts

```yaml

features:

- pattern: "print-*"

features: ["print"]

- pattern: "*"

features: ["alerts"]

- pattern: "foo.html"

features: []

reason: "Excluded due to flakiness" # Easily extensible in the future. And not just rely on yaml comments that don't make it into the metadata

If we feel that future extensions are sufficiently likely (and if we think that another schema revision would be sufficiently disruptive), then I would vote for keeping the existing design as-is. As noted in this latest version of the RFC, the schema currently in use is expressive enough to implement the new semantics. And as you note, the schema currently in use also supports extension. In addition, it is the most concise out of all the designs we've considered thus far.

That said, I've read in @jgraham's feedback a general interest in promoting usability for contributors today. The latest proposal seems optimized for that. It might be easier to justify a design that favors future work if we had some indication about the likelihood/timeline for that work.

If we wanted to extend it we could do it like

features: - print-*: features: ["print"] reason: "All the print reftests"

That's slightly less verbose in any case and would continue to allow the optimisation where the right hand side can just be a string or a list and that's interpreted as identical to an object with just a "features" key.

I like @jgraham's version of this. Thoughts on adopting this way? @jugglinmike. I am in favor of this. For now, features would be the only key.

@jgraham's suggestion is an extension to syntax currently proposed by the RFC, not an alternative to it. I'm in favor of supporting an extension because it would mean we would only pay the price of overhead when we need it, and because I still don't have any insight into the likelihood/timeline of additional metadata.

Although making that syntax the only option would simplify the parser, it would still increase verbosity well beyond what our current needs require. I feel that the existing syntax strikes a much stronger balance between parser simplicity, maintainability, and concision.

In any case, I think all extant proposals which include a property name for the list of web-feature IDs should be revised. In the case of the existing syntax, the property name is "name". It should instead be "ids" both to reflect the plurality of the value and to align with the WebDX terminology (where a web-feature's "name" is distinct from its "ID"). The name "ids" is especially appealing in this latest alternative because it avoids using the same property for two distinct parts of the data structure.

It seems like we're close; here are the options that are on the table right now:

keep the existing syntax (but replace "name" with "ids"):
features: - ids: [] files: - foo.html - ids: [print] files: - print-*

use the abbreviated syntax with an optional extension:
features: - foo.html: [] - print-*: ids: [print-*]

use the extended form of the abbreviated syntax only:
features: - foo.html: ids: [] - print-*: ids: [print-*]

My preference is option 2 followed by option 1 followed by option 3. What do you folks think?

We can go with option 2.

For background: Initially I was in favor of option 3 because I was looking ahead to increased automation (e.g., tooling like wpt-gen or LLMs). Consistent schemas (always objects) are generally easier for tools to write and modify programmatically without bugs.

Since these YAML files are internal to WPT infrastructure and changes here won't impact the final consumers of WEB_FEATURES_MANIFEST.json, I felt we had the flexibility to prioritize that tool-friendliness.

But after thinking about it, I agree with what you said that keeping the barrier to entry low for human contributors is really important (which option 2 does).

One thing I wanted to point out. I like the fact that you mentioned changing name to ids. Maybe for option 2, we could do the same for the high level key and change to files or rules. Either is okay. But, the fact that the underlying list is essentially pointing to files or rules makes more sense than features now.

files: # or rules here. - foo.html: [print, alerts] - print-*: ids: [print] reason: "All the print reftests"

Awesome!

Swapping out "features" makes a lot of sense to me! I like "rules" more than "files" because the values describe more than just files.

This conversation about extensibility is worth highlighting. I've tried to do that by framing it as a risk within the RFC itself: how the proposal appears to be less extensible, but how we've considered the direction that future additions can make. This will help guide our grandchildren when they find the time to add metadata.

Thanks @jcscottiii! Thanks @jgraham!

jcscottiii

@jugglinmike Sorry it took so long on the follow up review. PTAL at the unresolved conversations.

jcscottiii · 2026-06-18T15:21:51Z

 deviation of approximately 5.7487. This reduces the benefit of optimizing for
 the case of web-features with many associated file-matching patterns.

+### Extendability


Thanks for adding this section! I get that this is a risk. But supporting both the short hand and extended form out of the box will be good long term!

RFC 236: Exclusions in WEB_FEATURES.yml files

4c5469a

jugglinmike mentioned this pull request Mar 26, 2026

Extend WEB_FEATURES.yml to support feature exclusions web-platform-tests/wpt#58757

Draft

jugglinmike changed the title ~~RFC 236: Exclusions in WEB_FEATURES.yml files~~ RFC 237: Exclusions in WEB_FEATURES.yml files Mar 26, 2026

jugglinmike added 2 commits March 26, 2026 17:47

Correct RFC number

2361fbe

Add missing word

14a034e

jcscottiii mentioned this pull request Apr 1, 2026

Adds new lint rule to detect test overlap in web feature files web-platform-tests/wpt#58827

Draft

Correct typos

31cb54f

Update proposal

e3e50b3

jugglinmike requested review from gsnedders, jcscottiii and jgraham June 4, 2026 17:59

jcscottiii reviewed Jun 10, 2026

View reviewed changes

Implement review feedback

7aef291

jgraham approved these changes Jun 12, 2026

View reviewed changes

jgraham reviewed Jun 12, 2026

View reviewed changes

Comment thread rfcs/web_features_exclusions.md

jcscottiii reviewed Jun 16, 2026

View reviewed changes

jugglinmike added 3 commits June 16, 2026 16:37

Require web-feature IDs to be declared in a list

b5d74ad

Replace "features" with "rules"

380c8be

Add section on extensibility

d3984a5

jcscottiii approved these changes Jun 18, 2026

View reviewed changes

-```yaml
-features:
-- foo.html: NULL
-- print-*: print
-- "*": alerts
+```yaml
+features:
+- pattern: "print-*"
+  features: ["print"]
+- pattern: "*"
+  features: ["alerts"]
+- pattern: "foo.html"
+  features: []
+  reason: "Excluded due to flakiness" # Easily extensible in the future. And not just rely on yaml comments that don't make it into the metadata

Conversation

jugglinmike commented Mar 26, 2026

Uh oh!

jcscottiii commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Example

Compared to the original

Uh oh!

jugglinmike commented Apr 7, 2026

Uh oh!

jcscottiii commented Apr 8, 2026

Uh oh!

jugglinmike commented May 4, 2026

Uh oh!

jgraham commented May 5, 2026

Uh oh!

jugglinmike commented May 28, 2026

Uh oh!

jugglinmike commented Jun 4, 2026

Uh oh!

jcscottiii left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jcscottiii Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

jugglinmike Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

jgraham Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

jcscottiii Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

jugglinmike Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

jcscottiii Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jugglinmike Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jcscottiii left a comment

Choose a reason for hiding this comment

Uh oh!

jcscottiii Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jcscottiii commented Apr 1, 2026 •

edited

Loading

jcscottiii Jun 17, 2026 •

edited

Loading

jcscottiii Jun 18, 2026 •

edited

Loading