HTML API: Introduce CSS class-list splitter.#10043
HTML API: Introduce CSS class-list splitter.#10043dmsnell wants to merge 2 commits intoWordPress:trunkfrom
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
b1fa4c5 to
e0a9bff
Compare
| * @return Generator Use this in a foreach loop to iterate over the class names. | ||
| */ | ||
| function wp_split_class_names( $class_attribute_string ) { |
There was a problem hiding this comment.
| * @return Generator Use this in a foreach loop to iterate over the class names. | |
| */ | |
| function wp_split_class_names( $class_attribute_string ) { | |
| * @return Generator<non-empty-string> Use this in a foreach loop to iterate over the class names. | |
| */ | |
| function wp_split_class_names( $class_attribute_string ): Generator { |
Naturally, the same should be applied to \WP_HTML_Tag_Processor::class_list()
There was a problem hiding this comment.
Something
function wp_split_class_names( $class_attribute_string ): ?Generator {
There was a problem hiding this comment.
Given that this function contains yield statements, I don't think it ever can return null. It will always return Generator. See https://3v4l.org/YgDnb
| @@ -0,0 +1,59 @@ | |||
| <?php | |||
|
|
|||
| * @return Generator Use this in a foreach loop to iterate over the class names. | ||
| */ | ||
| function wp_split_class_names( $class_attribute_string ) { | ||
| if ( '' === $class_attribute_string ) { |
There was a problem hiding this comment.
| if ( '' === $class_attribute_string ) { | |
| if ( '' === $class_attribute_string || ! is_string( $class_attribute_string ) ) { |
There was a problem hiding this comment.
added in 8185519
I’m really torn over these new functions and how to balance types. I don’t want to make permissive functions that hide problems, but I also don’t want sites to crash.
maybe we should add a _doing_it_wrong() for the non-string case?
There was a problem hiding this comment.
To me I think types are good to use except in the filter callback context, where other plugins can return mixed values from other callbacks.
So actually I think I would go ahead and add the string type to this param then there's no need for the type check. I don't think all APIs need be crafted with kid gloves. Adding type checks for each and every function and _doing_it_wrong() for every parameter seems not ideal, as PHP has this built-in with types, and otherwise there is phpdoc for static analysis to alert to the dev that they're doing it wrong.
There was a problem hiding this comment.
Adding type checks for each and every function and _doing_it_wrong() for every parameter seems not ideal
With this I agree. My main problem is that the static type annotations have a poor mix of behaviors:
- They will lead to crashes in production when we know developers will likely be testing with string values. They are less likely to test for the cases that crash it, and may miss that they have type errors.
- They are not enforced, meaning the addition of type annotations and the removal of type checks hides invalid values deeper in the code. Even if this code had strict mode enabled, PHP wouldn’t enforce the types unless the file containing the calling code enables strict mode as well.
I’ve just seen way too many crashes when applying type annotations to function arguments, and it makes me feel like it’s a bad fit here.
| } | ||
|
|
||
| // Get these from the HTML API to avoid ad-hoc parsing HTML or CSS class names. | ||
| $processor = new WP_HTML_Tag_Processor( '<wp-noop>' ); |
There was a problem hiding this comment.
If WP_HTML_Tag_Processor or class_list() can throw exceptions? Consider try/catch or input validation to avoid fatal errors.
There was a problem hiding this comment.
They should not throw exceptions.
e2b7b9b to
a561a69
Compare
e6ff4e2 to
cdb967d
Compare
What about |
Seems cool, but do we have any use cases for this in core PHP? It would be nice to include some example implementations in the core codebase for this function to actually leverage it. |
I like this, though I still like Would love to continue stewing on the name. Overly-short, overly-long, it’s hard to find one that’s just right. |
82e9ae0 to
500bbb8
Compare
|
I’ve turned this into a static method on the Tag Processor, but I instantly don’t like it because it lost the nuance of decoding HTML character references. This is a conundrum, however, because existing code mixes decoded and non-decoded class names. For example, code will read the I may revert the last commit. While it’s helpful that this function properly splits and deduplicates that class names, decoding the HTML character references was an important piece as well, and I think that’s a bit harder to merge into the Tag Processor’s interface. |
500bbb8 to
81ef5e0
Compare
|
@westonruter I tossed out some refactors in #10215. They highlight two things to me:
It also leads me to feel like having a new separate function is best and exporting the internals of the HTML API is a mistake. Perhaps there is room for two new functions:
Something like this to more clearly communicate whether things like null bytes and character references shall be transformed or whether it’s assumed that the class names are the “raw” and unescaped class names build within source code. |
858440d to
661d588
Compare
661d588 to
9430d73
Compare
This patch introduces a new CSS helper module containing a new function, `wp_split_class_names()`. This function wraps some code to rely on the HTML API to take an HTML `class` attribute value and return a `Generator` to iterate over the classes in that value. Many existing functions perform ad-hoc parsing of CSS class names, usually by splitting on a space character. However, there are issues with this approach: - There is no decoding of HTML character references, which is normative inside HTML attributes. - There is no handling of null bytes. - Class names can be split by more than just the space character. - There is no handling of duplicates, and while mostly benign, code forgetting to account for duplicates can lead to defects. The new function handles the nuances to let developers focus on reading CSS class names, adding new class names, and removing class names. This serves a middleground between legacy code interacting with CSS class names in isolation and code processing full HTML documents.
9430d73 to
e5a3a62
Compare
Trac ticket: Core-63694.
This patch extracts
WP_HTML_Tag_Processor->class_list()for static calls via a new staticWP_HTML_Tag_Processor::parse_class_list()method. This new method contains the internal CSS parsing code to take an HTMLclassattribute value and return aGeneratorto iterate over the classes in that value. Class names are appropriately deduplicated according to the given document compatibility mode, whose default is no-quirks mode.Design review requests
The name isn’t great. It iterates over the class names, splits them, but also “deduplicates” them according to the parsing rules for aclassattribute _no-quirks-mode_ HTML and it decodes HTML entities. It also _MUST_ represent a fullclassattribute because the parsing of trailing character references which are missing a semicolon or otherwise incomplete is dependent on whether they fall at the end of a string.wp_classname_walker()wp_walk_class_attribute()wp_unique_classnames()classnames()in JS? We could pass varargs which arestring|falseor an array of additional class names to add.Background
Many existing functions perform ad-hoc parsing of CSS class names, usually by splitting on a space character. However, there are issues with this approach:
The new function handles the nuances to let developers focus on reading CSS class names, adding new class names, and removing class names. This serves a middleground between legacy code interacting with CSS class names in isolation and code processing full HTML documents.