Skip to content

Make Literal::byte_character_value work with bytes as well#157338

Open
GuillaumeGomez wants to merge 1 commit into
rust-lang:mainfrom
GuillaumeGomez:byte_character_value
Open

Make Literal::byte_character_value work with bytes as well#157338
GuillaumeGomez wants to merge 1 commit into
rust-lang:mainfrom
GuillaumeGomez:byte_character_value

Conversation

@GuillaumeGomez

@GuillaumeGomez GuillaumeGomez commented Jun 2, 2026

Copy link
Copy Markdown
Member

View all comments

As noted in this comment, byte_character_value should work for bytes, so this PR fixes it.

r? @traviscross

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 2, 2026
@rust-log-analyzer

This comment has been minimized.

@GuillaumeGomez

Copy link
Copy Markdown
Member Author

Just realized the failure is because of the GCC backend. Not sure why, gonna ignore this test for now for the GCC backend.

@GuillaumeGomez GuillaumeGomez force-pushed the byte_character_value branch from 22e7e40 to d6ee541 Compare June 3, 2026 13:27
@GuillaumeGomez

Copy link
Copy Markdown
Member Author

CI is happy now.

Comment thread library/proc_macro/src/lib.rs Outdated
pub fn byte_character_value(&self) -> Result<u8, ConversionErrorKind> {
self.0.symbol.with(|symbol| match self.0.kind {
bridge::LitKind::Char => {
bridge::LitKind::Char | bridge::LitKind::Byte => {

@traviscross traviscross Jun 3, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bridge::LitKind::Char | bridge::LitKind::Byte => {
bridge::LitKind::Byte => {

What's the explanation for checking for both rather than just for Byte? And what error would we expect character('é').byte_character_value() to return?

View changes since the review

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'A' is a valid byte (like all ascii characters). As for the error, I expect ConversionErrorKind::InvalidLiteralKind.

@traviscross traviscross Jun 4, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the error, I expect ConversionErrorKind::InvalidLiteralKind.

I'm getting Err(FailedToUnescape(NonAsciiCharInByte)) instead.

(Let's please assert the correct answer in the test rather than just checking .is_err().)

@rustbot author

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is even better, gonna update the test though to reflect it.

@traviscross traviscross Jun 8, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'A' is a valid byte (like all ascii characters).

We don't accept:

let _: u8 = 'A'; //~ ERROR mismatched types

The documentation says this function:

Returns the unescaped character value if the current literal is a byte character literal.

The behavior I would therefore expect here is:

#![feature(proc_macro_value, rustc_private)]
extern crate proc_macro;
use proc_macro::{Literal, TokenStream, TokenTree, ConversionErrorKind::*, EscapeError::*};

#[proc_macro]
pub fn demo(input: TokenStream) -> TokenStream {
  let c = r"b'A'".parse::<Literal>().unwrap();
  assert_eq!(c.character_value(), Err(InvalidLiteralKind));
  assert_eq!(c.byte_character_value(), Ok(65));
  let c = r"'A'".parse::<Literal>().unwrap();
  assert_eq!(c.character_value(), Ok('A'));
  assert_eq!(c.byte_character_value(), Err(InvalidLiteralKind));
  "()".parse().unwrap()
}

Thoughts?

@GuillaumeGomez GuillaumeGomez Jun 8, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My base argument for all this is that we already allow to get the value between integer/float types even if the type doesn't match exactly. So I think it's acceptable in this case to be a bit more flexible.

@traviscross traviscross Jun 8, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about what you said about integer/float types. Regarding your example:

But this code is valid:

fn main() {
    let x: u8 = 'a' as _;
    let y: char = b'a' as _;
    println!("{x:?} {y:?}");
}

Note that this branch does not accept this in the opposite direction. I.e., it conforms to this behavior:

#[proc_macro]
pub fn demo(input: TokenStream) -> TokenStream {
  let c = r"b'A'".parse::<Literal>().unwrap();
  assert_eq!(c.character_value(), Err(InvalidLiteralKind));
  assert_eq!(c.byte_character_value(), Ok(65));
  let c = r"'A'".parse::<Literal>().unwrap();
  assert_eq!(c.character_value(), Ok('A'));
  assert_eq!(c.byte_character_value(), Ok(65));
  "()".parse().unwrap()
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Should I allow it to match integer/float types then?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I wouldn't think so. We're operating at a syntactic level here. I would not expect the availability of as casts or other later-stage conversions to affect what we accept at this level. I'd expect byte_character_value to return Ok(…) only for byte character literals.

In a similar way, in #154608, what you implemented conformed to this behavior:

  let c = r"0u8".parse::<Literal>().unwrap();
  assert_eq!(c.u8_value(), Ok(0));
  assert_eq!(c.u16_value(), Err(InvalidLiteralKind));
  assert_eq!(c.f32_value(), Err(InvalidLiteralKind));
  let c = r"0u16".parse::<Literal>().unwrap();
  assert_eq!(c.u8_value(), Err(InvalidLiteralKind));
  assert_eq!(c.u16_value(), Ok(0));
  assert_eq!(c.f32_value(), Err(InvalidLiteralKind));
  let c = r"0".parse::<Literal>().unwrap();
  assert_eq!(c.u8_value(), Ok(0));
  assert_eq!(c.u16_value(), Ok(0));
  assert_eq!(c.f32_value(), Err(InvalidLiteralKind));
  let c = r"0.".parse::<Literal>().unwrap();
  assert_eq!(c.u8_value(), Err(InvalidLiteralKind));
  assert_eq!(c.u16_value(), Err(InvalidLiteralKind));
  assert_eq!(c.f32_value(), Ok(0.));

That seems correct to me. u8_value() accepts 0u8 and 0 as those are both valid u8 literals. u16_value() accepts 0u16 and 0 as those are both valid u16 literals. But u16_value() does not accept 0u8 even though we can as cast from it to a value of type u16. And f32_value() doesn't accept any of the integer literals.

My base argument for all this is that we already allow to get the value between integer/float types even if the type doesn't match exactly.

I'm not certain that I understand the argument that you're making here. If I didn't capture it above, I'm curious to know, so please do elaborate.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough.

@traviscross

Copy link
Copy Markdown
Contributor

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 3, 2026
@GuillaumeGomez

Copy link
Copy Markdown
Member Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 3, 2026
@GuillaumeGomez GuillaumeGomez force-pushed the byte_character_value branch from d6ee541 to a41f12c Compare June 4, 2026 10:15
@@ -1,6 +1,8 @@
//@ edition: 2021

#![feature(proc_macro_span)]
#![feature(proc_macro_value)]
#![feature(rustc_private)]

@GuillaumeGomez GuillaumeGomez Jun 4, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange that we need this one:

error[E0658]: use of unstable library feature `rustc_private`: this crate is being loaded from the sysroot, an unstable location; did you mean to load this crate from crates.io via `Cargo.toml` instead?
  --> /home/imperio/rust/rust/tests/ui/proc-macro/auxiliary/api/literal.rs:3:18
   |
LL | use proc_macro::{EscapeError, Literal, ConversionErrorKind};
   |                  ^^^^^^^^^^^
   |
   = note: see issue #27812 <https://github.com/rust-lang/rust/issues/27812> for more information
   = help: add `#![feature(rustc_private)]` to the crate attributes to enable
   = note: this compiler was built on YYYY-MM-DD; consider upgrading it if it is out of date

Because the reexport is done as follow in proc-macro:

#[unstable(feature = "proc_macro_value", issue = "136652")]
pub use rustc_literal_escaper::EscapeError;

I suppose it's because it comes from a dependency, but that remains surprising.

View changes since the review

@GuillaumeGomez

Copy link
Copy Markdown
Member Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 4, 2026
@rust-bors

This comment has been minimized.

@GuillaumeGomez GuillaumeGomez force-pushed the byte_character_value branch from a41f12c to 34966c8 Compare June 5, 2026 13:47
@rustbot

This comment has been minimized.

@GuillaumeGomez

Copy link
Copy Markdown
Member Author

Fixed merge conflict.

@GuillaumeGomez GuillaumeGomez force-pushed the byte_character_value branch from 34966c8 to 33dd5f4 Compare June 8, 2026 20:21
@rustbot

This comment has been minimized.

@GuillaumeGomez GuillaumeGomez force-pushed the byte_character_value branch from 33dd5f4 to e27598e Compare June 9, 2026 10:05
@GuillaumeGomez

Copy link
Copy Markdown
Member Author

Updated.

@GuillaumeGomez GuillaumeGomez force-pushed the byte_character_value branch from e27598e to f720766 Compare June 9, 2026 10:23
@rustbot rustbot added A-CI Area: Our Github Actions CI A-testsuite Area: The testsuite used to check the correctness of rustc T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels Jun 9, 2026
@rustbot

This comment has been minimized.

@GuillaumeGomez GuillaumeGomez force-pushed the byte_character_value branch from f720766 to e27598e Compare June 9, 2026 10:23
@rustbot

rustbot commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-CI Area: Our Github Actions CI A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants