Skip to content

Using Slang's language inference tool#1158

Merged
Janther merged 12 commits intomainfrom
slangs-infer-language
Jun 17, 2025
Merged

Using Slang's language inference tool#1158
Janther merged 12 commits intomainfrom
slangs-infer-language

Conversation

@Janther
Copy link
Copy Markdown
Member

@Janther Janther commented Jun 2, 2025

Slang also offers a new tool to infer the language using pragma statements.

With this we can drop out our custom Language inference tool but we have to keep some of it's logic for the cases where there is a very broad set of supported languages but the code has a very specific syntax that was not taken in consideration when defining the pragma statement. The slang team stated that this is beyond the scope of their tool.

Another change is that the Slang documentation contains a list of every version that has a syntax change. With this information we can change our old behaviour of using minor versions as syntax change steps.

Also in the previous version we were caching the last successful parser. Since prettier's function are async, multiple calls can be made with different pragmas and syntaxes and this could create issues if the parser can be changed in a non synchronous way.

@Janther Janther requested a review from fvictorio June 2, 2025 18:18
@fvictorio
Copy link
Copy Markdown
Member

I'm not fully convinced about this one. Let me first check if I'm understanding correctly:

We are now using Slang's inferLanguageVersions. This takes some Solidity code and uses the solidity version pragmas to return a list of versions that are supported by Slang and compatible with those pragmas. We could technically just use the latest version of this list, but your point is that the user's pragmas might be wrong and that we should then go over the list (in descending order) and use the latest version that correctly parses the file.

In addition to that, we can filter that list returned by Slang so that it only includes "milestone" versions; that is, versions where some syntax actually changed. This is just an optimization, as far as I understand.

There are two things about this approach I don't quite like:

  • This seems like a lot of complexity added for what in my mind is a very edge case: the user has an incorrect pragma, which is very permissive and includes backwards-incompatible changes, and it's using syntax before those incompatible changes. An example of this would be a pragma that is >=0.6.0, uses gwei as an identifier, and is then parser with a 0.8.x version.
  • This also means that we have a list of milestone versions we have to manually keep up to date. Maybe this is not a big deal because new syntax needs changes in our code anyway, but it anyways adds maintenance cost for something that IMO is not likely to happen (and has an easy fix if it happens: fix your pragmas or set a specific compiler in the Prettier Solidity options).

So, unless you feel strongly that this is a good trade-off, I'd propose just using the latest version inferred by Slang. Regardless of this discussion, we should improve our parsing errors to include the parser version we used (I don't think we are doing it now), and that should make that edge case easier to fix when/if it happens.

@Janther
Copy link
Copy Markdown
Member Author

Janther commented Jun 6, 2025

the only other example I was having also in mind is code without pragma and choosing the version based on the syntax only.

keeping up this list will be indeed an extra load to keep in mind every time there is an update of slang and I can imagine us forgetting to update the list if we are not focused.

the only issue for me is that people will rely mostly on this feature and in all of this cases it will fail.

@OmarTawfik
Copy link
Copy Markdown

I'd propose just using the latest version inferred by Slang

I assume that a user writing a pragma ^0.7.5 means: I'm currently building/testing with Solc 0.7.5, and I want to keep using the syntax/semantics/APIs of 0.7.5. I also want to get any possible future bug fixes/optimizations that they might provide, assuming the first point holds.

But because Solc releases don't necessarily follow SemVer (breaking changes in patch releases), we might want to use the first/earliest inferred version by Slang, not the last/latest. This is because we have the ability to precisely "use the syntax/semantics/APIs of 0.7.5", and exclude any breaking changes in later versions.

Prettier is not using Solc here, so it will never benefit from "possible future bug fixes/optimizations" in any case.

we have a list of milestone versions we have to manually keep up to date

If you still need this, it can also be generated automatically. A jq expression can select enabled and reserved properties from Slang JSON, and check it in next to the plugin code. Note that it is just an internal/informal format of Slang’s own grammar, so we make no guarantees about compatibility/stability of that file.

we should improve our parsing errors to include the parser version we used

This is a great point. Something that says:

  • We inferred your code to be using Solidity version X. If you would like to change that, update the pragmas in your source file, or specify a version in .prettierrc or VSCode’s settings.json.

If there will be cases where the correct version can never be inferred, adding an override in config might help there!

@Janther Janther force-pushed the slangs-infer-language branch from c87faf6 to 7b8530c Compare June 9, 2025 11:03
@fvictorio
Copy link
Copy Markdown
Member

the only other example I was having also in mind is code without pragma and choosing the version based on the syntax only.

I don't think this happens a lot in practice (it's a bad idea, and also you get an annoying compilation warning). So if we are just using the latest supported version (vs. inferring it), I think it's fine.

we might want to use the first/earliest inferred version by Slang, not the last/latest

This is an interesting idea. The problem is that tools (or at least Hardhat, but I think Truffle used to do something similar) infer the latest possible version. For example, in Hardhat if you have ^0.8.0, and you have configured compilers 0.8.0 and 0.8.1, the latter will be used. I understand that it's a different question for the parser, but still, I would err on the side of consistency I guess.

tl;dr, I think using the latest supported version is simple and consistent with other tools. Also, I wouldn't consider this behavior something that needs to be super stable. Ideally we don't change it later, but if this turns out to be a bad decision, it would be fine to change it because you can always specify the compiler you want to use yourself to override the default behavior.

@Janther
Copy link
Copy Markdown
Member Author

Janther commented Jun 11, 2025

Ok sorry for the few days AFK,

To explain a bit my approach, I'm thinking of supporting developers while minimising the configuration that needs to be used with prettier-plugin-solidity (the compiler option is available but should not be required) to the point that it just works.

I consider the pragmaless scenario because I still want the user to see the formatting on an unfinished piece of code or for example when writing a MD file with an embedded solidity example.

I also consider the use case where the developer started at a pragma (ie. ^0.8.0) and eventually adopted a new syntax and just changed the installed compiler without updating the pragma. It's not prettier's job to warn them, that's a linter's job, we just need to format the code. This creates the issue where we can have a range of pragmas that fit the pragma statement and a range of pragmas that fit the syntax being used in the code, for this later case, I don't mind being slower in identifying the correct version (it's the tax for not being thorough when developing) but it still needs to work when possible and only raise an error when the pragma statement and the syntax really don't match.

I recon my current solution using a list of the breaking changes in the syntax has a few possible issues:

  • us not updating the list when updating slang.
  • slang's team not updating the documentation website when adding a syntax change.
  • slang's team deciding that this list is not worthy keeping in their documentation.

All of these could be solved by @OmarTawfik's offering of supporting this list (unofficially) based on their JSON file which is autogenerated by their own development of their parser's Languages.

I also want to say that we did not have this issue with our old parser because it didn't rely on version, so we never got to discuss which is our end user.

Happy to keep this conversation open so we can adapt the decisions in the future.

@Janther Janther force-pushed the slangs-infer-language branch from 7b8530c to 5301dd6 Compare June 12, 2025 18:37
@Janther
Copy link
Copy Markdown
Member Author

Janther commented Jun 13, 2025

I just added a new commit removing the iteration through the list.

I had to also provide specific versions for a few of the tests that were relying on version inference.

I also thought about another reason to have the list (hopefully automated from the slang API). There have been talks about supporting .yul files. And these don't have any pragma thus we will alway infer the latest supported version, this might not align with the code being formatted.

@Janther Janther force-pushed the slangs-infer-language branch 2 times, most recently from d871b1d to 61c2f23 Compare June 16, 2025 18:02
@Janther Janther force-pushed the slangs-infer-language branch from 61c2f23 to ff18c5d Compare June 17, 2025 10:31
@fvictorio fvictorio requested a review from Copilot June 17, 2025 13:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR replaces our custom language inference tool with Slang's built‐in capabilities while retaining fallback logic for very specific syntax cases. It also updates test cases and snapshots to include explicit compiler versions and adjusts parser instantiation and error messaging accordingly.

  • Updated runFormatTest calls to supply a compiler version.
  • Adjusted parser creation logic and error handling in src/slangSolidityParser.ts and src/slang-utils/create-parser.ts.
  • Removed outdated pragma statements from snapshots and Solidity source files.

Reviewed Changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/format/MemberAccess/*.js Updated test to include compiler version for Slang.
tests/format/IndexOf/.js, FunctionDefinitions/*.js, etc. Similar updates to pass compiler version and update snapshots accordingly.
tests/format/Comments/.snap, AllSolidityFeatures/*.snap Removed outdated pragmas and updated snapshot references.
tests/config/run-format-test.js Changed property access in parser creation to account for new object shape.
src/slangSolidityParser.ts Updated parser destructuring; removed explicit error throwing for invalid parse.
src/slang-utils/create-parser.ts Refactored parser creation, changed semver filter from maxSatisfying to minSatisfying, and updated error messages.
package.json Upgraded dependency versions for ESLint and related tooling.
Comments suppressed due to low confidence (1)

src/slangSolidityParser.ts:24

  • The removal of the error throw when parseOutput is invalid may lead to silent failures if parsing fails. Consider reintroducing error handling to ensure that invalid parse outputs are not silently ignored.
}

Comment thread src/slang-utils/create-parser.ts Outdated

if (!result.parseOutput.isValid())
throw new Error(
'We encoutered the following syntax error:\n\n\t' +
Copy link

Copilot AI Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word 'encoutered' appears to be misspelled; consider correcting it to 'encountered'.

Suggested change
'We encoutered the following syntax error:\n\n\t' +
'We encountered the following syntax error:\n\n\t' +

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I enabled the copilot review because I saw this typo and was curious if it would catch it and if it would catch something else.

Copy link
Copy Markdown
Member

@fvictorio fvictorio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny comment. Feel free to merge with or without applying that suggestion.

}
const result = parserAndOutput(
text,
inferredRanges[inferredLength === supportedLength ? inferredLength - 1 : 0]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can inferredRanges be empty? I think this can only happen if you have two pragma statements whose intersection is empty, which is very unlikely, but maybe throwing an assertion error wouldn't be a bad idea.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so far slang just throws an error in this case, but I could add an check here in case slang changes this behaviour

@Janther Janther merged commit 3c24b79 into main Jun 17, 2025
7 checks passed
@Janther Janther deleted the slangs-infer-language branch June 17, 2025 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants