BEP-049 M1–M4: backtick strings, interpolation, control flow, tagged templates#3577
BEP-049 M1–M4: backtick strings, interpolation, control flow, tagged templates#3577codeshaunted wants to merge 37 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Lands the inert form of BEP-049 backtick strings: lex, parse, lower, and
format `` `...` `` (including multi-tick ladder + auto-dedent), with `${...}`
left as literal text until M2 wires interpolation.
- Lexer: new `Backtick` token + `BACKTICK` syntax kind.
- Parser: anchored-close ladder per BEP §8 (incl. JEP-326 cases). Detects
empty multi-tick (`` `` ``) at the opener and emits a clean diagnostic
instead of silently consuming downstream content.
- AST: `BacktickStringLiteral` with `value()` that decodes escapes and runs
multi-line content through `baml_base::dedent::preprocess_template`.
- HIR lowering: backtick literal → `Expr::Literal::String`.
- Formatter: prints backtick strings verbatim, round-trip idempotent.
- Consolidation: moves `unescape_string_literal` into `baml_base::escape`
(kills two near-identical copies) and adds `unescape_backtick_string_literal`
for `` \` `` and `\${` escapes. Lifts `preprocess_template` out of
`sys_llm::jinja::render` into `baml_base::dedent` so both prompt rendering
and the new backtick lowering share a single dedenter.
Tests:
- Unit: 3 lexer + 8 parser + 9 escape + 6 dedent + 6 HIR lowering + 3 fmt.
- Fixture pipeline: `baml_tests/projects/compiles/backtick_strings/`
covers lex → parser → HIR → TIR → MIR → diagnostics → codegen → formatter.
- Runtime: 4 end-to-end tests in `bex_engine` execute backtick functions
on the VM and assert the returned `BexExternalValue::String` matches.
Tracking doc at `architecture/bep_049_milestones.md` lays out M1–M7.
Lexer snapshot churn is mechanical: existing `.baml` fixtures with literal
backticks in docstrings/error-message strings used to lex as `Error "\``"`
and now lex as `Backtick "\``"`.
159d8e3 to
3ab9197
Compare
Binary size checks passed✅ 7 passed
Generated by |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds BEP-049 backtick and tagged-template support end-to-end: shared dedent/escape utilities, lexer+parser grammar, CST/AST backtick wrappers, lowering to Expr::Template (tagged and default), MIR/TIR/HIR analyses and diagnostics, runtime/to_string adapters, formatter support, and comprehensive tests. ChangesBEP-049 Backtick String Literals
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
baml_language/crates/baml_compiler_syntax/src/ast.rs (1)
1063-1073: 💤 Low valueDedent condition checks decoded content instead of source multiline status.
The condition
decoded.contains('\n')determines whether to apply dedent based on the decoded content having newlines. This differs fromis_multiline()which correctly checks whether the source spans multiple lines. Per BEP-049, dedent should apply only to source-level multiline strings, not to single-line sources containing\nescape sequences.Practically,
preprocess_templateon single-line content with embedded newlines produces the same result, so tests pass. However, the semantic intent is clearer withis_multiline().♻️ Suggested fix for clarity
pub fn value(&self) -> String { let Some(inner) = self.raw_inner() else { return String::new(); }; let decoded = unescape_backtick_string_literal(&inner); - if decoded.contains('\n') { + if self.is_multiline() { baml_base::dedent::preprocess_template(&decoded) } else { decoded } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_compiler_syntax/src/ast.rs` around lines 1063 - 1073, The value() method currently decides to call baml_base::dedent::preprocess_template based on decoded.contains('\n'); change it to use the source-level multiline check by calling is_multiline() instead (i.e., replace the decoded.contains('\n') condition with self.is_multiline()). Keep the rest of the logic (unescape_backtick_string_literal(&inner) and returning either preprocess_template(&decoded) or decoded) unchanged so dedent runs only for source multiline strings.baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs (1)
2300-2301: 💤 Low valueRemove redundant import.
rowan::ast::AstNodeis already imported at line 10, so the localusestatement on line 2301 is unnecessary.♻️ Remove redundant import
fn lower_backtick_string_literal(&mut self, node: &SyntaxNode) -> ExprId { use baml_compiler_syntax::BacktickStringLiteral; - use rowan::ast::AstNode; let value = BacktickStringLiteral::cast(node.clone())🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs` around lines 2300 - 2301, Remove the redundant import of rowan::ast::AstNode in lower_expr_body.rs: delete the duplicate `use rowan::ast::AstNode;` line (the file already imports AstNode earlier), leaving only the single existing import to avoid duplicate imports and clippy/warning noise; locate the extra `use` near the block that also imports `baml_compiler_syntax::BacktickStringLiteral` and remove it.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@baml_language/crates/baml_compiler_parser/src/parser.rs`:
- Around line 1840-1849: The helper find_first_backtick_pos (and the similar
helper at the other location) only skips basic trivia via is_basic_trivia, but
parse_backtick_string is entered via self.at(TokenKind::Backtick) which skips
full trivia (including comments), so these helpers can miss an opening backtick
when comments lie between current and the backtick. Fix both helpers to skip the
same set of trivia that self.at uses (i.e., include comments) — replace the
is_basic_trivia check with the broader trivia predicate used by self.at (or
reuse the same utility method used by self.at, or add a small wrapper like
is_full_trivia/is_trivia_including_comments) so the loop advances past comments
as well and then checks for TokenKind::Backtick; update both occurrences
(find_first_backtick_pos and the other helper around the parse_backtick_string
flow) to use this corrected trivia-skipping logic.
---
Nitpick comments:
In `@baml_language/crates/baml_compiler_syntax/src/ast.rs`:
- Around line 1063-1073: The value() method currently decides to call
baml_base::dedent::preprocess_template based on decoded.contains('\n'); change
it to use the source-level multiline check by calling is_multiline() instead
(i.e., replace the decoded.contains('\n') condition with self.is_multiline()).
Keep the rest of the logic (unescape_backtick_string_literal(&inner) and
returning either preprocess_template(&decoded) or decoded) unchanged so dedent
runs only for source multiline strings.
In `@baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs`:
- Around line 2300-2301: Remove the redundant import of rowan::ast::AstNode in
lower_expr_body.rs: delete the duplicate `use rowan::ast::AstNode;` line (the
file already imports AstNode earlier), leaving only the single existing import
to avoid duplicate imports and clippy/warning noise; locate the extra `use` near
the block that also imports `baml_compiler_syntax::BacktickStringLiteral` and
remove it.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e5e898f2-8a4b-4eaf-afda-5b333e80befe
⛔ Files ignored due to path filters (55)
baml_language/Cargo.lockis excluded by!**/*.lockbaml_language/crates/baml_tests/snapshots/broken_syntax/banned_expressions/baml_tests__broken_syntax__banned_expressions__01_lexer__banned_expressions.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/broken_syntax/error_cases/baml_tests__broken_syntax__error_cases__01_lexer__syntax_errors.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/broken_syntax/mismatched_brackets/baml_tests__broken_syntax__mismatched_brackets__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/broken_syntax/namespaces_same_name/baml_tests__broken_syntax__namespaces_same_name__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/broken_syntax/parser_error_recovery/baml_tests__broken_syntax__parser_error_recovery__01_lexer__partial_input.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/broken_syntax/type_annotation_errors/baml_tests__broken_syntax__type_annotation_errors__01_lexer__type_malformed.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__assert_std__/baml_tests__compiles____assert_std____01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__testing_std__/baml_tests__compiles____testing_std____01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__02_parser__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__05_diagnostics.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__10_formatter__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/deep_method_call/baml_tests__compiles__deep_method_call__01_lexer__deep_method_call.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/host_callable_call/baml_tests__compiles__host_callable_call__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/is_operator/baml_tests__compiles__is_operator__01_lexer__is_operator.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/json_alias_basic/baml_tests__compiles__json_alias_basic__01_lexer__json_alias_basic.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/json_cross_namespace_static_call/baml_tests__compiles__json_cross_namespace_static_call__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/json_cross_namespace_static_call/baml_tests__compiles__json_cross_namespace_static_call__01_lexer__ns_pkg_ns_inner_box.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/json_llm_return_type/baml_tests__compiles__json_llm_return_type__01_lexer__json_llm_return_type.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/json_map_literal/baml_tests__compiles__json_map_literal__01_lexer__json_map_literal.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/json_to_from_string_composite_generic/baml_tests__compiles__json_to_from_string_composite_generic__01_lexer__json_to_from_string_composite_generic.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/json_to_from_string_concrete/baml_tests__compiles__json_to_from_string_concrete__01_lexer__json_to_from_string_concrete.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/json_to_from_string_generic_forwarding/baml_tests__compiles__json_to_from_string_generic_forwarding__01_lexer__json_to_from_string_generic_forwarding.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/json_to_from_string_three_level/baml_tests__compiles__json_to_from_string_three_level__01_lexer__json_to_from_string_three_level.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/lambda_fat_arrow/baml_tests__compiles__lambda_fat_arrow__01_lexer__lambda_fat_arrow.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/lambda_field_access/baml_tests__compiles__lambda_field_access__01_lexer__lambda_field_access.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/lexical_scoping/baml_tests__compiles__lexical_scoping__01_lexer__lexical_scoping.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/patterns_new/baml_tests__compiles__patterns_new__01_lexer__patterns_new.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/reflect_type_of_user_generic/baml_tests__compiles__reflect_type_of_user_generic__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/stream_llm_inferred_typeargs/baml_tests__compiles__stream_llm_inferred_typeargs__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/type_annotation/baml_tests__compiles__type_annotation__01_lexer__type_alias_interaction.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/type_annotation/baml_tests__compiles__type_annotation__01_lexer__type_as_identifier.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/type_annotation/baml_tests__compiles__type_annotation__01_lexer__type_positions.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/captured_field_chain/baml_tests__diagnostic_errors__captured_field_chain__01_lexer__captured_field_chain.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/catch_return_type_mismatch/baml_tests__diagnostic_errors__catch_return_type_mismatch__01_lexer__catch_return_type_mismatch.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/catch_throw_regressions/baml_tests__diagnostic_errors__catch_throw_regressions__01_lexer__regressions.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/format_checks/baml_tests__diagnostic_errors__format_checks__01_lexer__match_exprs.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/generic_call_ambiguity/baml_tests__diagnostic_errors__generic_call_ambiguity__01_lexer__array_compare.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/generic_call_ambiguity/baml_tests__diagnostic_errors__generic_call_ambiguity__01_lexer__generic_call.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/generics/baml_tests__diagnostic_errors__generics__01_lexer__classes.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/instanceof_removed/baml_tests__diagnostic_errors__instanceof_removed__01_lexer__instanceof_removed.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/json_static_method_arity/baml_tests__diagnostic_errors__json_static_method_arity__01_lexer__json_static_method_arity.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/match_exhaustiveness/baml_tests__diagnostic_errors__match_exhaustiveness__01_lexer__match_exhaustiveness.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/patterns_new/baml_tests__diagnostic_errors__patterns_new__01_lexer__patterns_new.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/stream_types/baml_tests__diagnostic_errors__stream_types__01_lexer__builtin_refs.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/test_with_runner_ambiguity/baml_tests__diagnostic_errors__test_with_runner_ambiguity__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/type_reflection_strict/baml_tests__diagnostic_errors__type_reflection_strict__01_lexer__reflect_intrinsic_errors.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/type_reflection_strict/baml_tests__diagnostic_errors__type_reflection_strict__01_lexer__strict_normalization.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/unknown_type_error/baml_tests__diagnostic_errors__unknown_type_error__01_lexer__unknown_type_error.snapis excluded by!**/*.snap
📒 Files selected for processing (15)
baml_language/crates/baml_base/src/dedent.rsbaml_language/crates/baml_base/src/escape.rsbaml_language/crates/baml_base/src/lib.rsbaml_language/crates/baml_compiler2_ast/src/lib.rsbaml_language/crates/baml_compiler2_ast/src/lower_expr_body.rsbaml_language/crates/baml_compiler_lexer/src/tokens.rsbaml_language/crates/baml_compiler_parser/src/parser.rsbaml_language/crates/baml_compiler_syntax/Cargo.tomlbaml_language/crates/baml_compiler_syntax/src/ast.rsbaml_language/crates/baml_compiler_syntax/src/syntax_kind.rsbaml_language/crates/baml_fmt/src/ast/expressions.rsbaml_language/crates/baml_fmt/src/ast/tokens.rsbaml_language/crates/baml_fmt/src/lib.rsbaml_language/crates/baml_tests/projects/compiles/backtick_strings/main.bamlbaml_language/crates/bex_engine/tests/backtick_strings.rs
Lands the active form of BEP-049 backtick strings: `${expr}` interpolates,
with full block-expression body semantics (BEP §4 — statements + optional
trailing expression). Plus TypeScript-parity tightening for escape decoding
identified by auditing typescript-go's scanner.
## Parser / segments / lowering
- Parser: detect adjacent `${` (no trivia between `$` and `{`) and dispatch
to `parse_backtick_interpolation`, which reuses `parse_block_expr` for
the body — so the full host block grammar (let, while, if, throw, return,
…) works inside `${...}`.
- `BacktickSegment` AST type: alternating `Text` / `Interp` segments
consumed by the lowerer. M4's tagged-template lowering will share the
same primitive — different transform, same input.
- Multi-line dedent (BEP §12) treats `${...}` as opaque placeholders for
min-indent calculation per §12 rule 8.
- Lowering: text segments + `Interp.to_string()` concatenated into a
left-folded `+` chain. Empty / statement-only blocks (BEP §4 unit case)
emit `""` directly — no `unit.to_string()` failure.
## Implicit `.to_string()` (BEP §11)
- Added `to_string(self) -> string` to `Int`, `Bool`, `Float`, `String`
in `baml_std` (Rust impls in `bex_vm::package_baml::*`). String's
variant is a no-op so the lowering doesn't need to special-case it.
- Untagged `${expr}` lowers to `expr.to_string()`. Non-string types
(incl. `int?`, `null`, user classes without `to_string`) produce a
compile error — BEP §7 explicitly endorses this: "surfaces null-handling
bugs in prompts at type-check time instead of at LLM-call time."
## TS-parity escape tightening (audit vs typescript-go/internal/scanner)
- AA: `\r\n` and bare `\r` in backtick text normalize to `\n` (mirrors
scanner.go:1650-1660).
- BB: `\b`, `\v`, `\f` extended escapes added to the canonical escape
decoder (mirrors scanner.go:1721-1736). Applied to both quote and
backtick flavors.
## Bug fixes surfaced by audit
- Lexer regex previously absorbed a trailing `$` into the preceding Word
(`before-$` → one token), masking `${` adjacency. Tightened to require
any `$` to be a separator between identifier chunks. Caught by
`backtick_nested_in_interpolation`.
- `BacktickStringLiteral::delimiter_count` used `filter_map(into_token)
.take_while(BACKTICK)`, silently skipping past child nodes and
miscounting the closing run as part of the opener for `${a}${b}`
(adjacent interps). Now breaks on first non-BACKTICK element.
## Tests
- 38 runtime tests in `bex_engine` exercising every shape: while-loop in
interp body, recursive function calls, mixed-type implicit to_string,
TS-parity escapes, CR/CRLF normalization, comments inside `${...}`,
trailing `$`, multi-tick + interp, nested backticks.
- 18 parser unit tests (incl. segments adjacency regression, empty
multi-tick, JEP-326 anchored close, escape-doesn't-interpolate).
- 9 escape-decoder unit tests + 7 lowering unit tests.
- Compile-pipeline fixtures:
- `compiles/backtick_strings/main.baml` extended with M2 cases —
8 stages green (lexer → parser → HIR → TIR → MIR → diagnostics →
codegen → formatter).
- `diagnostic_errors/backtick_strict_types/main.baml` pins the
strict-type compile errors for null / class-without-to_string /
optional interpolation cases.
## Fixture / snapshot churn (incidental)
- `__baml_std__` snapshots: updated for 4 new `to_string` methods.
- `bytecode_display_expanded_unoptimized`: builtin offsets shifted.
- `diagnostic_errors/control_flow/main.baml`: previously errored on
`z.to_string()` (Int had no to_string); rewrote to use a guaranteed-
missing method so the project keeps its tier classification.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
baml_language/crates/baml_compiler_parser/src/parser.rs (1)
1919-1923:⚠️ Potential issue | 🟠 Major | ⚡ Quick winConsume only the immediate raw token after
\.The second
bump_raw()skips intervening whitespace/newlines, so inputs like`\ ${name}`or`\ `can swallow the later$/`and suppress interpolation or the closing delimiter. After consuming the backslash, this path needs to emit exactly one raw token, not “next non-trivia token.”Proposed fix
+ /// Consume exactly one raw token at `self.current` without skipping trivia. + fn bump_exact_raw(&mut self) -> bool { + let Some(token) = self.tokens.get(self.current) else { + return false; + }; + self.events.push(Event::Token { + kind: token_kind_to_syntax_kind(token.kind), + text: token.text.clone(), + }); + self.current += 1; + true + } + fn parse_backtick_content(&mut self, opening_ticks: usize) { let mut loop_counter = 0; loop { @@ if self.at_raw(TokenKind::Backslash) { self.bump_raw(); if self.current < self.tokens.len() { - self.bump_raw(); + self.bump_exact_raw(); } continue; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_compiler_parser/src/parser.rs` around lines 1919 - 1923, After detecting a raw backslash (self.at_raw(TokenKind::Backslash)) we should consume exactly one immediate token (the token directly following the backslash) instead of calling bump_raw which advances to the next non-trivia token; change the second consume to advance/consume only the next token in the token stream (the token at self.current, after the backslash bump) so interpolation delimiters or closing backticks aren't skipped. Update the logic around self.bump_raw(), self.current and self.tokens so the backslash path consumes exactly one following token (the immediate token) rather than “next non-trivia token.”
🧹 Nitpick comments (1)
baml_language/crates/baml_compiler_lexer/src/tokens.rs (1)
853-873: ⚡ Quick winAdd a regression case for
before-${x}token splitting.Line 856 validates
${...}tokenization, but it doesn’t cover the specific guard documented in Lines 112-116 (preventingWord("before-$")). Please add a case assertingWord("before-"), Dollar, LBrace, Word("x"), RBracefor`before-${x}`.Proposed test addition
#[test] fn test_backtick_with_interpolation_tokens() { // ${...} inside backtick — lexer emits the raw tokens; parser assembles let source = "`Hello ${name}`"; let tokens = lex_no_whitespace(source); assert_eq!( tokens, vec![ TokenKind::Backtick, TokenKind::Word, // Hello TokenKind::Dollar, TokenKind::LBrace, TokenKind::Word, // name TokenKind::RBrace, TokenKind::Backtick, ] ); assert_eq!(reconstruct_source(&lex(source)), source); + + // Regression: `${` must not be absorbed into a trailing-$ word segment. + let source2 = "`before-${x}`"; + let tokens2 = lex_no_whitespace(source2); + assert_eq!( + tokens2, + vec![ + TokenKind::Backtick, + TokenKind::Word, // before- + TokenKind::Dollar, + TokenKind::LBrace, + TokenKind::Word, // x + TokenKind::RBrace, + TokenKind::Backtick, + ] + ); + assert_eq!(reconstruct_source(&lex(source2)), source2); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_compiler_lexer/src/tokens.rs` around lines 853 - 873, Update the backtick interpolation test to cover the guard that prevents combining a trailing '$' into a Word token: inside the test_backtick_with_interpolation_tokens (which uses lex_no_whitespace and lex/reconstruct_source), add an assertion for the source "`before-${x}`" that expects tokens [TokenKind::Backtick, TokenKind::Word (with text "before-"), TokenKind::Dollar, TokenKind::LBrace, TokenKind::Word (with text "x"), TokenKind::RBrace, TokenKind::Backtick] and also assert reconstruct_source(&lex(source)) == source to ensure round-trip correctness; locate uses of lex_no_whitespace, lex, and reconstruct_source in the test to add this new case.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@baml_language/crates/baml_compiler_syntax/src/ast.rs`:
- Around line 1168-1196: The dedent remapping in BacktickStringLiteral::segments
misaligns when the first decoded part is RawPart::Interp because the current
loop only advances the split iterator for RawPart::Text; fix by changing the
remapping loop that walks decoded (the for p in &mut decoded after
dedented.split(PLACEHOLDER)) so that it consumes iter.next() for every part
(call iter.next() and ignore its value for RawPart::Interp, assign it to
RawPart::Text for Text parts) thereby preserving split positions, keep the
existing tail-handling logic that appends any leftover rest to the last
RawPart::Text, and add a new test covering a multiline template that starts with
an interpolation to prevent regressions.
In `@baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs`:
- Around line 2300-2302: The doc comment on lower_backtick_string_literal is
stale: update the sentence "Implicit `.to_string()` for non-string types is task
`#13`" to state that lowering now performs implicit `.to_string()` on non-string
interpolations (reflecting the conversion logic in the interpolation-handling
block that forces non-string child expressions to strings), so the comment
matches the implemented behavior in lower_backtick_string_literal and the
adjacent interpolation-lowering code paths.
---
Outside diff comments:
In `@baml_language/crates/baml_compiler_parser/src/parser.rs`:
- Around line 1919-1923: After detecting a raw backslash
(self.at_raw(TokenKind::Backslash)) we should consume exactly one immediate
token (the token directly following the backslash) instead of calling bump_raw
which advances to the next non-trivia token; change the second consume to
advance/consume only the next token in the token stream (the token at
self.current, after the backslash bump) so interpolation delimiters or closing
backticks aren't skipped. Update the logic around self.bump_raw(), self.current
and self.tokens so the backslash path consumes exactly one following token (the
immediate token) rather than “next non-trivia token.”
---
Nitpick comments:
In `@baml_language/crates/baml_compiler_lexer/src/tokens.rs`:
- Around line 853-873: Update the backtick interpolation test to cover the guard
that prevents combining a trailing '$' into a Word token: inside the
test_backtick_with_interpolation_tokens (which uses lex_no_whitespace and
lex/reconstruct_source), add an assertion for the source "`before-${x}`" that
expects tokens [TokenKind::Backtick, TokenKind::Word (with text "before-"),
TokenKind::Dollar, TokenKind::LBrace, TokenKind::Word (with text "x"),
TokenKind::RBrace, TokenKind::Backtick] and also assert
reconstruct_source(&lex(source)) == source to ensure round-trip correctness;
locate uses of lex_no_whitespace, lex, and reconstruct_source in the test to add
this new case.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 47ec80ea-80ac-4bcb-af3d-65b1627839d4
⛔ Files ignored due to path filters (27)
baml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_builtin_item_by_definition.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_builtin_string.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_truncation_hint.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__02_parser__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__10_formatter__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/backtick_strict_types/baml_tests__diagnostic_errors__backtick_strict_types__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/backtick_strict_types/baml_tests__diagnostic_errors__backtick_strict_types__02_parser__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/backtick_strict_types/baml_tests__diagnostic_errors__backtick_strict_types__03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/backtick_strict_types/baml_tests__diagnostic_errors__backtick_strict_types__04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/backtick_strict_types/baml_tests__diagnostic_errors__backtick_strict_types__05_diagnostics.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/backtick_strict_types/baml_tests__diagnostic_errors__backtick_strict_types__10_formatter__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/control_flow/baml_tests__diagnostic_errors__control_flow__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/control_flow/baml_tests__diagnostic_errors__control_flow__02_parser__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/control_flow/baml_tests__diagnostic_errors__control_flow__03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/control_flow/baml_tests__diagnostic_errors__control_flow__04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/control_flow/baml_tests__diagnostic_errors__control_flow__05_diagnostics.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/control_flow/baml_tests__diagnostic_errors__control_flow__10_formatter__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/null_handling/baml_tests__diagnostic_errors__null_handling__04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/diagnostic_errors/null_handling/baml_tests__diagnostic_errors__null_handling__05_diagnostics.snapis excluded by!**/*.snap
📒 Files selected for processing (18)
baml_language/crates/baml_base/src/escape.rsbaml_language/crates/baml_builtins2/baml_std/baml/bool.bamlbaml_language/crates/baml_builtins2/baml_std/baml/float.bamlbaml_language/crates/baml_builtins2/baml_std/baml/int.bamlbaml_language/crates/baml_builtins2/baml_std/baml/string.bamlbaml_language/crates/baml_compiler2_ast/src/lib.rsbaml_language/crates/baml_compiler2_ast/src/lower_expr_body.rsbaml_language/crates/baml_compiler_lexer/src/tokens.rsbaml_language/crates/baml_compiler_parser/src/parser.rsbaml_language/crates/baml_compiler_syntax/src/ast.rsbaml_language/crates/baml_tests/projects/compiles/backtick_strings/main.bamlbaml_language/crates/baml_tests/projects/diagnostic_errors/backtick_strict_types/main.bamlbaml_language/crates/baml_tests/projects/diagnostic_errors/control_flow/main.bamlbaml_language/crates/bex_engine/tests/backtick_strings.rsbaml_language/crates/bex_vm/src/package_baml/float.rsbaml_language/crates/bex_vm/src/package_baml/int.rsbaml_language/crates/bex_vm/src/package_baml/primitives.rsbaml_language/crates/bex_vm/src/package_baml/string.rs
✅ Files skipped from review due to trivial changes (1)
- baml_language/crates/baml_tests/projects/diagnostic_errors/backtick_strict_types/main.baml
Three follow-ups from CodeRabbit review:
1. ast.rs:1196 — segments() dedent remapping misaligned text pieces when
the first decoded part was Interp (multi-line literal starting with
`${...}`). The old loop walked decoded and advanced the split iterator
only for Text parts; the leading empty piece (for "no Text before
first Interp") was assigned to the first Text part, shifting every
subsequent Text by one. Rewrote to walk the pieces/interps in lockstep
instead — one piece per text position, interps interleaved. Skips
empty Text pieces to preserve the "no empty Text segments" invariant.
Regression test: backtick_segments_multiline_starts_with_interp.
2. parser.rs:1849 — count_consecutive_backticks /
find_first_backtick_pos only skipped basic trivia, but the entry
guard `self.at(TokenKind::Backtick)` skips comments too. A literal
like `let s = /*c*/ <backtick>ok<backtick>` reached the helpers with
comment tokens still in front and returned 0 → spurious parse error.
Both helpers now use the existing comment-aware
`skip_trivia_and_comments_from`. Regression tests:
backtick_after_block_comment_parses, backtick_after_line_comment_parses.
3. lower_expr_body.rs:2302 — doc comment said implicit `.to_string()`
was deferred to task #13, but the lowering already applies it. Updated
to reflect the actual behavior.
Six items from two ultrareview runs: - bug_001 (normal): preprocess_template panicked on multi-byte Unicode whitespace (NBSP U+00A0, LINE SEPARATOR U+2028) mixed with ASCII indent. min_indent computed in bytes, then sliced at a non-char- boundary. Reachable via macOS Option+Space or rich-text paste. Walk leading whitespace by char; strip per-line up to target_bytes while always landing on a char boundary (over-strip by one whitespace char at worst). 3 new tests in baml_base::dedent. - bug_011 (normal): backslash escape inside a backtick string overshot — the second bump_raw() skipped trivia before taking the escape target, so backslash + space + closing-backtick silently consumed the closer as the escape target. Added bump_one_token_raw that emits exactly the token at self.current with no trivia walk; use it for the escape target. Regression test: backtick_backslash_followed_by_whitespace_does_not_eat_closer. - bug_008 + bug_002 (normal, same helper): has_backtick_close_ahead_from walked tokens to EOF without scope or comment-awareness. Any downstream backtick literal OR backtick run inside a // line comment (lexer doesn't tokenize comments — // is two Slash tokens, content goes through raw) satisfied the look-ahead and disabled the empty-multi-tick guard. Bound the scan to the current top-level item (abort on intervening function/class/etc. keyword) and skip comments via skip_comment_at. Tests verify the diagnostic fires at the actual bad opener (byte offset check), not from a downstream remnant. 2 new tests. - bug_006 (nit): segments() hardcoded U+F8FF as the in-band dedent placeholder, but U+F8FF is a legitimate PUA codepoint (Apple logo on macOS) that users can type or paste. Collision silently dropped the user's char and misaligned interpolation positions. Walk the PUA range U+E000..U+F8FF and pick the first codepoint that doesn't appear in content. Practically always U+E000. 1 new test. - bug_010 (nit, doc): escape.rs module doc listed 6 always-on escapes but \b/\v/\f are also unconditionally decoded (TS parity, pinned by test). Updated to list all 9 with a TS-parity note.
`if` and `while` already accept paren-optional conditions; `match`
previously required them. Bring it in line — both `match (x) { ... }`
and `match x { ... }` now parse. The optional `: Type` annotation is
only accepted in the parenthesized form (parens are the syntactic anchor
that disambiguates the colon from a binding ascription).
The no-paren form guards against object-literal postfix capture by
incrementing `suppress_object_literal_depth` around the scrutinee
parse, mirroring how `parse_spawn_expr` handles the same hazard. So
`match Foo { _ => 0 }` correctly parses Foo as the scrutinee and
`{ _ => 0 }` as the match body — not as a struct constructor.
Tests:
- match_scrutinee_parens_optional
- match_scrutinee_with_type_annotation_no_parens
- match_paren_form_still_works
Snapshot churn (all real improvements, not regressions):
- compiler2_mir::match_expr previously captured `baml.sys.panic
("parse error")` because its fixture used the no-paren form;
now compiles correctly to a real jump table.
- backtick_strings 03_hir / 04_tir / 04_5_mir / 06_codegen all
dropped a trailing `+ ""` op — incidental fallout from the
segments() lockstep rewrite in PR #3577's earlier round (stale
snapshots that weren't re-run until this full-workspace pass).
There was a problem hiding this comment.
🧹 Nitpick comments (1)
baml_language/crates/baml_compiler_parser/src/parser.rs (1)
7772-7785: ⚡ Quick winAdd a negative test for unparenthesized type-ascribed scrutinees.
You’ve documented
: Typeas parenthesized-only, but current additions only assert positive forms. Add amatch x: int { ... }test that expects diagnostics to lock that contract.Suggested test
+ #[test] + fn match_scrutinee_type_annotation_requires_parens() { + let source = " +function Demo(x: int) -> int { + match x: int { + _ => 0, + } +} +"; + let (_root, errors) = parse_source(source); + assert!( + !errors.is_empty(), + "expected parse errors for unparenthesized type-ascribed scrutinee" + ); + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_compiler_parser/src/parser.rs` around lines 7772 - 7785, Add a negative test that ensures unparenthesized type-ascribed scrutinees (e.g., `match x: int { ... }`) produce diagnostics: create a new test function (e.g., match_scrutinee_with_type_annotation_unparenthesized) that calls parse_source with a source containing `match x: int { _ => 0, }`, then assert that the returned errors are non-empty (e.g., assert!(errors.len() > 0)) or otherwise verify the expected diagnostic appears; use the same helpers seen in the file (parse_source and the errors variable) to locate and validate the failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@baml_language/crates/baml_compiler_parser/src/parser.rs`:
- Around line 7772-7785: Add a negative test that ensures unparenthesized
type-ascribed scrutinees (e.g., `match x: int { ... }`) produce diagnostics:
create a new test function (e.g.,
match_scrutinee_with_type_annotation_unparenthesized) that calls parse_source
with a source containing `match x: int { _ => 0, }`, then assert that the
returned errors are non-empty (e.g., assert!(errors.len() > 0)) or otherwise
verify the expected diagnostic appears; use the same helpers seen in the file
(parse_source and the errors variable) to locate and validate the failure.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a7464d8b-0070-4c5e-90a2-cfffc7288af2
⛔ Files ignored due to path filters (5)
baml_language/crates/baml_tests/snapshots/compiler2_mir/baml_tests__compiler2_mir__match_expr.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/backtick_strings/baml_tests__compiles__backtick_strings__06_codegen.snapis excluded by!**/*.snap
📒 Files selected for processing (1)
baml_language/crates/baml_compiler_parser/src/parser.rs
…-interpolation # Conflicts: # baml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded.snap # baml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded_unoptimized.snap
…lattening)
Tagged templates with ${for}/${if} blocks previously hit an
emit_panic_call stub ("not yet implemented (M4e.1b)") at runtime,
because parts/values lengths are data-dependent.
Give Expr::Template's Custom tag a `body`: a desugared closure body built
at AST lowering that flattens the segments into baml.TaggedString { parts,
values } — text runs concatenated into a `cur` accumulator flushed into
`parts` at each interpolation, each ${expr} pushed RAW (uncoerced, §10/§11)
into `values`, with ${for}/${if} driving runtime growth via real
loops/branches. Built from the same lowered ExprIds the segments hold, so
it reuses MIR's generic for/if/method-call/aggregate lowering rather than
hand-emitting MIR.
- TIR Custom arm types `body` (push/TaggedString resolutions for MIR) with
the tag's body-lambda params bound, replacing the segment-only walk.
- HIR walks the flatten block's contents inside the template's Lambda scope
so captures + ${for} bindings register; synthetic accumulators are
leading-space names anchored at the template start (inside the scope).
- MIR keeps the static fast-path (fixed arrays) for text+interp-only
templates; the dynamic None arm lowers `body`. It restores the enclosing
metadata scope first — the flatten block was inferred inline in the parent
(not a real lambda), so its TIR resolutions are keyed there; without this
`parts.push(...)` mis-lowered to a map-element access.
Adds runtime tests: for-loop (incl. empty), if (taken/skipped), mixed
static+for, nested for-in-if. M4 (tagged templates) is now complete.
A `${for}` body whose `${expr}` contains a nested lambda capturing both the
loop-local and an enclosing local (`["1"].map((y) -> { x + outer + y })...`).
An adversarial review flagged this as a possible capture-index collision in
the tagged closure; executing it proves MIR's standard nested-lambda capture
machinery handles loop-locals correctly inside the flatten for-loop.
…olation # Conflicts: # baml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_builtin_item_by_definition.snap # baml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_builtin_string.snap # baml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_truncation_hint.snap # baml_language/crates/baml_compiler2_mir/src/lower.rs
The $stream / $parse_stream companions hardcoded `is_tagged_template_tag: false`, diverging from the AST-level companions (companions.rs) which copy `parent.is_tagged_template_tag`. Inherit it for consistency. Functionally a no-op — these companions are generated only for LLM functions (never tagged-string tags) and are never resolved as a tag — but it aligns the two companion paths. Addresses PR #3577 review.
…olation # Conflicts: # baml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____03_hir.snap # baml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded.snap # baml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded_unoptimized.snap
⏭️ Performance benchmarks were skippedPerf benchmarks (CodSpeed) are opt-in on pull requests — they no longer run on every push. They always run automatically after merge to To run them on this PR, do any of the following, then push a commit (or re-run CI):
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs`:
- Around line 2686-2750: The synthetic accumulator reads/writes are anchored at
TextRange::empty(span.end()) which places them outside the block scope; change
the anchoring so the empty range falls inside the block (e.g. use
TextRange::empty(span.start()) or otherwise compute an empty range within the
block span) by updating the after_span creation and its uses (the variable
after_span and all Expr/Stmt allocations that use it: acc_path_lhs,
acc_path_rhs, concat, assign_stmt, loop_body, for_stmt, acc_tail) so the
synthesized __m3_for references resolve inside the block scope rather than at
span.end().
- Around line 2283-2294: The lowered member currently loses a leading '$'
because seen_accessor is only a bool; change the accessor tracking to remember
whether the accessor token was a DOLLAR (e.g., replace seen_accessor: bool with
last_accessor_is_dollar: bool or make seen_accessor an Option<SyntaxKind>), set
it when you detect SyntaxKind::DOLLAR (where currently you set seen_accessor =
true), and when creating the field (where you call field =
Some(Name::new(token.text())) and field_range = Some(token.text_range())),
prepend a '$' to the identifier text (and adjust the range if necessary) when
last_accessor_is_dollar is true so x.$watch remains distinct from x.watch; touch
the code paths in try_lower_bare_token/base handling and the is_ident_token
branch to use the new flag.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 90881378-39e9-48a9-bac8-1274258a67ee
⛔ Files ignored due to path filters (6)
baml_language/Cargo.lockis excluded by!**/*.lockbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_builtin_package_listing.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_alias_string.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_builtin_item_by_definition.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_builtin_string.snapis excluded by!**/*.snapbaml_language/crates/baml_lsp2_actions/src/snapshots/baml_lsp2_actions__describe_tests__describe_builtin_string_with_compiler2_visible_files.snapis excluded by!**/*.snap
📒 Files selected for processing (31)
baml_language/crates/baml_builtins2/baml_std/baml/bool.bamlbaml_language/crates/baml_builtins2/baml_std/baml/core.bamlbaml_language/crates/baml_builtins2/baml_std/baml/float.bamlbaml_language/crates/baml_builtins2/baml_std/baml/int.bamlbaml_language/crates/baml_builtins2/baml_std/baml/string.bamlbaml_language/crates/baml_builtins2_codegen/src/codegen.rsbaml_language/crates/baml_compiler2_ast/src/ast.rsbaml_language/crates/baml_compiler2_ast/src/auto_derive_json.rsbaml_language/crates/baml_compiler2_ast/src/companions.rsbaml_language/crates/baml_compiler2_ast/src/lib.rsbaml_language/crates/baml_compiler2_ast/src/lower_cst.rsbaml_language/crates/baml_compiler2_ast/src/lower_expr_body.rsbaml_language/crates/baml_compiler2_ast/src/lower_type_expr.rsbaml_language/crates/baml_compiler2_emit/src/lib.rsbaml_language/crates/baml_compiler2_hir/src/builder.rsbaml_language/crates/baml_compiler2_hir/src/item_tree.rsbaml_language/crates/baml_compiler2_mir/src/lower.rsbaml_language/crates/baml_compiler2_ppir/src/lib.rsbaml_language/crates/baml_compiler2_tir/src/builder.rsbaml_language/crates/baml_compiler2_tir/src/infer_context.rsbaml_language/crates/baml_compiler2_tir/src/normalize.rsbaml_language/crates/baml_compiler2_tir/src/throws_analysis.rsbaml_language/crates/baml_compiler_lexer/src/tokens.rsbaml_language/crates/baml_compiler_parser/src/parser.rsbaml_language/crates/baml_compiler_syntax/src/ast.rsbaml_language/crates/baml_compiler_syntax/src/syntax_kind.rsbaml_language/crates/baml_fmt/src/ast/expressions.rsbaml_language/crates/baml_fmt/src/ast/tokens.rsbaml_language/crates/baml_fmt/src/lib.rsbaml_language/crates/baml_lsp2_actions/src/check.rsbaml_language/crates/baml_lsp2_actions/src/completions.rs
💤 Files with no reviewable changes (19)
- baml_language/crates/baml_compiler2_hir/src/item_tree.rs
- baml_language/crates/baml_compiler2_ast/src/lower_type_expr.rs
- baml_language/crates/baml_lsp2_actions/src/check.rs
- baml_language/crates/baml_compiler2_tir/src/infer_context.rs
- baml_language/crates/baml_compiler2_tir/src/throws_analysis.rs
- baml_language/crates/baml_compiler2_ppir/src/lib.rs
- baml_language/crates/baml_compiler2_emit/src/lib.rs
- baml_language/crates/baml_compiler_syntax/src/syntax_kind.rs
- baml_language/crates/baml_compiler2_tir/src/normalize.rs
- baml_language/crates/baml_compiler_lexer/src/tokens.rs
- baml_language/crates/baml_lsp2_actions/src/completions.rs
- baml_language/crates/baml_compiler2_mir/src/lower.rs
- baml_language/crates/baml_fmt/src/ast/expressions.rs
- baml_language/crates/baml_compiler2_tir/src/builder.rs
- baml_language/crates/baml_fmt/src/lib.rs
- baml_language/crates/baml_fmt/src/ast/tokens.rs
- baml_language/crates/baml_compiler_syntax/src/ast.rs
- baml_language/crates/baml_compiler_parser/src/parser.rs
- baml_language/crates/baml_compiler2_hir/src/builder.rs
✅ Files skipped from review due to trivial changes (1)
- baml_language/crates/baml_builtins2_codegen/src/codegen.rs
🚧 Files skipped from review as they are similar to previous changes (8)
- baml_language/crates/baml_compiler2_ast/src/companions.rs
- baml_language/crates/baml_builtins2/baml_std/baml/string.baml
- baml_language/crates/baml_builtins2/baml_std/baml/bool.baml
- baml_language/crates/baml_builtins2/baml_std/baml/float.baml
- baml_language/crates/baml_builtins2/baml_std/baml/int.baml
- baml_language/crates/baml_compiler2_ast/src/lower_cst.rs
- baml_language/crates/baml_compiler2_ast/src/auto_derive_json.rs
- baml_language/crates/baml_compiler2_ast/src/lib.rs
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs`:
- Around line 2686-2750: The synthetic accumulator reads/writes are anchored at
TextRange::empty(span.end()) which places them outside the block scope; change
the anchoring so the empty range falls inside the block (e.g. use
TextRange::empty(span.start()) or otherwise compute an empty range within the
block span) by updating the after_span creation and its uses (the variable
after_span and all Expr/Stmt allocations that use it: acc_path_lhs,
acc_path_rhs, concat, assign_stmt, loop_body, for_stmt, acc_tail) so the
synthesized __m3_for references resolve inside the block scope rather than at
span.end().
- Around line 2283-2294: The lowered member currently loses a leading '$'
because seen_accessor is only a bool; change the accessor tracking to remember
whether the accessor token was a DOLLAR (e.g., replace seen_accessor: bool with
last_accessor_is_dollar: bool or make seen_accessor an Option<SyntaxKind>), set
it when you detect SyntaxKind::DOLLAR (where currently you set seen_accessor =
true), and when creating the field (where you call field =
Some(Name::new(token.text())) and field_range = Some(token.text_range())),
prepend a '$' to the identifier text (and adjust the range if necessary) when
last_accessor_is_dollar is true so x.$watch remains distinct from x.watch; touch
the code paths in try_lower_bare_token/base handling and the is_ident_token
branch to use the new flag.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 90881378-39e9-48a9-bac8-1274258a67ee
⛔ Files ignored due to path filters (6)
baml_language/Cargo.lockis excluded by!**/*.lockbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_builtin_package_listing.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_alias_string.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_builtin_item_by_definition.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_describe_builtin_string.snapis excluded by!**/*.snapbaml_language/crates/baml_lsp2_actions/src/snapshots/baml_lsp2_actions__describe_tests__describe_builtin_string_with_compiler2_visible_files.snapis excluded by!**/*.snap
📒 Files selected for processing (31)
baml_language/crates/baml_builtins2/baml_std/baml/bool.bamlbaml_language/crates/baml_builtins2/baml_std/baml/core.bamlbaml_language/crates/baml_builtins2/baml_std/baml/float.bamlbaml_language/crates/baml_builtins2/baml_std/baml/int.bamlbaml_language/crates/baml_builtins2/baml_std/baml/string.bamlbaml_language/crates/baml_builtins2_codegen/src/codegen.rsbaml_language/crates/baml_compiler2_ast/src/ast.rsbaml_language/crates/baml_compiler2_ast/src/auto_derive_json.rsbaml_language/crates/baml_compiler2_ast/src/companions.rsbaml_language/crates/baml_compiler2_ast/src/lib.rsbaml_language/crates/baml_compiler2_ast/src/lower_cst.rsbaml_language/crates/baml_compiler2_ast/src/lower_expr_body.rsbaml_language/crates/baml_compiler2_ast/src/lower_type_expr.rsbaml_language/crates/baml_compiler2_emit/src/lib.rsbaml_language/crates/baml_compiler2_hir/src/builder.rsbaml_language/crates/baml_compiler2_hir/src/item_tree.rsbaml_language/crates/baml_compiler2_mir/src/lower.rsbaml_language/crates/baml_compiler2_ppir/src/lib.rsbaml_language/crates/baml_compiler2_tir/src/builder.rsbaml_language/crates/baml_compiler2_tir/src/infer_context.rsbaml_language/crates/baml_compiler2_tir/src/normalize.rsbaml_language/crates/baml_compiler2_tir/src/throws_analysis.rsbaml_language/crates/baml_compiler_lexer/src/tokens.rsbaml_language/crates/baml_compiler_parser/src/parser.rsbaml_language/crates/baml_compiler_syntax/src/ast.rsbaml_language/crates/baml_compiler_syntax/src/syntax_kind.rsbaml_language/crates/baml_fmt/src/ast/expressions.rsbaml_language/crates/baml_fmt/src/ast/tokens.rsbaml_language/crates/baml_fmt/src/lib.rsbaml_language/crates/baml_lsp2_actions/src/check.rsbaml_language/crates/baml_lsp2_actions/src/completions.rs
💤 Files with no reviewable changes (19)
- baml_language/crates/baml_compiler2_hir/src/item_tree.rs
- baml_language/crates/baml_compiler2_ast/src/lower_type_expr.rs
- baml_language/crates/baml_lsp2_actions/src/check.rs
- baml_language/crates/baml_compiler2_tir/src/infer_context.rs
- baml_language/crates/baml_compiler2_tir/src/throws_analysis.rs
- baml_language/crates/baml_compiler2_ppir/src/lib.rs
- baml_language/crates/baml_compiler2_emit/src/lib.rs
- baml_language/crates/baml_compiler_syntax/src/syntax_kind.rs
- baml_language/crates/baml_compiler2_tir/src/normalize.rs
- baml_language/crates/baml_compiler_lexer/src/tokens.rs
- baml_language/crates/baml_lsp2_actions/src/completions.rs
- baml_language/crates/baml_compiler2_mir/src/lower.rs
- baml_language/crates/baml_fmt/src/ast/expressions.rs
- baml_language/crates/baml_compiler2_tir/src/builder.rs
- baml_language/crates/baml_fmt/src/lib.rs
- baml_language/crates/baml_fmt/src/ast/tokens.rs
- baml_language/crates/baml_compiler_syntax/src/ast.rs
- baml_language/crates/baml_compiler_parser/src/parser.rs
- baml_language/crates/baml_compiler2_hir/src/builder.rs
✅ Files skipped from review due to trivial changes (1)
- baml_language/crates/baml_builtins2_codegen/src/codegen.rs
🚧 Files skipped from review as they are similar to previous changes (8)
- baml_language/crates/baml_compiler2_ast/src/companions.rs
- baml_language/crates/baml_builtins2/baml_std/baml/string.baml
- baml_language/crates/baml_builtins2/baml_std/baml/bool.baml
- baml_language/crates/baml_builtins2/baml_std/baml/float.baml
- baml_language/crates/baml_builtins2/baml_std/baml/int.baml
- baml_language/crates/baml_compiler2_ast/src/lower_cst.rs
- baml_language/crates/baml_compiler2_ast/src/auto_derive_json.rs
- baml_language/crates/baml_compiler2_ast/src/lib.rs
🛑 Comments failed to post (2)
baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs (2)
2283-2294:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve the
$in$watch-style member names.Once
SyntaxKind::DOLLARis treated as just another accessor marker, the lowered member becomeswatchinstead of$watch. That changes lookup/diagnostic behavior for the existing$...member surface and makesx.$watchindistinguishable fromx.watch.Based on learnings: `watch_when.baml` exercises `$watch` member syntax, so preserving the `$` is part of the current language surface.Suggested fix
- let mut seen_accessor = false; + let mut seen_accessor = false; + let mut saw_dollar = false; @@ - if matches!(token.kind(), SyntaxKind::DOT | SyntaxKind::DOLLAR) { + if token.kind() == SyntaxKind::DOT { seen_accessor = true; + } else if token.kind() == SyntaxKind::DOLLAR { + seen_accessor = true; + saw_dollar = true; } else if is_ident_token(token.kind()) { if !seen_accessor && base.is_none() { @@ } else if seen_accessor { - field = Some(Name::new(token.text())); + let member = if saw_dollar { + format!("${}", token.text()) + } else { + token.text().to_string() + }; + field = Some(Name::new(member)); field_range = Some(token.text_range()); } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs` around lines 2283 - 2294, The lowered member currently loses a leading '$' because seen_accessor is only a bool; change the accessor tracking to remember whether the accessor token was a DOLLAR (e.g., replace seen_accessor: bool with last_accessor_is_dollar: bool or make seen_accessor an Option<SyntaxKind>), set it when you detect SyntaxKind::DOLLAR (where currently you set seen_accessor = true), and when creating the field (where you call field = Some(Name::new(token.text())) and field_range = Some(token.text_range())), prepend a '$' to the identifier text (and adjust the range if necessary) when last_accessor_is_dollar is true so x.$watch remains distinct from x.watch; touch the code paths in try_lower_bare_token/base handling and the is_ident_token branch to use the new flag.
2686-2750:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winAnchor the synthetic
${for}accumulator inside the block scope, not atspan.end().These synthetic refs use
TextRange::empty(span.end()), but the tagged-template helpers below already document thatspan.endis the exclusive scope boundary. That makes the generated__m3_forreads/writes liable to fall outside the synthetic block's scope, so${for}in untagged templates can fail to resolve its own accumulator.Suggested fix
- // Accumulator binding. The leading space makes the name unparseable as - // a user identifier, so it can never collide with a user binding; name - // resolution is plain string equality, so the synthesized references - // below still resolve to it. References sit at an empty range after the - // template so they land inside the let binding's visibility window. - let after_span = TextRange::empty(span.end()); + // Accumulator binding. Keep both the binding and its references anchored + // inside the synthetic block's half-open source range. + let at = TextRange::empty(span.start()); let acc_name = Name::new(" __m3_for"); let acc_pat = self.alloc_pattern( Pattern::Bind { name: acc_name.clone(), subpat: None, }, - span, + at, ); - let empty_init = self.alloc_expr(Expr::Literal(Literal::String(String::new())), span); + let empty_init = self.alloc_expr(Expr::Literal(Literal::String(String::new())), at); let let_stmt = self.alloc_stmt( Stmt::Let { pattern: acc_pat, initializer: Some(empty_init), is_watched: false, origin: LetOrigin::Compiler, else_branch: None, }, - span, + at, ); - let acc_path_lhs = self.alloc_expr(Expr::Path(vec![acc_name.clone()]), after_span); - let acc_path_rhs = self.alloc_expr(Expr::Path(vec![acc_name.clone()]), after_span); + let acc_path_lhs = self.alloc_expr(Expr::Path(vec![acc_name.clone()]), at); + let acc_path_rhs = self.alloc_expr(Expr::Path(vec![acc_name.clone()]), at); let concat = self.alloc_expr( Expr::Binary { op: BinaryOp::Add, lhs: acc_path_rhs, rhs: body_string, }, - after_span, + at, ); let assign_stmt = self.alloc_stmt( Stmt::Assign { target: acc_path_lhs, value: concat, }, - after_span, + at, ); let loop_body = self.alloc_expr( Expr::Block { stmts: vec![assign_stmt], tail_expr: None, }, - after_span, + at, ); let for_stmt = self.alloc_stmt( Stmt::For { binding, collection, body: loop_body, }, - after_span, + at, ); - let acc_tail = self.alloc_expr(Expr::Path(vec![acc_name]), after_span); + let acc_tail = self.alloc_expr(Expr::Path(vec![acc_name]), at);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs` around lines 2686 - 2750, The synthetic accumulator reads/writes are anchored at TextRange::empty(span.end()) which places them outside the block scope; change the anchoring so the empty range falls inside the block (e.g. use TextRange::empty(span.start()) or otherwise compute an empty range within the block span) by updating the after_span creation and its uses (the variable after_span and all Expr/Stmt allocations that use it: acc_path_lhs, acc_path_rhs, concat, assign_stmt, loop_body, for_stmt, acc_tail) so the synthesized __m3_for references resolve inside the block scope rather than at span.end().
First slice of M5 (the built-in `prompt` tag). Adds the render-time types a
`prompt` tag body references (BEP-049 §10), analogous to M4a adding TaggedString:
- `baml.llm.Role { name, metadata }` — typed chat-role marker (produced by the
in-template `role(...)` constructor; replaces the magic-delimiter mechanism).
- `baml.llm.ContextClient { name, provider, default_role, allowed_roles }` —
flattened view of the resolved client, mirroring the runtime RenderContextClient.
- `baml.llm.Context { client, tags }` — the per-render `ctx`. (output_format
accessors land in M5b; they need a $rust_function backing.)
`Context.client` required parser support for `client` (a keyword) as a class
field + member-access name so `ctx.client` parses (BEP §10 surface):
- parser: at_member_name accepts KW_CLIENT; class-body missing-brace recovery
guards `client` via looks_like_client_declaration_start (only `client<…>`
declarations recover, not a `client Type` field) + regression tests.
- syntax: Field::name accepts KW_CLIENT.
Types only — no rust backing, no prompt tag yet. fmt already applied.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@baml_language/crates/baml_compiler_parser/src/parser.rs`:
- Around line 1030-1043: The lookahead in looks_like_client_declaration_start is
too permissive (it treats any `client` followed by `<` as a top-level
declaration); change it to validate that the `<` actually begins a
generic-parameter list for a declaration by further peeking: use
skip_trivia_and_comments_from to get the token after `Less` and ensure it is a
generic parameter start (e.g., TokenKind::Ident or TokenKind::Greater if you
allow empty params) and that there is a matching closing `TokenKind::Greater`
with a sensible follow-up token that indicates a declaration (not a
parameter-list-then-`(` as in a class field). Update the same tightened
lookahead logic at the other occurrence mentioned (the block around lines
3167-3172) so both use the stricter sequence validation rather than just
checking for `Less`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f6b22671-0295-4309-9682-c969abc21abf
⛔ Files ignored due to path filters (9)
baml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_builtin_namespace_llm.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_builtin_package_listing.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/src/compiler2_tir/snapshots/baml_tests__compiler2_tir__phase5__snapshot_baml_package_items.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded_unoptimized.snapis excluded by!**/*.snap
📒 Files selected for processing (3)
baml_language/crates/baml_builtins2/baml_std/baml/ns_llm/llm_types.bamlbaml_language/crates/baml_compiler_parser/src/parser.rsbaml_language/crates/baml_compiler_syntax/src/ast.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- baml_language/crates/baml_compiler_syntax/src/ast.rs
`baml.llm.prompt`...`` now evaluates to a `(Context) -> PromptAst` closure
that folds the template's flattened parts/values into chat messages —
`${role("system")}` markers split the content structurally, with no magic
delimiters. Adds the `prompt`/`make_role`/`assemble_prompt_ast` stdlib
functions plus `assemble_prompt_ast_impl` (the message-folding) in sys_ops.
Two compiler fixes unblock it:
- TIR: honor an explicit `unknown[]` (or any `unknown`-containing)
let-binding annotation instead of falling back to an evolving `never[]`
that pins to the first pushed value's type. `ty_contains_recovery_unknown`
conflated the builtin `unknown` top type with the `Unknown`/`Error`
resolution placeholders; split out `ty_contains_resolution_failure` (used
only by the `pattern_expected_ty` gate) so `unknown` annotations pin the
binding and the `values` accumulator holds a heterogeneous mix (a `Role`,
then a string) without a bogus "expected Role, got string".
- MIR: resolve a qualified tagged-template tag (`baml.llm.prompt`) via the
TIR-recorded `MemberResolution`, not just bare-name lookup of the last
path segment — the latter missed the stdlib function and panicked at
runtime with "tag did not resolve to a function".
The `prompt` body callback is constrained to `throws never` (pure templates
for this slice). The parser accepts `client` (a keyword) as an object-literal
field name for `Context { client: ... }`. Snapshots updated for the new
stdlib functions (additions plus function-index/line-number shifts only).
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
baml_language/crates/baml_compiler2_mir/src/lower.rs (1)
3342-3438:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftMember-path tags still lose their receiver.
This only re-resolves the tag to a
func_loc, then always emitsConstant::Function(tag_item_ref)with a single closure arg. For tags written as member expressions / local-rooted paths, MIR stores the callable inpath_member_resolutions, and aBoundMethodneeds its receiver preserved. As written,obj.tag\...`can still fall through to the bare-name fallback or become a wrong-aritytag()call instead ofobj.tag()`.Please mirror
lower_callhere: resolve the callee the same way as normal calls, preserve bound receivers, and use the resolvedfunc_loconly for reading the tag signature/body params.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_compiler2_mir/src/lower.rs` around lines 3342 - 3438, The code re-resolves the tag only to tag_func_loc and always emits callee = Operand::Constant(Constant::Function(tag_item_ref)), losing a bound receiver for member-path tags; mirror lower_call: consult path/member resolutions (same lookup used earlier, e.g. resolutions.get(&self.expr_metadata_key(tag)) and the stored MemberResolution variants such as BoundMethod/UnboundMethod/Free) to construct the correct callee Operand (preserve the receiver for BoundMethod or emit the appropriate method/constant operand) and only use tag_func_loc/tag_item_ref for reading the function signature and body params; then pass the preserved receiver operand into builder.call (instead of always calling the bare Constant::Function) and keep build_tagged_body_closure, tag_sig, and tag_func_loc usage unchanged for type/signature info.baml_language/crates/baml_compiler_parser/src/parser.rs (1)
6610-6615:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winInclude
TokenKind::Clientin missing-comma recovery guards.Line 6610 and Line 6638 still treat
clientas invalid field-start during recovery. After this change, a missing comma before aclient:field can consume the key token and cascade errors.🐛 Proposed fix
- if !self.at(TokenKind::Word) + if !self.at(TokenKind::Word) + && !self.at(TokenKind::Client) && !self.at(TokenKind::Quote) && !self.at(TokenKind::Hash) && !self.at(TokenKind::DotDotDot) && !self.at(TokenKind::RBrace) { self.bump(); } @@ - if !self.at(TokenKind::Word) + if !self.at(TokenKind::Word) + && !self.at(TokenKind::Client) && !self.at(TokenKind::Quote) && !self.at(TokenKind::Hash) && !self.at(TokenKind::DotDotDot) && !self.at(TokenKind::RBrace) { // Skip unexpected token self.bump(); }Also applies to: 6638-6643
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_compiler_parser/src/parser.rs` around lines 6610 - 6615, The recovery guard that checks allowed field-start tokens (the if that calls self.at(TokenKind::Word/Quote/Hash/DotDotDot/RBrace)) is missing TokenKind::Client so a bare `client:` gets consumed during missing-comma recovery; update that conditional (the same check around the field-parsing/recovery logic) to also check self.at(TokenKind::Client) so `client` is treated as a valid field start and not swallowed by the missing-comma recovery path.
🧹 Nitpick comments (1)
baml_language/crates/baml_tests/tests/prompt_tag_runtime.rs (1)
58-65: 💤 Low valueConsider inspecting PromptAst structure directly.
The debug string matching (
dbg.contains(...)) is acceptable for initial coverage, but it's fragile — any change to the Debug impl formatting will break the test. IfPromptAstexposes fields like.messagesor.roles, consider asserting against those directly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_tests/tests/prompt_tag_runtime.rs` around lines 58 - 65, The test currently asserts on the Debug string of PromptAst which is fragile; instead access PromptAst's structure (e.g., the PromptAst value returned in this test, its .messages vector and each Message's .role and .content fields) and assert directly that there is a system message with content "You are helpful." and a user message with content "Hi World!" (use PromptAst.messages, iterate or index into messages, and compare Message.role / Message.content rather than matching the debug string).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@baml_language/crates/baml_compiler_parser/src/parser.rs`:
- Around line 6610-6615: The recovery guard that checks allowed field-start
tokens (the if that calls self.at(TokenKind::Word/Quote/Hash/DotDotDot/RBrace))
is missing TokenKind::Client so a bare `client:` gets consumed during
missing-comma recovery; update that conditional (the same check around the
field-parsing/recovery logic) to also check self.at(TokenKind::Client) so
`client` is treated as a valid field start and not swallowed by the
missing-comma recovery path.
In `@baml_language/crates/baml_compiler2_mir/src/lower.rs`:
- Around line 3342-3438: The code re-resolves the tag only to tag_func_loc and
always emits callee = Operand::Constant(Constant::Function(tag_item_ref)),
losing a bound receiver for member-path tags; mirror lower_call: consult
path/member resolutions (same lookup used earlier, e.g.
resolutions.get(&self.expr_metadata_key(tag)) and the stored MemberResolution
variants such as BoundMethod/UnboundMethod/Free) to construct the correct callee
Operand (preserve the receiver for BoundMethod or emit the appropriate
method/constant operand) and only use tag_func_loc/tag_item_ref for reading the
function signature and body params; then pass the preserved receiver operand
into builder.call (instead of always calling the bare Constant::Function) and
keep build_tagged_body_closure, tag_sig, and tag_func_loc usage unchanged for
type/signature info.
---
Nitpick comments:
In `@baml_language/crates/baml_tests/tests/prompt_tag_runtime.rs`:
- Around line 58-65: The test currently asserts on the Debug string of PromptAst
which is fragile; instead access PromptAst's structure (e.g., the PromptAst
value returned in this test, its .messages vector and each Message's .role and
.content fields) and assert directly that there is a system message with content
"You are helpful." and a user message with content "Hi World!" (use
PromptAst.messages, iterate or index into messages, and compare Message.role /
Message.content rather than matching the debug string).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 8019b9b0-9627-46db-a019-dcf1b2e1d1a9
⛔ Files ignored due to path filters (9)
baml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_builtin_namespace_llm.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_builtin_package_listing.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/src/compiler2_tir/snapshots/baml_tests__compiler2_tir__phase5__snapshot_baml_package_items.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded_unoptimized.snapis excluded by!**/*.snap
📒 Files selected for processing (7)
baml_language/crates/baml_builtins2/baml_std/baml/ns_llm/llm_types.bamlbaml_language/crates/baml_compiler2_mir/src/lower.rsbaml_language/crates/baml_compiler2_tir/src/builder.rsbaml_language/crates/baml_compiler_parser/src/parser.rsbaml_language/crates/baml_tests/src/compiler2_tir/phase3a.rsbaml_language/crates/baml_tests/tests/prompt_tag_runtime.rsbaml_language/crates/sys_ops/src/lib.rs
👮 Files not reviewed due to content moderation or server errors (1)
- baml_language/crates/baml_compiler2_tir/src/builder.rs
…olation # Conflicts: # baml_language/crates/baml_compiler2_tir/src/builder.rs # baml_language/crates/baml_compiler2_tir/src/throws_analysis.rs # baml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____04_5_mir.snap # baml_language/crates/baml_tests/src/compiler2_tir/mod.rs # baml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded.snap # baml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded_unoptimized.snap # baml_language/crates/tools_onionskin/src/compiler.rs
Adds `render_output_format(return_type: type) -> string` to the `llm` stdlib
namespace (sys_op → `sys_llm::render_output_format`), which builds the return
type's `OutputFormatContent` and renders it with `RenderOptions::default()` —
byte-equivalent to the legacy Jinja `{{ ctx.output_format }}`
(`OutputFormatObject`'s `Display`). Adds an `output_format string` field to
`Context` so a `prompt` body can read `${ctx.output_format}`.
The orchestrator (M5e) will populate `Context.output_format` per attempt via
`render_output_format(return_type)` — assembling `Context` directly in BAML, so
no separate context-builder sys_op is needed. The parameterized
`output_format_with(...)` accessor (rare `{{ ctx.output_format(...) }}` case)
is a later slice.
Tests: sys_llm unit (schema parity vs a direct default-options render) +
baml_tests integration (`reflect.type_of<Person>()` → `render_output_format`
→ `${ctx.output_format}` in a `prompt` body embeds the schema). Stdlib
snapshots regenerated (additive: new fn + Context field, index/line shifts).
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
baml_language/crates/baml_builtins2/baml_std/baml/ns_llm/llm_types.baml (1)
12-14:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift
Role.metadatais currently lost before it reachesPromptAst.
assemble_prompt_ast_implinbaml_language/crates/sys_ops/src/lib.rs:710-763only reads the role'snameand then constructs eachPromptAst::Messagewithmetadata: Default::default(). Any metadata supplied here is silently discarded, so this public field's contract is not actually honored yet. Either thread the metadata through in this slice or keep it out of the supported surface until the runtime preserves it.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@baml_language/crates/baml_builtins2/baml_std/baml/ns_llm/llm_types.baml` around lines 12 - 14, Role.metadata is being discarded when building PromptAst messages; in assemble_prompt_ast_impl (the function assembling PromptAst::Message instances) it only reads Role.name and sets PromptAst::Message.metadata to Default::default(), so any provided metadata is lost. Fix by reading and passing the role's metadata through into the PromptAst::Message construction (use the Role.metadata field when creating each PromptAst::Message in assemble_prompt_ast_impl) so PromptAst preserves the role-level metadata instead of defaulting it away.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@baml_language/crates/baml_builtins2/baml_std/baml/ns_llm/llm_types.baml`:
- Around line 44-45: The change made Context.output_format to a required field
breaking manual constructions; revert to a backward-compatible optional field or
provide a default so existing code compiles: make Context.output_format optional
(e.g., Option<String> or a field with a default value via
Default/#[serde(default)]) in the Context struct or add a default impl for
Context/output_format, and update any manual constructions (e.g., instances
created as `baml.llm.Context { client: cc, tags: {} }` in the
prompt_tag_builds_promptast_with_role_messages test) to either set output_format
explicitly or rely on the new default. Ensure modifications target the Context
type and its Default/constructor logic so tests and manual `baml.llm.Context {
... }` literals continue to compile.
---
Outside diff comments:
In `@baml_language/crates/baml_builtins2/baml_std/baml/ns_llm/llm_types.baml`:
- Around line 12-14: Role.metadata is being discarded when building PromptAst
messages; in assemble_prompt_ast_impl (the function assembling
PromptAst::Message instances) it only reads Role.name and sets
PromptAst::Message.metadata to Default::default(), so any provided metadata is
lost. Fix by reading and passing the role's metadata through into the
PromptAst::Message construction (use the Role.metadata field when creating each
PromptAst::Message in assemble_prompt_ast_impl) so PromptAst preserves the
role-level metadata instead of defaulting it away.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d243e0c0-113b-4463-8b50-f76968b6f468
⛔ Files ignored due to path filters (9)
baml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_builtin_namespace_llm.snapis excluded by!**/*.snapbaml_language/crates/baml_cli/src/snapshots/baml_cli__describe_command_tests__render_builtin_package_listing.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/compiles/__baml_std__/baml_tests__compiles____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/src/compiler2_tir/snapshots/baml_tests__compiler2_tir__phase5__snapshot_baml_package_items.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded_unoptimized.snapis excluded by!**/*.snap
📒 Files selected for processing (4)
baml_language/crates/baml_builtins2/baml_std/baml/ns_llm/llm_types.bamlbaml_language/crates/baml_tests/tests/prompt_tag_runtime.rsbaml_language/crates/sys_llm/src/lib.rsbaml_language/crates/sys_ops/src/lib.rs
| /// The return type's schema, rendered with default options. | ||
| output_format string |
There was a problem hiding this comment.
Making Context.output_format required breaks existing constructors.
baml_language/crates/baml_tests/tests/prompt_tag_runtime.rs still has let ctx = baml.llm.Context { client: cc, tags: {} } in prompt_tag_builds_promptast_with_role_messages, so this new required field makes an existing runtime test stop compiling. Please update every manual Context construction in the same slice, or keep this field backward-compatible until the orchestrator is the only constructor.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@baml_language/crates/baml_builtins2/baml_std/baml/ns_llm/llm_types.baml`
around lines 44 - 45, The change made Context.output_format to a required field
breaking manual constructions; revert to a backward-compatible optional field or
provide a default so existing code compiles: make Context.output_format optional
(e.g., Option<String> or a field with a default value via
Default/#[serde(default)]) in the Context struct or add a default impl for
Context/output_format, and update any manual constructions (e.g., instances
created as `baml.llm.Context { client: cc, tags: {} }` in the
prompt_tag_builds_promptast_with_role_messages test) to either set output_format
explicitly or rely on the new default. Ensure modifications target the Context
type and its Default/constructor logic so tests and manual `baml.llm.Context {
... }` literals continue to compile.
A `prompt`/tagged template synthesizes its own `ScopeKind::Lambda` in HIR, but
its `${...}` references attributed only to that synthetic lambda. When the
template sat inside a user lambda `f`, `f` never recorded that it must capture
an outer variable the template uses — so TIR (which types the template body
inline in `f`'s scope) reported `[E0003] unresolved name`, and a binary-op
error on that variable cascaded to a `0..0` span.
A real nested lambda gets this for free: its references attribute to it AND its
captures thread up through each enclosing lambda. Mirror that for the synthetic
template lambda — after computing its captures, re-record each as a reference
owned by the enclosing lambda, so `f` captures it too (TIR resolves it; MIR
forwards it into the template closure).
Regression test: an `outer` local captured through a user lambda into a nested
`cap`...`` template — previously failed to compile, now returns "aOUTb".
`TirTypeError` is span-free for Salsa caching: a diagnostic stores only an arena `ExprId`, and the span is resolved at render time against the owning scope's source map. A nested lambda body is inferred *inline* in its enclosing scope (`infer_lambda_body`), so the diagnostics it emits carry the lambda's own arena IDs but live in the enclosing scope's diagnostic set — and are rendered with the *enclosing* scope's source map, which can't resolve a nested-arena ID, collapsing the span to `0..0`. `infer_lambda_body` already swaps in the lambda's source map during inference, so resolve those diagnostics' spans there: `freeze_diagnostic_spans_from` converts the locations recorded during the lambda pass to absolute `DiagnosticLocation::Span`s before the enclosing map is restored. Deeper lambdas freeze their own first, so nested cases compose; already-frozen spans are left unchanged. Fixes mislocated diagnostics for any type error inside a lambda body (binary ops, test-name expressions, etc.) — previously `!! 0..0:`. Snapshots updated to the now-correct spans (and one previously-duplicated, mis-spanned diagnostic now dedups). Regression test: a binary-op error in a nested lambda body must have a real span.
Adds the parameterized output-format accessor: `${ctx.output_format_with(
prefix=..., quote_class_fields=..., ...)}` re-renders the return type's schema
with caller options (the rare `{{ ctx.output_format(...) }}` Jinja form). The
schema is carried on `Context` as an opaque `_output_format` handle (an
`OutputFormat { _data $rust_type }` wrapping the prebuilt `OutputFormatContent`,
built by `build_output_format(return_type)`); the `output_format_with` method
re-renders it (the optional params map to `RenderOptions` inside sys_llm). The
common `${ctx.output_format}` (M5b.1) is unchanged.
`output_format_with` was the first feature to need two pieces of general infra,
both fixed here:
1. MIR — method calls in a tagged-template body. The static-layout interp arm of
`build_tagged_body_closure` lowered interps under the synthetic lambda's
metadata scope, but TIR keys their resolutions under the ENCLOSING scope (it
type-checks the body inline). Field accesses survived via an accidental
map-index fallback; method calls (`${ctx.m()}`) hit that fallback too and
crashed at runtime (`expected Map, got Instance`). Fixed by restoring the
enclosing metadata scope before lowering the interps — matching the
dynamic-layout arm. Corrects resolution for any `${a.b}`/`${a.m()}`/`${f(x)}`.
2. MIR — `$rust_io_function` (sys-op) optional params. Sys-ops have no bytecode
body, so they never run the default-parameter prologue; an omitted optional
reached the engine as an `OmittedArg` sentinel and panicked in
`vm_arg_to_bex_value`. Fixed by folding the callee's default literal to a
constant at the call site, read from the callee's OWN defaults arena (correct
cross-file, where the caller's TIR tables don't cover the callee).
`check_sys_op` is refactored into `sys_op_callee` (returns the IO `FunctionLoc`).
Test exercises both: a `prompt` body calling `${ctx.output_format_with(prefix=...)}`
with 7 of 8 optional params omitted. Stdlib snapshots regenerated (additive).
Implements BEP-049 (String Interpolation) milestones M1 through M4.
What's in this PR
M1 — backtick string literals
`...`: newBACKTICKtoken, multi-tick delimiter ladders, escapes, multi-line auto-dedent (§12), usable in any expression position.M2 —
${expr}interpolation:${...}as a block expression (§4), implicit.to_string()for non-string values (§11), and strict typing (§7/§11) — interpolating a nullable or non-stringable value is a compile-time error.M3 — block control flow:
${for (let x in xs)}…${endfor},${if (c)}…${else if}…${else}…${endif}using host syntax, with §13 whitespace control.M4 — tagged templates (§10):
tag`...`with the//baml:tagged_stringmarker,TaggedString { parts, values }, tag resolution + signature validation, body-lambda params scoping into${…}, and full VM execution — static layouts (M4e.1a) and${for}/${if}runtime flattening (M4e.1b).Architecture: check-then-lower (clean diagnostics)
Both backtick forms are unified behind a first-class
Expr::Template { tag: Default { elaborated } | Custom { tag, body }, segments }node that survives through TIR. Untagged backticks no longer desugar early — interpolation diagnostics point at the original${…}span instead of leaking a synthetic.to_string()call. This matches how tagged templates (and TypeScript) check the structured node and lower late.Testing
Full workspace suite green. Adds unit + runtime coverage for backtick literals, interpolation, strict-type diagnostics, control flow, and tagged-template execution (static,
${for}/${if}, nested, mixed, captures, body-param injection, nested lambdas capturing loop-locals).Not in this PR
M5 (built-in
prompttag), M6 (legacy#"..."#interop + migrator), M7 (drop minijinja).🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
${...}interpolation, multi-tick delimiters, multiline dedent/CR normalization, extended escapes, block control-flow (for/if), tagged templates, TaggedString runtime, and built-in prompt/Role API.Documentation
to_stringmethods for core primitives to define interpolation formatting.Diagnostics
Tests