-
-
Notifications
You must be signed in to change notification settings - Fork 2
05 Semantic Actions
Semantic actions are JSON command blocks that describe how to build TypedAST nodes from parsed syntax. This chapter covers all available commands and patterns.
Every grammar rule can have a semantic action:
rule_name = { pattern }
-> ResultType {
// JSON commands
}
The ResultType indicates what kind of AST node the rule produces:
-
TypedExpression- Expressions (literals, operations, calls) -
TypedStatement- Statements (if, while, return) -
TypedDeclaration- Top-level declarations (functions, structs) -
TypedBlock- Statement blocks -
TypedParameter- Function parameters -
TypedField- Struct fields -
TypedVariant- Enum variants -
Type- Type expressions -
List- Collect multiple children -
String- Extract text
Commands can reference parsed children using $N syntax:
// Children are numbered by their position in the parse tree
// $1 = first non-silent child, $2 = second, etc.
binary_expr = { left ~ "+" ~ right }
-> TypedExpression {
"commands": [
{ "define": "binary", "args": { "left": "$1", "op": "+", "right": "$2" } }
]
}
Special references:
-
$result- The result of previous commands (likeget_text) -
$1,$2, ... - Child nodes by position
Extracts the matched text as a string:
identifier = @{ ASCII_ALPHA ~ (ASCII_ALPHANUMERIC | "_")* }
-> String {
"get_text": true
}
Parses extracted text as an integer:
integer_literal = @{ "-"? ~ ASCII_DIGIT+ }
-> TypedExpression {
"get_text": true,
"parse_int": true,
"define": "int_literal",
"args": { "value": "$result" }
}
Gets a specific child by index:
// Get the first child (index 0)
expr = { inner_expr }
-> TypedExpression {
"get_child": { "index": 0 }
}
Collects all children into a list:
statements = { statement* }
-> List {
"get_all_children": true
}
Calls an AST builder method with arguments:
return_stmt = { "return" ~ expr? ~ ";" }
-> TypedStatement {
"commands": [
{ "define": "return_stmt", "args": { "value": "$1" } }
]
}
The define command is the primary way to create AST nodes. It calls a method on the TypedAstBuilder.
{ "define": "method_name", "args": { "arg1": "value1", "arg2": "$1" } }| Method | Arguments | Creates |
|---|---|---|
int_literal |
value |
Integer literal |
bool_literal |
value |
Boolean literal |
string_literal |
value |
String literal |
char_literal |
value |
Character literal |
bool_literal = { "true" | "false" }
-> TypedExpression {
"get_text": true,
"define": "bool_literal",
"args": { "value": "$result" }
}
| Method | Arguments | Creates |
|---|---|---|
variable |
name |
Variable reference |
field_access |
object, field
|
Field access (obj.field) |
index |
object, index
|
Index access (arr[i]) |
field_expr = { atom ~ "." ~ identifier }
-> TypedExpression {
"commands": [
{ "define": "field_access", "args": { "object": "$1", "field": "$2" } }
]
}
| Method | Arguments | Creates |
|---|---|---|
binary |
op, left, right
|
Binary operation |
unary |
op, operand
|
Unary operation |
call |
callee, args
|
Function call |
unary_expr = { unary_op ~ primary }
-> TypedExpression {
"commands": [
{ "define": "unary", "args": { "op": "$1", "operand": "$2" } }
]
}
| Method | Arguments | Creates |
|---|---|---|
let_stmt |
name, init, is_const, type? |
Variable declaration |
return_stmt |
value? |
Return statement |
if |
condition, then_branch, else_branch? |
If statement |
while |
condition, body
|
While loop |
for |
iterable, binding, body
|
For loop |
expression_stmt |
expr |
Expression statement |
assignment |
target, value
|
Assignment |
break |
Break statement | |
continue |
Continue statement |
if_else = { "if" ~ "(" ~ expr ~ ")" ~ block ~ "else" ~ block }
-> TypedStatement {
"commands": [
{ "define": "if", "args": {
"condition": "$1",
"then_branch": "$2",
"else_branch": "$3"
}}
]
}
| Method | Arguments | Creates |
|---|---|---|
block |
statements |
Statement block |
program |
declarations |
Program root |
block = { "{" ~ statement* ~ "}" }
-> TypedBlock {
"get_all_children": true,
"define": "block",
"args": { "statements": "$result" }
}
| Method | Arguments | Creates |
|---|---|---|
function |
name, params, return_type, body
|
Function declaration |
struct |
name, fields
|
Struct declaration |
enum |
name, variants
|
Enum declaration |
param |
name, type
|
Function parameter |
field |
name, type
|
Struct field |
variant |
name |
Enum variant |
fn_decl = { "fn" ~ identifier ~ "(" ~ fn_params ~ ")" ~ type_expr ~ block }
-> TypedDeclaration {
"commands": [
{ "define": "function", "args": {
"name": "$1",
"params": "$2",
"return_type": "$3",
"body": "$4"
}}
]
}
| Method | Arguments | Creates |
|---|---|---|
struct_init |
type_name, fields
|
Struct literal |
struct_field_init |
name, value
|
Field initializer |
array_literal |
elements |
Array literal |
struct_init = { identifier ~ "{" ~ struct_init_fields? ~ "}" }
-> TypedExpression {
"commands": [
{ "define": "struct_init", "args": { "type_name": "$1", "fields": "$2" } }
]
}
| Method | Arguments | Creates |
|---|---|---|
primitive_type |
name |
Primitive type (i32, bool, etc.) |
pointer_type |
pointee |
Pointer type (*T) |
optional_type |
inner |
Optional type (?T) |
array_type |
size, element
|
Array type ([N]T) |
primitive_type = { "i32" | "i64" | "bool" | "void" }
-> Type {
"get_text": true,
"define": "primitive_type",
"args": { "name": "$result" }
}
This special command builds left-associative binary expression trees from repetition patterns.
Given input 1 + 2 + 3, we want:
+
/ \
+ 3
/ \
1 2
Not:
+
/ \
1 +
/ \
2 3
addition = { term ~ ((add_op | sub_op) ~ term)* }
-> TypedExpression {
"fold_binary": { "operand": "term", "operator": "add_op|sub_op" }
}
Parameters:
-
operand: Name of the operand rule -
operator: Operator rules (pipe-separated for multiple)
For input 1 + 2 - 3:
- Parse produces:
[term(1), add_op(+), term(2), sub_op(-), term(3)] - Fold starts with first term:
result = 1 - Process pairs:
result = binary(+, result, 2)→(1 + 2) - Continue:
result = binary(-, result, 3)→((1 + 2) - 3)
Handle different operators at the same precedence level:
comparison = { addition ~ ((eq_op | neq_op | lt_op | gt_op) ~ addition)* }
-> TypedExpression {
"fold_binary": { "operand": "addition", "operator": "eq_op|neq_op|lt_op|gt_op" }
}
Use commands array to execute multiple commands in sequence:
typed_var_decl = { "const" ~ identifier ~ ":" ~ type_expr ~ "=" ~ expr ~ ";" }
-> TypedDeclaration {
"commands": [
{ "define": "let_stmt", "args": {
"name": "$1",
"type": "$2",
"init": "$3",
"is_const": true
}}
]
}
When a child might be absent, the builder handles null/missing gracefully:
// expr? produces None if missing
return_stmt = { "return" ~ expr? ~ ";" }
-> TypedStatement {
"commands": [
{ "define": "return_stmt", "args": { "value": "$1" } }
]
}
For complex cases, use separate rules:
// Split into variants to avoid indexing issues
if_stmt = { if_else | if_only }
if_only = { "if" ~ "(" ~ expr ~ ")" ~ block }
-> TypedStatement {
"commands": [
{ "define": "if", "args": {
"condition": "$1",
"then_branch": "$2"
}}
]
}
if_else = { "if" ~ "(" ~ expr ~ ")" ~ block ~ "else" ~ block }
-> TypedStatement {
"commands": [
{ "define": "if", "args": {
"condition": "$1",
"then_branch": "$2",
"else_branch": "$3"
}}
]
}
Sometimes a rule just selects between alternatives without transforming:
// Just pass through the matched child
expr = { logical_or }
-> TypedExpression {
"get_child": { "index": 0 }
}
statement = { if_stmt | while_stmt | return_stmt | expr_stmt }
-> TypedStatement {
"get_child": { "index": 0 }
}
Here's a complete expression grammar with proper operator precedence:
expr = { logical_or }
-> TypedExpression { "get_child": { "index": 0 } }
logical_or = { logical_and ~ (or_op ~ logical_and)* }
-> TypedExpression {
"fold_binary": { "operand": "logical_and", "operator": "or_op" }
}
logical_and = { comparison ~ (and_op ~ comparison)* }
-> TypedExpression {
"fold_binary": { "operand": "comparison", "operator": "and_op" }
}
comparison = { addition ~ ((eq_op | neq_op | lt_op | gt_op) ~ addition)* }
-> TypedExpression {
"fold_binary": { "operand": "addition", "operator": "eq_op|neq_op|lt_op|gt_op" }
}
addition = { multiplication ~ ((add_op | sub_op) ~ multiplication)* }
-> TypedExpression {
"fold_binary": { "operand": "multiplication", "operator": "add_op|sub_op" }
}
multiplication = { unary ~ ((mul_op | div_op) ~ unary)* }
-> TypedExpression {
"fold_binary": { "operand": "unary", "operator": "mul_op|div_op" }
}
unary = { unary_with_op | primary }
-> TypedExpression { "get_child": { "index": 0 } }
unary_with_op = { unary_op ~ primary }
-> TypedExpression {
"commands": [
{ "define": "unary", "args": { "op": "$1", "operand": "$2" } }
]
}
primary = { integer | identifier_expr | paren_expr }
-> TypedExpression { "get_child": { "index": 0 } }
paren_expr = _{ "(" ~ expr ~ ")" }
integer = @{ ASCII_DIGIT+ }
-> TypedExpression {
"get_text": true,
"parse_int": true,
"define": "int_literal",
"args": { "value": "$result" }
}
identifier_expr = { identifier }
-> TypedExpression {
"get_text": true,
"define": "variable",
"args": { "name": "$result" }
}
// Operators
and_op = { "and" } -> String { "get_text": true }
or_op = { "or" } -> String { "get_text": true }
eq_op = { "==" } -> String { "get_text": true }
neq_op = { "!=" } -> String { "get_text": true }
lt_op = { "<" } -> String { "get_text": true }
gt_op = { ">" } -> String { "get_text": true }
add_op = { "+" } -> String { "get_text": true }
sub_op = { "-" } -> String { "get_text": true }
mul_op = { "*" } -> String { "get_text": true }
div_op = { "/" } -> String { "get_text": true }
unary_op = { "-" | "!" } -> String { "get_text": true }