-
-
Notifications
You must be signed in to change notification settings - Fork 2
05 Semantic Actions
Semantic actions are JSON command blocks that describe how to build TypedAST nodes from parsed syntax. This chapter covers all available commands and patterns.
Every grammar rule can have a semantic action:
rule_name = { pattern }
-> ResultType {
// JSON commands
}
The ResultType indicates what kind of AST node the rule produces:
-
TypedExpression- Expressions (literals, operations, calls) -
TypedStatement- Statements (if, while, return) -
TypedDeclaration- Top-level declarations (functions, structs) -
TypedBlock- Statement blocks -
TypedParameter- Function parameters -
TypedField- Struct fields -
TypedVariant- Enum variants -
Type- Type expressions -
List- Collect multiple children -
String- Extract text
Commands can reference parsed children using $N syntax:
// Children are numbered by their position in the parse tree
// $1 = first non-silent child, $2 = second, etc.
binary_expr = { left ~ "+" ~ right }
-> TypedExpression {
"commands": [
{ "define": "binary", "args": { "left": "$1", "op": "+", "right": "$2" } }
]
}
Special references:
-
$result- The result of previous commands (likeget_text) -
$1,$2, ... - Child nodes by position
Extracts the matched text as a string:
identifier = @{ ASCII_ALPHA ~ (ASCII_ALPHANUMERIC | "_")* }
-> String {
"get_text": true
}
Parses extracted text as an integer:
integer_literal = @{ "-"? ~ ASCII_DIGIT+ }
-> TypedExpression {
"get_text": true,
"parse_int": true,
"define": "int_literal",
"args": { "value": "$result" }
}
Gets a specific child by index:
// Get the first child (index 0)
expr = { inner_expr }
-> TypedExpression {
"get_child": { "index": 0 }
}
Collects all children into a list:
statements = { statement* }
-> List {
"get_all_children": true
}
Calls an AST builder method with arguments:
return_stmt = { "return" ~ expr? ~ ";" }
-> TypedStatement {
"commands": [
{ "define": "return_stmt", "args": { "value": "$1" } }
]
}
The define command is the primary way to create AST nodes. It calls a method on the TypedAstBuilder.
{ "define": "method_name", "args": { "arg1": "value1", "arg2": "$1" } }| Method | Arguments | Creates |
|---|---|---|
int_literal |
value |
Integer literal |
bool_literal |
value |
Boolean literal |
string_literal |
value |
String literal |
char_literal |
value |
Character literal |
bool_literal = { "true" | "false" }
-> TypedExpression {
"get_text": true,
"define": "bool_literal",
"args": { "value": "$result" }
}
| Method | Arguments | Creates |
|---|---|---|
variable |
name |
Variable reference |
field_access |
object, field
|
Field access (obj.field) |
index |
object, index
|
Index access (arr[i]) |
field_expr = { atom ~ "." ~ identifier }
-> TypedExpression {
"commands": [
{ "define": "field_access", "args": { "object": "$1", "field": "$2" } }
]
}
| Method | Arguments | Creates |
|---|---|---|
binary |
op, left, right
|
Binary operation |
unary |
op, operand
|
Unary operation |
call |
callee, args
|
Function call |
unary_expr = { unary_op ~ primary }
-> TypedExpression {
"commands": [
{ "define": "unary", "args": { "op": "$1", "operand": "$2" } }
]
}
| Method | Arguments | Creates |
|---|---|---|
let_stmt |
name, init, is_const, type? |
Variable declaration |
return_stmt |
value? |
Return statement |
if |
condition, then_branch, else_branch? |
If statement |
while |
condition, body
|
While loop |
for |
iterable, binding, body
|
For loop |
expression_stmt |
expr |
Expression statement |
assignment |
target, value
|
Assignment |
break |
Break statement | |
continue |
Continue statement |
if_else = { "if" ~ "(" ~ expr ~ ")" ~ block ~ "else" ~ block }
-> TypedStatement {
"commands": [
{ "define": "if", "args": {
"condition": "$1",
"then_branch": "$2",
"else_branch": "$3"
}}
]
}
| Method | Arguments | Creates |
|---|---|---|
block |
statements |
Statement block |
program |
declarations |
Program root |
block = { "{" ~ statement* ~ "}" }
-> TypedBlock {
"get_all_children": true,
"define": "block",
"args": { "statements": "$result" }
}
| Method | Arguments | Creates |
|---|---|---|
function |
name, params, return_type, body
|
Function declaration |
struct |
name, fields
|
Struct declaration |
enum |
name, variants
|
Enum declaration |
param |
name, type
|
Function parameter |
field |
name, type
|
Struct field |
variant |
name |
Enum variant |
fn_decl = { "fn" ~ identifier ~ "(" ~ fn_params ~ ")" ~ type_expr ~ block }
-> TypedDeclaration {
"commands": [
{ "define": "function", "args": {
"name": "$1",
"params": "$2",
"return_type": "$3",
"body": "$4"
}}
]
}
| Method | Arguments | Creates |
|---|---|---|
struct_init |
type_name, fields
|
Struct literal |
struct_field_init |
name, value
|
Field initializer |
array_literal |
elements |
Array literal |
struct_init = { identifier ~ "{" ~ struct_init_fields? ~ "}" }
-> TypedExpression {
"commands": [
{ "define": "struct_init", "args": { "type_name": "$1", "fields": "$2" } }
]
}
| Method | Arguments | Creates |
|---|---|---|
primitive_type |
name |
Primitive type (i32, bool, etc.) |
pointer_type |
pointee |
Pointer type (*T) |
optional_type |
inner |
Optional type (?T) |
array_type |
size, element
|
Array type ([N]T) |
primitive_type = { "i32" | "i64" | "bool" | "void" }
-> Type {
"get_text": true,
"define": "primitive_type",
"args": { "name": "$result" }
}
These commands create pattern nodes for switch expressions and pattern matching:
| Method | Arguments | Creates |
|---|---|---|
literal_pattern |
value |
Match a literal value (int, string) |
wildcard_pattern |
Match anything (_ or else) |
|
range_pattern |
start, end, inclusive
|
Match a range (1..10) |
identifier_pattern |
name |
Bind matched value to variable |
struct_pattern |
name, fields
|
Match struct with field patterns |
field_pattern |
name, pattern? |
Match a struct field |
enum_pattern |
name, variant, fields
|
Match enum/tagged union variant |
array_pattern |
elements |
Match array elements |
pointer_pattern |
inner, mutable
|
Match pointer dereference |
error_pattern |
name |
Match error value (error.OutOfMemory) |
switch_expr |
scrutinee, cases
|
Switch expression |
switch_case |
pattern, body
|
Single switch case arm |
// Literal pattern: match exact value
switch_literal_pattern = { integer_literal }
-> TypedExpression {
"commands": [
{ "define": "literal_pattern", "args": { "value": "$1" } }
]
}
// Wildcard pattern: match anything
switch_wildcard_pattern = { "_" }
-> TypedExpression {
"commands": [
{ "define": "wildcard_pattern" }
]
}
// Range pattern: match value in range
switch_range_pattern = { integer_literal ~ ".." ~ integer_literal }
-> TypedExpression {
"commands": [
{ "define": "range_pattern", "args": {
"start": { "define": "literal_pattern", "args": { "value": "$1" } },
"end": { "define": "literal_pattern", "args": { "value": "$2" } },
"inclusive": false
}}
]
}
// Struct pattern: match struct fields
switch_struct_pattern = { identifier ~ "{" ~ struct_field_patterns? ~ "}" }
-> TypedExpression {
"commands": [
{ "define": "struct_pattern", "args": {
"name": { "text": "$1" },
"fields": "$2"
}}
]
}
// Tagged union pattern: .some, .none
switch_tagged_union_pattern = { "." ~ identifier }
-> TypedExpression {
"commands": [
{ "define": "enum_pattern", "args": {
"name": "",
"variant": { "text": "$1" },
"fields": []
}}
]
}
// Error pattern: error.OutOfMemory
switch_error_pattern = { "error" ~ "." ~ identifier }
-> TypedExpression {
"commands": [
{ "define": "error_pattern", "args": {
"name": { "text": "$1" }
}}
]
}
// Pointer pattern: *x
switch_pointer_pattern = { "*" ~ switch_pattern }
-> TypedExpression {
"commands": [
{ "define": "pointer_pattern", "args": {
"inner": "$1",
"mutable": false
}}
]
}
This special command builds left-associative binary expression trees from repetition patterns.
Given input 1 + 2 + 3, we want:
+
/ \
+ 3
/ \
1 2
Not:
+
/ \
1 +
/ \
2 3
addition = { term ~ ((add_op | sub_op) ~ term)* }
-> TypedExpression {
"fold_binary": { "operand": "term", "operator": "add_op|sub_op" }
}
Parameters:
-
operand: Name of the operand rule -
operator: Operator rules (pipe-separated for multiple)
For input 1 + 2 - 3:
- Parse produces:
[term(1), add_op(+), term(2), sub_op(-), term(3)] - Fold starts with first term:
result = 1 - Process pairs:
result = binary(+, result, 2)→(1 + 2) - Continue:
result = binary(-, result, 3)→((1 + 2) - 3)
Handle different operators at the same precedence level:
comparison = { addition ~ ((eq_op | neq_op | lt_op | gt_op) ~ addition)* }
-> TypedExpression {
"fold_binary": { "operand": "addition", "operator": "eq_op|neq_op|lt_op|gt_op" }
}
Use commands array to execute multiple commands in sequence:
typed_var_decl = { "const" ~ identifier ~ ":" ~ type_expr ~ "=" ~ expr ~ ";" }
-> TypedDeclaration {
"commands": [
{ "define": "let_stmt", "args": {
"name": "$1",
"type": "$2",
"init": "$3",
"is_const": true
}}
]
}
When a child might be absent, the builder handles null/missing gracefully:
// expr? produces None if missing
return_stmt = { "return" ~ expr? ~ ";" }
-> TypedStatement {
"commands": [
{ "define": "return_stmt", "args": { "value": "$1" } }
]
}
For complex cases, use separate rules:
// Split into variants to avoid indexing issues
if_stmt = { if_else | if_only }
if_only = { "if" ~ "(" ~ expr ~ ")" ~ block }
-> TypedStatement {
"commands": [
{ "define": "if", "args": {
"condition": "$1",
"then_branch": "$2"
}}
]
}
if_else = { "if" ~ "(" ~ expr ~ ")" ~ block ~ "else" ~ block }
-> TypedStatement {
"commands": [
{ "define": "if", "args": {
"condition": "$1",
"then_branch": "$2",
"else_branch": "$3"
}}
]
}
Sometimes a rule just selects between alternatives without transforming:
// Just pass through the matched child
expr = { logical_or }
-> TypedExpression {
"get_child": { "index": 0 }
}
statement = { if_stmt | while_stmt | return_stmt | expr_stmt }
-> TypedStatement {
"get_child": { "index": 0 }
}
Here's a complete expression grammar with proper operator precedence:
expr = { logical_or }
-> TypedExpression { "get_child": { "index": 0 } }
logical_or = { logical_and ~ (or_op ~ logical_and)* }
-> TypedExpression {
"fold_binary": { "operand": "logical_and", "operator": "or_op" }
}
logical_and = { comparison ~ (and_op ~ comparison)* }
-> TypedExpression {
"fold_binary": { "operand": "comparison", "operator": "and_op" }
}
comparison = { addition ~ ((eq_op | neq_op | lt_op | gt_op) ~ addition)* }
-> TypedExpression {
"fold_binary": { "operand": "addition", "operator": "eq_op|neq_op|lt_op|gt_op" }
}
addition = { multiplication ~ ((add_op | sub_op) ~ multiplication)* }
-> TypedExpression {
"fold_binary": { "operand": "multiplication", "operator": "add_op|sub_op" }
}
multiplication = { unary ~ ((mul_op | div_op) ~ unary)* }
-> TypedExpression {
"fold_binary": { "operand": "unary", "operator": "mul_op|div_op" }
}
unary = { unary_with_op | primary }
-> TypedExpression { "get_child": { "index": 0 } }
unary_with_op = { unary_op ~ primary }
-> TypedExpression {
"commands": [
{ "define": "unary", "args": { "op": "$1", "operand": "$2" } }
]
}
primary = { integer | identifier_expr | paren_expr }
-> TypedExpression { "get_child": { "index": 0 } }
paren_expr = _{ "(" ~ expr ~ ")" }
integer = @{ ASCII_DIGIT+ }
-> TypedExpression {
"get_text": true,
"parse_int": true,
"define": "int_literal",
"args": { "value": "$result" }
}
identifier_expr = { identifier }
-> TypedExpression {
"get_text": true,
"define": "variable",
"args": { "name": "$result" }
}
// Operators
and_op = { "and" } -> String { "get_text": true }
or_op = { "or" } -> String { "get_text": true }
eq_op = { "==" } -> String { "get_text": true }
neq_op = { "!=" } -> String { "get_text": true }
lt_op = { "<" } -> String { "get_text": true }
gt_op = { ">" } -> String { "get_text": true }
add_op = { "+" } -> String { "get_text": true }
sub_op = { "-" } -> String { "get_text": true }
mul_op = { "*" } -> String { "get_text": true }
div_op = { "/" } -> String { "get_text": true }
unary_op = { "-" | "!" } -> String { "get_text": true }