05 Semantic Actions

Chapter 5: Semantic Actions

Semantic actions are JSON command blocks that describe how to build TypedAST nodes from parsed syntax. This chapter covers all available commands and patterns.

Basic Structure

Every grammar rule can have a semantic action:

rule_name = { pattern }
  -> ResultType {
      // JSON commands
  }

The ResultType indicates what kind of AST node the rule produces:

TypedExpression - Expressions (literals, operations, calls)
TypedStatement - Statements (if, while, return)
TypedDeclaration - Top-level declarations (functions, structs)
TypedBlock - Statement blocks
TypedParameter - Function parameters
TypedField - Struct fields
TypedVariant - Enum variants
Type - Type expressions
List - Collect multiple children
String - Extract text

Child References

Commands can reference parsed children using $N syntax:

// Children are numbered by their position in the parse tree
// $1 = first non-silent child, $2 = second, etc.

binary_expr = { left ~ "+" ~ right }
  -> TypedExpression {
      "commands": [
          { "define": "binary", "args": { "left": "$1", "op": "+", "right": "$2" } }
      ]
  }

Special references:

$result - The result of previous commands (like get_text)
$1, $2, ... - Child nodes by position

Core Commands

get_text

Extracts the matched text as a string:

identifier = @{ ASCII_ALPHA ~ (ASCII_ALPHANUMERIC | "_")* }
  -> String {
      "get_text": true
  }

parse_int

Parses extracted text as an integer:

integer_literal = @{ "-"? ~ ASCII_DIGIT+ }
  -> TypedExpression {
      "get_text": true,
      "parse_int": true,
      "define": "int_literal",
      "args": { "value": "$result" }
  }

get_child

Gets a specific child by index:

// Get the first child (index 0)
expr = { inner_expr }
  -> TypedExpression {
      "get_child": { "index": 0 }
  }

get_all_children

Collects all children into a list:

statements = { statement* }
  -> List {
      "get_all_children": true
  }

define

Calls an AST builder method with arguments:

return_stmt = { "return" ~ expr? ~ ";" }
  -> TypedStatement {
      "commands": [
          { "define": "return_stmt", "args": { "value": "$1" } }
      ]
  }

The `define` Command

The define command is the primary way to create AST nodes. It calls a method on the TypedAstBuilder.

Syntax

{ "define": "method_name", "args": { "arg1": "value1", "arg2": "$1" } }

Available Methods

Literals

Method	Arguments	Creates
`int_literal`	`value`	Integer literal
`bool_literal`	`value`	Boolean literal
`string_literal`	`value`	String literal
`char_literal`	`value`	Character literal

bool_literal = { "true" | "false" }
  -> TypedExpression {
      "get_text": true,
      "define": "bool_literal",
      "args": { "value": "$result" }
  }

Variables and Access

Method	Arguments	Creates
`variable`	`name`	Variable reference
`field_access`	`object`, `field`	Field access (obj.field)
`index`	`object`, `index`	Index access (arr[i])

field_expr = { atom ~ "." ~ identifier }
  -> TypedExpression {
      "commands": [
          { "define": "field_access", "args": { "object": "$1", "field": "$2" } }
      ]
  }

Operations

Method	Arguments	Creates
`binary`	`op`, `left`, `right`	Binary operation
`unary`	`op`, `operand`	Unary operation
`call`	`callee`, `args`	Function call

unary_expr = { unary_op ~ primary }
  -> TypedExpression {
      "commands": [
          { "define": "unary", "args": { "op": "$1", "operand": "$2" } }
      ]
  }

Statements

Method	Arguments	Creates
`let_stmt`	`name`, `init`, `is_const`, `type`?	Variable declaration
`return_stmt`	`value`?	Return statement
`if`	`condition`, `then_branch`, `else_branch`?	If statement
`while`	`condition`, `body`	While loop
`for`	`iterable`, `binding`, `body`	For loop
`expression_stmt`	`expr`	Expression statement
`assignment`	`target`, `value`	Assignment
`break`		Break statement
`continue`		Continue statement

if_else = { "if" ~ "(" ~ expr ~ ")" ~ block ~ "else" ~ block }
  -> TypedStatement {
      "commands": [
          { "define": "if", "args": {
              "condition": "$1",
              "then_branch": "$2",
              "else_branch": "$3"
          }}
      ]
  }

Blocks and Programs

Method	Arguments	Creates
`block`	`statements`	Statement block
`program`	`declarations`	Program root

block = { "{" ~ statement* ~ "}" }
  -> TypedBlock {
      "get_all_children": true,
      "define": "block",
      "args": { "statements": "$result" }
  }

Declarations

Method	Arguments	Creates
`function`	`name`, `params`, `return_type`, `body`	Function declaration
`struct`	`name`, `fields`	Struct declaration
`enum`	`name`, `variants`	Enum declaration
`param`	`name`, `type`	Function parameter
`field`	`name`, `type`	Struct field
`variant`	`name`	Enum variant

fn_decl = { "fn" ~ identifier ~ "(" ~ fn_params ~ ")" ~ type_expr ~ block }
  -> TypedDeclaration {
      "commands": [
          { "define": "function", "args": {
              "name": "$1",
              "params": "$2",
              "return_type": "$3",
              "body": "$4"
          }}
      ]
  }

Structs and Enums

Method	Arguments	Creates
`struct_init`	`type_name`, `fields`	Struct literal
`struct_field_init`	`name`, `value`	Field initializer
`array_literal`	`elements`	Array literal

struct_init = { identifier ~ "{" ~ struct_init_fields? ~ "}" }
  -> TypedExpression {
      "commands": [
          { "define": "struct_init", "args": { "type_name": "$1", "fields": "$2" } }
      ]
  }

Types

Method	Arguments	Creates
`primitive_type`	`name`	Primitive type (i32, bool, etc.)
`pointer_type`	`pointee`	Pointer type (*T)
`optional_type`	`inner`	Optional type (?T)
`array_type`	`size`, `element`	Array type ([N]T)

primitive_type = { "i32" | "i64" | "bool" | "void" }
  -> Type {
      "get_text": true,
      "define": "primitive_type",
      "args": { "name": "$result" }
  }

Pattern Matching

These commands create pattern nodes for switch expressions and pattern matching:

Method	Arguments	Creates
`literal_pattern`	`value`	Match a literal value (int, string)
`wildcard_pattern`		Match anything (`_` or `else`)
`range_pattern`	`start`, `end`, `inclusive`	Match a range (`1..10`)
`identifier_pattern`	`name`	Bind matched value to variable
`struct_pattern`	`name`, `fields`	Match struct with field patterns
`field_pattern`	`name`, `pattern`?	Match a struct field
`enum_pattern`	`name`, `variant`, `fields`	Match enum/tagged union variant
`array_pattern`	`elements`	Match array elements
`pointer_pattern`	`inner`, `mutable`	Match pointer dereference
`error_pattern`	`name`	Match error value (`error.OutOfMemory`)
`switch_expr`	`scrutinee`, `cases`	Switch expression
`switch_case`	`pattern`, `body`	Single switch case arm

// Literal pattern: match exact value
switch_literal_pattern = { integer_literal }
  -> TypedExpression {
      "commands": [
          { "define": "literal_pattern", "args": { "value": "$1" } }
      ]
  }

// Wildcard pattern: match anything
switch_wildcard_pattern = { "_" }
  -> TypedExpression {
      "commands": [
          { "define": "wildcard_pattern" }
      ]
  }

// Range pattern: match value in range
switch_range_pattern = { integer_literal ~ ".." ~ integer_literal }
  -> TypedExpression {
      "commands": [
          { "define": "range_pattern", "args": {
              "start": { "define": "literal_pattern", "args": { "value": "$1" } },
              "end": { "define": "literal_pattern", "args": { "value": "$2" } },
              "inclusive": false
          }}
      ]
  }

// Struct pattern: match struct fields
switch_struct_pattern = { identifier ~ "{" ~ struct_field_patterns? ~ "}" }
  -> TypedExpression {
      "commands": [
          { "define": "struct_pattern", "args": {
              "name": { "text": "$1" },
              "fields": "$2"
          }}
      ]
  }

// Tagged union pattern: .some, .none
switch_tagged_union_pattern = { "." ~ identifier }
  -> TypedExpression {
      "commands": [
          { "define": "enum_pattern", "args": {
              "name": "",
              "variant": { "text": "$1" },
              "fields": []
          }}
      ]
  }

// Error pattern: error.OutOfMemory
switch_error_pattern = { "error" ~ "." ~ identifier }
  -> TypedExpression {
      "commands": [
          { "define": "error_pattern", "args": {
              "name": { "text": "$1" }
          }}
      ]
  }

// Pointer pattern: *x
switch_pointer_pattern = { "*" ~ switch_pattern }
  -> TypedExpression {
      "commands": [
          { "define": "pointer_pattern", "args": {
              "inner": "$1",
              "mutable": false
          }}
      ]
  }

The `fold_binary` Command

This special command builds left-associative binary expression trees from repetition patterns.

Problem It Solves

Given input 1 + 2 + 3, we want:

Not:

  +
 / \
1   +
   / \
  2   3

Usage

addition = { term ~ ((add_op | sub_op) ~ term)* }
  -> TypedExpression {
      "fold_binary": { "operand": "term", "operator": "add_op|sub_op" }
  }

Parameters:

operand: Name of the operand rule
operator: Operator rules (pipe-separated for multiple)

How It Works

For input 1 + 2 - 3:

Parse produces: [term(1), add_op(+), term(2), sub_op(-), term(3)]
Fold starts with first term: result = 1
Process pairs: result = binary(+, result, 2) → (1 + 2)
Continue: result = binary(-, result, 3) → ((1 + 2) - 3)

Multiple Operators

Handle different operators at the same precedence level:

comparison = { addition ~ ((eq_op | neq_op | lt_op | gt_op) ~ addition)* }
  -> TypedExpression {
      "fold_binary": { "operand": "addition", "operator": "eq_op|neq_op|lt_op|gt_op" }
  }

Command Sequences

Use commands array to execute multiple commands in sequence:

typed_var_decl = { "const" ~ identifier ~ ":" ~ type_expr ~ "=" ~ expr ~ ";" }
  -> TypedDeclaration {
      "commands": [
          { "define": "let_stmt", "args": {
              "name": "$1",
              "type": "$2",
              "init": "$3",
              "is_const": true
          }}
      ]
  }

Handling Optional Children

When a child might be absent, the builder handles null/missing gracefully:

// expr? produces None if missing
return_stmt = { "return" ~ expr? ~ ";" }
  -> TypedStatement {
      "commands": [
          { "define": "return_stmt", "args": { "value": "$1" } }
      ]
  }

For complex cases, use separate rules:

// Split into variants to avoid indexing issues
if_stmt = { if_else | if_only }

if_only = { "if" ~ "(" ~ expr ~ ")" ~ block }
  -> TypedStatement {
      "commands": [
          { "define": "if", "args": {
              "condition": "$1",
              "then_branch": "$2"
          }}
      ]
  }

if_else = { "if" ~ "(" ~ expr ~ ")" ~ block ~ "else" ~ block }
  -> TypedStatement {
      "commands": [
          { "define": "if", "args": {
              "condition": "$1",
              "then_branch": "$2",
              "else_branch": "$3"
          }}
      ]
  }

Passthrough Rules

Sometimes a rule just selects between alternatives without transforming:

// Just pass through the matched child
expr = { logical_or }
  -> TypedExpression {
      "get_child": { "index": 0 }
  }

statement = { if_stmt | while_stmt | return_stmt | expr_stmt }
  -> TypedStatement {
      "get_child": { "index": 0 }
  }

Complete Example

Here's a complete expression grammar with proper operator precedence:

expr = { logical_or }
  -> TypedExpression { "get_child": { "index": 0 } }

logical_or = { logical_and ~ (or_op ~ logical_and)* }
  -> TypedExpression {
      "fold_binary": { "operand": "logical_and", "operator": "or_op" }
  }

logical_and = { comparison ~ (and_op ~ comparison)* }
  -> TypedExpression {
      "fold_binary": { "operand": "comparison", "operator": "and_op" }
  }

comparison = { addition ~ ((eq_op | neq_op | lt_op | gt_op) ~ addition)* }
  -> TypedExpression {
      "fold_binary": { "operand": "addition", "operator": "eq_op|neq_op|lt_op|gt_op" }
  }

addition = { multiplication ~ ((add_op | sub_op) ~ multiplication)* }
  -> TypedExpression {
      "fold_binary": { "operand": "multiplication", "operator": "add_op|sub_op" }
  }

multiplication = { unary ~ ((mul_op | div_op) ~ unary)* }
  -> TypedExpression {
      "fold_binary": { "operand": "unary", "operator": "mul_op|div_op" }
  }

unary = { unary_with_op | primary }
  -> TypedExpression { "get_child": { "index": 0 } }

unary_with_op = { unary_op ~ primary }
  -> TypedExpression {
      "commands": [
          { "define": "unary", "args": { "op": "$1", "operand": "$2" } }
      ]
  }

primary = { integer | identifier_expr | paren_expr }
  -> TypedExpression { "get_child": { "index": 0 } }

paren_expr = _{ "(" ~ expr ~ ")" }

integer = @{ ASCII_DIGIT+ }
  -> TypedExpression {
      "get_text": true,
      "parse_int": true,
      "define": "int_literal",
      "args": { "value": "$result" }
  }

identifier_expr = { identifier }
  -> TypedExpression {
      "get_text": true,
      "define": "variable",
      "args": { "name": "$result" }
  }

// Operators
and_op = { "and" } -> String { "get_text": true }
or_op = { "or" } -> String { "get_text": true }
eq_op = { "==" } -> String { "get_text": true }
neq_op = { "!=" } -> String { "get_text": true }
lt_op = { "<" } -> String { "get_text": true }
gt_op = { ">" } -> String { "get_text": true }
add_op = { "+" } -> String { "get_text": true }
sub_op = { "-" } -> String { "get_text": true }
mul_op = { "*" } -> String { "get_text": true }
div_op = { "/" } -> String { "get_text": true }
unary_op = { "-" | "!" } -> String { "get_text": true }

Next Steps

Chapter 6: Understand the TypedAST structure these commands create
Chapter 7: Use the builder API directly in Rust
Chapter 8: See all commands used in a complete grammar

Uh oh!

05 Semantic Actions

Chapter 5: Semantic Actions

Basic Structure

Child References

Core Commands

get_text

parse_int

get_child

get_all_children

define

The define Command

Syntax

Available Methods

Literals

Variables and Access

Operations

Statements

Blocks and Programs

Declarations

Structs and Enums

Types

Pattern Matching

The fold_binary Command

Problem It Solves

Usage

How It Works

Multiple Operators

Command Sequences

Handling Optional Children

Passthrough Rules

Complete Example

Next Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

The `define` Command

The `fold_binary` Command