Skip to content

Implement language server protocol (LSP) and VS Code extension client#81

Merged
ehwan merged 22 commits into
mainfrom
lsp
Jun 23, 2026
Merged

Implement language server protocol (LSP) and VS Code extension client#81
ehwan merged 22 commits into
mainfrom
lsp

Conversation

@ehwan

@ehwan ehwan commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Description

This pull request introduces complete, rich editor integration for RustyLR grammar files via the new rusty_lr_lsp language server and the rustylr-vscode VS Code extension client.

Key Features Implemented

1. Language Server Protocol (rusty_lr_lsp)

  • Syntax Highlighting (Semantic Tokens): Theme-aligned token coloring for terminals, non-terminals, directives (%...), bindings (var=), location bindings (@...), and variables ($...).
  • Diagnostics: Inline warning/error reporting for syntax errors, conflict resolutions, unproductive rules, and unused symbols.
  • Auto-Formatting: Standardizes whitespace and handles block indentation of grammar rules and reduce action bodies.
  • Go to Definition: Jump to rule definitions, terminal declarations, and precedence rules (including %prec and precedence symbol declarations).
  • Find References: Find all usages of terminals, non-terminals, precedence symbols, and the error keyword throughout the grammar file (excluding the internals of reduce action blocks).
  • Hover Tooltips: Interactive documentation and type details for symbols, keywords, patterns, and variables.
  • Inlay Hints: Inline type annotations and reduce action block labels.
  • Auto-Completion: Intelligent suggestions for directives, symbols, locations, variables, and diagnostics.

2. VS Code Extension Client (rustylr-vscode)

  • Packaged configuration and release metadata in package.json (categories, keywords, repository tracking).
  • Implemented settings for customizing the language server executable path, arguments, and workspace root.
  • Created extension documentation (README.md, CHANGELOG.md).

@ehwan ehwan self-assigned this Jun 23, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Language Server Protocol (LSP) server (rusty_lr_lsp) and a companion VS Code extension (editors/vscode-rustylr) to provide rich editor support for RustyLR grammar files, alongside minor API exposure in rusty_lr_parser. The review feedback highlights several robustness and correctness issues across the implementation: a parsing bug in the formatter's semicolon detection when comments are present, a potential out-of-bounds panic in the hover logic, an incorrect offset calculation in position mapping when characters are out of bounds, and an unhandled filesystem exception in the VS Code extension's root detection.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +123 to +159
fn find_directive_semicolon(content: &str, start: usize) -> Option<usize> {
let mut quote = None;
let mut escaped = false;
let mut paren_depth = 0usize;
let mut bracket_depth = 0usize;
let mut brace_depth = 0usize;

let remaining = &content[start..];
for (relative_idx, ch) in remaining.char_indices() {
if let Some(quote_ch) = quote {
if escaped {
escaped = false;
} else if ch == '\\' {
escaped = true;
} else if ch == quote_ch {
quote = None;
}
continue;
}

match ch {
'"' => quote = Some(ch),
'\'' if is_single_quote_literal_start(remaining, relative_idx) => quote = Some(ch),
'(' => paren_depth += 1,
')' => paren_depth = paren_depth.saturating_sub(1),
'[' => bracket_depth += 1,
']' => bracket_depth = bracket_depth.saturating_sub(1),
'{' => brace_depth += 1,
'}' => brace_depth = brace_depth.saturating_sub(1),
';' if paren_depth == 0 && bracket_depth == 0 && brace_depth == 0 => {
return Some(start + relative_idx);
}
_ => {}
}
}
None
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The find_directive_semicolon function does not skip comments (// or /* ... */). If a directive contains a comment with a semicolon (e.g., %token num Token::Num(_); // comment; here), the formatter will incorrectly identify the semicolon inside the comment as the end of the directive. This can lead to malformed formatting or syntax errors.

We should update the function to use a peekable iterator and skip comments properly.

fn find_directive_semicolon(content: &str, start: usize) -> Option<usize> {
    let mut quote = None;
    let mut escaped = false;
    let mut paren_depth = 0usize;
    let mut bracket_depth = 0usize;
    let mut brace_depth = 0usize;

    let remaining = &content[start..];
    let mut chars = remaining.char_indices().peekable();
    while let Some((relative_idx, ch)) = chars.next() {
        if let Some(quote_ch) = quote {
            if escaped {
                escaped = false;
            } else if ch == '\\' {
                escaped = true;
            } else if ch == quote_ch {
                quote = None;
            }
            continue;
        }

        match ch {
            '"' => quote = Some(ch),
            '\'' if is_single_quote_literal_start(remaining, relative_idx) => quote = Some(ch),
            '/' => {
                if let Some((_, '/')) = chars.peek() {
                    chars.next();
                    while let Some((_, next_ch)) = chars.next() {
                        if next_ch == '\n' {
                            break;
                        }
                    }
                } else if let Some((_, '*')) = chars.peek() {
                    chars.next();
                    while let Some((_, next_ch)) = chars.next() {
                        if next_ch == '*' {
                            if let Some((_, '/')) = chars.peek() {
                                chars.next();
                                break;
                            }
                        }
                    }
                }
            }
            '(' => paren_depth += 1,
            ')' => paren_depth = paren_depth.saturating_sub(1),
            '[' => bracket_depth += 1,
            ']' => bracket_depth = bracket_depth.saturating_sub(1),
            '{' => brace_depth += 1,
            '}' => brace_depth = brace_depth.saturating_sub(1),
            ';' if paren_depth == 0 && bracket_depth == 0 && brace_depth == 0 => {
                return Some(start + relative_idx);
            }
            _ => {}
        }
    }
    None
}

Comment thread rusty_lr_lsp/src/hover.rs Outdated
Comment on lines +115 to +122
if action_range.start < content.len() && content.as_bytes()[action_range.start] == b'{' {
let start_brace_end = if action_range.start + 1 < action_range.end
&& content.as_bytes()[action_range.start + 1] == b'{'
{
action_range.start + 2
} else {
action_range.start + 1
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In reduce_action_brace_at_offset, if the document is concurrently modified or truncated, the cached action_range might exceed the current content.len(). Although action_range.start < content.len() is checked, action_range.start + 1 is not verified before indexing content.as_bytes()[action_range.start + 1]. This can cause a panic and crash the LSP server.

We should add a bounds check for action_range.start + 1 < content.len().

Suggested change
if action_range.start < content.len() && content.as_bytes()[action_range.start] == b'{' {
let start_brace_end = if action_range.start + 1 < action_range.end
&& content.as_bytes()[action_range.start + 1] == b'{'
{
action_range.start + 2
} else {
action_range.start + 1
};
if action_range.start < content.len() && content.as_bytes()[action_range.start] == b'{' {
let start_brace_end = if action_range.start + 1 < action_range.end
&& action_range.start + 1 < content.len()
&& content.as_bytes()[action_range.start + 1] == b'{'
{
action_range.start + 2
} else {
action_range.start + 1
};

Comment on lines +31 to +49
pub fn position_to_offset(content: &str, pos: Position) -> usize {
let mut line = 0;
let mut character = 0;
let mut byte_offset = 0;

for c in content.chars() {
if line == pos.line && character >= pos.character {
break;
}
byte_offset += c.len_utf8();

if c == '\n' {
line += 1;
character = 0;
} else {
character += c.len_utf16() as u32;
}
}
byte_offset

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In position_to_offset, if pos.character is out of bounds for the requested pos.line, the loop will continue iterating through subsequent lines until it reaches the end of the file. This can cause the function to return an incorrect offset far beyond the target line.

To prevent this, we should break early if line > pos.line.

pub fn position_to_offset(content: &str, pos: Position) -> usize {
    let mut line = 0;
    let mut character = 0;
    let mut byte_offset = 0;

    for c in content.chars() {
        if line > pos.line {
            break;
        }
        if line == pos.line && character >= pos.character {
            break;
        }
        byte_offset += c.len_utf8();

        if c == '\n' {
            line += 1;
            character = 0;
        } else {
            character += c.len_utf16() as u32;
        }
    }
    byte_offset
}

Comment on lines +166 to +186
function findRustyLrRoot(startPath) {
if (!startPath) {
return undefined;
}

let current = fs.statSync(startPath).isDirectory() ? startPath : path.dirname(startPath);
while (true) {
if (
fs.existsSync(path.join(current, "Cargo.toml")) &&
fs.existsSync(path.join(current, "rusty_lr_lsp", "Cargo.toml"))
) {
return current;
}

const parent = path.dirname(current);
if (parent === current) {
return undefined;
}
current = parent;
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In findRustyLrRoot, fs.statSync(startPath) can throw an exception (e.g., ENOENT or permission errors) if the path does not exist or is inaccessible. This is especially common in virtual or remote workspaces in VS Code. An unhandled exception here will prevent the extension from activating properly.

We should wrap fs.statSync in a try-catch block to handle this gracefully.

function findRustyLrRoot(startPath) {
  if (!startPath) {
    return undefined;
  }

  let current;
  try {
    current = fs.statSync(startPath).isDirectory() ? startPath : path.dirname(startPath);
  } catch (err) {
    return undefined;
  }
  while (true) {
    if (
      fs.existsSync(path.join(current, "Cargo.toml")) &&
      fs.existsSync(path.join(current, "rusty_lr_lsp", "Cargo.toml"))
    ) {
      return current;
    }

    const parent = path.dirname(current);
    if (parent === current) {
      return undefined;
    }
    current = parent;
  }
}

@ehwan ehwan merged commit e825e03 into main Jun 23, 2026
1 check passed
@ehwan ehwan deleted the lsp branch June 23, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant