Skip to content

Lexer outputs confused reminder when the terminal NUMBER prohibits trailing dot #195

@Yosshi999

Description

@Yosshi999

version: v0.4.12

According to the JSON specification (RFC 8259), trailing dot is prohibited for floating point.
However, when I define the new JSON grammar like below, I found the lexer's suspicious behavior.

// RFC 8259 without complex STRING definition
?start: value

?value: object
| array
| STRING
| NUMBER
| "true"             -> true
| "false"            -> false
| "null"             -> null

object: "{" [member ("," member)*] "}"
member: STRING ":" value
array : "[" [value ("," value)*] "]"

NUMBER: MINUS? INT FRAC? EXP?
MINUS: "-"
INT: "0" | ("1".."9") DIGIT*
DIGIT: "0".."9"
FRAC: "." DIGIT+
EXP: ("e"|"E") ["+"|"-"] DIGIT+

STRING: /\"[^"]*\"/
WS: /[ \t\f\r\n]/+

%ignore WS

The observed behavior:

>>> grammar_engine._parse_partial_code(0, '{ "cap": 10.0', b'', accepted_generation=True)
(remainder : b'10.0', remainder_state: RemainderState.MAYBE_COMPLETE, accept_sequences: {accept_terminals: ['NUMBER', 'COMMA'], accept_terminals: ['NUMBER', 'WS', 'COMMA'], accept_terminals: ['LBRACE'], accept_terminals: ['WS'], accept_terminals: ['NULL'], accept_terminals: ['STRING'], accept_terminals: ['NUMBER', 'WS', 'RBRACE'], accept_terminals: ['NUMBER', 'RBRACE'], accept_terminals: ['TRUE'], accept_terminals: ['FALSE'], accept_terminals: ['LSQB']}, next_ac_indents: None, False)

# ↑ This looks correct.

>>> grammar_engine._parse_partial_code(0, '{ "cap": 10.', b'', accepted_generation=True)
(remainder : b'.', remainder_state: RemainderState.INCOMPLETE, accept_sequences: {accept_terminals: ['COMMA'], accept_terminals: ['WS'], accept_terminals: ['RBRACE']}, next_ac_indents: None, False)

# ↑ This reminder must be '10.' ?

It seems that the lexer will be confused when its state moves along accepted (digits) -> live-state (trailing dot) -> accepted (digits).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions