The bpeg
Grammar¶
The bpeg
grammar is modelled after Python’s own parser grammar as per PEP 617.
It provides indentation based rules, a EBNF-like expression grammar,
and efficient literal declarations.
Top-level rules¶
comment: '#' :: NEW_LINE
A line comment, discarding the entire line starting at the
#
symbol.
define: name ':' NEW_LINE INDENT rule+
A named collection of ordered rules. If more than one rule matches, the uppermost matching rule is preferred.
rule: '|' e [ '{' action '}' ] NEW_LINE
A rule to match any input matching the expression
e
. The optionalaction
defines how to translate the matched input.
Compound Expressions¶
sequence: e1 e2
Ordered sequence of clauses, matching
e1
followed bye2
."return" expression
choice: e1 | e2
Ordered choice, matching either
e1
ore2
. If both match,e1
is preferred.
group: ( e )
Match
e
. Useful to enforce precedence.expr (',' expr)*
option: [ e ]
Match
e
or nothing. Always succeeds, may be zero width.[ "async" ] "def"
not: !e
Match if
e
does not match. Matches with zero width.! NEW_LINE
and: &e
Match if
e
does match, but with zero width. This is an optimised form of!!e
.begin (& ':') colon_block
repeat: e+
Match
e
once or several times.':' statement+
any: e*
Match
e
zero or several times. Equivalent to[ e+ ]
.
commit: ~ e
Match
e
or fail. Always succeeds, may be zero width.Failure to match
e
records the failure but proceeds “as if”e
matched. Useful for accurate failure reports.# fail on empty and mismatched parentheses '(' ~ expr ')' | exprBinds tighter than sequences and less tight than choices:
~e1 e2 | e3
is equivalent to(~e1 ~e2) | e3
.
capture: name=e
Capture the result of matching
e
with a givenname
for use in a rule action.
Special Terminals¶
nothing: ''
or ""
Zero-length literal, always matches at any position. Used to construct optional and any rules, which should be preferred for readability.
anything: .
Match any input of width one. May lead to excessive matches; prefer range or delimited literals.
Literal Terminals¶
literal: " :: "
or ' :: '
Match any input exactly equal to the literal.
"def"
range: literal1 - literal2
Match any input smaller/larger or equal to literal1/literal2.
"a" - "z"
delimited: literal1 :: literal2
Match literal1 followed by the literal2 with arbitrary matches in between. More efficient version of
literal1 ( !literal2 . ) literal2
.literal: | '"' :: '"' | "'" :: "'"