Quick Guide¶
Using bootpeg to create a parser is done in three steps: Define a Grammar to match input, provide some Actions to transform matches, and tell bootpeg to use both to create the parser.
This guide shows the parts needed to create a parser for
basic scientific number notation.
It can parse numbers as integer, such as 10
or -16
,
or scientific, such as 1e1
or -160E-1
.
Grammar and Actions¶
The Grammar is a textual definition how to match input. [1] You can write a grammar in one of many meta-grammars, though we suggest the bootpeg meta-grammar by default. A grammar consists of rules that decide which input is valid and how it is to be interpreted:
# bootpeg uses # for line comments
# the first rule must match the entire input, usually via other rules
top:
| scientific | integer
# a named rule – can be referred to in other rules by name
integer:
| literal=([ "-" ] "1"-"9" "0"-"9"*) { Integer(literal) }
# ^ ^ ^ ^ interpret match as Integer
# | | \ match a digit between 1 and 9 followed by arbitrary many digits
# | \ match an optional leading sign
# \ capture (parts of) match to transform them
scientific:
| base=integer ("E" | "e") exponent=integer { base * (10 ** exponent) }
# ^ ^ either E or e may be used for the exponent
# \ capture part of the match to interpret it separately
The Actions are callables and constants needed to transform input.
You can define actions freely using any code you like;
bootpeg
merely expects them in a mapping from each
of the grammar’s names to the respective action:
>>> # map from names used in the grammar to callables
>>> example_actions = {
... # use the builtin `int` to interpret any matched Integer
... 'Integer': int
... }
The Grammar uses captures like base=integer
to select part of the input.
The transformations like { Integer(literal) }
define how captured values
are passed to Actions.
Boot the PEG¶
There are two convenient ways to create a parser from the grammar and actions:
use
bootpeg.create_parser
to load the grammar from a string, oruse
bootpeg.import_parser
to load the grammar from a packaged file.
Generally, you should work with a string for interactive development, and a packaged file in any other case. In addition to the grammar, both functions also take the actions of your new parser, and the bootpeg dialect of the grammar.
Simply pass in the grammar, the actions defined, and the dialect
– in this case bootpeg.grammars.bpeg
for the bootpeg meta-grammar.
>>> from bootpeg import create_parser
>>> from bootpeg.grammars import bpeg
>>> example_grammar = """\
... top:
... | scientific | integer
... integer:
... | [ "-" ] "1" - "9" ("0" - "9")* { Integer(.*) }
... scientific:
... | base=integer ("E" | "e") exponent=integer { .base * (10 ** .exponent) }
... """
>>> example_actions = {'Integer': int}
>>> parse = create_parser(example_grammar, bpeg, example_actions)
>>> parse("12")
12
>>> parse("12E6")
12000000
Where to next?¶
As bootpeg uses the PEG formalism, grammars are order-dependent.
In the example, swapping | scientific | integer
for | integer | scientific
would not work well:
matching 12E6
would just match 12
as an integer
and be done.
See Choices and Precedence on how to use ordering to your advantage.
Separating the task of grammars and actions is integral to how bootpeg operates. Likewise, one should aim at creating parsers where each part handles its strong point. In the example, the grammar recognizes numbers but the actions interpret them. See Actions and Captures on how to best match the tasks of grammars and actions.