Choices and Precedence

Every bootpeg grammar is unambiguous since the underlying PEG formalism enforces unambiguous choices: the first choice matched always wins. On the one hand, this is useful since it avoids distinct matches “magically” invalidating each other. On the other hand, it means that ordering in the grammar must be chosen with care to achieve the desired language coverage and precedence.

Ordered vs. unordered Choices

An obvious demonstration for PEG’s ordered choice are between two terminals where one is a subsequence of the other:

("a" | "ab") "c"

With PEG, the second choice branch can never be reached: Parsing the input abc matches with the first choice and consumes a, then fail after the choice c is expected but b was not consumed. The key is that later branches in a choice are only tried if earlier branches fail; if a failure happens after the choice, there is no backtracking into the choice.

In contrast, other matching schemes such as RegEx will try every possible combination of choices until one succeeds:

>>> import re
>>> head, tail = r"a|ab", r"c"
>>> rule = re.compile(rf"^({head}){tail}$")
>>> rule.match("abc") is not None
True

While ordered choices might seem impractical in such simple cases, they simplify complex grammars and especially left-recursion.

Order and left-recursion

Any left-recursive rule may be matched at the same position more than once. This has a powerful implication for ordered choice: “during” matching the first branch its match has neither failed nor succeeded, so recursion will try trailing branches. Whenever ordered choice recurses into itself, several of its branches are matched; the choice order determines how they are nested into each other.

A simple demonstration uses recursion and an ordered choice to match a sequence:

# during matching `as as` we can match `"a"` at the same position
as: as as | "a"

This rule has a higher precedence for “multiple as” over “one "a"”. Due to left-recursion, matching the first branch as as uses the second branch "a" for the first clause as but the full rule as as | "a" for the second clause as.

Notably, recursion is what makes the choice match several branches. When swapping the branches to "a" | as as, either the first branch will match successfully and skip the second branch, or the first branch will not match and thus make it impossible for the second branch to match as well.

Note

With recursive-choices, use terminals in early branches to prevent nesting but use terminals in later branches to enable nesting.

Precedence via ordered recursion

Properly ordering choices not only decides what input is matched, but also how it is matched – in specific, choice order determines the precedence of its clauses.

A common example is precedence of mathematical operations. For example, one can express the precedence of multiplication over addition as “addition clauses may contain multiplication clauses”:

top:
    | addition
addition:
    | lhr=addition ' '* '+' ~ ' '* rhs=multiplication { add(lhs, rhs) }
    | multiplication
multiplication:
    | lhr=multiplication ' '* '*' ~ ' '* rhs=power { mul(lhs, rhs) }
    | power
power:
    | lhs=primitive ' '* '^' ~ ' '* rhs=power { mul(lhs, rhs) }
    | primitive
primitive:
    | '(' ~ expr=top ')' { expr }
    | literal=("1"-"9" "0"-"9"*) { int(literal) }

This structure is called “precedence climbing”: Every clause may contain a clause of the next precedence level and “climb up” at the current position. Notably, the top clause starts at the lowest precedence which then works its way up to the highest precedence; the jump back down from the highest to lowest precedence is only possible after advancing the position (by a literal match of an opening ( or a number).

In addition, the grammar expresses associativity via recursion: left-associative rules such as addition use left-recursion, whereas right-associative rules such as power use right-recursion. [1] This allows the rule to expand in the respective direction.

Note

Use choice ordering for precedence, and left-/right-recursion for left-/right-associativity.