written 8.4 years ago by
teamques10
★ 68k
|
•
modified 8.4 years ago
|
- Bottom-up parsers for a large class of context-free grammars can be easily developed using operator grammars.
- Operator grammars have the property that no production right side is empty or has two adjacent nonterminal.
- This property enables the implementation of efficient operator-precedence parsers.
- These parser rely on the following three precedence relations:
Relation |
Meaning |
a <· b |
a yields precedence to b |
a =·b |
a has the same precedence as b |
a·>b |
a takes precedence over b |
- These operator precedence relations allow to delimit the handles in the right sentential forms: $\lt \cdot $ marks the left end, $=\cdot$ appears in the interior of the handle, and $\cdot\gt$ marks the right end.
- Let assume that between the symbols $ai$ and $ai+1$ there is exactly one precedence relation. Suppose that \$ is the end of the string.
- Then for all terminals we can write: $ \lt \cdot b \ and \ b \cdot \gt $. If we remove all nonterminal and place the correct precedence relation: $\lt \cdot, =\cdot, \cdot\gt$ between the remaining terminals, there remain strings that can be analyzed by easily developed parser.
- For example, the following operator precedence relations can be introduced for simple expressions:
|
id |
+ |
* |
\$ |
id |
|
$ \cdot \gt$ |
$ \cdot \gt$ |
$ \cdot \gt$ |
+ |
$ \lt \cdot $ |
$ \cdot \gt$ |
$ \lt \cdot $ |
$ \cdot \gt$ |
* |
$ \lt \cdot $ |
$ \cdot \gt$ |
$ \cdot \gt$ |
$ \cdot \gt$ |
\$ |
$ \lt \cdot $ |
$ \lt \cdot $ |
$ \lt \cdot $ |
$ \cdot \gt$ |
Stack |
Input |
a |
b |
Action |
\$ |
id + id \$ |
\$ |
id |
Shift id |
\$id |
+id \$ |
id |
+ |
Reduce E -> id |
\$ E |
+id \$ |
\$ |
+ |
Shift + |
\$E+ |
id\$ |
+ |
id |
Shift id |
\$ E+id |
\$ |
id |
\$ |
Reduce E -> id |
\$ E + E |
\$ |
+ |
\$ |
Reduce E -> E+E |
\$E |
\$ |
\$ |
\$ |
Accept |
Example: The input string:
$\text{id1 + id2 * id3} $
After inserting precedence relations becomes
$\text{\lt $\cdot$ id1 $\cdot$ \gt + \lt $\cdot$ id2 $\cdot$\gt * \lt $\cdot$ id3 $\cdot$ \gt}$
- Having precedence relations allows to identify handles as follows:
- Scan the string from left until seeing •>
- Scan backwards the string from right to left until seeing <•
- Everything between the two relations <• and •> forms the handle.
Note that not the entire sentential form is scanned to find the handle.
Operator Precedence Parsing Algorithm
Initialize: Set ip to point to the first symbol of w \$
Repeat: Let X be the top stack symbol, and a the symbol pointed to by ip
if \$ is on the top of the stack and ip points to \$ then return
else
/* Let a be the top terminal on the stack, and b the symbol pointed to by ip*/
If a< $\cdot$ b or a =• b then
push b onto the stack
advance ip to the next input symbol
else if a •>bthen
repeat
pop the stack
until the top stack terminal is related by <•
to the terminal most recently popped
else error()
end
Making Operator Precedence Relations
- The operator precedence parsers usually do not store the precedence table with the relations, rather they are implemented in a special way.
- Operator precedence parsers use precedence functions that map terminal symbols to integers, and so the precedence relations between the symbols are implemented by numerical comparison.
- Not every table of precedence relations has precedence functions but in practice for most grammars such functions can be designed.
Algorithm for Constructing Precedence Functions
- Create functions $f_a$ for each grammar terminal a and for the end of string symbol;
- Partition the symbols in groups so that $f_a$ and $g_b$ are in the same group if a =• b (there can be symbols in the same group even if they are not connected by this relation);
- Create a directed graph whose nodes are in the groups, next for each symbols a and b do: place an edge from the group of $g_b$ to the group of fa if a <• b, otherwise if a •>b place an edge from the group of $f_a$ to that of $g_b$;
- If the constructed graph has a cycle then no precedence functions exist. When there are no cycles collect the length of the longest paths from the groups of fa and $g_b$ respectively.
Example: Consider the above table
|
id |
+ |
* |
id |
|
$ \cdot \gt$ |
$ \cdot \gt$ |
+ |
$ \lt \cdot $ |
$ \cdot \gt$ |
$ \lt \cdot $ |
* |
$ \lt \cdot $ |
$ \cdot \gt$ |
$ \cdot \gt$ |
\$ |
$ \lt \cdot $ |
$ \lt \cdot $ |
$ \lt \cdot $ |
Using the algorithm leads to the following graph:
From which we extract the following precedence functions:
|
id |
+ |
* |
\$ |
f |
4 |
2 |
4 |
0 |
g |
5 |
1 |
3 |
0 |
Error Recovery in operator-precedence parsing:
There are two points in the parsing process at which an operator-precedence parser can discover syntactic error:
- If no precedence relation holds between the terminal on top of the stack and the current input.
- If a handle has been found, but there is no production with this handle as a right side.
The error checker does the following errors:
- Missing operand
- Missing operator
- No expression between parenthesis
- Illegal b on line (line containing b)
- Missing d on line (line containing c)
- Missing E on line (line containing b)
These error diagnostic issued at handling of errors during reduction.
During handling of shift/reduce errors; the diagnostic’s issues are:
- Missing operand
- Unbalanced right parenthesis
- Missing right parenthesis
- Missing operator
Advantages:
- Simplicity, easy to construct by hand
Disadvantages:
- It is hard to handle tokens like the unary operators
- Since the relationship between a grammar for the language being parsed and the operator precedence parser itself is tenuous, one cannot always be sure the parser accepts exactly the desired language.
- Only a small class of grammars can be parsed using operator-precedence techniques.