Language Implementation Patterns Reading Notes
LL (k) parser
The k in LL(k) grammar stands for lookahead, a few characters ahead. In LL(1), you can select the corresponding parsing expression by looking ahead to a lexical unit, such as the if statement. If the next lexical unit is if, you need to select the expression of the if statement for parsing. To implement the LL(k) parser, cache k lexical units (tokens) during parsing. The implementation can use k-length arrays, but access in a circular manner, access to the last lexical unit, fetch the next, return to the first element of the array. The pseudocode is as follows:
private Token[] lookahead = new Token[k];
private Lexer lexer;
int pos = 0;
public void consume(a) {
lookahead[p] = lexer.nextToken(); // Put the lexical unit in the next place
pos = (pos + 1) % k; // When the end of the cache is reached, increment starts from 0
}
// Returns the lexical unit looking ahead to I
public Token LT(int i) {
return lookahead[(pos + i - 1) % k]; // Ring values
}
// Returns the lexical unit type for forward-looking I
public int LA(int i) {
return LT(i).type;
}
public void match(int x) {
if (LA(1) == x) {
consume();
}
throw new Exception();
}
Copy the code
Backtracking parser
The traceback Parser uses any number of forward symbols to parse, so the cache in the Parser needs to be dynamically expanded, and the starting position of the parse lexical unit must be marked. The pseudocode is as follows:
private List<Token> lookahead; // Variable size cache
private List<Integer> markers; // stack to store the mark of the record location
private Lexer lexer;
int pos = 0;
// Returns the lexical unit looking ahead to I
public Token LT(int i) {
return lookahead[(pos + i - 1) % k]; // Ring values
}
// Returns the lexical unit type for forward-looking I
public int LA(int i) {
return LT(i).type;
}
public void match(int x) {
if (LA(1) == x) {
consume();
}
throw new Exception();
}
public void consume(a) {
pos++;
// The state is not inferred and the end of the buffer has been reached
if(pos == lookahead.size() && ! isSpeculating()) { pos =0; // At the end, fill in the new lexical unit starting at 0
lookahead.clear();
}
sync(1); // Take a new lexical unit
}
public void sync(int i) {
if (pos + i - 1 > (lookahead.size() - 1)) {
int n = (pos + i - 1) - (lookahead.size() - 1); // Get n lexical unitsfill(n); }}public void fill(int n) {
for (int i = 1; i <= n; i++) { lookahead.add(lexer.nextToken()); }}// Tag related
public int mark(a) {
markers.add(pos);
return pos;
}
public void release(a) {
int marker = markers.get(markers.size() - 1);
markers.remove(markers.size() - 1);
seek(marker);
}
public void seek(int index) {
pos = index;
}
public bookean isSpeculating(a) {
return markers.size() > 0;
}
/ / parsing
public void stat(a) {
if(speculate_stat_alt1()) { list(); match(Lexer.EOF); }}// Try to match option 1
public boolean speculate_stat_alt1(a) {
boolean success = true;
mark(); // Mark the current location
try {
list();
match(Lexer.EOF);
} catch (RecognitionExceptin e) {
success = false;
}
release(); // Put the input back regardless of whether it matches
return success;
}
Copy the code