Lexer
A lexer converts an input stream of characters into a sequence of tokens, including the location and nature of the token.
Tokens can then be used by a parser to build a program tree. If there is an error (e.g. an unexpected/invalid sequence of tokens), the location information can be used to formulate a useful error message.
Example
We can load up any valid PL/0 program and inspect the sequence of tokens produced.
Here is the multiply program:
VAR x, y, z; BEGIN x := 10; y := 20; z := x * y; ! z END.
Here is the token stream generated by the program:
$ ./pl0_lexer.py < examples/multiply.pl1 LexToken(VAR,'VAR',2,1) LexToken(NAME,'x',2,5) LexToken(COMMA,',',2,6) LexToken(NAME,'y',2,8) LexToken(COMMA,',',2,9) LexToken(NAME,'z',2,11) LexToken(EOS,';',2,12) LexToken(BEGIN,'BEGIN',4,15) LexToken(NAME,'x',5,22) LexToken(UPDATE,':=',5,24) LexToken(NUMBER,10,5,27) LexToken(EOS,';',5,29) LexToken(NAME,'y',6,32) LexToken(UPDATE,':=',6,34) LexToken(NUMBER,20,6,37) LexToken(EOS,';',6,39) LexToken(NAME,'z',8,44) LexToken(UPDATE,':=',8,46) LexToken(NAME,'x',8,49) LexToken(TIMES,'*',8,51) LexToken(NAME,'y',8,53) LexToken(EOS,';',8,54) LexToken(PRINT,'!',10,59) LexToken(NAME,'z',10,61) LexToken(END,'END',11,63) LexToken(DOT,'.',11,66)
Follow Me