| |
From start
on antlr grammar for E:
Here is a still incomplete but reasonable initial cut at
the E grammar redone for Antlr
(I should probably change the production order for clarity). There is
a grammar (and lexer) for E
and a grammar and lexer for the nested quasiparser language. The real
thing should have a similar nested grammar for doc comments (which is
the canonical example of nested grammars for antlr). Mostly you want to
look at e.g and quasi.g. Many tree-building actions are filled in, though
some are still wrong or missing. The lexers are not particularly complete,
since that was not the primary point. They should be straightforward to
enhance to cover E. The grammar
is built (using antlr) something like:
C:\tools\jdk1.4\bin\javaw.exe -classpath rt.jar;antlr.jar antlr.Tool e.g
C:\tools\jdk1.4\bin\javaw.exe -classpath rt.jar;antlr.jar antlr.Tool quasi.g
eMain.java takes my sequence of examples from example.e:
def foo {}
def bar { to baz(a :int) {^a} }
def func(a,b) {a | b | false}
!3
def z
def x := 5; y
3>4;;
{
67
x *= 6
}
[3,4,]
if(33){43};
a := 4**5 * 3;
a.fry(4, b*4);
x(45,"hello");
a[4]:=5;
33
33+43+4
;
`identest$id`
`etest${345}a`
`hello`
e`go`
and produces:
C:\tools\jdk1.4\bin\javaw.exe -classpath rt.jar;antlr.jar eMain
[,<SeqExpr>]
[def,<ObjectExpr>]
[,<FinalPattern>]
[foo,<IDENT>]
[{,<EScript>]
[def,<ObjectExpr>]
[,<FinalPattern>]
[bar,<IDENT>]
[{,<EScript>]
[to,<EMethod>]
[baz,<IDENT>]
[,<List>]
[,<FinalPattern>]
[a,<IDENT>]
[int,<IDENT>]
[^,<ReturnExpr>]
[a,<IDENT>]
[def,<ObjectExpr>]
[,<FinalPattern>]
[func,<IDENT>]
[,<List>]
[,<FinalPattern>]
[a,<IDENT>]
[,<FinalPattern>]
[b,<IDENT>]
[,<CallExpr>]
[,<CallExpr>]
[a,<IDENT>]
|
[b,<IDENT>]
|
[false,<IDENT>]
[,<CallExpr>]
[3,<INT>]
!
[def,<DefineExpr>]
[z,<IDENT>]
[def,<DefineExpr>]
[,<FinalPattern>]
[x,<IDENT>]
[5,<INT>]
[y,<IDENT>]
[,<CallExpr>]
[3,<INT>]
>
[4,<INT>]
[,<HideExpr>]
[67,<INT>]
[,<AssignExpr>]
[x,<IDENT>]
*=
[6,<INT>]
[[,<TupleExpr>]
[3,<INT>]
[4,<INT>]
[if,<IfExpr>]
[33,<INT>]
[43,<INT>]
[:=,<AssignExpr>]
[a,<IDENT>]
[,<CallExpr>]
[,<CallExpr>]
[4,<INT>]
**
[5,<INT>]
*
[3,<INT>]
[.,<CallExpr>]
[a,<IDENT>]
[fry,<IDENT>]
[4,<INT>]
[,<CallExpr>]
[b,<IDENT>]
*
[4,<INT>]
[run,<CallExpr>]
[x,<IDENT>]
[run,<STRING>]
[45,<INT>]
["hello",<STRING>]
[:=,<AssignExpr>]
[get,<CallExpr>]
[a,<IDENT>]
[get,<STRING>]
[4,<INT>]
[5,<INT>]
[33,<INT>]
[,<CallExpr>]
[,<CallExpr>]
[33,<INT>]
+
[43,<INT>]
+
[4,<INT>]
[simple,<QuasiLiteralExpr>]
[simple,<STRING>]
[,<QuasiContent>]
[identest,<QUASIBODY>]
[id,<QIDENT>]
[simple,<QuasiLiteralExpr>]
[simple,<STRING>]
[,<QuasiContent>]
[etest,<QUASIBODY>]
[345,<INT>]
[a,<QUASIBODY>]
[simple,<QuasiLiteralExpr>]
[simple,<STRING>]
[,<QuasiContent>]
[hello,<QUASIBODY>]
[,<QuasiLiteralExpr>]
[e,<IDENT>]
[,<QuasiContent>]
[go,<QUASIBODY>]
The one important Antlr sytax I forgot to put in comments is "(a
b) => x y", which looks ahead, and if it finds a then b, it chooses
the parser direction x y. This allows disambiguation in a few key productions.
|
|