CS2210 Compiler Construction Fall 2022
Part II: Syntax Analysis
1. Objective
In this phase of the project, you are required to write a parser using YACC for the CS 2210 programming language, MINI-JAVA. The parser communicates with the lexer you built in Part I and output the parse tree of the input MINI-JAVA program.
2. Due Date
The assignment has no due date.
3. Grammar Specification
The grammar is specified by syntax diagrams (Appendix B).
4. Implementation 4.1 Parser Structure
grammar.y ¡ú ¡ú y.tab.c
lexl.l ¡ú ¡ú lex.yy.c parser
your_other_files.c, proj2.c terminal# parser < test1.java
Grammar.y has similar file structure as that of ¡°lex.l¡±.
%{ /* definition */ #include "proj2.h" #include
%token
%% /* yacc specification */
Program : PROGRAMnum IDnum SEMInum ClassDecl
{ $$ = MakeTree(ProgramOp, $4, MakeLeaf(IDNode, $2)); printtree($$, 0); }
/* other rules */
Expression : SimpleExpression {$$ = $1;}
| SimpleExpression Comp_op SimpleExpression
{ MkLeftC($1, $2); $$ = MkRightC($3, $2); }
int yycolumn, yyline;
FILE *treelst;
main() { treelst = stdout; yyparse(); }
yyerror(char *str) { printf(“yyerror: %s at line %d\n”, str, yyline); } #include “lex.yy.c”
Modification has to be made in your lex.l. When assigning yylval, you need to
{int} {yycolumn += yyleng; yylval.intg = atoi(yytext); return(ICONSTnum);} {variable} { …. yylval.intg = index; … }
4.2 Data Strutures
Appendix A lists functions that are provided for your convenience to implement and debug your code. The C source code ¡°proj2.c¡± and header file ¡°proj2.h¡± could be found from class webpage.
The parse tree is defined as follows.
You need to distinguish the following kinds of nodes (defined in proj2.h): IDNode, NUMNode, STRINGNode, DUMMYNode, INTEGERTNode or EXPRNode. The first 4 kinds correspond to an identifier, an integer constant, a string constant and a null node. A leaf node of INTEGERTNode kind is created for ¡°int¡± type declaration, i.e. create the node for every INTnum token. All interior nodes are of EXPRNode kind.
Each Leaf node contains an IntVal field. For an ID or string constant node, IntVal is the value to find its lexeme ( a pointer to symbol table). For a NUMNode, it is the value. For a DUMMYNode, it is always 0.
Each interior node is associated with an operator type. Defined in proj2.h, we have the following types.
/* syntax tree node struct */ typedef struct treenode
{ int NodeKind, NodeOpType, IntVal; struct treenode *LeftC, *RightC;
} ILTree, *tree;
program, root node operator
class body, method body, decl body, statmentlist body. each declaration has this operator
connected by ¡°,¡±
array type
type id operator
bound for array variable declaration
head of method,
arguments specified by ¡°VAL¡± .e.g. abc(VAL int x) statement
if-then-else
while statement
specification of parameters
routine call
assign operator
return statement
ProgramOp:
ArrayTypeOp:
RArgTypeOp:
VargTypeOp:
RoutineCallOp:
AddOp, SubOp, MultOp, DivOp, LTOp, GTOp, EQOp, NEOp, LEOp, GEOp, AndOp, OrOp, UnaryNegOp,
NotOp: ALU operations VarOp:
SelectOp: IndexOp: FieldOp: ClassOp: MethodOp: ClassDefOp:
to access a field/index variable follow ¡°[]¡± to access a variable follow ¡°.¡± to access a variable for each class
for each method
for each class defintion
Functions makeleaf, maketree are used to create leaf nodes and intermediate nodes respectively. Printtree(tree nd, int depth) is used to output a tree structure. You need to provide the implementation of following two functions in order to have variable name and string const correctly printed. That is, replace the following code in ¡°proj2.c¡± with your version.
To grade your project, you are also required to print out the parse tree at top level after you have successfully built it. Syntax errors should be reported in your yyerror function. You need to give the line number where the error occurs.
The sample output for the example is:
extern char strg_tbl[];
char* getname(int i) /* i is the index of the table, passed through yylval*/ { return( strg_tbl+i );/*return string table indexed at i*/ }
char* getstring(int i)
{ return( strg_tbl+i );/*return string table indexed at i*/}
+-[IDNode,0,”xyz”]
R-[ProgramOp]
| +-[IDNode,4,”test”]
| +-[ClassDefOp]
| | | +-[MethodOp]
| | | | | +-[DUMMYnode]
| | | | | +-[SpecOp]
+-[DUMMYnode]
+-[CommaOp]
| +-[STRINGNode,29,”Hello World !!!”]
+-[RoutineCallOp]
| | +-[DUMMYnode]
| | +-[SelectOp]
| | | | +-[DUMMYnode]
| | | +-[FieldOp]
| | | +-[IDNode,21,”println”]
| +-[VarOp]
| +-[IDNode,14,”system”]
+-[StmtOp]
| +-[DUMMYnode]
+-[BodyOp]
| +-[DUMMYnode]
| | | | | | +-[DUMMYnode]
| | | | +-[HeadOp]
| | | | +-[IDNode,9,”main”]
| | +-[BodyOp]
| | +-[DUMMYnode]
+-[ClassOp]
+-[DUMMYnode]
5. Assignment Submission
Thereisno submission for this project.
Appendix A: Provided functions
function NullExp(); return *ILTree
Returns a null node with kind=DummyNode and semantic value=0.
function MakeLeaf(Kind: NodeKindType; N: integer); return *ILTree
Returns a leaf node of specified Kind with integer semantic value N.
function MakeTree(Op: NodeOpType; Left,Right: *ILTree); return *ILTree
Returns an internal node, T, such that NodeOp(T)=Op; LeftChild(T)=Left; RightChild(T)=Right and
NodeKind(T)=InteriorNode.
function NodeOp(T: *ILTree); return NodeOpType
See MakeTree. Returns the integer constant representing NodeOpType of T if T is an interior node, else returns UndefOp.
Uses NodeKind(T) to distinguish leaf from interior. function NodeKind(T: *ILTree); return NodeKindType
Returns the kind of node T. function LeftChild(T: *ILTree); return *ILTree
Returns pointer to left child of T. Returns pointer to null node if NodeKind(T) <> InteriorNode. function RightChild(T: *ILTree); return *ILTree
Returns pointer to right child of T. Returns pointer to null node if NodeKind(T)!= InteriorNode. function IntVal(T: *ILTree); return integer
See MakeLeaf. Returns integer semantic value of node T if NodeKind(T) = IDNode, STRGNode, NUMNode, or BOOLNode. Otherwise returns Undefined.
function IsNull(T: *ILTree); return boolean IsNull(T) iff T is null node.
function SetNodeOp(T: *ILTree; Op: NodeOpType)
NodeKind(T) must be InteriorNode. Makes NodeOp(T) = Op.
function SetNodeKind(T: *ILTree; Kind: NodeKindType)
NodeKind(T) must not be InteriorNode. Makes NodeKind(T) = Kind.
function SetNodeVal(T: *ILTree; Val: integer)
NodeKind(T) must not be InteriorNode. Makes IntVal(T) = Val.
function SetLeftChild(T,NewChild: *ILTree)
NodeKind(T) must be InteriorNode. Makes LeftChild(T) = NewChild.
function SetRightChild(T,NewChild: *ILTree)
NodeKind(T) must be InteriorNode. Makes RightChild(T) = NewChild.
Computer Science Tutoring
Appendix B: Syntax diagrams
VariableInitializer
)snekot( slobmys lanimret sispille dilos
slobmys lanimretnon sexob dehsad :dnegeL
eertbus espilce dedahs
sedon lamron espilce :dnegeL
lceDdohteM
dIlceDelbairaV
rezilaitinIelbairaV
ro F eertbuS
ro F eertbuS
eertbus gniwollof eht sah raV hcaE
roF eertbuS
roF eertbuS
lceDdohteM
roF eertbuS
SNOITARALCEDDNESNOITARALCED
roF eertbuS
erom ro eno
erom ro orez
roF eertbuS
lceDdohteM
roF eertbuS
pOfeDssalC
erom ro orez
ro F eertbuS
erom ro eno
roF eertbuS
roF eertbuS
.eertbusehtgnidliub
ni desu eb yam ti taht hcus )elbairav labolg(
retniop etarapes a ni derots eb dluohs epyT
; REIFITNEDI
REIFITNEDI
VariableDeclId
VariableInitializer
ArrayInitializer
ArrayCreationExpression
MethodDecl
noisserpxE
rezilaitinIyarrA
noisserpxEnoitaerCyarrA
rezilaitinIelbairaV
noisserpxE
tsiLretemaraPlamroF
.seertbus dna eht gnidliub
ni desu eb yam ti taht hcus )elbairav labolg(
retniop etarapes a ni derots eb dluohs epyT
pOepyTyarrA
roF eertbuS
roF eertbuS
pOepyTyarrA
lanimret-non taht
roF eertbuS
roF eertbuS
roF eertbuS
roF eertbuS
erom ro eno
erom ro eno
][REIFITNEDI
roF eertbuS
kcolB)(REIFITNEDIDOHTEM
FormalParameterList
R/VA rgTypeOp
Sub tree Fo r
VAL INT IDENTIFIER
Comm aOp R /VA rgTypeOp
IDNode INTEGERT Comm aOp Dumm y
StatementList
IDNode INTEGERT
Sub tree Fo r Sub tree Fo r
Decls Stm tList
Decls Statem entL ist
Sub tree Fo r
ID/INTEGER
zeroormore
IDENT IF IER
SubtreeFor Dummy
{ Statement } SubtreeFor
SubtreeFor
Dummy Stm tList
Sub tree Fo r
thatnon-term inal
Stm tL ist
A s s ig nm e n tS ta tem e n t
M e th o dC a l lS ta tem e n t
ReturnStatem ent
IfStatem ent
W hileStatement
Programming Help
AssignmentStatement
A s s ig nO p
A s s ig nO p
Sub tree Fo r
Dumm y Var
RoutineCallOp
SubtreeFor
V ariab le := Expression
MethodCallStatement
Variable ( Expression ) Comm aOp
Sub tree Fo r
Sub tree Fo r
, Comm aOp
Sub tree Fo r
ReturnStatement
RETURN Expression
IfStatement
IF Expression StatementList
Sub tree Fo r
one orm ore
WhileStatement
WH ILE Expression Statem entL ist
Sub tree Fo r
ELSE StatementList Stm tList
Expression
Dumm y Comm aOp
Sub tree Fo r Sub tree Fo r
Expr Stm tL ist
SubtreeFor SubtreeFor
Expr Stm tL ist
SimpleExpression SimpleExpression
Sub tree Fo r Sub tree Fo r
>= S im pleExpr S im pleExpr
Code Help, Add WeChat: cstutorcs
SimpleExpression
UnsignedConstant
tnatsnoCdengisnU
tnemetatSllaCdohteM
noisserpxE
noisserpxE
ymmuDedoNDI
dleiF/xednI
erom ro eno
dleiF/xednI
ro F eertbuS
REIFITNEDI .
TNATSNOCGNIRTS
TNATSNOCREGETNI
erom ro orez
pOtceleSedoNDI
lanimret-nontaht
roF eertbuS
lanimret-nontaht
roF eertbuS
pOgeNyranU
mreT/rtocaFmreT/rotcaF
roF eertbuS
roF eertbuS
mreT/rotcaFpOddA
roF eertbuS
erom ro eno
roF eertbuS
roF eertbuS
roF eertbuS
REIFITNEDI