-
Notifications
You must be signed in to change notification settings - Fork 14
/
Copy pathattribute.html
430 lines (382 loc) · 18.8 KB
/
attribute.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
<!DOCTYPE html>
<html lang="en">
<head>
<title>SILC | ATTRIBUTES</title>
<meta charset="UTF-8">
<link rel="stylesheet" href="css/style.css">
</head>
<body>
<header class="center clearfix" id="navtop"> <a href="index.html" class="logo fleft"><img src="img/logo.png" alt=""></a>
<nav class="fright">
<ul>
<li><a href="index.html">Home</a></li>
<li><a href="about.html">About</a></li>
<!-- <li><a href="help.html">Help</a></li> -->
<li><a href="roadmap.html">Roadmap</a></li>
<li><a href="documentation.html" class="navactive">Documentation</a></li>
</ul>
</nav>
</header>
<div class="about center part clearfix">
<header class="title">
<h3 class="fleft">Contents</h3>
</header>
<aside class="column4 mright">
<menu>
<ul>
<li><a href="#" class="sec">Lorem sipsum</a></li>
<li><!--For blank space between lins and download button--> </li>
<!-- <li><a href="#" class="button"> Download as PDF </a></li> -->
<li><!--Blank line for space between download button and main title--> </li>
</ul>
</menu>
</aside>
<section class="columnthird content"> <h1 class="mright">ATTRIBUTE SYNTHESIS</h1>
<article id="navintro" class="detail">
<h2>Introduction to attributes</h2>
<p>In the <a href="yacc.html#navpassingvalues">last section</a> of the YACC documentation we noted that it is possible to pass values associated with tokens from yylex() to yyparse(). We had described the term 'attribute' as a value associated with a token. YACC uses yylval to facilitate passing values from the lexical analyzer to the parser. In this document we will explore attribute values for non-terminals and how these attributes are handled by YACC internally. We will also be exploring the usage of YYSTYPE to define custom attribute types.
</p>
<p>yylval is a global variable of the default return type <a href="#navyystype">YYSTYPE</a> declared by YACC in y.tab.c. In the YACC documentation, we had seen an <a href="yacc.html#navexy1">example</a> which illustrates the passing of attributes from yylex() to yyparse(). If the programmer were to use LEX to generate yylex(), then the attributes will have to be passed to yyparse() in a similar fashion i.e, using yylval (as shown below). In the LEX program, the yylex() simply returns the token by its name while the associated attribute is assigned to yylval which can be accessed by yyparse(). Note that, for LEX to return a token, the token must be declared in the declarations section of the YACC program. The following example is a LEX program which returns a token DIGIT when it finds a string matching the "number" pattern.
</p>
<pre>
%{
#include "y.tab.h"
#include<e><</e>stdlib.h<e>></e>
#include<e><</e>stdio.h<e>></e>
%}
number [0-9]+
%%
{number} {
yylval = atoi(<a href="lex.html#navyytext">yytext</a>);
return DIGIT;
}
. return *yytext;
%%
</pre>
<p>In this example, we want to return the token DIGIT when an integer is found in the input stream. In addition to the token, we need to pass the value found in the input stream to yyparse(). The lexeme found in the input stream is a string which contains the integer found. <a href="http://en.cppreference.com/w/cpp/string/byte/atoi">atoi()</a> is a built-in function of the type int defined in the stdlib.h header file. We use atoi() to obtain the integer equivalent of the lexeme found. The obtained integer value is then assigned to yylval.
</p>
<p> The following code segment is a YACC program which declares the token DIGIT. Recall that the following program must be run with a <a href="ywl.html#navdflag">-d flag</a> to the yacc command. This flag is used to create the <a href="ywl.html#navdflag">y.tab.h</a> file which facilitates a one way communication from YACC to LEX i.e., it is used to communicate the tokens declared in the YACC program to the LEX program. Let us refer to the task of LEX obtaining the token declarations from YACC as LEX <i>importing</i> the declarations. The y.tab.h file contains declarations of all the tokens in the YACC program. Note that the LEX program above includes the y.tab.h file in the auxiliary declarations section to <i>import</i> the declarations.
</p>
<pre>
%{
#include <<e>stdio.h</e>>
%}
%token DIGIT
%%
start : expr '\n' {printf("\nComplete");exit(1);}
;
expr: expr '+' expr {printf("+ ");}
| expr '*' expr {printf("* ");}
| '(' expr ')'
| DIGIT {printf("%d ",$1);}
;
%%
yyerror()
{
printf("Error");
}
main()
{
yyparse();
return 1;
}
</pre>
</article>
<article id="navily" class="detail">
<h2>Non-terminal attributes</h2>
<p>
In addition to values associated with terminal symbols, YACC also allows values to be associated with non-terminals. We extend our notion of an attribute to: "An attribute is a value associated with a terminal or non-terminal grammar symbol". The attribute of a grammar symbol in a production can be referred to in the actions section of a rule using a special YACC syntax: $1 for the first symbol in the body of a production, $2 for the second symbol, $3 for the third and so on. For example consider the following example of a YACC rule.
</p>
<pre>
X: B C D
</pre>
<p id="navstackvar">
The value of a symbol 'B' can be referred to by $1, value of 'B' can be referred to by $2 and value of symbol 'C' can be referred to by $3. It is also possible to assign a value to the non-terminal which occurs as the head of a rule's production using $$. A value can be assigned to the symbol X using $$. Let's call these the <i>attribute stack variables</i>.</p>
<p>
Consider the problem of displaying two numbers in an input stream if they occur as a pair separated by a comma. Also suppose that the numbers must be displayed ONLY after a pair is found. Let us look at a YACC program that solves the problem.
</p>
<pre>
pair: number ',' number { printf("pair(%d,%d),$1,$3"); }
;
number: DIGIT { $$=$1; }
;
</pre>
<p>
In the program segment, the first rule displays the value of the 'number' symbols when found as a pair in the input stream. The action of the rule refers to the value of the first number symbol as $1 and and the second number as $3. (Note that $2 refers to the <i>literal token</i> ',' which does not have any value associated with it). Since 'number' is a non-terminal, its attribute cannot be set by yylex(). <a href="yacc.html#navprod">Recall</a> that every non-terminal symbol in the CFG must have a corresponding production with the non-terminal as the head. The attribute value of a non-terminal is set by the action of the rule which contains the corresponding production. In the example, the value of the non-terminal 'number' is set by the rule:
</p>
<pre>
number: DIGIT { $$=$1; }
</pre>
<p>
The action of the rule sets the value of 'number' (reffered to using $$) to the value of DIGIT (referred to using $1).
</p>
<p>
Sample I/O:
<p>
<pre>
I: 2,5<br>
O: pair(2,5)<br><br>
I: 3,5,7<br>
O: syntax error
</pre>
</article>
<article id="navastack" class="detail">
<h2>Attribute Stack</h2>
<p>
YACC maintains a parse stack to achieve <a href="yacc.html#navshiftreduce">shift-reduce parsing</a>. The parse stack contains grammar symbols (both terminal and non-terminal symbols) representing the current <a href="yacc.html#">configuration of the parser</a>. Similar to the parse stack, YACC also maintains an attribute stack to maintain the values of the grammar symbols in the parse stack. The attribute stack is <i>synchronous</i> with the parse stack. <i>Synchronous</i> because the i'th value on the attribute stack will be the attribute of the i'th symbol on the parse stack.
</p>
<p><a href="lex.html#">Recall</a> that when a token has been identified by LEX in the input stream, it stores the token's value in yylval. When a token is <a href="yacc.html#navparseact">shifted</a> to the parse stack, the value of yylval is pushed on to the attribute stack. Similarly, during a <a href="yacc.html#navparseact">reduction</a> when a non-terminal replaces a production <a href="yacc.html#navparseact">handle</a> on top of the stack, the attributes of the grammar symbols of the handle are popped from the stack and the attribute value of the non-terminal (head of the production) is pushed on to the attribute stack. The following example is a complete YACC program that evaluates an expression.
</p>
<pre>
%{
#include <<e>stdio.h</e>>
int result;
%}
%token DIGIT
%left '+'
%left '*'
%%
start : expr '\n' {result = $1; exit(1);}
;
expr: expr '+' expr {$$ = $1 + $2;}
| expr '*' expr {$$ = $1 * $3;}
| '(' expr ')' {$$ = $2;}
| DIGIT {$$ = $1;}
;
%%
yyerror()
{
printf("Error");
}
main()
{
yyparse();
printf("Expression value = %d",result);
return 1;
}
</pre>
<p>Sample I/O:</p>
<pre>
I: 2+3*(4+5)<br>
O: 29<br>
</pre>
<p>Consider the two productions from the example.</p>
<pre>
expr: expr '+' expr {$$ = $1 + $2;} <br>
expr: DIGIT {$$ = $1;}
</pre>
<p> When the input 2+3*(4+5) is fed to the parser, yylex() reads the first character and finds a matching token DIGIT. yylex() returns DIGIT and asigns the value 2 to yylval. When yylex() has returned, yyparse() pushes DIGIT to the parse stack and the value of yylval to the attribute stack. $1, $2, $3 etc., refers to the attribute values on top of the stack before a reduction takes place. $$ refers to the attribute value of the non-terminal which is the head of the production which contains the handle. When the non-terminal is pushed on to the parse stack, the value of $$ is pushed on to the attribute stack. $$ refers to the symbol on top of the stack after a reduction has taken place. The following table shows the configuration of the parse stack and the attribute stack at every step of the parsing process. Assume that whenever yylex() returns a token with no attribute, yyparse() pushes a '<b>.</b>' to the attribute stack.</p>
<table class="tg">
<tr>
<th class="tg-e3zv">PARSE STACK<br></th>
<th class="tg-e3zv">ATTRIBUTE STACK<br></th>
<th class="tg-e3zv">I/P BUFFER<br></th>
<th class="tg-e3zv">PARSER-ACTION EXECUTED<br></th>
</tr>
<tr>
<td class="tg-031e"></td>
<td class="tg-031e"></td>
<td class="tg-031e">2 + 3 * (4 + 5) $<br></td>
<td class="tg-031e">_</td>
</tr>
<tr>
<td class="tg-031e">DIGIT</td>
<td class="tg-031e">2</td>
<td class="tg-031e">+ 3 * ( 4 + 5 ) $<br></td>
<td class="tg-031e">SHIFT</td>
</tr>
<tr>
<td class="tg-031e">expr</td>
<td class="tg-031e">2</td>
<td class="tg-031e">+ 3 * ( 4 + 5 ) $</td>
<td class="tg-031e">REDUCE</td>
</tr>
<tr>
<td class="tg-031e">expr +<br></td>
<td class="tg-031e">2 <b>.</b></td>
<td class="tg-031e">3 * ( 4 + 5 ) $</td>
<td class="tg-031e">SHIFT</td>
</tr>
<tr>
<td class="tg-031e">expr + DIGIT<br></td>
<td class="tg-031e">2 <b>.</b> 3</td>
<td class="tg-031e">* ( 4 + 5 ) $</td>
<td class="tg-031e">SHIFT</td>
</tr>
<tr>
<td class="tg-031e">expr + expr<br></td>
<td class="tg-031e">2 <b>.</b> 3</td>
<td class="tg-031e">* ( 4 + 5) $<br></td>
<td class="tg-031e">REDUCE</td>
</tr>
<tr>
<td class="tg-031e">expr + expr *<br></td>
<td class="tg-031e">2 <b>.</b> 3 <b>.</b></td>
<td class="tg-031e">( 4 + 5 ) $</td>
<td class="tg-031e">SHIFT</td>
</tr>
<tr>
<td class="tg-031e">expr + expr * (<br></td>
<td class="tg-031e">2 <b>.</b> 3<b> . .</b></td>
<td class="tg-031e">4 + 5 ) $<br></td>
<td class="tg-031e">SHIFT</td>
</tr>
<tr>
<td class="tg-031e">expr + expr * ( DIGIT<br></td>
<td class="tg-031e">2 <b>.</b> 3 <b>. .</b> 4</td>
<td class="tg-031e">+ 5 ) $</td>
<td class="tg-031e">SHIFT</td>
</tr>
<tr>
<td class="tg-031e">expr + expr * ( expr<br></td>
<td class="tg-031e">2 <b>.</b> 3 <b>. .</b> 4 </td>
<td class="tg-031e">+ 5 ) $</td>
<td class="tg-031e">REDUCE</td>
</tr>
<tr>
<td class="tg-031e">expr + expr * ( expr +<br></td>
<td class="tg-031e">2 <b>.</b> 3 <b>. .</b> 4 <b>.</b></td>
<td class="tg-031e">5 ) $</td>
<td class="tg-031e">SHIFT</td>
</tr>
<tr>
<td class="tg-031e">expr + expr * ( expr + DIGIT<br></td>
<td class="tg-031e">2 <b>.</b> 3 <b>. .</b> 4 <b>.</b> 5</td>
<td class="tg-031e">) $<br></td>
<td class="tg-031e">SHIFT</td>
</tr>
<tr>
<td class="tg-031e">expr + expr * ( expr + expr <br></td>
<td class="tg-031e">2 <b>.</b> 3 <b>. .</b> 4 <b>.</b> 5</td>
<td class="tg-031e">) $<br></td>
<td class="tg-031e">REDUCE</td>
</tr>
<tr>
<td class="tg-031e">expr + expr * ( expr<br></td>
<td class="tg-031e">2 <b>.</b> 3 <b>. .</b> 9</td>
<td class="tg-031e">) $<br></td>
<td class="tg-031e">REDUCE</td>
</tr>
<tr>
<td class="tg-031e">expr + expr * ( expr )<br></td>
<td class="tg-031e">2 <b>.</b> 3 <b>. .</b> 9 .</td>
<td class="tg-031e">$</td>
<td class="tg-031e">SHIFT</td>
</tr>
<tr>
<td class="tg-031e">expr + expr * expr<br></td>
<td class="tg-031e">2 <b>.</b> 3 <b>.</b> 9</td>
<td class="tg-031e">$</td>
<td class="tg-031e">REDUCE</td>
</tr>
<tr>
<td class="tg-031e">expr + expr <br></td>
<td class="tg-031e">2 <b>.</b> 27</td>
<td class="tg-031e">$</td>
<td class="tg-031e">REDUCE</td>
</tr>
<tr>
<td class="tg-031e">expr</td>
<td class="tg-031e">29</td>
<td class="tg-031e">$</td>
<td class="tg-031e">REDUCE</td>
</tr>
<tr>
<td class="tg-031e">$expr</td>
<td class="tg-031e">29</td>
<td class="tg-031e">$</td>
<td class="tg-031e">ACCEPT</td>
</tr>
</table>
</article>
<article id="navyystype" class="detail">
<h2>YYSTYPE</h2>
<p> Attributes of terminals can be passed from yylex() to yyparse() and attributes of a non-terminal can be <b>synthesized</b>. An attribute of a non-terminal grammar symbol is said to <i>synthesized</i> if it has been calculated from the attribute values of it's children in the <a href="http://en.wikipedia.org/wiki/Parse_tree">parse tree</a>. A synthesized attribute is an attribute of a non-terminal than has been calculated using the attribute values of the symbols in the handle that it replaces. An example of synthesized attribute:
</p>
<pre>Z: X {printf("Result=%d",$1);}
X: A '+' B { $$ = $1 + $3; }
</pre>
<p>The attribute value of X is a synthesized attribute as it has been calculated using the attribute values of the symbols in the handle that it replaces. The attribute stack consists of attributes of token and synthesized attributes. The attribute stack is of the type YYSTYPE i.e, all the <a href="#navstackvar">stack variables</a> are of the type YYSTYPE. For example, in the above production, $$,$1 and $3 are all of the type YYSTYPE. YYSTYPE is a YACC-defined data type. yylval is declared by YACC to be of the type YYSTYPE. By default YACC defines YYSTYPE to be the type int. It means that, by default only integer valued attributes can be passed from yylex() to yyparse() and only integer attributes can be synthesized. If we were to attempt to assign any other value to yylval or any of the attribute stack variables, a type error would be flagged on compiling y.tab.c using gcc.
</p>
<p> This default type definition of YYSTYPE can overriden with any built-in or userdefined data type. For example if we wanted to print the prefix form of an expression:
</p>
<pre>
expr: expr OP expr { printf("%c %c %c",$2,$1,$3);}
</pre>
<p>
The type of YYSTYPE can be overriden manually as shown below. This line has to be added to the declarations section of the YACC program. This may be used (not recommended) to change the type of all the attributes from int to some other type.
</p>
<pre>
typedef char YYSTYPE;
</pre>
<p>But inorder to have multiple custom attribute values, YACC offers a useful feature called %union to customize the type of YYSTYPE. %union is useful when we require to have different tokens of different types. For example if we wanted some tokens to be of the type int and some tokens to be of the type char. The following code segment can be added to declarations section of the YACC program to achieve that.
</p>
<pre>
/* YACC Auxiliary declarations*/
/* YACC Declarations*/
%union
{
char character;
int integer;
};
%token <e><</e>character<e>></e> OP
%token <e><</e>integer<e>></e> NUMBER
%%
expr: expr OP expr { printf("%c %d %d",$2,$1,$3);}
| DIGIT {$$=$1;}
;
%%
/* Auxiliary functions */
</pre>
<p> Note that the type of the attribute of each token must be mentioned when the token is being declared using the following syntax.
<pre>
%token <e><</e>type<e>></e> tokenname
</pre>
'type' must be declared under %union prior to use in the declaration of a token. If the type of a token is not explicitly mentioned, no attribute value can be assigned to the token i.e, it is assumed to be of type <a href="http://en.wikipedia.org/wiki/Void_type">void</a>.
</p>
</article>
<article id="ref" class="detail">
<h2>References</h2>
<p>
<a href="http://epaperpress.com/lexandyacc/pry1.html">http://epaperpress.com/lexandyacc/pry1.html</a> <br>
<a href="http://www.sci.brooklyn.cuny.edu/~zhou/teaching/cis707/attr/tsld003.htm">http://www.sci.brooklyn.cuny.edu/~zhou/teaching/cis707/attr/tsld003.htm<a><br>
<a href="http://cs.nyu.edu/~gottlieb/courses/2008-09-fall/compilers/lectures/lecture-08.html">http://cs.nyu.edu/~gottlieb/courses/2008-09-fall/compilers/lectures/lecture-08.html</a><br>
<a href="http://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools"> http://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools</a> <br>
</p>
</artcile>
<article id="navreferences" class="detail">
<h2>References</h2>
<p> For further details on the topics covered in this document, the reader may refer to the following :</p>
<ul>
<li>1. Compilers : Principles,Techniques and Tools by Alfred V.Aho, Monica S. Lam, Ravi Sethi and Jeffrey D.Ulman .</li>
<li>2. Modern Compiler Implementation in C by Andrew W.Appel</li>
<li>3. Flex & Bison by John Levine</li>
<li>4. <a href="http://dinosaur.compilertools.net/"> http://dinosaur.compilertools.net/</a></li>
</ul>
</article>
<div class="up grid col-one-third" style="float:right">
<a href="#navtop" title="Go back up"> top ↑</a>
</div>
</section>
</div>
<footer class="center part clearfix">
<ul class="social column3 mright">
<li><a href="https://github.com/silcnitc">Github</a></li>
<li> <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="img/creativecommons.png" /></a></li>
</ul>
<div class="up column3 mright"> <a href="#navtop" class="ir">Go up</a> </div>
<nav class="column3">
<ul>
<li><a href="index.html">Home</a></li>
<li><a href="about.html">About</a></li>
<!-- <li><a href="uc.html">Contact</a></li> -->
</ul>
</nav>
</footer>
<script src="http://code.jquery.com/jquery.min.js"></script>
<script>window.jQuery || document.write('<script src="js/jquery-1.5.1.min.js"><\/script>')</script>
<script src="js/scripts.js"></script>
<script src="js/inject.js"></script>
</body>
</html>