Preface vii
Part I Fundamentals of Compilation
1 Introduction 3
1.1 Modules and interfaces 4
1.2 Tools and software 5
1.3 Data structures for tree languages 7
2 Lexical Analysis 16
2.1 Lexical tokens 17
2.2 Regular expressions 18
2.3 Finite automata 21
2.4 Nondeterministic finite automata 24
2.5 Lex: a lexical analyzer generator 30
3 Parsing 39
3.1 Context-free grammars 41
3.2 Predictive parsing 46
3.3 LR parsing 56
3.4 Using parser generators 69
3.5 Error recovery 76
4 Abstract Syntax 88
4.1 Semantic actions 88
.4.2 Abstract parse trees 92
5 Semantic Analysis 103
5.1 Symbol tables 103
5.2 Bindings for the Tiger compiler 112
5.3 Type-checking expressions 115
5.4 Type-checking declarations 118
6 Activation Records 125
6.1 Stack frames 127
6.2 Frames in the Tiger compiler 135
7 Translation to Intermediate Code 150
7.1 Intermediate representation trees 151
7.2 Translation into trees 154
7.3 Declarations 170
8 Basic Blocks and Traces 176
8.1 Canonical trees 177
8.2 Taming conditional branches 185
9 Instruction Selection 191
9.1 Algorithms for instruction selection 194
9.2 CISC machines 202
9.3 Instruction selection for the Tiger compiler 205
10 Liveness Analysis 218
10.1 Solution of dataflow equations 220
10.2 Liveness in the Tiger compiler 229
11 Register Allocation 235
11.1 Coloring by simplification 236
11.2 Coalescing 239
11.3 Precolored nodes 243
11.4 Graph coloring implementation 248
11.5 Register allocation for trees 257
12 Putting It All Together 265
Part II Advanced Topics
13 Garbage Collection 273
13.1 Mark-and-sweep collection 273
13.2 Reference counts 278
13.3 Copying collection 280
13.4 Generational collection 285
13.5 Incremental collection 287
13.6 Baker's algorithm 290
13.7 Interface to the compiler 291
14 Object-Oriented Languages 299
14.1 Classes 299
14.2 Single inheritance of data fields 302
14.3 Multiple inheritance 304
14.4 Testing class membership 306
14.5 Private fields and methods 310
14.6 Classless languages 310
14.7 Optimizing object-oriented programs 311
15 Functional Programming Languages 315
15.1 A simple functional language 316
15.2 Closures 318
15.3 Immutable variables 319
15.4 Inline expansion 326
15.5 Closure conversion 332
15.6 Efficient tail recursion 335
15.7 Lazy evaluation 337
16 Polymorphic Types 350
16.1 Parametric polymorphism 351
16.2 Type inference 359
16.3 Representation of polymorphic variables 369
16.4 Resolution of static overloading 378
17 Dataflow Analysis 383
17.1 Intermediate representation for flow analysis 384
17.2 Various dataflow analyses 387
17.3 Transformations using dataflow analysis 392
17.4 Speeding up dataflow analysis 393
17.5 Alias analysis 402
18 Loop Optimizations 410
18.1 Dominators 413
18.2 Loop-invariant computations 418
18.3 Induction variables 419
18.4 Array-bounds checks 425
18.5 Loop unrolling 429
19 Static Single-Assignment Form 433
19.1 Converting to SSA form 436
19.2 Efficient computation of the dominator tree 444
19.3 Optimization algorithms using SSA 451
19.4 Arrays, pointers, and memory 457
19.5 The control-dependence graph 459
19.6 Converting back from SSA form 462
19.7 A functional intermediate form 464
20 Pipelining and Scheduling 474
20.1 Loop scheduling without resource bounds 478
20.2 Resource-bounded loop pipelining 482
20.3 Branch prediction 490
21 The Memory Hierarchy 498
21.1 Cache organization 499
21.2 Cache-block alignment 502
21.3 Prefetching 504
21.4 Loop interchange 510
21.5 Blocking 511
21.6 Garbage collection and the memory hierarchy 514
Appendix: Tiger Language Reference Manual 518
A. 1 Lexical issues 518
A.2 Declarations 518
A.3 Variables and expressions 521
A.4 Standard library 525
A.5 Sample Tiger programs 526
Bibliography 528
Index 537
Over the past decade, there have been several shifts in the way compilers arebuilt. New kinds of programming languages are being used: object-orientedlanguages with dynamic methods, functional languages with nested scopeand first-class function closures; and many of these languages require garbagecollection. New machines have large register sets and a high penalty for mem-ory access, and can often run much faster with compiler assistance in schedul-ing instructions and managing instructions and data for cache locality.
This book is intended as a textbook for a oneor two-semester coursein compilers. Students will see the theory behind different components of acompiler, the programming techniques used to put the theory into practice,and the interfaces used to modularize the compiler. To make the interfacesand prografnming examples clear and concrete, I have written them in the Cprogramming language. Other editions of this book are available that use theJava and ML languages.
Implementation project. The "student project compiler" that I have outlinedis reasonably simple, but is organized to demonstrate some important tech-niques that are now in common use: abstract syntax trees to avoid tangling syntax and semantics, separation of instruction selection from register alloca-tion, copy propagation to give flexibility to earlier phases of the compiler, and containment of target-machine dependencies. Unlike many "student compil-ers'' found in textbooks, this one has a simple but sophisticated back end,allowing good register allocation to be done after instruction selection.
Each chapter in Part I has a programming exercise corresponding to one module of a compiler. Software useful for the exercises can be found at
Exercises. Each chapter has pencil-and-paper exercises; those marked witha star are more challenging, two-star problems are difficult but solvable, and the occasional three-star exercises are not known to have a solution.Course sequence. The figure shows how the chapters depend on each other.
· A one-semester course could cover all of Part I (Chapters 1-12), with students implementing the project compiler (perhaps working in groups); in addition, lectures could cover selected topics from Part II.
· An advanced or graduate course could cover Part II, as well as additional topics from the current literature. Many of the Part II chapters can stand independently from Part I, so that an advanced course could be taught to students who have used a different book for their first course.
· In a two-quarter sequence, the first quarter could cover Chapters 1-8, and the second quarter could cover Chapters 9-12 and some chapters from Part II.
Acknowledgments. Many people have provided constructive criticism or helped me in other ways on this book. I would like to thank Leonor Abraido-Fandino, Scott Ananian, Stephen Bailey, Max Hailperin, David Hanson, Jef-frey Hsu, David MacQueen, Torben Mogensen, Doug Morgan, Robert Netzer,Elma Lee Noah, Mikael Petterson, Todd Proebsting, Anne Rogers, BarbaraRyder, Amr Sabry, Mooly Sagiv, Zhong Shao, Mary Lou Sofia, Andrew Tol-mach, Kwangkeun Yi, and Kenneth Zadeck.