Code Generation

Code Generation

Download, print and study this document offline
 Page 1


1
11/30/2007 1
Code Generation
CSE413
Autumn 2007
11/30/2007 2
Compiler Structure (review)
Source Target
Scanner
Parser
Middle
(optimization)
Code Gen
characters
tokens
IR
IR (maybe different)
Assembly code
11/30/2007 3
What We Need
 To run a D program:
 Space needs to be allocated for a stack 
and a heap
 ESP and other registers need to have 
sensible initial values
 We need some way to communicate with 
the outside world (I/O)
11/30/2007 4
Bootstraping from C
 Idea: take advantage of the existing C 
runtime library 
 Write a small C main program that calls 
the main method in the assembly you 
generate as if it were a C function.
 C’s standard library provides the execution 
environment
 We can write C functions for get() and 
put() and call them from our asm code.
11/30/2007 5
Bootstrap Program
 The bootstrap will be a tiny C program 
that calls your compiled code as if it 
were an ordinary C function
 It also contains some functions that 
compiled code can call as needed
 Mini “runtime library”
 get(), put()
11/30/2007 6
Example Bootstrap Program
#include <stdio.h>
int d$main();   /* prototype for external function */
int main() {
printf("\nValue returned from D main: %d.\n", d$main());
return 0;
}
/* return next integer from standard input */
int get() { scanf.... }
/* write x to standard output */
int put(int x) { printf... }
Page 2


1
11/30/2007 1
Code Generation
CSE413
Autumn 2007
11/30/2007 2
Compiler Structure (review)
Source Target
Scanner
Parser
Middle
(optimization)
Code Gen
characters
tokens
IR
IR (maybe different)
Assembly code
11/30/2007 3
What We Need
 To run a D program:
 Space needs to be allocated for a stack 
and a heap
 ESP and other registers need to have 
sensible initial values
 We need some way to communicate with 
the outside world (I/O)
11/30/2007 4
Bootstraping from C
 Idea: take advantage of the existing C 
runtime library 
 Write a small C main program that calls 
the main method in the assembly you 
generate as if it were a C function.
 C’s standard library provides the execution 
environment
 We can write C functions for get() and 
put() and call them from our asm code.
11/30/2007 5
Bootstrap Program
 The bootstrap will be a tiny C program 
that calls your compiled code as if it 
were an ordinary C function
 It also contains some functions that 
compiled code can call as needed
 Mini “runtime library”
 get(), put()
11/30/2007 6
Example Bootstrap Program
#include <stdio.h>
int d$main();   /* prototype for external function */
int main() {
printf("\nValue returned from D main: %d.\n", d$main());
return 0;
}
/* return next integer from standard input */
int get() { scanf.... }
/* write x to standard output */
int put(int x) { printf... }
2
11/30/2007 7
Code Generation in Our 
Project 
11/30/2007 8
Generating .asm Code
 Suggestion: isolate the actual output 
operations in a handful of routines
 Modularity & saves some typing
 Possibilities
// write code string s to .asm output
void gen(String s) { … }
// write “op  src,dst” to .asm output
void genbin(String op, String src, String dst) { … }
// write label L to .asm output as “L:”
void genLabel(String L) { … }
 A handful of these methods should do it
11/30/2007 9
Agenda
 Mapping source code to x86
 Today: Stuff Needed for project: basic 
statements and expressions
 Next: Other cool stuff: Object 
representation, method calls, and dynamic 
dispatch
11/30/2007 10
Conventions for Examples
 Examples show code snippets in isolation
 A “real” code generator needs to worry about things 
like :
 Which registers are busy at which point in the program
 Which registers to spill into memory when a new register is 
needed and no free ones are available
 You won’t need to worry about this for your compiler!
 Register eax used below as a generic example 
 Rename as needed for more complex code involving multiple 
registers
 But for the most part in our compiler we can stick to just 
using eax and ecx.
11/30/2007 11
A Simple Code Generation 
Strategy
 Priority: quick ‘n dirty correct code first, 
optimize later if time
 Treat the x86 as a 1-register stack 
machine
11/30/2007 12
x86 as a Stack Machine
 Idea: Use x86 stack for expression evaluation with 
eax as the “top” of the stack
 Invariant: Whenever an expression (or part of one) 
is evaluated at runtime, the result is in eax
 If a value needs to be preserved while another 
expression is evaluated, push eax, evaluate, then pop 
when needed
 Remember: always pop what you push
 Will produce lots of redundant, but correct, code
Page 3


1
11/30/2007 1
Code Generation
CSE413
Autumn 2007
11/30/2007 2
Compiler Structure (review)
Source Target
Scanner
Parser
Middle
(optimization)
Code Gen
characters
tokens
IR
IR (maybe different)
Assembly code
11/30/2007 3
What We Need
 To run a D program:
 Space needs to be allocated for a stack 
and a heap
 ESP and other registers need to have 
sensible initial values
 We need some way to communicate with 
the outside world (I/O)
11/30/2007 4
Bootstraping from C
 Idea: take advantage of the existing C 
runtime library 
 Write a small C main program that calls 
the main method in the assembly you 
generate as if it were a C function.
 C’s standard library provides the execution 
environment
 We can write C functions for get() and 
put() and call them from our asm code.
11/30/2007 5
Bootstrap Program
 The bootstrap will be a tiny C program 
that calls your compiled code as if it 
were an ordinary C function
 It also contains some functions that 
compiled code can call as needed
 Mini “runtime library”
 get(), put()
11/30/2007 6
Example Bootstrap Program
#include <stdio.h>
int d$main();   /* prototype for external function */
int main() {
printf("\nValue returned from D main: %d.\n", d$main());
return 0;
}
/* return next integer from standard input */
int get() { scanf.... }
/* write x to standard output */
int put(int x) { printf... }
2
11/30/2007 7
Code Generation in Our 
Project 
11/30/2007 8
Generating .asm Code
 Suggestion: isolate the actual output 
operations in a handful of routines
 Modularity & saves some typing
 Possibilities
// write code string s to .asm output
void gen(String s) { … }
// write “op  src,dst” to .asm output
void genbin(String op, String src, String dst) { … }
// write label L to .asm output as “L:”
void genLabel(String L) { … }
 A handful of these methods should do it
11/30/2007 9
Agenda
 Mapping source code to x86
 Today: Stuff Needed for project: basic 
statements and expressions
 Next: Other cool stuff: Object 
representation, method calls, and dynamic 
dispatch
11/30/2007 10
Conventions for Examples
 Examples show code snippets in isolation
 A “real” code generator needs to worry about things 
like :
 Which registers are busy at which point in the program
 Which registers to spill into memory when a new register is 
needed and no free ones are available
 You won’t need to worry about this for your compiler!
 Register eax used below as a generic example 
 Rename as needed for more complex code involving multiple 
registers
 But for the most part in our compiler we can stick to just 
using eax and ecx.
11/30/2007 11
A Simple Code Generation 
Strategy
 Priority: quick ‘n dirty correct code first, 
optimize later if time
 Treat the x86 as a 1-register stack 
machine
11/30/2007 12
x86 as a Stack Machine
 Idea: Use x86 stack for expression evaluation with 
eax as the “top” of the stack
 Invariant: Whenever an expression (or part of one) 
is evaluated at runtime, the result is in eax
 If a value needs to be preserved while another 
expression is evaluated, push eax, evaluate, then pop 
when needed
 Remember: always pop what you push
 Will produce lots of redundant, but correct, code
3
11/30/2007 13
Constants
 Source
17
 x86
mov eax, 17
 Idea: realize constant value in a register
 Aside: Optimization: if constant is 0
xor eax, eax
11/30/2007 14
Use (RHS) of Variables
 Source
(use of variable) a
 x86
mov eax, [ebp+ -8]
 All variables in our programs will be addressable 
relative to ebp.  
 In this example a is a local variable.
 If the offset was positive, it would be a parameter.
 Check your symbol table to generate the correct 
offset for a given variable.
11/30/2007 15
Assign (LHS) of Variables
 Source
a = exp
 x86
<code for exp, leaves result in eax>
mov [ebp+ -8],  eax
11/30/2007 16
Example:  var = exp;  
 Assuming that var is a local variable
 <code for exp>
 Generates code that leaves the result of 
evaluating exp in eax
 gen(mov [ebp+offset of variable],eax)
11/30/2007 17
Example: Generate Code for 
Constants and Identifiers
 Integer constants, say 17
gen(mov eax,17)
 leaves value in eax
 Variables
gen(mov eax, [ebp + appropriate offset])
 also leaves value in eax
11/30/2007 18
Assignment Statement
 Source
var = exp;
 x86
<code to evaluate exp, 
leaving result in register eax>
mov [ebp+offset
var
], eax
Page 4


1
11/30/2007 1
Code Generation
CSE413
Autumn 2007
11/30/2007 2
Compiler Structure (review)
Source Target
Scanner
Parser
Middle
(optimization)
Code Gen
characters
tokens
IR
IR (maybe different)
Assembly code
11/30/2007 3
What We Need
 To run a D program:
 Space needs to be allocated for a stack 
and a heap
 ESP and other registers need to have 
sensible initial values
 We need some way to communicate with 
the outside world (I/O)
11/30/2007 4
Bootstraping from C
 Idea: take advantage of the existing C 
runtime library 
 Write a small C main program that calls 
the main method in the assembly you 
generate as if it were a C function.
 C’s standard library provides the execution 
environment
 We can write C functions for get() and 
put() and call them from our asm code.
11/30/2007 5
Bootstrap Program
 The bootstrap will be a tiny C program 
that calls your compiled code as if it 
were an ordinary C function
 It also contains some functions that 
compiled code can call as needed
 Mini “runtime library”
 get(), put()
11/30/2007 6
Example Bootstrap Program
#include <stdio.h>
int d$main();   /* prototype for external function */
int main() {
printf("\nValue returned from D main: %d.\n", d$main());
return 0;
}
/* return next integer from standard input */
int get() { scanf.... }
/* write x to standard output */
int put(int x) { printf... }
2
11/30/2007 7
Code Generation in Our 
Project 
11/30/2007 8
Generating .asm Code
 Suggestion: isolate the actual output 
operations in a handful of routines
 Modularity & saves some typing
 Possibilities
// write code string s to .asm output
void gen(String s) { … }
// write “op  src,dst” to .asm output
void genbin(String op, String src, String dst) { … }
// write label L to .asm output as “L:”
void genLabel(String L) { … }
 A handful of these methods should do it
11/30/2007 9
Agenda
 Mapping source code to x86
 Today: Stuff Needed for project: basic 
statements and expressions
 Next: Other cool stuff: Object 
representation, method calls, and dynamic 
dispatch
11/30/2007 10
Conventions for Examples
 Examples show code snippets in isolation
 A “real” code generator needs to worry about things 
like :
 Which registers are busy at which point in the program
 Which registers to spill into memory when a new register is 
needed and no free ones are available
 You won’t need to worry about this for your compiler!
 Register eax used below as a generic example 
 Rename as needed for more complex code involving multiple 
registers
 But for the most part in our compiler we can stick to just 
using eax and ecx.
11/30/2007 11
A Simple Code Generation 
Strategy
 Priority: quick ‘n dirty correct code first, 
optimize later if time
 Treat the x86 as a 1-register stack 
machine
11/30/2007 12
x86 as a Stack Machine
 Idea: Use x86 stack for expression evaluation with 
eax as the “top” of the stack
 Invariant: Whenever an expression (or part of one) 
is evaluated at runtime, the result is in eax
 If a value needs to be preserved while another 
expression is evaluated, push eax, evaluate, then pop 
when needed
 Remember: always pop what you push
 Will produce lots of redundant, but correct, code
3
11/30/2007 13
Constants
 Source
17
 x86
mov eax, 17
 Idea: realize constant value in a register
 Aside: Optimization: if constant is 0
xor eax, eax
11/30/2007 14
Use (RHS) of Variables
 Source
(use of variable) a
 x86
mov eax, [ebp+ -8]
 All variables in our programs will be addressable 
relative to ebp.  
 In this example a is a local variable.
 If the offset was positive, it would be a parameter.
 Check your symbol table to generate the correct 
offset for a given variable.
11/30/2007 15
Assign (LHS) of Variables
 Source
a = exp
 x86
<code for exp, leaves result in eax>
mov [ebp+ -8],  eax
11/30/2007 16
Example:  var = exp;  
 Assuming that var is a local variable
 <code for exp>
 Generates code that leaves the result of 
evaluating exp in eax
 gen(mov [ebp+offset of variable],eax)
11/30/2007 17
Example: Generate Code for 
Constants and Identifiers
 Integer constants, say 17
gen(mov eax,17)
 leaves value in eax
 Variables
gen(mov eax, [ebp + appropriate offset])
 also leaves value in eax
11/30/2007 18
Assignment Statement
 Source
var = exp;
 x86
<code to evaluate exp, 
leaving result in register eax>
mov [ebp+offset
var
], eax
4
11/30/2007 19
Binary +
 Source
exp1 + exp2
 x86
<code evaluating exp1 into eax>
<code evaluating exp2 into ecx>
add eax, ecx
11/30/2007 20
Example: Generate Code for 
exp1 + exp2
 <code to calculate exp1>
 generates code to evaluate exp1 and put result in eax
 gen(push eax)
 generate a push instruction
 <code to calculate exp2>
 generates code for exp2; result in eax
 gen(pop ecx)
 pop left argument into ecx; cleans up stack
 gen(add eax, ecx)
 perform the addition; result in eax
11/30/2007 21
Binary -, *
 Same as +
 Use sub for –
 // Be sure to get order of operands correct!!
 Use imul for *
 Aside: Optimizations
 Use left shift to multiply by powers of 2
 Use x+x instead of 2*x, etc. (faster)
 Use dec for x-1
11/30/2007 22
Control Flow
 Basic idea: decompose higher level operation 
into conditional and unconditional gotos
 In the following, j
false
is used to mean:
“jump when a condition is false”
 No such instruction on x86
 Will have to realize with appropriate sequence of 
instructions to set condition codes followed by 
conditional jumps
 Normally won’t actually generate the value “true”
or “false” in a register
11/30/2007 23
While
 Source
while (cond) stmt
 x86
test: <code evaluating cond>
j
false
done
<code for stmt>
jmp test
done:
11/30/2007 24
Labels
 In x86 assembly language we’ll need to 
produce unique labels for each if, while, 
etc.
 Labels can appear on a line by 
themselves.
jmp label 
 will start execution on the first line of x86 
code it finds after seeing the label.
Page 5


1
11/30/2007 1
Code Generation
CSE413
Autumn 2007
11/30/2007 2
Compiler Structure (review)
Source Target
Scanner
Parser
Middle
(optimization)
Code Gen
characters
tokens
IR
IR (maybe different)
Assembly code
11/30/2007 3
What We Need
 To run a D program:
 Space needs to be allocated for a stack 
and a heap
 ESP and other registers need to have 
sensible initial values
 We need some way to communicate with 
the outside world (I/O)
11/30/2007 4
Bootstraping from C
 Idea: take advantage of the existing C 
runtime library 
 Write a small C main program that calls 
the main method in the assembly you 
generate as if it were a C function.
 C’s standard library provides the execution 
environment
 We can write C functions for get() and 
put() and call them from our asm code.
11/30/2007 5
Bootstrap Program
 The bootstrap will be a tiny C program 
that calls your compiled code as if it 
were an ordinary C function
 It also contains some functions that 
compiled code can call as needed
 Mini “runtime library”
 get(), put()
11/30/2007 6
Example Bootstrap Program
#include <stdio.h>
int d$main();   /* prototype for external function */
int main() {
printf("\nValue returned from D main: %d.\n", d$main());
return 0;
}
/* return next integer from standard input */
int get() { scanf.... }
/* write x to standard output */
int put(int x) { printf... }
2
11/30/2007 7
Code Generation in Our 
Project 
11/30/2007 8
Generating .asm Code
 Suggestion: isolate the actual output 
operations in a handful of routines
 Modularity & saves some typing
 Possibilities
// write code string s to .asm output
void gen(String s) { … }
// write “op  src,dst” to .asm output
void genbin(String op, String src, String dst) { … }
// write label L to .asm output as “L:”
void genLabel(String L) { … }
 A handful of these methods should do it
11/30/2007 9
Agenda
 Mapping source code to x86
 Today: Stuff Needed for project: basic 
statements and expressions
 Next: Other cool stuff: Object 
representation, method calls, and dynamic 
dispatch
11/30/2007 10
Conventions for Examples
 Examples show code snippets in isolation
 A “real” code generator needs to worry about things 
like :
 Which registers are busy at which point in the program
 Which registers to spill into memory when a new register is 
needed and no free ones are available
 You won’t need to worry about this for your compiler!
 Register eax used below as a generic example 
 Rename as needed for more complex code involving multiple 
registers
 But for the most part in our compiler we can stick to just 
using eax and ecx.
11/30/2007 11
A Simple Code Generation 
Strategy
 Priority: quick ‘n dirty correct code first, 
optimize later if time
 Treat the x86 as a 1-register stack 
machine
11/30/2007 12
x86 as a Stack Machine
 Idea: Use x86 stack for expression evaluation with 
eax as the “top” of the stack
 Invariant: Whenever an expression (or part of one) 
is evaluated at runtime, the result is in eax
 If a value needs to be preserved while another 
expression is evaluated, push eax, evaluate, then pop 
when needed
 Remember: always pop what you push
 Will produce lots of redundant, but correct, code
3
11/30/2007 13
Constants
 Source
17
 x86
mov eax, 17
 Idea: realize constant value in a register
 Aside: Optimization: if constant is 0
xor eax, eax
11/30/2007 14
Use (RHS) of Variables
 Source
(use of variable) a
 x86
mov eax, [ebp+ -8]
 All variables in our programs will be addressable 
relative to ebp.  
 In this example a is a local variable.
 If the offset was positive, it would be a parameter.
 Check your symbol table to generate the correct 
offset for a given variable.
11/30/2007 15
Assign (LHS) of Variables
 Source
a = exp
 x86
<code for exp, leaves result in eax>
mov [ebp+ -8],  eax
11/30/2007 16
Example:  var = exp;  
 Assuming that var is a local variable
 <code for exp>
 Generates code that leaves the result of 
evaluating exp in eax
 gen(mov [ebp+offset of variable],eax)
11/30/2007 17
Example: Generate Code for 
Constants and Identifiers
 Integer constants, say 17
gen(mov eax,17)
 leaves value in eax
 Variables
gen(mov eax, [ebp + appropriate offset])
 also leaves value in eax
11/30/2007 18
Assignment Statement
 Source
var = exp;
 x86
<code to evaluate exp, 
leaving result in register eax>
mov [ebp+offset
var
], eax
4
11/30/2007 19
Binary +
 Source
exp1 + exp2
 x86
<code evaluating exp1 into eax>
<code evaluating exp2 into ecx>
add eax, ecx
11/30/2007 20
Example: Generate Code for 
exp1 + exp2
 <code to calculate exp1>
 generates code to evaluate exp1 and put result in eax
 gen(push eax)
 generate a push instruction
 <code to calculate exp2>
 generates code for exp2; result in eax
 gen(pop ecx)
 pop left argument into ecx; cleans up stack
 gen(add eax, ecx)
 perform the addition; result in eax
11/30/2007 21
Binary -, *
 Same as +
 Use sub for –
 // Be sure to get order of operands correct!!
 Use imul for *
 Aside: Optimizations
 Use left shift to multiply by powers of 2
 Use x+x instead of 2*x, etc. (faster)
 Use dec for x-1
11/30/2007 22
Control Flow
 Basic idea: decompose higher level operation 
into conditional and unconditional gotos
 In the following, j
false
is used to mean:
“jump when a condition is false”
 No such instruction on x86
 Will have to realize with appropriate sequence of 
instructions to set condition codes followed by 
conditional jumps
 Normally won’t actually generate the value “true”
or “false” in a register
11/30/2007 23
While
 Source
while (cond) stmt
 x86
test: <code evaluating cond>
j
false
done
<code for stmt>
jmp test
done:
11/30/2007 24
Labels
 In x86 assembly language we’ll need to 
produce unique labels for each if, while, 
etc.
 Labels can appear on a line by 
themselves.
jmp label 
 will start execution on the first line of x86 
code it finds after seeing the label.
5
11/30/2007 25
Example:
Control Flow: Unique Labels
 Needed: a String-valued method that 
returns a different label each time it is 
called (e.g., L1, L2, L3, …)
 Variation: a set of methods that generate 
different kinds of labels for different 
constructs (can really help readability of 
the generated code)
 (while1, while2, while3, …; if1, if2, …; else1, 
else2, ….)
11/30/2007 26
If
 Source
if (cond) stmt
 x86
<code evaluating cond>
j
false
skip
<code for stmt>
skip:
11/30/2007 27
If-Else
 Source
if (cond) stmt1 else stmt2
 x86
<code evaluating cond>
j
false
else
<code for stmt1>
jmp done
else: <code for stmt2>
done:
11/30/2007 28
Boolean Expressions
 What do we do with this?
x > y
 It is an expression that evaluates to 
true or false
 Could generate the value (0/1 or whatever 
the local convention is)
 But normally we don’t want/need the 
value; we’re only trying to decide whether 
to jump
11/30/2007 29
Code for exp1 > exp2
 Basic idea: designate jump target, and 
whether to jump if the condition is true 
or if it is false
 Example: exp1 > exp2, target L123, 
jump on false
<evaluate exp1 to eax>
<evaluate exp2 to ecx>
cmp eax, ecx
jng L123
11/30/2007 30
Example exp1 < exp2
 Similar to other binary operators
 Difference: context is a target label and whether to 
jump if true or false
 x86:
<evaluate exp1 to eax>
gen(push eax)
<evaluate exp2 to eax>// eax contains exp2
gen(pop ecx) // ecx contains exp1
gen(cmp ecx, eax)
gen(condjump targetLabel)
 appropriate conditional jump depending on sense of test
Read More

FAQs on Code Generation

1. What is code generation?
Ans. Code generation is the process of automatically generating source code from a higher-level representation or model. It involves transforming a high-level specification into a lower-level implementation, which can be executed directly by a computer.
2. Why is code generation important?
Ans. Code generation is important as it saves time and effort by automating the process of writing repetitive or boilerplate code. It also helps in improving productivity, reducing the chances of human errors, and ensuring consistency across codebases.
3. What are the benefits of using code generation?
Ans. Using code generation offers several benefits, including increased productivity, reduced development time, improved code quality, and easier maintenance. It also facilitates code reuse, enables rapid prototyping, and provides a higher level of abstraction for developers.
4. How does code generation work?
Ans. Code generation works by taking a higher-level representation of the desired code, such as a model or specification, and automatically transforming it into executable code. This can be done through the use of templates, code templates, or by utilizing specific code generation tools or frameworks.
5. Are there any drawbacks or limitations to code generation?
Ans. While code generation brings numerous benefits, it also has some limitations. Code generated through automated processes may be less human-readable or harder to debug compared to manually written code. Additionally, code generation tools may have limitations in supporting complex or specific requirements, requiring manual intervention in some cases.
Download as PDF
Explore Courses for exam
Signup for Free!
Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.
10M+ students study on EduRev
Download free EduRev App
Track your progress, build streaks, highlight & save important lessons and more!