Data Flow Analysis
- Properties of data taken into consideration the control flow in a function (also known as flow-sensitive analysis)
- Intraprocedural analysis
Examples of optimizations
- Constant propagation
- Common subexpression elimination
- Dead code elimination
Value of x?
Which "definition" defines x?
Is the definition still meaningful (live)?
Static program vs. dynamic execution

- Statically: Finite program
- Dynamically: Can have infinitely many possible execution paths
- Data flow analysis abstraction:
- For each static point in the program:
combines information of all the dynamic instances of the same program point.
- Example of a data flow question:
- Which definition defines the value used in statement "b = a"?
Reaching Definitions

- Every assignment is a definition
- A definition d reaches a point p
- if there exists a path from the point immediately following d to p
- such that d is not killed (overwritten) along that path.
- Problem statement
- For each point in the program, determine if each definition in the program reaches the point
- A bit vector per program point, vector-length = #defs
Data Flow Analysis Schema

- Build a flow graph (nodes = basic blocks, edges = control flow)
- Set up a set of equations between in[b] and out[b] for all basic blocks b
- Effect of code in basic block:
Transfer function fb relates in[b] and out[b], for same b - Effect of flow of control:
relates out[b1], in[b2] if b1 and b2 are adjacent
- Find a solution to the equations
Effects of a Basic Block

- A statement s (d: x = y + z)
out[s] = fs(in[s]) = Gen[s] U (in[s]-Kill[s])- Gen[s]: definitions generated: Gen[s] = {d}
- in[s]-Kill[s] : propagated definitions: in[s] - Kill[s], where Kill[s]=set of all other defs to x in rest of program
- out[B] = fd2fd1fd0(in[B])
=Gen[d2] U (Gen[d1] U (Gen[d0] U (in[B]-Kill[d0]))-Kill[d1])) -Kill[d2]
= Gen[d2] U (Gen[d1] U (Gen[d0] - Kill[d1]) - Kill[d2]) U in[B] - (Kill[d0] U Kill[d1] U Kill[d2])
= Gen[B] U (in[B] - Kill[B]) - Gen[B]: locally exposed definitions (available at end of bb)
- Kill[B]: set of definitions killed by B
Effects of the Edges (acyclic)

- Join node: a node with multiple predecessors
- meet operator (∧): ∪
in[b] = out[p1] ∪ out[p2] ∪ ... ∪ out[pn], where
p1, ..., pn are predecessors of b
Cyclic Graphs

- Equations still hold
- out[b] = fb(in[b])
- in[b] = out[p1] ∪ out[p2] ∪ ... ∪ out[pn], p1, ..., pn pred.
- Find: fixed point solution
Reaching Definitions: Iterative Algorithm
input: control flow graph CFG = (N, E, Entry, Exit)
// Boundary condition
OUT[Entry] = ø
// Initialization for iterative algorithm
For each basic block B other than Entry
OUT[B] = ø
// iterate
While (changes to any OUT occur) {
For each basic block B other than Entry {
in[B] = ∪ (out[p]), for all predecessors p of B
out[B] = fB(in[B]) // out[B]=gen[B]∪(in[B]-kill[B])
}
Summary of Reaching Definitions
II. Live Variable Analysis
Definition
- A variable v is live at point p
if the value of v is used
along some path in the flow graph starting at p. - Otherwise, the variable is dead.
Problem statement
- For each basic block b,
- determine if each variable is live at the start/end point of b
- Size of bit vector: one bit for each variable
Effects of a Basic Block (Transfer Function)
- Observation:Trace uses back to the definitions

- Direction: backward: in[b] = fb(out[b])
- Transfer function for statement s: x = y + z
- generate live variables: Use[s] = {y, z}
- propagate live variables: out[s] - Def[s], Def[s] = x
- in[s] = Use[s] ∪ (out(s)-Def[s])
- Transfer function for basic block b:
- in[b] = Use[b] ∪ (out(b)-Def[b])
- Use[b], set of locally exposed uses in b,
uses not covered by definitions in b - Def[b]= set of variables defined in b.b.
Liveness: Iterative Algorithm
input: control flow graph CFG = (N, E, Entry, Exit)
// Boundary condition
IN[Exit] = ø
// Initialization for iterative algorithm
For each basic block B other than Exit
IN[B] = ø
// iterate
While (changes to any IN occur) {
For each basic block B other than Exit {
out[B] = ∪ (in[s]), for all successors of B
in[B] = fB(out[B]) // in[B]=Use[B]∪(out[B]-Def[B])
}
Framework

Problem 1: "Must-Reach" Definitions
- A definition D (a = b+c) must reach point P iff
- D appears at least once along on all paths leading to P
- a is not redefined along any path after last appearance of D and before P
- How do we formulate the data flow algorithm for this problem?
Problem 2: A legal solution to (May) Reaching Def?
