Open App

Computer Science Engineering (CSE) Exam > Computer Science Engineering (CSE) Notes > Compiler Design > Common Subexpression Elimination

Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE) PDF Download

15-411: Compiler Design Frank Pfenning

➤ Introduction

Copy propagation allows us to have optimizations with this form:
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)

It is natural to ask about transforming a similar computation on compound expressions:

Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)

However, this will not work most of the time. The result may not even be a valid instruction (for example, if instr (x) = (y x ⊕ 1). Even if it is, we have made our program bigger, and possibly more expensive to run. However, we can consider the opposite: In a situation
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)

we can replace the second computation of s₁ ⊕ s₂ by a reference to x (under some conditions), saving a reduction computation. This is called common subexpression elimination (CSE).

➤ Common Subexpression Elimination

The thorny issue for common subexpression elimination is determining when the optimization above is performed. Consider the following program in SSA form:

If we want to use CSE to replace the calculation of a ⊕ b in Lab4, then there appear to be two candidates: we can rewrite u a ⊕ b as u x or u y. However, only the ﬁrst of these is correct! If control ﬂow passes through Lab3 instead of Lab2, then it will be an error to access x in Lab4.
In order to rewrite u a ⊕ b as u x, in general we need to know that x will have the right value when execution reaches line k. Being in SSA form helps us, because it lets us know that the right-hand sides will always have the same meaning if they are syntactically identical. But we also need to know x even be deﬁned along every control ﬂow path that takes us to Lab4.
What we would like to know is that every control ﬂow path from the beginning of the code (that is, the beginning of the function we are compiling) to line k goes through line l. Then we can be sure that x has the right value when we reach k. This is the deﬁnition of the dominance relation between lines of code. We write l ≥ k if l dominates k and l > k if it l strictly dominates k. We see how to deﬁne it in the next section; once it is deﬁned we use it as follows:
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)

It was suggested in lecture that this optimization would be correct even if the binary operator is effectful. The reason is that if l dominates k then we always execute l ﬁrst. If the operation does not raise an exception, then the use of x in k is correct. If it does raise an exception, we never reach k. So, yes, this optimization works even for binary operations that may potentially raise an exception.

➤ Dominance
On general control ﬂow graphs, dominance is an interesting relation and there are several algorithms for computing this relationship. We can cast it as a form of forward data-ﬂow analysis. One of the approaches exploits the simplicity of our language to directly generate the dominance relationship as part of code generation. We brieﬂy discuss this here. The drawback is that if your code generation is slightly different or more efﬁcient, or if your transformation change the essential structure of the control ﬂow graph, then you need to update the relationship. A simple and fast algorithm that works particularly well in our simple language is described by Cooper et al. [CHK06] which is empirically faster than the traditional Lengauer-Tarjan algorithm [LT79] (which is asymptotically faster). In this lecture, we consider just the basic cases.
For straight-line code the predecessor if each line is its immediate dominator, and any preceding line is a dominator.
For conditionals, consider if (e; s₁ ; s₂ )
We translate this to the following code, Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE) is the code for e and s, respectively and e^ is the temp through which we can refer to the result of evaluating e.

On the right is the corresponding control-ﬂow graph. Now the immediate dominator of l₁ should be l'₀ and the immediate dominator of l₂ should also be l'₀ . Now for l₃ we don’t know if we arrive from l'₀ or from l'₂. Therefore, neither of these nodes will dominate l₃. Instead, the immediate dominator is l'₀ , the last node we can be sure to be traversed before we arrive at l'₃ . Indicating immediate dominators with dashed read lines, we show the result below.
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE) However, if it turns out, say, l'₁ is not reachable, then the dominator relationship looks different. This is the case, for example, if s₁ in this example is a return statement or is known to raise an error. Then we have instead:
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE) In this case, l'₁ : goto l₃ is unreachable code and can be optimized away. Of course, the case where l'₂ is unreachable is symmetric.
For loops, it is pretty easy to see that the beginning of the loop dominates all the statements in the loop. Again, considering the straightforward compilation of a while loop with the control ﬂow graph on the right.
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)
Interesting here is mainly that the node p' just before the loop header l₀ is indeed the immediate dominator of l₀, even l₀ has l'₁ as another predecessor. The deﬁnition makes this obvious: when we enter the loop we have to come through p' node, on subsequent iterations we come from l'₁ . So we cannot be guaranteed to come through l'₁ , but we are guaranteed to come through p' on our way to l₀.
➤ Implementing Common Subexpression Elimination
To implement common subexpression elimination we traverse the program, looking for deﬁnitions l : x ← . If is already in the table, deﬁning variable y at k, we replace l with l : x ← y if k dominates l. Otherwise, we add the expression, line, and variable to the hash table.
Dominance can usually be checked quite quickly if we maintain a dominator tree, where each line has a pointer to its immediate dominator. We just follow these pointers until we either reach k (and so k > l) or the root of the control-ﬂow graph (in which case k does not dominate l).

➤ Memory Optimization
Even on modern architectures with hierarchical memory caches, memory access, on average, is still signiﬁcantly more expensive than register access or even most arithmetic operations. Therefore, memory optimizations play a signiﬁcant role in generating fast code. As we will see, whether certain memory optimizations are possible or not depends on properties of the whole language. For example, whether or not we can obtain pointers to the middle of heap-allocated objects will be a crucial question to answer.
We will use a simple running example to illustrate applying common subexpression elimination to memory reads. In this example, mult(A; p; q) will multiply matrix A with vector p and return the result in vector q.

Below is the translation into abstract assembly, with the small twist that we have allowed memory reference to be of the form M [base + offset ]. The memory optimization question we investigate is whether some load instructions t ← M [s] can be avoided because the corresponding value is already held in a temp.
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)
We see that the source refers to p->x and p->y twice, and those are reﬂected in the two, potentially redundant loads above. Before you read on, consider if we could replace the lines with t₉ ← t₁ and t₁₂ ← t₄. We can do that if we can be assured that memory at the addresses p + 0 and p + 4, respectively, has not changed since the previous load instructions.
It turns out that in C0 the second load is deﬁnitely redundant, but the ﬁrst one may not be.
The ﬁrst load is not redundant because when this function is called, the pointers p and q might be the same (they might aliased). When this is the case, the store to M [q +0] will likely change the value stored at M [p+0], leading to a different answer than expected for the second line.
On the other hand, this cannot happen for the ﬁrst line, because M [q + 0] could never be the same as M [p + 4] since one accesses the x ﬁeld and the other the y ﬁeld of a struct.
Of course, the answer is mostly likely wrong when p = q. One could either rewrite the code, or require that p = q in the precondition to the function.
In C, the question is more delicate because the use of the address-of (&) operator could obtain pointers to the middle of objects. For example, the argument int[] A would be int* A in C, and such a pointer might have been obtained with &q->x.

➤ Using the Results of Alias Analysis
In C0, the types of pointers are a powerful basis of alias analysis. The way alias analysis is usually phrased is as a may-alias analysis, because we try to infer which pointers in a program may alias. Then we know for optimization purposes that if two pointers are not in the may-alias relationship that they must be different. Writing to one address cannot change the value stored at the other.
Let’s consider how we might use the results of alias analysis, embodied in a predicate may-alias(a; b) for two addresses a and b. We assume we have a load instruction
l : t ← M [a]
and we want to infer if this is available at some other line l' : t' ← M [a] so we could replace it with l' : t⁰ ← t. Our optimization rule has the same form as previous instances of common subexpression elimination:

The fact that l dominates k is sufﬁcient here in SSA form to guarantee that the meaning of t and a remains unchanged. avail is supposed to check that M [a] also remains unchanged.
Reaching analysis for memory references is a simple forward dataﬂow analysis.
If we have a node with two or more incoming control ﬂow edges, it must be available along all of them. For the purposes of traversing loops we assume availability, essentially trying to ﬁnd a counterexample in the loop. To express this concisely, our analysis rules propagate unavailability of a deﬁnition l : t ← M [a] an other instructions k that are dominated by l.
For unavailability, unavail(l; k), we have the seeding rule on the left and the general propagation rule on the right. Because we are in SSA, we know in the seeding rule that l > k where k is the (unique) successor of l'.
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)
The rule on the right includes the cases of jumps or conditional jumps. This ensures that in a node with multiple predecessors, if a value is unavailable in just one of them, in will be unavailable at the node. Function calls can also seed unavailability.
Unfortunately it is enough if one of the function parameters is a memory reference, because from one memory reference we may be able to get to another by following pointers and offsets.
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)
With more information on the shape of memory this rule can be relaxed.
From unavailability we can deduce which memory values are still available, namely those that are not unavailable (restriction attention to those that are dominated by the load—otherwise the question is not asked).
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)
Note that stratiﬁcation is required: we need to saturate unavail(l; l') before applying this rule.

➤ Type-Based Alias Analysis
The simplest form of alias analysis is based on the type and offset of the address. We call this an alias class, with the idea that pointers in different alias classes cannot

alias. The basic predicate here is class(a; τ ; offset ) which expresses that a is an address derived from a source of type τ and offset offset from the start of the memory of type τ .
Then the may-alias relation is deﬁned by
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)
There is a couple of special cases we do not treat explicitly. For example, the location of the array length (which is stored in safe mode at least) may be at offset -8. But such a location can never be written to (array lengths never change, once allocated), so a load of the array length is available at all locations dominated by the load.
The seed of the class relation comes from the compiler, that annotates an address with this information. In our example,
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)
the compiler would generate
class(A; int[ ]; 0)
class(p; struct point*; 0)
class(q; struct point*; 0)
We now propagate the information through a forward dataﬂow analysis. For example:

In the second case we have written $n to emphasize the second summand is a constant n. Unfortunately, if it is a variable, we cannot precisely calculate the offset.
This may happen with arrays, but not with pointers, including pointers to structs.
So we need to generalize the third argument to class to be either a variable or T, which indicates any value may be possible. We then have, for example
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)
Now T behaves like an information sink. For example, T + k = k + T = T. Since in SSA form a is deﬁned only once, we should not have to change our mind about the class assigned to a variable. However, at parameterized jump targets (which is equivalent to T-functions), we need to “disjoin” the information so that if the argument is known to be k at one predecessor but unknown at T at another predecessor, the result should be T.
Because of loops, we then need to generalize further and introduce ⊥ which means that we believe (for now) that the variable is never used. Because of the seeding by the compiler, this will mostly happen for loop variables. The values are arranged in a lattice
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)

where at the bottom we have more information, at the top the least. The ∪ operation between lattice elements ﬁnds the least upper bounds of its two arguments.
For example, 0 ∪ 4 = T and ⊥ ∪ 2 = 2. We use it in SSA form to combine information about offsets. We now read an assertion class(a; τ ; k) as saying that the offset is at least k under the lattice ordering. Then we have
Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE) Because of loops we might perform this calculation multiple times until we have reached a ﬁxed point. In this case the ﬁxed point is least upper bound of all the offset classes we compute, which is a little different than the saturated data base we considered before.
This is an example of abstract interpretation, which may be a subject of a future lecture. One can obtain a more precise alias analysis if one reﬁnes the abstract domain, which is lattice shown above.

➤ Allocation-Based Alias Analysis
Another technique to infer that pointers may not alias is based on their allocation point. In brief, if two pointers are allocated with different calls to alloc or alloc array, then they cannot be aliased. Because allocation may happen in a different function than we are currently compiling (and hopefully optimizing), this is an example of an inter-procedural analysis.

The document Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE) is a part of the Computer Science Engineering (CSE) Course Compiler Design.

All you need of Computer Science Engineering (CSE) at this link: Computer Science Engineering (CSE)

	Compiler Design 26 videos\|67 docs\|30 tests

Compiler Design

26 videos|67 docs|30 tests

Join Course for Free

FAQs on Common Subexpression Elimination - Compiler Design - Computer Science Engineering (CSE)

1. What is common subexpression elimination in computer science engineering?

Ans. Common subexpression elimination is a compiler optimization technique used in computer science engineering to eliminate redundant computations. It identifies expressions that are computed multiple times within a program and replaces them with a single computation, thereby reducing the number of instructions executed and improving the efficiency of the program.

2. How does common subexpression elimination work?

Ans. Common subexpression elimination works by analyzing the expressions in a program and identifying those that are computed multiple times. It assigns a unique name to each expression and stores its value in a temporary variable. Whenever the expression is encountered again in the program, the stored value is used instead of recomputing the expression. This eliminates the redundancy and reduces the number of computations performed.

3. What are the benefits of common subexpression elimination?

Ans. Common subexpression elimination offers several benefits in computer science engineering. It reduces the number of instructions executed, leading to faster program execution. It also conserves memory by eliminating the need to store redundant values. Additionally, it can simplify the code by removing repetitive computations, making it easier to read and maintain.

4. Are there any limitations or drawbacks to common subexpression elimination?

Ans. While common subexpression elimination is a useful optimization technique, it has some limitations. It may introduce additional memory overhead due to the need for temporary variables to store common subexpressions. It may also increase the complexity of the compiler's code generation process. Furthermore, common subexpression elimination may not be beneficial in all scenarios, especially when the overhead of storing and accessing temporary variables outweighs the benefits of eliminating redundant computations.

5. How is common subexpression elimination implemented in compilers?

Ans. Common subexpression elimination is typically implemented in compilers using data-flow analysis techniques. The compiler analyzes the control flow graph of the program to identify common subexpressions and their occurrences. It then assigns temporary variables to store the computed values and replaces the redundant expressions with references to these variables. The implementation may involve various optimizations, such as constant folding and constant propagation, to further improve the efficiency of the generated code.

Related Exams

Computer Science Engineering (CSE)

About this Document

4.91/5 Rating

Feb 19, 2025 Last updated

Document Description: Common Subexpression Elimination for Computer Science Engineering (CSE) 2025 is part of Compiler Design preparation. The notes and questions for Common Subexpression Elimination have been prepared according to the Computer Science Engineering (CSE) exam syllabus. Information about Common Subexpression Elimination covers topics like and Common Subexpression Elimination Example, for Computer Science Engineering (CSE) 2025 Exam. Find important definitions, questions, notes, meanings, examples, exercises and tests below for Common Subexpression Elimination.

Introduction of Common Subexpression Elimination in English is available as part of our Compiler Design for Computer Science Engineering (CSE) & Common Subexpression Elimination in Hindi for Compiler Design course. Download more important topics related with notes, lectures and mock test series for Computer Science Engineering (CSE) Exam by signing up for free. Computer Science Engineering (CSE): Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)

Description

Full syllabus notes, lecture & questions for Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE) - Computer Science Engineering (CSE) | Plus excerises question with solution to help you revise complete syllabus for Compiler Design | Best notes, free PDF download

Information about Common Subexpression Elimination

In this doc you can find the meaning of Common Subexpression Elimination defined & explained in the simplest way possible. Besides explaining types of Common Subexpression Elimination theory, EduRev gives you an ample number of questions to practice Common Subexpression Elimination tests, examples and also practice Computer Science Engineering (CSE) tests

	Compiler Design 26 videos\|67 docs\|30 tests

Compiler Design

26 videos|67 docs|30 tests

Join Course for Free

Download as PDF

Explore Courses for Computer Science Engineering (CSE) exam

Previous Year Questions with Solutions

Free

Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)

video lectures

past year papers

pdf

Exam

Objective type Questions

practice quizzes

Semester Notes

Important questions

Viva Questions

study material

ppt

Common Subexpression Elimination | Compiler Design - Computer Science Engineering (CSE)

Summary

Extra Questions

mock tests for examination

shortcuts and tricks

Sample Paper

MCQs

;

Additional Information about Common Subexpression Elimination for Computer Science Engineering (CSE) Preparation

Common Subexpression Elimination Free PDF Download

The Common Subexpression Elimination is an invaluable resource that delves deep into the core of the Computer Science Engineering (CSE) exam. These study notes are curated by experts and cover all the essential topics and concepts, making your preparation more efficient and effective. With the help of these notes, you can grasp complex subjects quickly, revise important points easily, and reinforce your understanding of key concepts. The study notes are presented in a concise and easy-to-understand manner, allowing you to optimize your learning process. Whether you're looking for best-recommended books, sample papers, study material, or toppers' notes, this PDF has got you covered. Download the Common Subexpression Elimination now and kickstart your journey towards success in the Computer Science Engineering (CSE) exam.

Importance of Common Subexpression Elimination

The importance of Common Subexpression Elimination cannot be overstated, especially for Computer Science Engineering (CSE) aspirants. This document holds the key to success in the Computer Science Engineering (CSE) exam. It offers a detailed understanding of the concept, providing invaluable insights into the topic. By knowing the concepts well in advance, students can plan their preparation effectively. Utilize this indispensable guide for a well-rounded preparation and achieve your desired results.

Common Subexpression Elimination Notes

Common Subexpression Elimination Notes offer in-depth insights into the specific topic to help you master it with ease. This comprehensive document covers all aspects related to Common Subexpression Elimination. It includes detailed information about the exam syllabus, recommended books, and study materials for a well-rounded preparation. Practice papers and question papers enable you to assess your progress effectively. Additionally, the paper analysis provides valuable tips for tackling the exam strategically. Access to Toppers' notes gives you an edge in understanding complex concepts. Whether you're a beginner or aiming for advanced proficiency, Common Subexpression Elimination Notes on EduRev are your ultimate resource for success.

Common Subexpression Elimination Computer Science Engineering (CSE) Questions

The "Common Subexpression Elimination Computer Science Engineering (CSE) Questions" guide is a valuable resource for all aspiring students preparing for the Computer Science Engineering (CSE) exam. It focuses on providing a wide range of practice questions to help students gauge their understanding of the exam topics. These questions cover the entire syllabus, ensuring comprehensive preparation. The guide includes previous years' question papers for students to familiarize themselves with the exam's format and difficulty level. Additionally, it offers subject-specific question banks, allowing students to focus on weak areas and improve their performance.

Study Common Subexpression Elimination on the App

Students of Computer Science Engineering (CSE) can study Common Subexpression Elimination alongwith tests & analysis from the EduRev app, which will help them while preparing for their exam. Apart from the Common Subexpression Elimination, students can also utilize the EduRev App for other study materials such as previous year question papers, syllabus, important questions, etc. The EduRev App will make your learning easier as you can access it from anywhere you want. The content of Common Subexpression Elimination is prepared as per the latest Computer Science Engineering (CSE) syllabus.

Education Revolution