The Input: Three-Address Code (TAC) - The Universal Blueprint for Execution
Audio Book
Dive deep into the subject with an immersive audiobook experience.
Three-Address Code (TAC): The Compiler's Intermediate Language
Chapter 1 of 1
π Unlock Audio Chapter
Sign up and enroll to access the full audio experience
Chapter Content
Three-Address Code (TAC) is a fundamental Intermediate Representation (IR) that simplifies complex programs into a sequence of atomic operations, each typically involving at most three addresses, often using temporary variables.
Detailed Explanation
Before a compiler can generate the final machine instructions that your CPU understands, it first transforms the original source code into an intermediate, more manageable form. One of the most common and intuitive of these forms is called Three-Address Code, or TAC. The name "Three-Address" comes from a simple rule: most instructions in TAC refer to at most three locations β two for input values (operands) and one for storing the result. What truly defines TAC, however, is its atomic operations principle. This means that even a complex high-level statement like x = a + b * c is broken down into a series of incredibly simple, single-step operations. For instance, t1 = b * c, then t2 = a + t1, and finally x = t2. Notice the use of t1 and t2? These are temporary variables, generated by the compiler itself, to hold those intermediate results. They are crucial for systematically decomposing complex expressions. Control flow, like if-else statements or loops, is handled by explicit JUMP instructions. Why is TAC such an ideal input for the final code generation stage? First, its simplicity is a huge advantage. The code generator only needs to know how to translate a small, predefined set of atomic operations, rather than dealing with the full complexity of the original programming language's syntax tree. Second, its flat, sequential structure makes it perfect for applying many machine-independent optimizations β like removing redundant calculations β before the code is tailored to a specific CPU. Finally, TAC is relatively close to hardware, meaning each TAC instruction often maps directly to just one or a few actual machine instructions, making the final translation direct and efficient. TAC truly acts as a universal blueprint, streamlining the journey from your high-level code to the raw instructions your computer executes.
Examples & Analogies
Imagine you have a complex recipe from a famous chef that uses fancy terms and combines many steps. An Intermediate Representation like TAC is like a sous-chef taking that complex recipe and breaking it down into a list of incredibly simple, step-by-step instructions for an apprentice. Each step uses only a few ingredients at a time ("three addresses"), and any intermediate mixtures are put into clearly labeled temporary bowls ("temporary variables"). This simple, atomized list is then easy for any apprentice (the code generator) to follow, regardless of their native language (CPU architecture), and it's also easier to find ways to make the process more efficient (optimizations) before the final cooking (machine code generation) begins.
Key Concepts
-
Role of IR: Bridge between high-level source and low-level machine code.
-
Definition of TAC: Sequence of atomic operations with at most three addresses.
-
Atomic Operations: Simplification of complexity.
-
Temporary Variables: Crucial for managing intermediate results.
-
Explicit Control Flow: Using
JUMP,CALL,RETURN. -
Advantages of TAC: Simplifies code generation, enables machine-independent optimizations, close to hardware.
-
-
Examples
-
Complex Expression to TAC:
-
High-level:
y = (a + b) * c / d; -
TAC:
-
t1 = a + b -
t2 = t1 * c -
t3 = t2 / d -
y = t3 -
Conditional Statement to TAC:
-
High-level:
if (x < y) { z = 1; } else { z = 0; } -
TAC:
-
IF x < y GOTO L1 -
z = 0 -
GOTO L2 -
L1: z = 1 -
L2: -
Array Access to TAC:
-
High-level:
x = arr[i];(assuming 4-byte integers) -
TAC:
-
t1 = i * 4(calculate offset) -
t2 = arr + t1(calculate address of element) -
x = *t2(dereference to get value) -
-
Flashcards
-
Term: Three-Address Code (TAC)
-
Definition: Intermediate code with atomic ops, $\le$3 addresses per instruction.
-
Term: Atomic Operation
-
Definition: A single, elementary operation per TAC instruction.
-
Term: Temporary Variable
-
Definition: Compiler-generated variable to hold intermediate results in TAC.
-
Term: IR
-
Definition: Intermediate Representation; bridge between source and machine code.
-
Term: Advantage of TAC (for code gen)
-
Definition: Simplifies complexity, enables machine-independent opts, close to hardware.
-
-
Memory Aids
-
Mnemonic: For TAC's characteristics: A.T.S.J.
-
Atomic operations
-
Three addresses (at most)
-
Temporary variables (the second 'T' is for Temporary)
-
Sequential execution with Jumps
-
Analogy: Think of TAC as a very detailed, single-step instruction manual written in a universal language, that any specialized robot (code generator for a specific CPU) can easily follow. Each line is one simple, clear command.
-
Concept Connection: Remember TAC breaks down complex "thoughts" (high-level code) into simple "sentences" (atomic operations) that are easy to translate into the "electrical signals" (machine instructions) the CPU understands.
-
Examples & Applications
Complex Expression to TAC:
High-level: y = (a + b) * c / d;
TAC:
t1 = a + b
t2 = t1 * c
t3 = t2 / d
y = t3
Conditional Statement to TAC:
High-level: if (x < y) { z = 1; } else { z = 0; }
TAC:
IF x < y GOTO L1
z = 0
GOTO L2
L1: z = 1
L2:
Array Access to TAC:
High-level: x = arr[i]; (assuming 4-byte integers)
TAC:
t1 = i * 4 (calculate offset)
t2 = arr + t1 (calculate address of element)
x = *t2 (dereference to get value)
Flashcards
Term: Three-Address Code (TAC)
Definition: Intermediate code with atomic ops, $\le$3 addresses per instruction.
Term: Atomic Operation
Definition: A single, elementary operation per TAC instruction.
Term: Temporary Variable
Definition: Compiler-generated variable to hold intermediate results in TAC.
Term: IR
Definition: Intermediate Representation; bridge between source and machine code.
Term: Advantage of TAC (for code gen)
Definition: Simplifies complexity, enables machine-independent opts, close to hardware.
Memory Aids
Mnemonic: For TAC's characteristics: A.T.S.J.
Atomic operations
Three addresses (at most)
Temporary variables (the second 'T' is for Temporary)
Sequential execution with Jumps
Analogy: Think of TAC as a very detailed, single-step instruction manual written in a universal language, that any specialized robot (code generator for a specific CPU) can easily follow. Each line is one simple, clear command.
Concept Connection: Remember TAC breaks down complex "thoughts" (high-level code) into simple "sentences" (atomic operations) that are easy to translate into the "electrical signals" (machine instructions) the CPU understands.
Memory Aids
Interactive tools to help you remember key concepts
Memory Tools
For TAC's characteristics: A.T.S.J.
- Atomic operations
- Three addresses (at most)
- Temporary variables (the second 'T' is for Temporary)
- Sequential execution with Jumps
- **Analogy
Memory Tools
Remember TAC breaks down complex "thoughts" (high-level code) into simple "sentences" (atomic operations) that are easy to translate into the "electrical signals" (machine instructions) the CPU understands.
Flash Cards
Glossary
- Control Flow
The order in which individual statements, instructions, or function calls of an imperative program are executed. In TAC, managed by explicit
JUMP,CALL, andRETURNinstructions.
- Advantages of TAC
Simplifies code generation, enables machine-independent optimizations, close to hardware.
- Array Access to TAC
- High-level:
x = arr[i];(assuming 4-byte integers)
- High-level:
- Definition
Simplifies complexity, enables machine-independent opts, close to hardware.
- Concept Connection
Remember TAC breaks down complex "thoughts" (high-level code) into simple "sentences" (atomic operations) that are easy to translate into the "electrical signals" (machine instructions) the CPU understands.