Compiler Design

Get introduced to Compiler Design and its structure, transformation, and analysis. Read more and know about parsing in compiler design, dag in compiler design and a lot more.

What Exactly Is Compiler Design?

Compiler Design is the formation and set of rules that form a compiler’s transformation, assessment, and improvement. A compiler is a software that is used to translate a program code written in a high-level language into a low-level language without changing the meaning of the program code.

Compiler Architecture

A compiler is divided into two parts when they compile:

Analysis Phase: The analysis phase is also known as the front-end. This phase reads the source code and divides it into its core parts and checks for grammar and syntax errors. Then a representation is generated of the source code and symbol table and then is given to the synthesis phase as input.

Synthesis Phase: The synthesis Phase is also known as the back-end. This phase produces the target code under the influence of the source code and symbol table. 

Phases of Compiler Design

A source code goes through various phases of the compiler and gives us a target code which is known as the compilation process. Every phase takes input from its previous stage and feeds its output to the next phase of the compiler. Let us grasp the concept of phases of a compiler.

Lexical Analysis [ Lexer, Tokenization, Scanner]

It is a phase that divides the program into several tokens which are identifiers, separators, keywords, operators, constants, and special characters, and converts them into meaningful lexemes.

Syntax Analysis

The syntax analysis phase is also called parsing. In this phase, input is taken as the token produced in the previous phase also known as lexical analysis and produces a tree called the parse tree or syntax tree. The parser then determines whether the representation formed by the tokens is accurate. There are two types of parsing: Top-down and Bottom-up.

Top-down parser- The top-down parser is a parser that begins from the start symbol and ends on the terminals. The leftmost derivation is employed. Top-down parsers are divided into two types: Recursive descent parser and Non-recursive descent parser.

Bottom-up parser- The bottom-up parser is a parser that begins from the non-terminals and ends on the start symbol. The inverse of the most recent derivation is employed. Bottom-up parsers are classified into two types: LR parser and Operator precedence parser.

Semantic Analysis

This phase makes sure to check whether the tree produced by parsing is compatible with the rules of the language. In addition, the semantic analyzer keeps a record of identifiers types and patterns as well as keeps a check on whether or not identifiers are declared before use. The semantic analyzer generated a labeled syntax tree as an output.

Intermediate Code Generation

An intermediate source code is generated after the semantic analysis for the target machine. This is written in a way that the intermediate code is easily translated into the target machine code. It falls somewhere between high-level and low-level language.

Code Optimization

In code optimization, the intermediate code is improved. To speed up the program output process, it removes code lines and arranges the order of statements without wasting assets.

Code Generation

This phase maps the optimized to the target machine. The code generator translates the intermediate code into an order of re-findable low language code and it performs the machine code task the same as the intermediate code would perform.

Symbol Table

A symbol table is structured data maintained by compilers to store information like variable names, function names, objects, classes, etc. Symbol table also helps the analysis and synthesis phase to produce input and target code.

A symbol table carries out the following roles:

  • To store names of the entities in a structured form in a single place.
  • To check whether a variable is declared.
  • Scope resolution.

Basic Block

The structure of basic blocks is represented by the Directed Acyclic Graph also known as DAG. It helps us to imagine the flow of inputs between basic blocks and also provides an improvement in the basic blocks.

  • DAG is used to apply transformations to basic blocks.
  • DAG makes it easier to transform basic blocks.
  • DAG shows the process of computed values beforehand.

Conclusion

Compiler Design is a concept used to make high-level programs into low-level machine language to create an executable and understandable program. It helps us visualize the process that takes place during the phases and gives us a refined output in return. Knowing the concept of compiler design can be beneficial for those who are interested in programming.