What steps to convert the program from the code to the executable file?
Pre-compile
cpp hello.c > hello.i
gcc -E hello.c -o hello.i
delete all “define” and replace all the macro definition
handle all condition pre-compile: #if, #ifdef,#elif,#else,#endif
handle #include: insert the file to the code. This is recursive.
delete all comments of the code
add the line number and the file mark. This will help compiler produces the line number used in the debug and can show the line number when compiler produces the error or warning.
remain all #pragma since the compiler will use it.
Compile
The process of the compiler produces the assembly code with the file pre-compiled by Scanner, Parser, Grammar Parser, Semantic Parser and Optimizer.
gcc -S hello.i -o hello.s
or directly use:
gcc -S hello.c -o hello.s
Assemble
The Assemble can transfer the assembly language to the machine language according to reference table recorded the relation between the assembly and the machine code.
You can use the following command to produce the object the machine can recognize:
gcc -c hello.s -o hello.o
Link
It produces an executable file to link all object files.
How does the compiler work?
The font of compile
Scanner and Parser
The Parser will scan the code with “Finite State Machine” to crop that code into some tokens.
These tokens are classed by:
KeyWord
Identification
Literal (numbers, string)
Special Symbol
These tokens are stored in a symbol table.
The program called lex can implement this function. It can crop the string inputted by the user into tokens according to some rules set by the users. So the parser doesn’t need to develop, the programs only change rules based on their requirement.
Especially, The macro and the file replacement are in the pre-compile for the C programming language.
Grammar Parser
The syntax tree is built by analyzing the token produced by the scanner. The process uses the Context-free Grammar
The syntax tree is the tree whose node is the expression. The symbol and number are min expression, those aren’t made from other expression, are as the leaf of the whole tree. While analyzing the grammar, the priority and the meaning of a more arithmetic symbol is ensured.
If the expression is illegal, the compiler reports the failure in the grammar parser.
The tool called yacc is grammar parser tools. It is called as Compiler Compiler.
Semantic Parser
The meaning will be explained in the semantic Parser. The semantic parser checks the semantic of statements.
Semantic contains two aspects:
Static
The semantic is ensured during compiling.
It contains the match of the identification and the type.
Dynamic
The semantic is ensured during running
Through semantic analyzing, all nodes in the syntax tree were marked type. If the type needs auto-transformation, the semantic analyzing program will insert transforming node to the syntax tree.
Build the Internal Language
The Source Code Optimizer converts the optimized code to the Intermediate Code.
The intermediate code is close to machine code, but it doesn’t depend on any platform. For example, it doesn’t contain any size of data, the address of the variable, the name of the register.
The Intermediate Code is as an Internal Language. It can cross any platform and make the compiler becomes two parts: The front – has one intermediate code The back – has more implementation for different platforms.
The back of compiler
Generate the machine code
The Code Generator translates the intermediate code to the machine code. This process depends on the hardware of the certain machine. The different machine has different bytes, different registers, and different data size.
Target Code Optimizer
Finally, the Target Code Optimizer optimizes the machine code generated by Code Generator.
Then, the object code that is the machine code is linked to an executable file.
How does the linker work?
What kind of work does the linker complete?
relocating the address of the object when the code changes
replacing the symbol stood for the real address in the assembly language
The symbol in the assembly language can stand for an address of a variable or a function.
joining all modules to generate an executable file
An object file contained machine code called a module. It is communication problem to join all modules.
All modules communicate each other with symbols. And joining process is called Linker
Static Linker
The content of linker is to handle well all reference and makes all modules connecting perfectly.
The process of linking is:
Address and Storage Allocation
Symbol Resolution
Relocation
The linker can modify the address used as the placeholder of the symbol to the real address. This process is called Relocation. Every address used as the placeholder is called Relocation Entry
The Static Linker
yinquanWhat steps to convert the program from the code to the executable file?
#if
,#ifdef
,#elif
,#else
,#endif
#include
: insert the file to the code. This is recursive.#pragma
since the compiler will use it.The process of the compiler produces the assembly code with the file pre-compiled by Scanner, Parser, Grammar Parser, Semantic Parser and Optimizer.
or directly use:
The Assemble can transfer the assembly language to the machine language according to reference table recorded the relation between the assembly and the machine code.
You can use the following command to produce the object the machine can recognize:
It produces an executable file to link all object files.
How does the compiler work?
The Parser will scan the code with “Finite State Machine” to crop that code into some tokens.
These tokens are classed by:
These tokens are stored in a symbol table.
The program called lex can implement this function. It can crop the string inputted by the user into tokens according to some rules set by the users. So the parser doesn’t need to develop, the programs only change rules based on their requirement.
Especially, The macro and the file replacement are in the pre-compile for the C programming language.
The syntax tree is built by analyzing the token produced by the scanner. The process uses the Context-free Grammar The syntax tree is the tree whose node is the expression. The symbol and number are min expression, those aren’t made from other expression, are as the leaf of the whole tree. While analyzing the grammar, the priority and the meaning of a more arithmetic symbol is ensured.
If the expression is illegal, the compiler reports the failure in the grammar parser.
The tool called yacc is grammar parser tools. It is called as Compiler Compiler.
Semantic Parser
The meaning will be explained in the semantic Parser. The semantic parser checks the semantic of statements.
Semantic contains two aspects:
The semantic is ensured during compiling.
It contains the match of the identification and the type.
The semantic is ensured during running
Through semantic analyzing, all nodes in the syntax tree were marked type. If the type needs auto-transformation, the semantic analyzing program will insert transforming node to the syntax tree.
The Source Code Optimizer converts the optimized code to the Intermediate Code.
The intermediate code is close to machine code, but it doesn’t depend on any platform. For example, it doesn’t contain any size of data, the address of the variable, the name of the register.
The Intermediate Code is as an Internal Language. It can cross any platform and make the compiler becomes two parts:
The front – has one intermediate code
The back – has more implementation for different platforms.
The Code Generator translates the intermediate code to the machine code. This process depends on the hardware of the certain machine. The different machine has different bytes, different registers, and different data size.
Finally, the Target Code Optimizer optimizes the machine code generated by Code Generator.
Then, the object code that is the machine code is linked to an executable file.
How does the linker work?
The symbol in the assembly language can stand for an address of a variable or a function.
An object file contained machine code called a module. It is communication problem to join all modules.
All modules communicate each other with symbols. And joining process is called Linker
Static Linker
The content of linker is to handle well all reference and makes all modules connecting perfectly.
The process of linking is:
The linker can modify the address used as the placeholder of the symbol to the real address. This process is called Relocation. Every address used as the placeholder is called Relocation Entry