分类: C/C++
2011-03-30 22:10:37
Compilation
Compilation refers
to the processing of source code files (.c, .cc, or .cpp) and the creation of
an 'object' file. This step doesn't create anything the user can actually run.
Instead, the compiler merely produces the machine language instructions that
correspond to the source code file that was compiled. For instance, if you
compile (but don't link) three separate files, you will have three object files
created as output, each with the name .o or .obj (the extension will depend on
your compiler). Each of these files contains a translation of your source code
file into a machine language file -- but you can't run them yet! You need to
turn them into executables your operating system can use. That's where the
linker comes in.
Linking
Linking
refers to the creation of a single executable file from multiple object files.
In this step, it is common that the linker will complain about undefined
functions (commonly, main itself). During compilation, if the compiler could
not find the definition for a particular function, it would just assume that
the function was defined in another file. If this isn't the case, there's no
way the compiler would know -- it doesn't look at the contents of more than one
file at a time. The linker, on the other hand, may look at multiple files and try
to find references for the functions that weren't mentioned.
You might ask why there are separate compilation and
linking steps. First, it's probably easier to implement things that way. The
compiler does its thing, and the linker does its thing -- by keeping the
functions separate, the complexity of the program is reduced. Another (more
obvious) advantage is that this allows the creation of large programs without
having to redo the compilation step every time a file is changed. Instead,
using so called "conditional compilation", it is necessary to compile
only those source files that have changed; for the rest, the object files are
sufficient input for the linker. Finally, this makes it simple to implement
libraries of pre-compiled code: just create object files and link them just
like any other object file. (The fact that each file is compiled separately
from information contained in other files, incidentally, is called the
"separate compilation model".)
To get the full benefits of condition compilation,
it's probably easier to get a program to help you than to try and remember
which files you've changed since you last compiled. (You could, of course, just
recompile every file that has a timestamp greater than the timestamp of the
corresponding object file.) If you're working with an integrated development
environment (IDE) it may already take care of this for you. If you're using
command line tools, there's a nifty utility called that comes with most *nix distributions. Along with
conditional compilation, it has several other nice features for programming,
such as allowing different compilations of your program -- for instance, if you
have a version producing verbose output for debugging.
Knowing the difference between the compilation phase
and the link phase can make it easier to hunt for bugs. Compiler errors are
usually syntactic in nature -- a missing semicolon, an extra parenthesis.
Linking errors usually have to do with missing or multiple definitions. If you
get an error that a function or variable is defined multiple times from the
linker, that's a good indication that the error is that two of your source code
files have the same function or variable.
The compilation Process
All 5 stages are implemented by one program in UNIX,
namely cc, or in our case, gcc (or g++). The general order of things goes gcc
-> gcc -E -> gcc -S -> as -> ld.