84 lines
4.6 KiB
Markdown
84 lines
4.6 KiB
Markdown
|
# Linkers part 1
|
|||
|
|
|||
|
I’ve been working on and off on a new linker. To my surprise, I’ve discovered
|
|||
|
in talking about this that some people, even some computer programmers, are
|
|||
|
unfamiliar with the details of the linking process. I’ve decided to write some
|
|||
|
notes about linkers, with the goal of producing an essay similar to my existing
|
|||
|
one about the GNU configure and build system.
|
|||
|
|
|||
|
As I only have the time to write one thing a day, I’m going to do this on my
|
|||
|
blog over time, and gather the final essay together later. I believe that I may
|
|||
|
be up to five readers, and I hope y’all will accept this digression into stuff
|
|||
|
that matters. I will return to random philosophizing and minding other people’s
|
|||
|
business soon enough.
|
|||
|
|
|||
|
## A Personal Introduction
|
|||
|
|
|||
|
Who am I to write about linkers?
|
|||
|
|
|||
|
I wrote my first linker back in 1988, for the AMOS operating system which ran
|
|||
|
on Alpha Micro systems. (If you don’t understand the following description,
|
|||
|
don’t worry; all will be explained below). I used a single global database to
|
|||
|
register all symbols. Object files were checked into the database after they
|
|||
|
had been compiled. The link process mainly required identifying the object file
|
|||
|
holding the main function. Other objects files were pulled in by reference. I
|
|||
|
reverse engineered the object file format, which was undocumented but quite
|
|||
|
simple. The goal of all this was speed, and indeed this linker was much faster
|
|||
|
than the system one, mainly because of the speed of the database.
|
|||
|
|
|||
|
I wrote my second linker in 1993 and 1994. This linker was designed and
|
|||
|
prototyped by Steve Chamberlain while we both worked at Cygnus Support (later
|
|||
|
Cygnus Solutions, later part of Red Hat). This was a complete reimplementation
|
|||
|
of the BFD based linker which Steve had written a couple of years before.
|
|||
|
The primary target was a.out and COFF. Again the goal was speed, especially
|
|||
|
compared to the original BFD based linker. On SunOS 4 this linker was almost as
|
|||
|
fast as running the cat program on the input .o files.
|
|||
|
|
|||
|
The linker I am now working, called gold, on will be my third. It is
|
|||
|
exclusively an ELF linker. Once again, the goal is speed, in this case being
|
|||
|
faster than my second linker. That linker has been significantly slowed down
|
|||
|
over the years by adding support for ELF and for shared libraries. This support
|
|||
|
was patched in rather than being designed in. Future plans for the new linker
|
|||
|
include support for incremental linking–which is another way of increasing
|
|||
|
speed.
|
|||
|
|
|||
|
There is an obvious pattern here: everybody wants linkers to be faster. This is
|
|||
|
because the job which a linker does is uninteresting. The linker is a speed
|
|||
|
bump for a developer, a process which takes a relatively long time but adds no
|
|||
|
real value. So why do we have linkers at all? That brings us to our next topic.
|
|||
|
|
|||
|
## A Technical Introduction
|
|||
|
|
|||
|
What does a linker do?
|
|||
|
|
|||
|
It’s simple: a linker converts object files into executables and shared
|
|||
|
libraries. Let’s look at what that means. For cases where a linker is used,
|
|||
|
the software development process consists of writing program code in some
|
|||
|
language: e.g., C or C++ or Fortran (but typically not Java, as Java normally
|
|||
|
works differently, using a loader rather than a linker). A compiler translates
|
|||
|
this program code, which is human readable text, into into another form of
|
|||
|
human readable text known as assembly code. Assembly code is a readable form of
|
|||
|
the machine language which the computer can execute directly. An assembler is
|
|||
|
used to turn this assembly code into an object file. For completeness, I’ll
|
|||
|
note that some compilers include an assembler internally, and produce an object
|
|||
|
file directly. Either way, this is where things get interesting.
|
|||
|
|
|||
|
In the old days, when dinosaurs roamed the data centers, many programs were
|
|||
|
complete in themselves. In those days there was generally no compiler–people
|
|||
|
wrote directly in assembly code–and the assembler actually generated an
|
|||
|
executable file which the machine could execute directly. As languages liked
|
|||
|
Fortran and Cobol started to appear, people began to think in terms of
|
|||
|
libraries of subroutines, which meant that there had to be some way to run the
|
|||
|
assembler at two different times, and combine the output into a single
|
|||
|
executable file. This required the assembler to generate a different type of
|
|||
|
output, which became known as an object file (I have no idea where this name
|
|||
|
came from). And a new program was required to combine different object files
|
|||
|
together into a single executable. This new program became known as the linker
|
|||
|
(the source of this name should be obvious).
|
|||
|
|
|||
|
Linkers still do the same job today. In the decades that followed, one new
|
|||
|
feature has been added: shared libraries.
|
|||
|
|
|||
|
More tomorrow.
|
|||
|
|