84 lines
4.6 KiB
Markdown
84 lines
4.6 KiB
Markdown
# Linkers part 1
|
||
|
||
I’ve been working on and off on a new linker. To my surprise, I’ve discovered
|
||
in talking about this that some people, even some computer programmers, are
|
||
unfamiliar with the details of the linking process. I’ve decided to write some
|
||
notes about linkers, with the goal of producing an essay similar to my existing
|
||
one about the GNU configure and build system.
|
||
|
||
As I only have the time to write one thing a day, I’m going to do this on my
|
||
blog over time, and gather the final essay together later. I believe that I may
|
||
be up to five readers, and I hope y’all will accept this digression into stuff
|
||
that matters. I will return to random philosophizing and minding other people’s
|
||
business soon enough.
|
||
|
||
## A Personal Introduction
|
||
|
||
Who am I to write about linkers?
|
||
|
||
I wrote my first linker back in 1988, for the AMOS operating system which ran
|
||
on Alpha Micro systems. (If you don’t understand the following description,
|
||
don’t worry; all will be explained below). I used a single global database to
|
||
register all symbols. Object files were checked into the database after they
|
||
had been compiled. The link process mainly required identifying the object file
|
||
holding the main function. Other objects files were pulled in by reference. I
|
||
reverse engineered the object file format, which was undocumented but quite
|
||
simple. The goal of all this was speed, and indeed this linker was much faster
|
||
than the system one, mainly because of the speed of the database.
|
||
|
||
I wrote my second linker in 1993 and 1994. This linker was designed and
|
||
prototyped by Steve Chamberlain while we both worked at Cygnus Support (later
|
||
Cygnus Solutions, later part of Red Hat). This was a complete reimplementation
|
||
of the BFD based linker which Steve had written a couple of years before.
|
||
The primary target was a.out and COFF. Again the goal was speed, especially
|
||
compared to the original BFD based linker. On SunOS 4 this linker was almost as
|
||
fast as running the cat program on the input .o files.
|
||
|
||
The linker I am now working, called gold, on will be my third. It is
|
||
exclusively an ELF linker. Once again, the goal is speed, in this case being
|
||
faster than my second linker. That linker has been significantly slowed down
|
||
over the years by adding support for ELF and for shared libraries. This support
|
||
was patched in rather than being designed in. Future plans for the new linker
|
||
include support for incremental linking–which is another way of increasing
|
||
speed.
|
||
|
||
There is an obvious pattern here: everybody wants linkers to be faster. This is
|
||
because the job which a linker does is uninteresting. The linker is a speed
|
||
bump for a developer, a process which takes a relatively long time but adds no
|
||
real value. So why do we have linkers at all? That brings us to our next topic.
|
||
|
||
## A Technical Introduction
|
||
|
||
What does a linker do?
|
||
|
||
It’s simple: a linker converts object files into executables and shared
|
||
libraries. Let’s look at what that means. For cases where a linker is used,
|
||
the software development process consists of writing program code in some
|
||
language: e.g., C or C++ or Fortran (but typically not Java, as Java normally
|
||
works differently, using a loader rather than a linker). A compiler translates
|
||
this program code, which is human readable text, into into another form of
|
||
human readable text known as assembly code. Assembly code is a readable form of
|
||
the machine language which the computer can execute directly. An assembler is
|
||
used to turn this assembly code into an object file. For completeness, I’ll
|
||
note that some compilers include an assembler internally, and produce an object
|
||
file directly. Either way, this is where things get interesting.
|
||
|
||
In the old days, when dinosaurs roamed the data centers, many programs were
|
||
complete in themselves. In those days there was generally no compiler–people
|
||
wrote directly in assembly code–and the assembler actually generated an
|
||
executable file which the machine could execute directly. As languages liked
|
||
Fortran and Cobol started to appear, people began to think in terms of
|
||
libraries of subroutines, which meant that there had to be some way to run the
|
||
assembler at two different times, and combine the output into a single
|
||
executable file. This required the assembler to generate a different type of
|
||
output, which became known as an object file (I have no idea where this name
|
||
came from). And a new program was required to combine different object files
|
||
together into a single executable. This new program became known as the linker
|
||
(the source of this name should be obvious).
|
||
|
||
Linkers still do the same job today. In the decades that followed, one new
|
||
feature has been added: shared libraries.
|
||
|
||
More tomorrow.
|
||
|