Better understanding Linux secondary dependencies solving with examples
08 Jan 2015 by David CorvoysierA few months ago I stumbled upon a linking problem with secondary dependencies I couldn’t solved without overlinking the corresponding libraries.
I only realized today in a discussion with my friend Yann E. Morin that not only did I use the wrong solution for that particular problem, but that my understanding of the gcc linking process was not as good as I had imagined.
This blog post is to summarize what I have now understood.
There is also a small repository on github with the mentioned samples.
A few words about Linux libraries
This paragraph is only a brief summary of what is very well described in The Linux Documentation Project library howto.
Man pages for the linux linker and loader are also a good source of information.
There are three kind of libraries in Linux: static, shared and dynamically loaded (DL).
Dynamically loaded libraries are very specific to some use cases like plugins, and would deserve an article on their own. I will only focus here on static and shared libraries.
Static libraries
A static library is simply an archive of object files conventionally starting with the lib
prefix and ending with the .a
suffix.
Example:
libfoobar.a
Static libraries are created using the ar program:
$ ar rcs libfoobar.a foo.o bar.o
Linking a program with a static library is as simple as adding it to the link command either directly with its full path:
$ gcc -o app main.c /path/to/foobar/libfoobar.a
or indirectly using the -l
/L
options:
$ gcc -o app main.c -lfoobar -L/path/to/foobar
Shared libraries
A shared library is an ELF object loaded by programs when they start.
Shared libraries follow the same naming conventions as static libraries, but with the .so
suffix instead of .a
.
Example:
libfoobar.so
Shared library objects need to be compiled with the -fPIC
option that produces position-independent code, ie code that can be relocated in memory.
$ gcc -fPIC -c foo.c
$ gcc -fPIC -c bar.c
The gcc command to create a shared library is similar to the one used to create a program, with the addition of the -shared
option.
$ gcc -shared -o libfoobar.so foo.o bar.o
Linking against a shared library is achieved using the exact same commands as linking against a static library:
$ gcc -o app main.c libfoobar.so
or
$ gcc -o app main.c -lfoobar -L/path/to/foobar
Shared libraries and undefined symbols
An ELF object maintains a table of all the symbols it uses, including symbols belonging to another ELF object that are marked as undefined.
At compilation time, the linker will try to resolve an undefined symbol by linking it either statically to code included in the overall output ELF object or dynamically to code provided by a shared library.
If an undefined symbol is found in a shared library, a DT_NEEDED
entry is created for that library in the output ELF target.
The content of the DT_NEEDED
field depends on the link command:
- the full path to the library if the library was linked with an absolute path,
- the library name otherwise (or the library soname if it was defined).
You can check the dependencies of an ELF object using the readelf command:
$ readelf -d main
or
$ readelf -d libbar.so
When producing an executable a symbol that remains undefined after the link will raise an error: all dependencies must therefore be available to the linker in order to produce the output binary.
For historic reason, this behavior is disabled when building a shared library: you need to specify the --no-undefined
(or -z defs
) flag explicitly if you want errors to be raised when an undefined symbol is not resolved.
$ gcc -Wl,--no-undefined -shared -o libbar.so -fPIC bar.c
or
$ gcc -Wl,-zdefs -shared -o libbar.so -fPIC bar.c
Note that when producing a static library, which is just an archive of object files, no actual ‘linking’ operation is performed, and undefined symbols are kept unchanged.
Library versioning and compatibility
Several versions of the same library can coexist in the system.
By conventions, two versions of the same library will use the same library name with a different version suffix that is composed of three numbers:
- major revision,
- minor revision,
- build revision.
Example:
libfoobar.so.1.2.3
This is often referred as the library real name.
Also by convention, the library major version should be modified every time the library binary interface (ABI) is modified.
Following that convention, an executable compiled with a shared library version is theoretically able to link with another version of the same major revision.
This concept if so fundamental for expressing compatibility between programs and shared libraries that each shared library can be associated a soname, which is the library name followed by a period and the major revision:
Example:
libfoobar.so.1
The library soname is stored in the DT_SONAME
field of the ELF shared object.
The soname has to be passed as a linker option to gcc.
$ gcc -shared -Wl,-soname,libfoobar.so.1 -o libfoobar.so foo.o bar.o
As mentioned before, whenever a library defines a soname, it is that soname that is stored in the DT_NEEDED
field of ELF objects linked against that library.
Solving versioned libraries dependencies at build time
As mentioned before, libraries to be linked against can be specified using a shortened name and a path:
$ gcc -o app main.c -lfoobar -L/path/to/foobar
When installing a library, the installer program will typically create a symbolic link from the library real name to its linker name to allow the linker to find the actual library file.
Example:
/usr/lib/libfoobar.so -> libfoobar.so.1.5.3
The linker uses the following search paths to locate required shared libraries:
- directories specified by
-rpath-link
options (more on that later) - directories specified by
-rpath
options (more on that later) - directories specified by the environment variable
LD_RUN_PATH
- directories specified by the environment variable
LD_LIBRARY_PATH
- directories specified in
DT_RUNPATH
orDT_RPATH
of a shared library are searched for shared libraries needed by it - default directories, normally
/lib
and/usr/lib
- directories listed inthe
/etc/ld.so.conf
file
Solving versioned shared libraries dependencies at runtime
On GNU glibc-based systems, including all Linux systems, starting up an ELF binary executable automatically causes the program loader to be loaded and run.
On Linux systems, this loader is named /lib/ld-linux.so.X
(where X is a version number). This loader, in turn, finds and loads recursively all other shared libraries listed in the DT_NEEDED
fields of the ELF binary.
Please note that if a soname was specified for a library when the executable was compiled, the loader will look for the soname instead of the library real name. For that reason, installation tools automatically create symbolic names from the library soname to its real name.
Example:
/usr/lib/libfoobar.so.1 -> libfoobar.so.1.5.3
When looking fo a specific library, if the value described in the DT_NEEDED
doesn’t contain a /
, the loader will consecutively look in:
- directories specified at compilation time in the ELF object
DT_RPATH
(deprecated), - directories specified using the environment variable
LD_LIBRARY_PATH
, - directories specified at compile time in the ELF object
DT_RUNPATH
, - from the cache file
/etc/ld.so.cache
, which contains a compiled list of candidate libraries previously found in the augmented library path (can be disabled at compilation time), - in the default path
/lib
, and then/usr/lib
(can be disabled at compilation time).
Proper handling of secondary dependencies
As mentioned in the introduction, my issue was related to secondary dependencies, ie shared libraries dependencies that are exported from one library to a target.
Let’s imagine for instance a program main that depends on a library libbar that itself depends on a shared library libfoo.
We will use either a static libbar.a or a shared libbar.so.
foo.c
int foo()
{
return 42;
}
bar.c
int foo();
int bar()
{
return foo();
}
main.c
int bar();
int main(int argc, char** argv)
{
return bar();
}
Creating the libfoo.so shared library
libfoo has no dependencies but the libc, so we can create it with the simplest command:
$ gcc -shared -o libfoo.so -fPIC foo.c
Creating the libbar.a static library
As said before, static libraries are just archives of object files, without any means to declare external dependencies.
In our case, there is therefore no explicit connection whatsoever between libbar.a and libfoo.so.
$ gcc -c bar.c
$ ar rcs libbar.a bar.o
Creating the libbar.so dynamic library
The proper way to create the libbar.so shared library it by explicitly specifying it depends on libfoo:
$ gcc -shared -o libbar2.so -fPIC bar.c -lfoo -L$(pwd)
This will create the library with a proper DT_NEEDED
entry for libfoo.
$ readelf -d libbar.so
Dynamic section at offset 0xe08 contains 25 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libfoo.so]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
...
However, since undefined symbols are not by default resolved when building a shared library, we can also create a “dumb” version without any DT_NEEDED
entry:
$ gcc -shared -o libbar_dumb.so -fPIC bar.c
Note that it is very unlikely that someone actually chooses to create such an incomplete library on purpose, but it may happen that by misfortune you encounter one of these beasts in binary form and still need to link against it (yeah, sh… happens !).
Linking against the libbar.a static library
As mentioned before, when linking an executable, the linker must resolve all undefined symbols before producing the output binary.
Trying to link only with libbar.a produces an error, since it has an undefined symbol and the linker has no clue where to find it:
$ gcc -o app_s main.c libbar.a
libbar.a(bar.o): In function `bar':
bar.c:(.text+0xa): undefined reference to `foo'
collect2: error: ld returned 1 exit status
Adding libfoo.so to the link command solves the problem:
$ gcc -o app main.c libbar.a -L$(pwd) -lfoo
You can verify that the app binary now explicitly depends on libfoo:
$ readelf -d app
Dynamic section at offset 0xe18 contains 25 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libfoo.so]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
...
At run-time, the dynamic linker will look for libfoo.so, so unless you have installed it in standard directories (/lib
or /usr/lib
) you need to tell it where it is:
LD_LIBRARY_PATH=$(pwd) ./app
To summarize, when linking an executable against a static library, you need to specify explicitly all dependencies towards shared libraries introduced by the static library on the link command.
Note however that expressing, discovering and adding implicit static libraries dependencies is typically a feature of your build system (autotools, cmake).
Linking against the libbar.so shared library
As specified in the linker documentation, when the linker encounters an input shared library it processes all its DT_NEEDED
entries as secondary dependencies:
- if the linker output is a shared relocatable ELF object (ie a shared library), and the –copy-dt-needed-entries option is set (this is the legacy behavior) it will add all
DT_NEEDED
entries from the input library as newDT_NEEDED
entries in the output, - if the linker output is a shared relocatable ELF object (ie a shared library), and if the –no-copy-dt-needed-entries option is set (this is the new default behavior for binutils, following a move initiated by major distros like Fedora ) it will simply ignore all
DT_NEEDED
entries from the input library, - if the linker ouput is a non-shared, non-relocatable link (our case), it will automatically add the libraries listed in the
DT_NEEDED
of the input library on the link command line, producing an error if it can’t locate them.
So, let’s see what happens when dealing with our two shared libraries.
Linking against the “dumb” library
When trying to link an executable against the “dumb” version of libbar.so, the linker encounters undefined symbols in the library itself it cannot resolve since it lacks the DT_NEEDED
entry related to libfoo:
$ gcc -o app main.c -L$(pwd) -lbar_dumb
libbar_dumb.so: undefined reference to `foo'
collect2: error: ld returned 1 exit status
Let’s see how we can solve this.
Adding explicitly the libfoo.so dependency
Just like we did when we linked against the static version, we can just add libfoo to the link command to solve the problem:
$ gcc -o app main.c -L$(pwd) -lbar_dumb -lfoo
It creates an explicit dependency in the app binary:
$ readelf -d app
Dynamic section at offset 0xe18 contains 25 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libbar_dumb.so]
0x0000000000000001 (NEEDED) Shared library: [libfoo.so]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
...
Again, at runtime you may need to tell the dynamic linker where libfoo.so is:
$ LD_LIBRARY_PATH=$(pwd) ./app
Note that having an explicit dependency to libfoo is not quite right, since our application doesn’t use directly any symbols from libfoo. What we’ve just done here is called overlinking, and it is BAD.
Let’s imagine for instance that in the future we decide to provide a newer version of libbar that uses the same ABI, but based on a new version of libfoo with a different ABI: we should theoretically be able to use that new version of libbar without recompiling our application, but what would really happen here is that the dynamic linker would actually try to load the two versions of libfoo at the same time, leading to unpredictable results. We would therefore need to recompile our application even if it is still compatible with the newest libbar.
As a matter of fact, this actually happened in the past: a libfreetype update in the debian distro caused 583 packages to be recompiled, with only 178 of them actually using it.
Ignoring libfoo dependency
There is another option you can use when dealing with the “dumb” library: tell the linker to ignore its undefined symbols altogether:
$ gcc -o app main.c -L$(pwd) -lbar_dumb -Wl,--allow-shlib-undefined
This will produce a binary that doesn’t declare its hidden dependencies towards libfoo:
$ readelf -d app
Dynamic section at offset 0xe18 contains 25 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libbar_dumb.so]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
...
This isn’t without consequences at runtime though, since the dynamic linker is now unable to resolve the executable dependencies:
$ ./app: symbol lookup error: ./libbar_dumb.so: undefined symbol: foo
Your only option is then to load libfoo explicitly (yes, this is getting uglier and uglier):
$ LD_PRELOAD=$(pwd)/libfoo.so LD_LIBRARY_PATH=$(pwd) ./app
Linking against the “correct” library
Doing it the right way
As mentioned before, when linking against the correct shared library, the linker encounters the libfoo.so DT_NEEDED
entry, adds it to the link command and finds it at the path specified by -L
, thus solving the undefined symbols … or at least that is what I expected:
$ gcc -o app main.c -L$(pwd) -lbar
/usr/bin/ld: warning: libfoo.so, needed by libbar.so, not found (try using -rpath or -rpath-link)
/home/diec7483/dev/linker-example/libbar.so: undefined reference to `foo'
collect2: error: ld returned 1 exit status
Why the error ? I thought I had done everything by the book !
Okay, let’s take a look at the ld
man page again, looking at the -rpath-link
option. This says:
When using ELF or SunOS, one shared library may require another. This happens when an “ld -shared” link includes a shared library as one of the input files. When the linker encounters such a dependency when doing a non-shared, non-relocatable link, it will automatically try to locate the required shared library and include it in the link, if it is not included explicitly. In such a case, the -rpath-link option specifies the first set of directories to search. The -rpath-link option may specify a sequence of directory names either by specifying a list of names separated by colons, or by appearing multiple times.
Ok, this is not crystal-clear, but what it actually means is that when specifying the path for a secondary dependency, you should not use -L
but -rpath-link
:
$ gcc -o app main.c -L$(pwd) -lbar -Wl,-rpath-link=$(pwd)
You can now verify that app depends only on libbar:
$ readelf -d app
Dynamic section at offset 0xe18 contains 25 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libbar.so]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
...
And this is finally how things should be done.
You may also use
-rpath
instead of-rpath-link
but in that case the specified path will be stored in the resulting executable, which is not suitable if you plan to relocate your binaries. Tools like cmake use the-rpath
during the build phase (make
), but remove the specified path from the executable during the installation phase(make install
).
Conclusion
To summarize, when linking an executable against:
-
a static library, you need to specify all dependencies towards other shared libraries this static library depends on explicitly on the link command.
-
a shared library, you don’t need to specify dependencies towards other shared libraries this shared library depends on, but you may need to specify the path to these libraries on the link command using the
-rpath
/-rpath-link
options.
comments powered by DisqusNote however that expressing, discovering and adding implicit libraries dependencies is typically a feature of your build system (autotools, cmake), as demonstrated in my samples.