symbol-granularity trace output for ld and ldd?

Daniel S. Wilkerson dsw@cs.berkeley.edu
Mon Jul 10 23:55:00 GMT 2006


We are doing whole-program static analysis of C and C++.  It works
very much like a C++ compiler except instead of compiling code it
computes various properties of it that someone might want to know
about, such as what data can flow to where.  This is handy for finding
remotely-exploitable security holes for example.

First we replace the system compiler, linker, etc. with scripts that
record their input and then call the real tool (see
build-interceptor.tigris.org).  Our scripts capture the .i files as
they go into the compiler and keep them around; they also pass extra
flags to the linker such as --trace to find out what the linker is
doing.

Then we take the captured .i files and run them through our analysis
tool.  Right now we are doing ok on the compiling part, but we are
having some trouble with linking.  We really do not want to
re-implement all of the subtleties of the ld linker and the dynamic
linker; it would be error-prone and a waste.  Instead we would rather
just ask the real linker what symbols it links to what and then do
what it says in our linker.  That is, our linker can "link" the
results of our analysis, but we don't want it to have to figure out
which symbols match up with which other symbols.

Using 'ld --trace' and 'ldd' only give us information at the file
granularity.  This is probably ok for static linking as we can sort of
figure things out; however for dynamic linking we have heard that
symbols can be loaded lazily and one at a time and multiple libraries
sometimes provide definitions for the same symbol.  It would be really
handy if there were a way to just get from the static and especially
the dynamic linkers exactly what symbols were linked to what.  Any
help is appreciated.

Daniel Wilkerson
http://www.cs.berkeley.edu/~dsw/



More information about the Binutils mailing list