I have a binary that is linked like this: /opt/rh/devtoolset-8/root/usr/bin/g++ -O3 -DNDEBUG -Wl,--gc-sections -s -Wl,--as-needed <few .o files> -o procmon.e -Wl,-rpath,/usr/local/lib64 <bunch of my static libs .a> /usr/local/lib64/libxalan-c.so /usr/local/lib64/libxerces-c.so <bunch of static libs(1) built with "vcpkg" intermixed with (2)> (1) curl z aws-cpp-sdk-s3 aws-cpp-sdk-core ssl crypto aws-c-event-stream aws-c-common aws-checksums azurestorage uuid xml2 zma cpprest boost_log boost_log_setup boost_filesystem boost_thread boost_date_time boost_regex boost_chrono boost_atomic (2) -lcrypt -lrt -lm -ldl -pthread but (even though --as-needed is present) ldd still reports unused dependencies: $ ldd -u -r procmon.e Unused direct dependencies: /usr/local/lib64/libxalan-c.so.111 /lib64/libcrypt.so.1 /lib64/libm.so.6 Info: CentOS 7 g++ (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3) GNU ld version 2.30-47.el7
As a first step, please check your ld command line by passing -Wl,-v to g++.
Here is the output with "-Wl,-v": collect2 version 8.2.1 20180905 (Red Hat 8.2.1-3) /opt/rh/devtoolset-8/root/usr/libexec/gcc/x86_64-redhat-linux/8/ld -plugin /opt/rh/devtoolset-8/root/usr/libexec/gcc/x86_64-redhat-linux/8/liblto_plugin.so -plugin-opt=/opt/rh/devtoolset-8/root/usr/libexec/gcc/x86_64-redhat-linux/8/lto-wrapper -plugin-opt=-fresolution=/tmp/cc4U4v6Q.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lpthread -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --build-id --no-add-needed --eh-frame-hdr --hash-style=gnu -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o procmon.e -s /lib/../lib64/crt1.o /lib/../lib64/crti.o /opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/crtbegin.o -L/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8 -L/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/../../.. --as-needed --gc-sections -v CMakeFiles/procmon.e.dir/procmon.cpp.o CMakeFiles/procmon.e.dir/proc.cpp.o -rpath /usr/local/lib64 <bunch of my static libs (.a)> /usr/local/lib64/libxalan-c.so /usr/local/lib64/libxerces-c.so /home/user/vcpkg/installed/x64-linux/lib/libcurl.a /home/user/vcpkg/installed/x64-linux/lib/libz.a -lcrypt /home/user/vcpkg/installed/x64-linux/lib/libaws-cpp-sdk-s3.a /home/user/vcpkg/installed/x64-linux/lib/libaws-cpp-sdk-core.a /home/user/vcpkg/installed/x64-linux/lib/libcurl.a /home/user/vcpkg/installed/x64-linux/lib/libssl.a /home/user/vcpkg/installed/x64-linux/lib/libcrypto.a /home/user/vcpkg/installed/x64-linux/lib/libaws-c-event-stream.a /home/user/vcpkg/installed/x64-linux/lib/libaws-c-common.a -lrt /home/user/vcpkg/installed/x64-linux/lib/libaws-checksums.a -lpthread /home/user/vcpkg/installed/x64-linux/lib/libazurestorage.a /home/user/vcpkg/installed/x64-linux/lib/libuuid.a /home/user/vcpkg/installed/x64-linux/lib/libxml2.a /home/user/vcpkg/installed/x64-linux/lib/liblzma.a /home/user/vcpkg/installed/x64-linux/lib/libz.a /home/user/vcpkg/installed/x64-linux/lib/libcpprest.a -lpthread /home/user/vcpkg/installed/x64-linux/lib/libz.a /home/user/vcpkg/installed/x64-linux/lib/libssl.a /home/user/vcpkg/installed/x64-linux/lib/libcrypto.a -ldl /home/user/vcpkg/installed/x64-linux/lib/libboost_log.a /home/user/vcpkg/installed/x64-linux/lib/libboost_log_setup.a /home/user/vcpkg/installed/x64-linux/lib/libboost_filesystem.a /home/user/vcpkg/installed/x64-linux/lib/libboost_thread.a /home/user/vcpkg/installed/x64-linux/lib/libboost_date_time.a /home/user/vcpkg/installed/x64-linux/lib/libboost_regex.a /home/user/vcpkg/installed/x64-linux/lib/libboost_chrono.a /home/user/vcpkg/installed/x64-linux/lib/libboost_atomic.a -lstdc++ -lm -lgcc_s -lgcc -lpthread -lc -lgcc_s -lgcc /opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/crtend.o /lib/../lib64/crtn.o GNU ld version 2.30-47.el7
Maybe garbage collection (-Wl,--gc-sections) happens after effect of "-Wl,--as-needed"?
Is this the reason for this behaviour? $ readelf -s procmon.e | grep xalan 237: 0000000000415bb0 33 FUNC WEAK DEFAULT 13 _ZN11xalanc_1_1111XalanVe 289: 0000000000415bb0 33 FUNC WEAK DEFAULT 13 _ZN11xalanc_1_1111XalanVe How can I drill deeper? (i.e. figure out what kind of symbols these are, why my executable references them, etc)
(In reply to crusader.mike from comment #4) > Is this the reason for this behaviour? > > $ readelf -s procmon.e | grep xalan > 237: 0000000000415bb0 33 FUNC WEAK DEFAULT 13 > _ZN11xalanc_1_1111XalanVe > 289: 0000000000415bb0 33 FUNC WEAK DEFAULT 13 > _ZN11xalanc_1_1111XalanVe > > How can I drill deeper? (i.e. figure out what kind of symbols these are, why > my executable references them, etc) Try readelf -sW. The non-truncated symbol should be decodable by c++filt.
Symbol table '.dynsym' contains 361 entries: Num: Value Size Type Bind Vis Ndx Name ... 237: 0000000000415bb0 33 FUNC WEAK DEFAULT 13 xalanc_1_11::XalanVector<unsigned short, xalanc_1_11::MemoryManagedConstructionTraits<unsigned short> >::~XalanVector() 289: 0000000000415bb0 33 FUNC WEAK DEFAULT 13 xalanc_1_11::XalanVector<unsigned short, xalanc_1_11::MemoryManagedConstructionTraits<unsigned short> >::~XalanVector() ... Symbol table '.symtab' contains 1412 entries: Num: Value Size Type Bind Vis Ndx Name ... 859: 0000000000415bb0 33 FUNC WEAK DEFAULT 13 xalanc_1_11::XalanVector<unsigned short, xalanc_1_11::MemoryManagedConstructionTraits<unsigned short> >::~XalanVector() 1374: 0000000000415bb0 33 FUNC WEAK DEFAULT 13 xalanc_1_11::XalanVector<unsigned short, xalanc_1_11::MemoryManagedConstructionTraits<unsigned short> >::~XalanVector() ... Apologies, but my understanding is somewhat limited in this area (I am reading related docs right now, but it'll take time). Can someone explain what these entries mean and how to find why my binary ended up having them? Another question would be why there are two symbols (_ZN11xalanc_1_1111XalanVectorItNS_31MemoryManagedConstructionTraitsItEEED1Ev and _ZN11xalanc_1_1111XalanVectorItNS_31MemoryManagedConstructionTraitsItEEED2Ev) that unmangle to the same C++ symbol?
> Maybe garbage collection (-Wl,--gc-sections) happens after > effect of "-Wl,--as-needed"? It does, and that might be why you have shared libraries seen as needed before garbage collection runs. If you link with -Wl,-Map,mapfilename then inspecting mapfilename will show you which symbol caused each shared library to be needed.
Alan, you are correct -- looks like garbage collection can remove symbol references to the point that final binary no longer needs given DT_NEEDED shared lib anymore. That is precisely what happens in my case. And if you carefully read --as-needed documentation -- it works precisely as declared (not as expected :)). Now question is: 1. Is there any way to discard DT_NEEDED entries that are no longer needed? (apparently determining this isn't trivial according to my admittedly basic understanding of dynamic linker's behavior) 2. Should --as-needed behavior be modified to address this? Or is it better to make --gc-section sensitive to --as-needed presence (and perform additional cleanup)? Additionally, I've read a lot about [dynamic] linker behavior (big thanks to gold's author for blog posts/etc) and can answer some of my own questions: 3. That weird symbol in my final binary is an inline C++ function (XalanVector<...> destructor) that wasn't inlined. 4. ~XalanVector() wasn't garbage collected very likely because there is a global variable that uses it. Is there any way to track down that variable? 5. There is no corresponding constructor because it was inlined. 6. I can't explain why that destructor produced two entries in .dynsym table (which end with EEED1Ev and EEED2Ev respectively). Interestingly they both have same address/type/etc. My mapfile mentions only one of them: .text 0x00000000004074e0 0x2cd72 ... .text._ZN11xalanc_1_1111XalanVectorItNS_31MemoryManagedConstructionTraitsItEEED2Ev 0x0000000000415bb0 0x21 ../../CommonLib/libCommon.a(NXMLNodeUnix.cpp.o) 0x0000000000415bb0 xalanc_1_11::XalanVector<unsigned short, xalanc_1_11::MemoryManagedConstructionTraits<unsigned short> >::~XalanVector() 0x0000000000415bb0 xalanc_1_11::XalanVector<unsigned short, xalanc_1_11::MemoryManagedConstructionTraits<unsigned short> >::~XalanVector() plus an entry in .gcc_except_table (I assume this is used during stack unwinding): .gcc_except_table ... .gcc_except_table._ZN11xalanc_1_1111XalanVectorItNS_31MemoryManagedConstructionTraitsItEEED2Ev 0x000000000044137e 0x4 ../../CommonLib/libCommon.a(NXMLNodeUnix.cpp.o) I would appreciate any help explaining origin and purpose of EEED1Ev symbol. 7. I find it curious that my final binary contains huge .dynsym table. Even if few symbols are actually used by shared libs -- should be rest of those entries removed to save space? Is there a way to find which ones of these symbols are used (and by which shared lib)?
... about #6, running my binary with LD_DEBUG: LD_DEBUG=bindings LD_BIND_NOW=1 ./procmon.e produces curious output: ... 27438: binding file /usr/local/lib64/libxalan-c.so.111 [0] to /<blah>/procmon.e [0]: normal symbol `_ZN11xalanc_1_1111XalanVectorItNS_31MemoryManagedConstructionTraitsItEEED1Ev' ... i.e. libxalan-c ends up using EEED1Ev symbol from my executable! EEED2Ev isn't mentioned in this output at all. Another thing -- libxalan-c.so has both of these symbols in .dynsym and .symtab tables, both weak, both have same address. With only one difference: only EEED1Ev is mentioned in .rela.dyn (table of relocations?) I am still not sure what is going on here, though...
Regarding the interaction between --gc-sections and --as-needed, yes it would be possible to run a pass over as-needed dynamic objects after garbage collection to check whether their symbols are still needed. This might be a lot of work for little gain, and to do better than just removing DT_NEEDED entries would basically require iterating the link. (A dynamic object reference to symbols in the executable or shared library being linked marks the sections of those symbols against garbage collection.) Here's a comment from gcc/cp/mangle.c that should help explain the various destructor symbol variations. /* Handle destructor productions of non-terminal <special-name>. DTOR is a destructor FUNCTION_DECL. <special-name> ::= D0 # deleting (in-charge) destructor ::= D1 # complete object (in-charge) destructor ::= D2 # base object (not-in-charge) destructor */
> ... it would be possible ... a lot of work for little gain Well, in my case (which I believe isn't very rare) majority of functionality is locked in a few large static libs (of dubious quality) and all minor tools link them and (due to various effects) end up having dummy dependencies, forcing me to package unnecessary libs in deliverables (plus some waste at runtime). Fixing this is not huge, but nice to have. In terms of work -- what about associating a counter with every lib, which goes up for every resolved symbol and goes down for every symbol marked for garbage collection? At the end -- remove all DT_NEEDED entries with counter at 0. Well, (since I am ignorant wrt ld implementation) it is probably a dumb idea, so I 'll leave this problem with those who know what they are doing. > ... to do better than just removing DT_NEEDED entries would basically require iterating the link What do you mean by "to do better"? > ... that should help explain the various destructor ... Thank you, Alan. Now it makes sense. Can you comment on #7? I.e. why elf executable ends up having large .dynsym table? is there a way to to trim it down only to stuff used by it's shared libs? Thank you.
As per comment #7