It seems wrong to me that calling a glibc function can mess up the application's namespace, because the glibc function loads up libraries as an implementation detail. Looking around the web I find references to similar cases of similar namespace issues. E.g., http://blog.schmichael.com/2006/11/29/mdns-crashes-samba/ . Might there be a general solution to this issue, or do applications just need to keep working around this by renaming their functions ?
Follows a paste of the original GDB bug report, for convenience.
"GDB generates a SIGSEGV whenever target remote hostname:port command is used (either in .gdbinit or by command). Tracked this down to what appears to be a namespace conflict for timeval_add. Stack trace inserted below. There is a global symbol in libsamba-util.so.0 (timeval_add) and also in libiberty which is part of GDB. This is version Fedora 7.6.1-46.fc19.
[fc19 gdb-7.6.1]$ readelf -s /lib64/libsamba-util.so.0 | grep timeval_add
326: 0000003d78a0f0a0 59 FUNC GLOBAL DEFAULT 12 timeval_add@@SAMBA_UTIL_0.0.1
[fc19 gdb-7.6.1]$ readelf -s /lib64/libsamba-util.so.0 | grep timeval_current_ofs
231: 0000003d78a0f130 63 FUNC GLOBAL DEFAULT 12 timeval_current_ofs_msec@@SAMBA_UTIL_0.0.1
343: 0000003d78a0f100 43 FUNC GLOBAL DEFAULT 12 timeval_current_ofs@@SAMBA_UTIL_0.0.1
691: 0000003d78a0f170 59 FUNC GLOBAL DEFAULT 12 timeval_current_ofs_usec@@SAMBA_UTIL_0.0.1
#0 timeval_add (result=0x7fffffffd850, a=0x0, b=0x3d090) at ../../libiberty/timeval-utils.c:57
#1 0x0000003d78a0f124 in timeval_current_ofs () from /lib64/libsamba-util.so.0
#2 0x0000003d71e12b84 in name_query () from /usr/lib64/samba/libgse.so
#3 0x00007ffff0d3d489 in _nss_wins_gethostbyname_r () from /lib64/libnss_wins.so.2
#4 0x0000003d3ad0ebd3 in gethostbyname_r@@GLIBC_2.2.5 () from /lib64/libc.so.6
During symbol reading, Child DIE 0xa7383 and its abstract origin 0xa7e56 have different tags.
During symbol reading, Child DIE 0xa7383 and its abstract origin 0xa7e56 have different parents.
#5 0x0000003d3ad0e316 in gethostbyname () from /lib64/libc.so.6
#6 0x000000000047d69a in net_open (scb=0xdb30d0, name=<optimized out>) at ../../gdb/ser-tcp.c:194
During symbol reading, cannot get low and high bounds for subprogram DIE at 1217621.
During symbol reading, Multiple children of DIE 0x12ba07 refer to DIE 0x129485 as their abstract origin.
During symbol reading, DW_AT_GNU_call_site_target target DIE has invalid low pc, for referencing DIE 0x12efd3 [in module /home/bruce/rpmbuild/BUILD/gdb-7.6.1/build-x86_64-redhat-linux/gdb/gdb].
#7 0x0000000000637f04 in serial_open (name=<optimized out>, name@entry=0xc2219e "office:1234") at ../../gdb/serial.c:221
#8 0x00000000004a3eac in remote_serial_open (name=0xc2219e "hdev:1234") at ../../gdb/remote.c:3712
#9 remote_open_1 (name=<optimized out>, from_tty=0, target=0xb22160 <remote_ops>, extended_p=0) at ../../gdb/remote.c:4257
#10 0x000000000064127a in execute_command (p=0xc221a8 "4", p@entry=0xc22190 "target remote office:1234", from_tty=0) at ../../gdb/top.c:487
#11 0x0000000000641efb in command_loop () at ../../gdb/top.c:590
#12 0x0000000000641fb9 in read_command_file (stream=stream@entry=0xcd51a0) at ../../gdb/top.c:330
#13 0x00000000004bcc42 in script_from_file (stream=stream@entry=0xcd51a0, file=file@entry=0xd48860 "/home/bruce/.gdbinit") at ../../gdb/cli/cli-script.c:1654
#14 0x00000000004bd366 in source_script_from_stream (stream=0xcd51a0, file=0xd48860 "/home/bruce/.gdbinit") at ../../gdb/cli/cli-cmds.c:545
#15 0x00000000004bf04f in source_script_with_search (file=0xd48860 "/home/bruce/.gdbinit", from_tty=<optimized out>, search_path=0)
#16 0x0000000000584a2e in catch_command_errors (command=0x4bf170 <source_script>, arg=0xd48860 "/home/bruce/.gdbinit", from_tty=from_tty@entry=0,
mask=mask@entry=6) at ../../gdb/exceptions.c:573
#17 0x000000000058773f in captured_main (data=data@entry=0x7fffffffe010) at ../../gdb/main.c:958
#18 0x000000000058495a in catch_errors (func=func@entry=0x586430 <captured_main>, func_args=func_args@entry=0x7fffffffe010,
errstring=errstring@entry=0x7207cb "", mask=mask@entry=6) at ../../gdb/exceptions.c:546
#19 0x00000000005878d4 in gdb_main (args=args@entry=0x7fffffffe010) at ../../gdb/main.c:1144
#20 0x00000000004543de in main (argc=<optimized out>, argv=<optimized out>) at ../../gdb/gdb.c:34
Florian Weimer and I had a dicussion about this last year. In the general case there is no other option except to track and enforce the global namespace of all libraries and exported symbols for the entire distribution and avoid overlap. Note that LSB does some of this by standardizing the exported symbol lists. Florian had a database that he constructed to test for these kinds of things.
The discussion ends here where we both agree that no amount of tooling will fix this, but the tooling will help mitigate the problems:
There may be some amount of dlmopen() usage that we can do to ensure the implementation loaded modules are hidden from the user modules, but eventually the breakage will happen and the only solution is to alter the linkage scope (drawing a wall around things) or fix the symbols.
Does that make sense?
Retitled to reflect RFE nature.
This is being discussed here:
The only solution to this is to use dlmopen to load the NSS modules to isolate the 3rd party objects from the application, similarly for libgcc's use for unwidnding.
This is blocked on fixing 18684 which should get dlmopen working correctly.
For reference, the original gdb bug report was further clarified and discussed here:
(GDB is now carrying a workaround for the Python bug that makes gdb export all its symbols in the dynamic table, which then in turn exposes the NSS module's namespace pollution. That workaround works as long as GDB is not configured against a static Python:
Roland's suggestion for tooling was interesting:
case you hit was in a third-party NSS module. Obviously we can't
ourselves do anything directly about the quality of third-party
modules. But we could potentially provide a tool to vet third-party
modules for our name space rules.