libc hang in mutex acquisition on exit in single-threaded process

Christian Grothoff grothoff@gnunet.org
Sat Feb 16 19:33:00 GMT 2019


Dear GNU libc helpers,

I'm seeing some _very_ odd behavior with processes hanging on exit (?)
with GNU libc 2.28-6 on Debian (amd64 threadripper).  This seems to
happen at random (for random tests, with very low frequency!) in the
GNUnet (Git master) testsuite when a child process is about to exit.

With gdb, I see this:

(gdb) ba
#0 __lll_lock_wait_private () at
=2E./sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:63
#1 0x00007f6ea4dfddcb in __unregister_atfork (dso_handle=3D0x7f6ea4eb5940=

<atfork_lock>, dso_handle@entry=3D0x55a865ebd0b8) at register-atfork.c:80=

#2 0x00007f6ea4d314a9 in __cxa_finalize (d=3D0x55a865ebd0b8) at
cxa_finalize.c:107
#3 0x000055a865eba233 in __do_global_dtors_aux ()
#4 0x00007ffe871814b0 in ?? ()
#5 0x00007f6ea510f686 in _dl_fini () at dl-fini.c:138
Backtrace stopped: frame did not save the PC

Note that (some of) our code uses the dlopen() API, but we do _not_ use
any threads.  The "__exit_funcs_lock" is 2 at the time, and the memory
around it does not appear to be corrupted (mostly zeros).  Valgrind is
happy with our code (but that of course is no assurance).

The hang happens rarely, but I've seen it at least twice now (but that
could be out of 100 test suite runs in the last two weeks with overall
100000+ processes being forked and waitpid'ed).


Florian Weimer suggested this was the right place to find help to
investigate this, as I don't even have any idea how to start.

Do you have any advice?

Best regards,

Christian



More information about the Libc-help mailing list