The malloc.c accesses to the *hook variables are still not thread-safe. The __malloc_hook, __memalign_hook, __realloc_hook and __free_hook variables are shared among threads and therefore should only be read and written using atomic memory accesses. Usually a simple pointer access is an atomic operation but in this case the content of the hooks has to be checked against NULL before actually using it. To make this thread-safe a temporary variable has been added which the *hook value is copied to: http://sourceware.org/cgi-bin/cvsweb.cgi/libc/malloc/malloc.c.diff?r1=1.85&r2=1.86&cvsroot=glibc But this is undone by GCC optimization. GCC is allowed to optimize a program as long as this does not change semantics of the single threaded execution of the program. Therefore it is correct for GCC to optimize code sequences like: __malloc_ptr_t (*hook) (size_t, __const __malloc_ptr_t) = __malloc_hook; if (__builtin_expect (hook != NULL, 0)) return (*hook)(bytes, RETURN_ADDRESS (0)); into several accesses to the __malloc_hook variable instead of actually using a copy as the C code would suggest. This means that a parallel thread might set __malloc_hook to NULL after the NULL check but before invoking the hook. This optimization can be seen in s390 64 bit code but will most likely also cause problems on other architectures. upstream libc built with gcc 4.1.2: __libc_malloc: .LFB95: stmg %r9,%r15,72(%r15) .LCFI38: larl %r13,.L1308 larl %r1,__malloc_hook aghi %r15,-160 .LCFI39: lgr %r3,%r14 lgr %r9,%r2 clc .L1309-.L1308(8,%r13),0(%r1) <-- 1. dereference check jne .L1299 ... .L1299: lg %r1,0(%r1) <-- 2. dereference lmg %r9,%r15,232(%r15) br %r1 <-- invokation
Created attachment 3827 [details] Possible fix for the problem The patch adds a compiler memory barrier for the respective hook using an inline assembly clobbering the memory location.
The hooks are not thread-safe. Period. What are you trying to fix? If anything, the variables should be marked volatile. You can try that.
(In reply to comment #2) > The hooks are not thread-safe. Period. What are you trying to fix? Due to your comment (from several years ago) to rev 1.86: "Make access to ..._hook pointers thread-safe." I assumed that these hooks originally were intended to be thread-safe and I really think they should. > If anything, the variables should be marked volatile. You can try that. Yes "volatile" would also prevent GCC from optimizing here but I think it goes beyond what we actually need. "volatile" basically disables every optimization wrt that variable. I think the inline assembly exactly expresses what happens here and is a more subtle way to fix the problem. Unfortunately I am not able to come up with a testcase. A customer just provided a core dump of a program which crashes due to the __malloc_hook variable set to NULL while trying to invoke it. The customer is already testing a similar patch and runs without problems for a week by now. The customer code does not set the __malloc_hook variable itself so I can just assume that the modification comes from within libc.
I've checked in a patch which should have the same effect.