Bug 21672 - sys-libs/glibc on ia64 crashes on thread exit: signal SIGSEGV, Segmentation fault: pthread_create.c:432: __madvise (pd->stackblock, freesize - PTHREAD_STACK_MIN, MADV_DONTNEED);
Description Sergei Trofimovich 2017-06-25 21:52:33 UTC
First found in gentoo in https://bugs.gentoo.org/622694

The tets file:

$ cat bug.c 
    // how to crash: gcc -O0 -ggdb3 -o r bug.c -pthread && ./r

    #include <pthread.h>

    static void * f (void * p)
        return NULL;

    int main (int argc, const char ** argv)
        pthread_t t;
        pthread_create (&t, NULL, &f, NULL);

        pthread_join (t, NULL);
        return 0;

How to crash:
$ gcc -O0 -ggdb3 -o r bug.c -pthread && ./r
Segmentation fault (core dumped)

$  gdb r core
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x2000000000077da0 in start_thread (arg=0x0) at pthread_create.c:432
432         __madvise (pd->stackblock, freesize - PTHREAD_STACK_MIN, MADV_DONTNEED);
[Current thread is 1 (Thread 0x2000000000b6b1f0 (LWP 20912))]

(gdb) list
427     #ifdef _STACK_GROWS_DOWN
428       char *sp = CURRENT_STACK_FRAME;
429       size_t freesize = (sp - (char *) pd->stackblock) & ~pagesize_m1;
430       assert (freesize < pd->stackblock_size);
431       if (freesize > PTHREAD_STACK_MIN)
432         __madvise (pd->stackblock, freesize - PTHREAD_STACK_MIN, MADV_DONTNEED);
433     #else
434       /* Page aligned start of memory to free (higher than or equal
435          to current sp plus the minimum stack size).  */
436       void *freeblock = (void*)((size_t)(CURRENT_STACK_FRAME

#0  0x2000000000077da0 in start_thread (arg=0x0) at pthread_create.c:432
        pd = 0x0
        now = <optimized out>
        unwind_buf = <error reading variable unwind_buf (Cannot access memory at address 0xfffffffffffffd90)>
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = 0x2000000000b6a870 ""
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#1  0x0000000000000000 in ?? ()
Comment 1 Sergei Trofimovich 2017-06-25 22:02:44 UTC
Created attachment 10221 [details]

The SIGSEGV is caused by the code responsible for stack cleanup
when thread exits. madvise(MADV_DONTNEED) is called on a part of stack
activelu being used at exit.

It happens because on ia64 stack grows from both sides of stack block:
 - normal "sp" stack (stack for local variables) grows down
 - register stack "bsp" grows up from the opposite end of stack block

madvise(MADV_DONTNEED) effectively does memset(0) register stack
which causes SIGSEGV at address 0x8 afterwards when a pointer frop
stack is being dereferenced.
Comment 2 Sergei Trofimovich 2017-06-25 22:09:23 UTC
Sent the patch to libc-alpha as https://sourceware.org/ml/libc-alpha/2017-06/msg01265.html
Comment 3 Sergei Trofimovich 2017-06-27 09:57:41 UTC
Described in more details breakage mechanics: http://trofi.github.io/posts/202-stack-growth-direction-how-hard-can-it-be.html