Bug 14622 - exit handlers not thread safe
Summary: exit handlers not thread safe
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: stdio (show other bugs)
Version: 2.17
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-25 16:25 UTC by law
Modified: 2014-06-25 06:46 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description law 2012-09-25 16:25:44 UTC
The exit handlers which flush & close streams are not thread safe.  

For example, given this code:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void *test(void *arg)
{
    return NULL;
}
void *writer(void *arg)
{
    for(;;) {
        char a[100];
        FILE *f = fopen("out", "w");

        if(f == NULL)
           abort();

        fputs("Test", f);

        if(fgets(a, 100, stdin))
            fputs(a, f);
        fclose(f);
    }

    return NULL;
}

int main(int argc, char *argv[])
{
    pthread_t tid1,tid2;


    pthread_create(&tid1, NULL, writer, NULL);
    pthread_create(&tid2, NULL, test, NULL);
    pthread_join(tid2, NULL);
    return 0;
}


while [ true ]; do 
  echo test | valgrind --error-exitcode=2 ./a.out  || break  
done  


If run under valgrind enough times you'll eventually see the IO cleanup handlers referencing free'd memory:

==7234== Invalid read of size 4
==7234==    at 0x34A7275FC8: _IO_file_write@@GLIBC_2.2.5 (in /usr/lib64/libc-2.15.so)
==7234==    by 0x34A7275EA1: new_do_write (in /usr/lib64/libc-2.15.so)
==7234==    by 0x34A7276D44: _IO_do_write@@GLIBC_2.2.5 (in /usr/lib64/libc-2.15.so)
==7234==    by 0x34A7278DB6: _IO_flush_all_lockp (in /usr/lib64/libc-2.15.so)
==7234==    by 0x34A7278F07: _IO_cleanup (in /usr/lib64/libc-2.15.so)
==7234==    by 0x34A7238BBF: __run_exit_handlers (in /usr/lib64/libc-2.15.so)
==7234==    by 0x34A7238BF4: exit (in /usr/lib64/libc-2.15.so)
==7234==    by 0x34A722173B: (below main) (in /usr/lib64/libc-2.15.so)
==7234==  Address 0x542f2e0 is 0 bytes inside a block of size 568 free'd
==7234==    at 0x4A079AE: free (vg_replace_malloc.c:427)
==7234==    by 0x34A726B11C: fclose@@GLIBC_2.2.5 (in /usr/lib64/libc-2.15.so)
==7234==    by 0x40087C: writer (t.c:22)
==7234==    by 0x34A7607D13: start_thread (in /usr/lib64/libpthread-2.15.so)
==7234==    by 0x34A72F167C: clone (in /usr/lib64/libc-2.15.so)



What's happening, is the thread in "writer" and the main program's thread are racing.  If the "main" thread starts processing its exit IO handlers, then the "writer" thread fcloses the stream (which deallocates the associated memory), then the "main" thread continues processing its exit IO handlers and starts dereferencing free'd memory.

The exit IO handlers explicitly avoid locking on the stream and the list_all_lock.  Presumably to avoid having exit hang on an event that's never going to happen.


At first I thought we could continue to avoid taking the stream lock, but always honour the list_all_lock in IO_flush_all_lockp when called from the exit IO handler.  However, that can block in a 3 thread case.  Thread 1 is blocked waiting on an event that will never happen with its stream locked.  Thread 2 tries to fclose the same stream, it'll acquire the list_all_lock, then block on the stream lock.  Thread 3 calls exit and blocks because it can't acquire the list_all_lock.

I don't offhand see a good solution.  Even playing games with last_stamp doesn't seem like it's work to me.
Comment 1 Rich Felker 2012-09-25 18:03:42 UTC
Since I couldn't find anything in the standard detailing how implementations are required to deal with this, I opened a request for interpretation on the Austin Group tracker:

http://austingroupbugs.net/view.php?id=611

There's also a relevant question thread on Stack Overflow, where I first saw this issue raised:

http://stackoverflow.com/questions/12549704/glibc-possible-race-condition-between-closing-file-while-exiting/