[Bug libc/3429] New: Race in _dl_open with r_debug.r_state consistency check

While running some stress tests on one of our application, we encountered an
assert() in as follows:

"Inconsistency detected by dl-open.c: 610: _dl_open: Assertion
`_dl_debug_initialize (0, args.nsid)->r_state == RT_CONSISTENT' failed!

with glibc-2.4.31. This race seems to be present in the libc I got from the CVS
[at code inspection]. We were able to reproduce this consistently within 4-5hrs
of run.

Upon debugging we found that it is due to a race between two threads doing a

The scenario is something like this :

In elf/dl-open.c, _dl_open:

  /* Make sure we are alone.  */
  __rtld_lock_lock_recursive (GL(dl_load_lock));


  int errcode = _dl_catch_error (&objname, &errstring, &malloced,
                                 dl_open_worker, &args);
#ifndef MAP_COPY
  /* We must munmap() the cache file.  */
  _dl_unload_cache ();

  /* Release the lock.  */
  __rtld_lock_unlock_recursive (GL(dl_load_lock));

^^^^^ This would kick any other thread waiting on the lock.

if (__builtin_expect (errstring != NULL, 0))
   assert (_dl_debug_initialize (0, args.nsid)->r_state == RT_CONSISTENT);

assert (_dl_debug_initialize (0, args.nsid)->r_state == RT_CONSISTENT);

And, if the thread which gets woken up is playing with the same namespace, and
sets the r_state to RT_ADD in _dl_map_object_from_fd even before we reach here
(truly possible in an SMP system),  ( due to getting scheduled out ), we would
hit the assert !

So, it is not safe to believe that the r_state won't get changed once we release
the lock.

