Help, any one ever meet hanging on _IO_lock_lock(list_all_lock) issue ?

Wuqixuan wuqixuan@huawei.com
Wed Nov 13 06:30:00 GMT 2013


> That is odd. However it could be the result of an unbalanced set of
> locks and unlocks. That could result in the problem you're seeing.

> The IO lock can be taken recursively incrementing cnt, and
> decrementing cnt on unlock.

> Once it decrements to 0 the lock is unlocked.

> If something corrupted the cnt value then it will not unlock.

> e.g.
> #define _IO_lock_unlock(_name) \
>   do {                                                                        \
>     if (--(_name).cnt == 0)                                                   \
>       {                                                                       \
>         (_name).owner = NULL;                                                 \
>         lll_unlock ((_name).lock, LLL_PRIVATE);                               \
>       }                                                                       \
>   } while (0)

> See the `cnt == 0' won't be true and it won't unlock or clear the
> owner, and this thread will continue to do something else.

> The lock will be leaked at that point.

> Is it alive? Dead? Backtrace?

Because the issue happened in my side only once, but cannot be reproduced. Now the env is not there. 

Yes, if cnt value is corrupted, nobody can use this lock anymore. But do you know in our case how the cnt value is corrupted and how to  reproduced ? I guess there is some other bug to cause the unbalance set of locks and unlocks in glibc 2.4. Do you know what's that? 

We found http://sourceware.org/git/?p=glibc.git;a=commit;h=7583a88d1c7170caad26966bcea8bfc2c92093ba which is fixed by schwab. 
The patch seems telling flush_cleanup has bug and possibility to corrupt cnt. Do you know prevously what was the exact issue when we want to fix it?  

Thanks lot & Regards.
Wuqixuan.


More information about the Libc-help mailing list