The structure new_sem (file internaltypes.h __HAVE_64B_ATOMICS = 0) has changed from 2.20 to 2.21 the way the semaphore value is stored. The semaphore value is now stored in top 31 bits of the field "value". Bit 0 is used to indicate any nwaiters. This breaks the code of __old_sem_wait (nptl/sem_wait.c). In particular, "atomic_decrement_if_positive" needs to be replaced by a call that understands the new layout of the modified structure new_sem, otherwise we may end up looping forever as the value of the futex may end up being interpreted as negative (for example if user initializes the semaphore value to SEM_VALUE_MAX (2147483647) we actually store 0x7FFFFFFF << 1 which is 0xFFFFFFFE which is (-2) and "atomic_decrement_if_positive" will never return value > 0. Not to mention we also modify bit 0 while decrementing.
Does the hang happen on every call to __old_sem_wait?
(In reply to Florian Weimer from comment #1) > Does the hang happen on every call to __old_sem_wait? It is hard to say. In my case the semaphore is initialized via "new_sem_init" but then the wait for semaphore is via "old_sem_wait". In this case there will be always problems, as the two calls assume different structure for sem_t. Prior to glibc 2.21 this would work as well, as the two different structures agreed on interpretation of the field "value". Normally, I would expect no problems with combination old_sem_init/old_sem_wait. I should clarify that I ran into this problem using Python 2.7/multiprocessing.
We have two sets of routines: old_sem_init/old_sem_wait new_sem_init/new_sem_wait If we use matched pairs, there are no problems. But there is only one routine sem_open. This one uses the semaphore structure new_sem. So if we create a semaphore using sem_open, we must use new_sem_wait. That is not guaranteed and as a matter of fact this is the cause of the problem I have. So I think there should be two routines: new_sem_open and old_sem_open as well.