This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug nptl/17980] New: sem_* interoperability between 32bit and 64bit binaries
- From: "glibc at schuster dot re" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Sat, 14 Feb 2015 16:42:10 +0000
- Subject: [Bug nptl/17980] New: sem_* interoperability between 32bit and 64bit binaries
- Auto-submitted: auto-generated
https://sourceware.org/bugzilla/show_bug.cgi?id=17980
Bug ID: 17980
Summary: sem_* interoperability between 32bit and 64bit
binaries
Product: glibc
Version: 2.21
Status: NEW
Severity: normal
Priority: P2
Component: nptl
Assignee: unassigned at sourceware dot org
Reporter: glibc at schuster dot re
CC: drepper.fsp at gmail dot com
Created attachment 8128
--> https://sourceware.org/bugzilla/attachment.cgi?id=8128&action=edit
A minimal working example illustrating the issue
Description:
Since my distributions update from glibc 2.20 to 2.21 a mixed 32bit/64bit
application using posix semaphores (sem_open, sem_post, ...) started aborting
in the code of sem_post.
Mixed in this case means, that there is a 32bit application setting up a named
semaphore using sem_open and waiting for input of the 64bit-application, which
calls sem_post upon completion of a certain task.
stracing the executable reveals the following futex-call being the reason
(glibc calls abort() after the failed syscall):
futex(0x7f5281f83000, 0xf7784081 /* FUTEX_??? */, 1) = -1 ENOSYS (Function
not implemented)
Unfortunately the code for this binary (as well as the build-chain) is a bit
convoluted and illsuited for debugging (if it is necessary, I can provide it
though, as well as building instructions).
A bit of debugging reveals that this issue persists with other 32bit/64bit
program-combinations, so I produced a minimal working example attached below.
It does not crash as the original binary, but simply fails to work as expected:
Steps to Reproduce:
* Unpack the tar (tar xf minimal-working-example.tar.gz)
* Build all binarys (make all)
* execute ./mwe and ./mwec in two terminals for the reference run
* One will receive an interleaved execution: mwe sleeps/waits once for each
call of mwec-unlock
* execute ./mwe_cleanup to reset/delete the used named semaphore
* execute ./mwe-32 in one terminal and ./mwec in another one
* The semaphore-unlock is not recogniced by ./mwe-32
Looking into the sources, my best bet is that the performance optimization for
"new_sem" in 2.21 changed the struct-layout of new_sem in
"sysdeps/nptl/internaltypes.h" for 64bit binaries (as AMD64 provides
"__HAVE_64B_ATOMICS"), but the 32bit version does not provide those, so the
struct-layouts differ.
In the case of named semaphores those structs seem to be written to a file,
which is then mmapped into the processes address-spaces by sem_open. As those
struct-layouts now differ, 32bit and 64bit-binaries are not interoperable
anymore.
I've produced a patch that seems to fix the issue for me by reordering
struct-members (see file 0001-Amortize-layout-of-struct-new_sem.patch
attached). Please note that this patch is more a "proof of concept", as it does
only take AMD64 into account and most likely will not play nicely with
big-endian architectures...
The problem persists with:
* glibc-2.21
* git-master (latest commit is 3f293d614c9e641a0d96d347df5c1c5ee687762f)
Note:
Bugzilla seems to only allow one attachement per bugreport, so I will attach
the patch with a comment below
--
You are receiving this mail because:
You are on the CC list for the bug.