18726 – Futexes are broken on MIPS/n32

Bug 18726 - Futexes are broken on MIPS/n32

Summary: Futexes are broken on MIPS/n32

Status:	RESOLVED OBSOLETE

Alias:	None

Product:	glibc
Classification:	Unclassified
Component:	nptl (show other bugs)
Version:	unspecified

Importance:	P2 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2015-07-27 13:30 UTC by Petr Malat
Modified:	2018-07-09 08:55 UTC (History)
CC List:	4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Flags:	fweimer: security-

Attachments
Sample program illustrating the issue (953 bytes, text/x-csrc) 2015-07-27 13:30 UTC, Petr Malat	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Petr Malat 2015-07-27 13:30:52 UTC

Created attachment 8454 [details]
Sample program illustrating the issue

Hi,
there is a problem in generic syscall implementation, which breaks futexes, which break pthread_* functions using them.

The problem is in
  sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
file in the macro
  #define ARGIFY(X) ((long long) (__typeof__ ((X) - (X))) (X))
which is then used to load syscall arguments into registers. I assume its purpose is to avoid sign extension of an argument if it's of unsigned type, which seems to be wrong. The CPU always does sign extensions while it's loading 32bit integers and this is expected by Linux and GCC. The problem arises, if an argument passed this way and a loaded value are compared. Even if they are the same, comparison fails because one of them is sign extended and the second one isn't.

This can be fixed by replacing ARGIFY with a simple cast to long, however it wouldn't work for a syscall, which takes 64bit argument, but I do not know about a syscall, which would take such an argument on 32bit platform (e.g. llseek splits its arguments into two 32bit ones). Why this macro was introduced?

See the attached code, which illustrates the problem.

The issue doesn't arise if futex is called trough syscall().

Comment 1 jsm-csl@polyomino.org.uk 2015-07-27 15:33:40 UTC

On Mon, 27 Jul 2015, oss at malat dot biz wrote:

> Hi,
> there is a problem in generic syscall implementation, which breaks futexes,
> which break pthread_* functions using them.

What glibc and Linux kernel versions did you test with?

> The problem is in
>   sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
> file in the macro
>   #define ARGIFY(X) ((long long) (__typeof__ ((X) - (X))) (X))
> which is then used to load syscall arguments into registers. I assume its
> purpose is to avoid sign extension of an argument if it's of unsigned type,

No, its purpose is to avoid warnings for converting 32-bit pointers to 
64-bit integers - which means that 32-bit pointers have to be converted to 
32-bit integers before being converted to 64-bit integers.  See 
<https://sourceware.org/ml/libc-ports/2007-06/msg00003.html>.

> which seems to be wrong. The CPU always does sign extensions while it's loading
> 32bit integers and this is expected by Linux and GCC. The problem arises, if an

Linux (versions since SYSCALL_WRAPPERS, which I think includes all 2.6.29 
and later versions) should always do that sign extension on syscall entry 
to avoid a security issue (CVE-2009-0029, see bug 4459).  The minimum 
kernel supported by current glibc is 2.6.32 (likely to move to 3.2 in the 
not too distant future when the 2.6.32 kernel series ceases to be 
maintained).

> This can be fixed by replacing ARGIFY with a simple cast to long, however it
> wouldn't work for a syscall, which takes 64bit argument, but I do not know
> about a syscall, which would take such an argument on 32bit platform (e.g.

n32 syscalls take arguments in the same way as n32 userspace functions, in 
general (that is, 64-bit arguments go in single registers).  Look at the 
various *.c files in sysdeps/unix/sysv/linux/mips/mips64/n32/ for code 
using the syscall macros in such a way that requires them to work with 
64-bit integer arguments.

Comment 2 Petr Malat 2015-07-27 16:18:33 UTC

(In reply to joseph@codesourcery.com from comment #1)
> On Mon, 27 Jul 2015, oss at malat dot biz wrote:
> 
> > Hi,
> > there is a problem in generic syscall implementation, which breaks
> > futexes, which break pthread_* functions using them.
> 
> What glibc and Linux kernel versions did you test with?
Linux 2.6.32 and glibc-2.9 (with glibc-ports-2.9), then with 2.16 and then
I've checked the code trough gitweb and it's the same.

> > The problem is in
> >   sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
> > file in the macro
> >   #define ARGIFY(X) ((long long) (__typeof__ ((X) - (X))) (X))
> > which is then used to load syscall arguments into registers. I assume its
> > purpose is to avoid sign extension of an argument if it's of unsigned 
> > type,
> 
> No, its purpose is to avoid warnings for converting 32-bit pointers to 
> 64-bit integers - which means that 32-bit pointers have to be converted to 
> 32-bit integers before being converted to 64-bit integers.  See 
> <https://sourceware.org/ml/libc-ports/2007-06/msg00003.html>.
That's what I was looking for. The old code seems wrong too, it prevents sign extension.
 
> > which seems to be wrong. The CPU always does sign extensions while it's
> > loading 32bit integers and this is expected by Linux and GCC. The problem
> > arises, if an
> 
> Linux (versions since SYSCALL_WRAPPERS, which I think includes all 2.6.29 
> and later versions) should always do that sign extension on syscall entry 
> to avoid a security issue (CVE-2009-0029, see bug 4459).  The minimum 
> kernel supported by current glibc is 2.6.32 (likely to move to 3.2 in the 
> not too distant future when the 2.6.32 kernel series ceases to be 
> maintained).
> 
> > This can be fixed by replacing ARGIFY with a simple cast to long, however
> > it wouldn't work for a syscall, which takes 64bit argument, but I do not
> > know about a syscall, which would take such an argument on 32bit platform
> n32 syscalls take arguments in the same way as n32 userspace functions, in 
> general (that is, 64-bit arguments go in single registers).  Look at the 
> various *.c files in sysdeps/unix/sysv/linux/mips/mips64/n32/ for code 
> using the syscall macros in such a way that requires them to work with 
> 64-bit integer arguments.
In that case the fix is not going to be so simple (I will check the code), but I still think ARGIFY is wrong:
 - Assume the futex value is 0x8000 0000.
 - Because it's unsigned, ARGIFY expands to
      ((long long) (unsigned) (0x8000 0000))
   so the content of the register is:
      0x0000 0000 8000 0000
 - Then the futex syscall is made and the kernel loads the value for 
   comparison from the provided memory address into an unsigned variable 
   (using load word instruction)
 - The content of register is 0xFFFF FFFF 8000 0000 because sign extension has
   been done
 - Comparison fails (because whole registers are compared even when unsigned
   integers are used) and the process doesn't go to sleep. If this happens in
   e.g. pthread_cond_wail, the process ends up livelocked.

Comment 3 jsm-csl@polyomino.org.uk 2015-07-27 16:51:18 UTC

On Mon, 27 Jul 2015, oss at malat dot biz wrote:

> In that case the fix is not going to be so simple (I will check the code), but
> I still think ARGIFY is wrong:
>  - Assume the futex value is 0x8000 0000.
>  - Because it's unsigned, ARGIFY expands to
>       ((long long) (unsigned) (0x8000 0000))
>    so the content of the register is:
>       0x0000 0000 8000 0000
>  - Then the futex syscall is made and the kernel loads the value for 
>    comparison from the provided memory address into an unsigned variable 
>    (using load word instruction)

But the code in the kernel that handles syscall arguments should have 
sign-extended the register value before it ever reached the C 
implementation of futex in the kernel with a 32-bit argument type, so it 
shouldn't matter what the high word of the register passed to the kernel 
was.  If the syscall wrapper code in the kernel doesn't do so, then 
hostile userspace code can cause security issues by passing values to the 
kernel that result in C code having register values for arguments that 
don't conform to the ABI and so undefined behavior in that C code.  And 
because of the privilege boundary involved, the kernel can never rely on 
userspace code sign-extending arguments correctly.

So why isn't the kernel ensuring that the "int" value conforms to the ABI 
for int values (by sign-extending) before it gets to a C function in the 
kernel with an int argument?

Comment 4 Petr Malat 2015-07-27 17:09:19 UTC

(In reply to joseph@codesourcery.com from comment #3)
> On Mon, 27 Jul 2015, oss at malat dot biz wrote:
> 
> > In that case the fix is not going to be so simple (I will check the code), but
> > I still think ARGIFY is wrong:
> >  - Assume the futex value is 0x8000 0000.
> >  - Because it's unsigned, ARGIFY expands to
> >       ((long long) (unsigned) (0x8000 0000))
> >    so the content of the register is:
> >       0x0000 0000 8000 0000
> >  - Then the futex syscall is made and the kernel loads the value for 
> >    comparison from the provided memory address into an unsigned variable 
> >    (using load word instruction)
> 
> But the code in the kernel that handles syscall arguments should have 
> sign-extended the register value before it ever reached the C 
> implementation of futex in the kernel with a 32-bit argument type, so it 
> shouldn't matter what the high word of the register passed to the kernel 
> was.  If the syscall wrapper code in the kernel doesn't do so, then 
> hostile userspace code can cause security issues by passing values to the 
> kernel that result in C code having register values for arguments that 
> don't conform to the ABI and so undefined behavior in that C code.  And 
> because of the privilege boundary involved, the kernel can never rely on 
> userspace code sign-extending arguments correctly.
> 
> So why isn't the kernel ensuring that the "int" value conforms to the ABI 
> for int values (by sign-extending) before it gets to a C function in the 
> kernel with an int argument?
In this case it can't cause any security issues as the value is used only to
decide if the caller shall sleep on the futex. As the value is not stored
according to the ABI, it's not equal to any valid value and the process is 
not put to sleep.
I agree returning EINVAL would be better in such a case, but the bug would
have to be fixed in glibc anyway.
After checking implementation of these new syscalls, which were not present
in my old version, I think the proper fix would be to change futex value to signed int type.

Comment 5 jsm-csl@polyomino.org.uk 2015-07-27 17:24:38 UTC

On Mon, 27 Jul 2015, oss at malat dot biz wrote:

> > So why isn't the kernel ensuring that the "int" value conforms to the ABI 
> > for int values (by sign-extending) before it gets to a C function in the 
> > kernel with an int argument?
> In this case it can't cause any security issues as the value is used only to

The MIPS architecture specification says there is UNPREDICTABLE behavior 
if a non-sign-extended value is used with a 32-bit instruction.  The 
compiler may generate such instructions to work with the int value.  
While such architecturally UNPREDICTABLE behavior can't itself cross 
privilege boundaries, that only means it doesn't cause a security issue if 
such an instruction is executed in userspace - and the problem code is 
executing in the kernel.  So you're relying on particular combinations of 
(compiler code generation, behavior of particular processors on such 
inputs) to avoid security issues, which does not seem a good idea.

> After checking implementation of these new syscalls, which were not present
> in my old version, I think the proper fix would be to change futex value to
> signed int type.

Are you saying that in fact the bug was fixed in the kernel, but later 
than 2.6.32?  Because the conclusion in the kernel community was that the 
kernel was the right place to fix this (by always sign-extending affected 
syscall arguments on kernel entry).  And I think the conclusion from more 
recent discussions in the glibc context of a generic API for internal 
futex uses in glibc is that unsigned int is the preferred type for futexes 
within glibc.

Comment 6 Petr Malat 2015-07-28 10:27:40 UTC

(In reply to joseph@codesourcery.com from comment #5)
> On Mon, 27 Jul 2015, oss at malat dot biz wrote:
> > After checking implementation of these new syscalls, which were not present
> > in my old version, I think the proper fix would be to change futex value to
> > signed int type.
> 
> Are you saying that in fact the bug was fixed in the kernel, but later 
> than 2.6.32? 
No, I meant my original proposal of changing ARGIFY is not good, because it
will not work with ftruncate64, lockf64 etc., which are not present in my old version.

> Because the conclusion in the kernel community was that the 
> kernel was the right place to fix this (by always sign-extending affected 
> syscall arguments on kernel entry).
OK, I will try with a never kernel. Nevertheless I still think this is also 
a bug on glibc side as it violates the ABI.

> And I think the conclusion from more recent discussions in the glibc context
> of a generic API for internal futex uses in glibc is that unsigned int is 
> the preferred type for futexes within glibc.
I've got this idea from the prototype specified in futex(2), which says val 
argument is of the type int, but on the other hand using unsigned is aligned
with the kernel, as the value is unsigned there too.

Comment 7 Petr Malat 2015-07-28 16:07:53 UTC

I've tracked down a change, which prevents the problem in the kernel:
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/kernel/futex_compat.c?id=90caf58dad77a4021e9dade5d051756cea2a2cd7

When COMPAT_SYSCALL_DEFINE is used a wrapper is created, which casts the 
futex value to unsigned long and then to unsigned int, which forces gcc to
emit sll instruction, which, according to the specification, can be used even
when a register is not properly sign extended.

I will report this to the kernel mailing list, as from current long term releases
2.6.32.67, 3.2.69 and 3.4.108 are affected. From my POV glibc shall be fixed too,
as it violates the ABI.

Comment 8 jsm-csl@polyomino.org.uk 2015-07-28 16:21:08 UTC

On Tue, 28 Jul 2015, oss at malat dot biz wrote:

> I will report this to the kernel mailing list, as from current long term 
> releases 2.6.32.67, 3.2.69 and 3.4.108 are affected.

Thanks.  As a potential security issue ("incomplete fix for 
CVE-2009-0029", I suppose it would be described) this clearly ought to be 
fixed in stable kernels.

> From my POV glibc shall be fixed too, as it violates the ABI.

Well, the kernel discussion in 2007 said the ABI should be for the kernel 
to do the sign extension, and the approach taken with syscall wrappers in 
the kernel in 2009 confirmed that as the intended ABI.  That doesn't rule 
out working around the kernel bug in glibc while some affected kernel 
versions are still supported, but does make clear that it's the kernel 
that's buggy.

Comment 9 Torvald Riegel 2017-01-12 10:23:54 UTC

What is the status of this bug?  Can it be closed?

Comment 10 jsm-csl@polyomino.org.uk 2017-01-12 14:13:59 UTC

I don't know the status in the kernel, but presumably we'd keep it open 
until the buggy kernel versions are no longer supported or we've decided 
not to work around the kernel bug.

Comment 11 Petr Malat 2018-07-09 08:55:52 UTC

Marking this as resolved, as all affected kernels are after their end of life.