Bug 14362

Summary: Incorrect ioctl declaration
Product: glibc Reporter: Linus Torvalds <torvalds>
Component: libcAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: normal CC: carlos, drepper.fsp, emaste, kosaki.motohiro, mmaruska, ondra
Priority: P2 Flags: fweimer: security-
Version: 2.15   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: WIP patch to convert ioctl request type to `int'.

Description Linus Torvalds 2012-07-12 19:57:42 UTC
Both SuS and the man-pages agree: the correct declaration for ioctl() is

   extern int ioctl(int fd, int request, ...);

but glibc headers incorrectly have "unsigned long" for the request. 

This means that this simple C program results in an incorrect type conflict error:

   #include <sys/ioctl.h>
   extern int ioctl(int fd, int request, ...);

even though the declaration clearly matches documentation.

Perhaps more importantly, it's misleading for anybody who actually reads the header files. That's rare, but still.. We had this discussion on the subsurface mailing list, because OS X has the same buggy declaration, and what is worse, the OS X kernel is apparently buggy too, because doing

   int action = (level ? TIOCSBRK : TIOCCBRK);
   ...
   ioctl(fd, action, ...);

will sign-extend the 'int' into 'unsigned long', and the OS X FreeBSD-based kernel apparently actually looks at all 64 bits of an 'unsigned long' when it does things, leading to actual bugs (ie ioctl returning ENOTTY because the sign-extended 64-bit value is not recognized).

So having "unsigned long" in there can result in actual bugs. You don't see this on Linux, because 64-bit Linux will correctly truncate the request value to 32 bits (and then internally uses 'unsigned' to make sure no sign extension happens, but that's a separate issue).

So please fix the ioctl() declaration. "unsigned long" is misleading and actively incorrect and can cause bugs on non-Linux operating systems.

In case anybody uses glibc on OS X, I would suggest doing

   int ioctl(int fd, int __request, ...)
   {
       unsigned long request = (unsigned int) __request;
       ...

to avoid the OS X bug.
Comment 1 Carlos O'Donell 2012-07-24 03:02:29 UTC
Created attachment 6551 [details]
WIP patch to convert ioctl request type to `int'.
Comment 2 Carlos O'Donell 2012-07-24 03:07:30 UTC
The history in the glibc ChangeLogs indicate that it was changed from `int' to `unsigned long int' around 1993 in order to align itself with BSD 4.4.

I consider this a serious issue for compliance with POSIX which has ioctl defined as `int ioctl(int fildes, int request, ... /* arg */);'.

I'll float around a WIP patch for discussion around the pros/cons of changing this established interface.
Comment 3 Carlos O'Donell 2012-07-24 03:58:52 UTC
Notes:
http://sourceware.org/ml/libc-alpha/2012-07/msg00446.html
Comment 4 KOSAKI Motohiro 2012-07-24 15:26:44 UTC
Link to man-pages bugzilla.

   https://bugzilla.kernel.org/show_bug.cgi?id=42705
Comment 5 Linus Torvalds 2012-07-24 17:41:28 UTC
(In reply to comment #4)
> Link to man-pages bugzilla.
> 
>    https://bugzilla.kernel.org/show_bug.cgi?id=42705

I think that one brings up interesting points, but is not actually correct or really relevant. But it does show some of the confusion in this area, yes, which makes it interesting.

The Linux kernel internally uses 'unsigned int' exactly because sign extension is such a ^&%*@ confusing pain in the ass when you have random big integers (and yes, they really historically are big, with high bits used for various things).

At the same time, in Linux, the kernel doesn't even really care: the sign extension is irrelevant as long as the comparison is always done in 32 bits. So the important part is not the "unsigned int", but that it is *some* kind of "int". Not a "long".

But I point out the internal use of "unsigned int", because I do think it's nice: it avoids the worry about sign extension, and makes things that have the high bit set not have any special behavior. So "unsigned int" is a better type, I think.

But when that bugzilla entry says that

        int x = IOCTL;
        if (x == IOCTL)
                /* ... */

  Will generate a "comparison always false due to limited range of data type"
  warning from gcc, and presumably will indeed compare false, which is clearly
  not the expected result.

it is actually incorrect. Because the IOCTL value is (or rather - can be) an 'unsigned int', when you do the comparison, the C typecasting rules will do the comparison in 'unsigned int', and everything will be correct. The manpages bugzilla entry is correct about this being *confusing*, though, and might generate warnings about comparing signed and unsigned values.

Now, if you do things in "long", you really can get screwed. Then you might sign-extend one and compare it against a non-sign-extended one, and then you're really screwed. That is, in fact, exactly what happens on OS X right now due to the OS X *incorrect* use of "unsigned long".

So I think the manpages bugzilla is wrong, but does bring up a valid source of confusion. I absolutely agree that we would have been *much* better off if POSIX had just said that the argument is "unsigned int".

But that's not what they did. And I think we should conform to POSIX, and admit that it's ugly, but that it's absolutely better than "unsigned long", which is just broken and causes actual bugs.

(Side note: the bugzilla would be correct if some IOCTL value is declared as a single constant, eg

   #define IOCTL 2147483648

because such a constant would be implicitly converted to "long", and then the comparison with an "int" (which would be sign-extended) would never be true.

However, that's not how the ioctl's with the high bits set are defined, they are done by shifting values that are of type "unsigned int" around, or done as hex constants, both of which will stay as "unsigned int"). And mixing "unsigned int" and "int" values can be *confusing*, but it won't be wrong - nothing will be extended to "long", but "int" will be silently converted to "unsigned int".

Yeah, C type conversion can be subtle.
Comment 6 Andreas Schwab 2013-01-15 11:32:46 UTC
Current FreeBSD still uses unsigned long.  Note that the POSIX ioctl is for STREAMS only and marked obsolete.  I think this should be considered a documentation bug.
Comment 7 Linus Torvalds 2013-01-15 17:03:58 UTC
(In reply to comment #6)
> Current FreeBSD still uses unsigned long.  Note that the POSIX ioctl is for
> STREAMS only and marked obsolete.  I think this should be considered a
> documentation bug.

Why would you consider it a documentation bug, when it clearly results in real bugs, and the FreeBSD/Darwin type is *wrong*?

The fact is, casting it to unsigned int is the correct thing to do. There are actual bug reports about BSD itself due to this exact same problem. 

If you google for it, you'll see this discussed on the gnulib lists, on the perl mailing lists, etc etc. The FreeBSD/Darwin ioctl() declaration is *wrong*.

And it's not wrong because it actually *uses* 64 bits. No, it is "unsigned long" just because of historical accidents, back when "long" was 32-bit. 

Even *Apple* knows it is a bug, and actually uses "unsigned int" in their own code. See for example

  http://valgrind-variant.googlecode.com/svn-history/r66/trunk/valgrind/coregrind/m_syswrap/syswrap-darwin.c

which has a big Apple copyright, and look for ioctl. Look at the code that actually reads the arguments (PRE_REG_READ3): the "request" argument is read as an "unsigned int".

As noted, this does not actually cause problems on Linux, because unlike FreeBSD, Linux knows what the f*ck it is doing, and just ignores the upper bits exactly because of possible sign confusion. So I don't care as a Linux person. But I think glibc should try to actually help Darwin. Because there are ioctl commands on Darwin that have bit 31 set, and they cause problems.

Go and look at the BSD sys/sys/ioccom.h file itself if you don't believe me. The constants are clearly 32-bit ones, with IOC_IN being bit#31 (ok, so I don't know where the actual Darwin SDK is, and I wouldn't want to download it anyway, but I can find the FreeBSD sources, and there is even a comment there talking about how the *high* three bits specify the direction. And they are high bits only in 32-bit. See for example

    http://fxr.watson.org/fxr/source/sys/ioccom.h

The fact is, the FreeBSD/Darwin *declaration* is wrong. And no, it's not just wrong in the documentation sense. It's wrong in the "real code breaks" sense.
Comment 8 OndrejBilka 2013-05-15 10:09:47 UTC
What is status of patch?
Comment 9 Carlos O'Donell 2013-05-15 14:28:49 UTC
(In reply to comment #8)
> What is status of patch?

Someone needs to work through all of the feedback that we received on list for the MIPS, Power and TILE implementations. Link provided in comment 3.

Off the top of my head:
(a) Binary compatibility in the cases where ioctl has a C wrapper in glibc.
(b) MIPS, Power, and TILE interfaces when older kernels are used need to be verified with help from those maintainers.
(c) Validate that the changes, including casts in INLINE_SYSCALL don't break what strace shows for the arguments to ioctl (even if the claim is that strace looks at only the low 16-bits).
(c) At a minimum look over the ruby code and verify their ioctl glue doesn't break with this change.
Comment 10 Jackie Rosen 2014-02-16 19:30:23 UTC Comment hidden (spam)