|
Sources Bugzilla – Full Text Bug Listing |
| Summary: | Bug (+fix) in readdir() due to getdents() | ||
|---|---|---|---|
| Product: | glibc | Reporter: | Dan Tsafrir <dan.tsafrir> |
| Component: | libc | Assignee: | GOTO Masanori <gotom> |
| Status: | RESOLVED INVALID | ||
| Severity: | critical | CC: | carlos, glibc-bugs |
| Priority: | P1 | Keywords: | testsuite |
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Host: | i686-libranet-gnu-linux | Target: | |
| Build: | Last reconfirmed: | ||
| Attachments: |
Patch taken from bug comments.
Preliminary testcase adapted from bug comments. Standalone preliminary testcase adapted from bug comments. |
||
|
Description
Dan Tsafrir
2004-02-09 15:07:14 UTC
This patch applies cleanly to sysdeps/unix/sysv/linux/getdents.c in the trunk. I have created a first pass at a testcase but it requires an autofs filesystem and is non-deterministic according to the bug comments so I have not actually reproduced yet. Created attachment 876 [details]
Patch taken from bug comments.
Created attachment 877 [details]
Preliminary testcase adapted from bug comments.
Created attachment 878 [details]
Standalone preliminary testcase adapted from bug comments.
Here is a standalone testcase as well if someone happens to have a system with
autofs to reproduce this and to refine the testcase.
AFAICS, current kernel does put_user(file->f_pos, &lastdirent->d_off), thus d_off should have well-defined meaning nowadays. Petr, your reason for closing this bug is invalid: (1) If you read the original bug report you'll notice that it's not enough for d_off to have a "well defined meaning". What you really need is for that meaning to be the same in both the kernel and in libc. This bug report specifically points out the fact that the meaning is different. (2) Also, if you search for put_user(file->f_pos,&lastdirent->d_off) in the kernel code you'd see that it existed many, many years before this bug was submitted (the line is there since Linux-1.3.0, no less). Thus, having this line in the kernel doesn't prove or disprove anything. I do not believe there is a glibc bug here. * The bug report quotes a manpage for the readdir system call as saying that d_off refers to the current dirent rather than the next dirent. That is correct, but irrelevant. The readdir system call is a backwards compatibility one using a very old version of the dirent structure. With the getdents and getdents64 syscalls, different structures are used, and in those structures the semantics are that d_off refers to the next dirent, as you will see if you refer to the kernel sources or the getdents manpage. The getdents64 syscalls is the relevant one here. * The bug report claims there is an assumption that d_off contains a byte offset. Actually, there is no such assumption. The only requirement is that the kernel does not use -1 as an offset. Otherwise, the offsets are used as opaque values, stored, copied, converted to the userspace type (with the result only compared for equality, not used for arithmetic) and used as an argument for __lseek64. Using as an argument for __lseek64 (with SEEK_SET) is not using the value as a byte offset; it's up to each filesystem in the kernel to implement seeking on directories correctly so that it works with the opaque d_off values provided by the kernel. Some filesystems may use byte offsets and some use other values involving a more complicated implementation of that seek operation in the kernel. * Various different filesystems have been used for automounting (e.g. autofs / autofs4) and this report is not specific about exactly what filesystem, with what kernel version, is used in the problem situations. (Kernels before 2.6.0 are in any case not in practice relevant to current glibc, although some support code for them has yet to be removed.) But if a filesystem did not handle seeking with values provided in d_off, that would be a bug in the filesystem. * It is certainly not correct to seek with byte offsets computed by glibc, since offsets for seeking in a directory are the same sort of opaque cookie provided in d_off and only those values may be passed to the kernel as a position to which to seek. So the patch here cannot be correct. * Finally, there is a concern about the conversion of d_off from 64-bit to 32-bit. It's true that some applications may not use this value (used for telldir/seekdir), but the same applies to any other value returned by the kernel (inode numbers in particular); glibc cannot know what values the application wants and so must return EOVERFLOW if any value from the kernel cannot be represented in the userspace type. The way for applications to avoid this is to be built with _FILE_OFFSET_BITS=64 so that the 64-bit interfaces are used, and this is the normal practice nowadays for most applications given the desire to support files over 2GB on 32-bit systems. Regarding the other issues described with the source code, note that nbytes is not provided by the user of glibc since getdents is an internal-only interface. There may well be issues there, but on any given system they would either be latent or appear every time the relevant code in getdents is executed, so it seems unlikely they affect any current system. In any case, please always avoid mixing different issues in a single bug; file separate issues for these separate problems if you think they are still present. Joseph, It seems you've mistakenly misread the bug report as claiming that making assumptions regarding the offset is somehow correct/justified, whereas, conversely, the bug reports just points to *glibc code* that makes the erroneous assumption. Hence, the report does not need to specify the filesystem on which the bug was triggered, because it points to *glibc code* that suffers from what you yourself acknowledge is a serious problem. Dan,
Please read my comments in more detail. Because offsets in directories are
opaque cookies (if you look at the autofs sources in the kernel, you'll see
that they are hash values for autofs, for example), and because the __lseek64
call's offset ends up being passed to the filesystem's llseek method which
expects just such an opaque cookies, it can never be correct to pass a value
calculated by adding up record lengths to __lseek64 on a file descriptor for a
directory - and so those parts of your patch (adjusting how last_offset is
computed) cannot be correct since last_offset is (only) used to pass to
__lseek64 for such a file descriptor. Maybe last_offset should be renamed to
make clear that in normal terms it is not an offset; it's an opaque value on
which no arithmetic is valid.
I do not see any glibc code (without your patch) that does any arithmetic on
the offset values; it only copies them. If you see any code that assumes that
d_off values are meaningful in arithmetic (as opposed to being opaque values
that can be used later to seek to a particular point in a directory), what
specific lines of code (in current git master) are they?
For reference, "grep last_offset getdents.c" shows the following for me, none
of which treat last_offset as anything other than an opaque value.
off64_t last_offset = -1;
if (last_offset != -1)
__lseek64 (fd, last_offset, SEEK_SET);
last_offset = d_off;
assert (last_offset != -1);
__lseek64 (fd, last_offset, SEEK_SET);
last_offset = kdp->d_off;
Similarly, d_off is purely treated as opaque.
The second and third issues are likely bugs (albeit latent bugs), but
*different* bugs, and one issue in the tracker should only ever be used for a
single issue with the library. Thus, those second and third issues (relating
to nbytes, not d_off) should be filed as two separate tracker issues.
|