Tested with glibc-2.15, the relevant code verified to be the same in current git. CHOST: x86_64-unknown-linux-gnu Kernel: 3.5.0 (vanilla) Whenever fchmodat() is called with AT_SYMLINK_NOFOLLOW on Linux, it unconditionally fails with ENOTSUP. The following snippet from sysdeps/unix/sysv/linux/fchmodat.c is responsible for that: #ifndef __NR_lchmod /* Linux so far has no lchmod syscall. */ if (flag & AT_SYMLINK_NOFOLLOW) { __set_errno (ENOTSUP); return -1; } #endif POSIX.2008 however specifies[1] that: [EOPNOTSUPP] The AT_SYMLINK_NOFOLLOW bit is set in the flag argument, path names a symbolic link, and the system does not support changing ^^^^^^^^^^^^^^^^^^^^^^^^^^ the mode of a symbolic link. In other words, I believe that fchmodat() should only return ENOTSUP if AT_SYMLINK_NOFOLLOW is requested on a symbolic link. On regular files, it should just change the mode. [1]:http://pubs.opengroup.org/onlinepubs/9699919799/functions/chmod.html
glibc can't check if it's a symlink itself without a race condition. What does the kernel do if AT_SYMLINK_NOFOLLOW is passed to the fchmodat syscall (both on symlinks, and on non-symlinks? (Any code expecting an lchmod syscall can clearly be removed from glibc in any case - chmod support for symlinks in future would have to be via a *at syscall to accord with current Linux kernel practice regarding adding syscalls - so #ifndef __NR_lchmod should be unconditional, #ifdef __NR_lchmod removed.) If the syscall does the right thing, I suppose the check for AT_SYMLINK_NOFOLLOW could be moved after the code using the fchmodat syscall, so that the issue only applies to old kernels (for which it is unavoidable because of the race).
Ah, I see the problem is that the fchmodat syscall doesn't take a flags argument at all. In that case I don't think this is fixable (given the race) without a new syscall, and then only fixable for kernels using the new syscall. (I.e., unless anyone sees a way to avoid the race this should be SUSPENDED until the kernel is fixed.)
I'm not familiar with the semantics of O_PATH on Linux, but would it be possible to work around thus bug by opening the pathname with O_PATH and then using fstat on it to determine if it's a link? In other words, does O_PATH permanently associate the file descriptor with the symbolic link itself, or just with the current name the symbolic link has in the directory it's contained it?
I don't know if O_PATH will do the trick. If it does, what do kernels before 2.6.39 (when O_PATH was introduced) do when O_PATH is used? (If they just ignore it and open the file as if O_PATH were not present, then we'd need to make sure not to try using O_PATH on older kernels, to avoid any side-effects from opening device files, for example.)
I seem to have a solution. In pseudo-code: if (flags & AT_SYMLINK_NOFOLLOW) { tmpfd = openat(fd, pathname, O_PATH|O_NOFOLLOW); if (tmpfd < 0) { errno = ENOTSUPP; return -1; } int ret = fchmod(tmpfd, mode); close(tmpfd); return ret; } Obviously this needs some fixes for glibc, like avoiding cancellation points.
Joseph, you're right; actually, I was unfortunately testing on a pre-O_PATH kernel and didn't notice this, so I can't say for sure yet whether the code does what's desired on a new kernel with O_PATH. I can confirm your fear about older kernels though; the kernel happily opens the underlying file, which could be bad if it's a device file, etc. In order to test for working O_PATH, one could first open "/" with O_PATH and attempt to call getdents on the resulting fd. Assuming the open succeeded, that would work if and only if O_PATH support is missing. If the open failed due to missing permissions, that would also mean O_PATH support is missing. If it failed for another reason, we probably have a resource exhaustion issue and can just pass the error up.
There are a few places with version number checks on the kernel version at runtime, I'd probably just use one of those to see if O_PATH is available (in the absence of a --enable-kernel=2.6.39 or later option that would define __ASSUME_O_PATH).
On modern Linux (just tested on 3.10.4), the following solution works with no race condition: 1. Check fstatat, fail if symlink. 2. Use openat with O_PATH|O_NOFOLLOW. 3. Use fstat on the resulting fd, fail if symlink. 4. Use fchmod on the resulting fd. Step 1 is not strictly necessary, but it's nice to definitively fail on symlinks that aren't racing, whereas the subsequent tests could fail spuriously on hitting the open file limit, etc. There are however failure cases that need to be considered: Step 2 can act as a plain O_RDONLY open if O_PATH is not available on the kernel in use. I believe this can be tested for by attempting to open /dev/null with O_PATH and trying to write to the resulting file descriptor. If it succeeds, the kernel lacks O_PATH. Note that, aside from the nasty case of opening a device file and having it do something when you open it, plain O_RDONLY would do just fine and would make this workaround work even on ancient kernels, BUT it will spuriously fail if the file is mode 000 and the caller is not root. Also, if O_PATH is not supported, O_NOFOLLOW will cause ELOOP if the target is a symlink, instead of opening a fd to the symlink itself, which would need to be treated as an error condition. Also note that, since O_NOFOLLOW Is being used, there is no danger of following a symlink to a device file if O_PATH is not supported. The bad behavior with devices could only happen if the device name were passed directly. Step 2 can also simply fail for reasons like having too many open files. Step 3 can also fail. On a large number of kernel versions (2.6 up through 3.5 or maybe a little later), file descriptors obtained by O_PATH are not valid for passing to fstat, fchmod, etc. I do not believe there is any workaround for this case, unless you want to fall back to just using O_RDONLY. Since current Linux admits a workaround, I believe a solution based on this approach should be applied to glibc. However, some more discussion is needed to determine how to handle old kernels and errors.
Actually, we still have a problem. I hadn't checked cases that should succeed, and in fact fchmod STILL fails on O_PATH files. So I still don't know any way to avoid race conditions changing the permissions on a file unless you have permission to open it...
I have solved the issue: while various syscalls such as fchmod, and even fstat on all but very-new kernels, fail on O_PATH file descriptors, the corresponding non-f-prefixed calls work on /proc/self/fd/%d. I will be using this approach in musl libc to fix the fchmodat issue in a completely race-free way that works on all O_PATH kernels, and works modulo the need for read permissions to open the file even on older kernels. I will also be using it to fix fchmod, fstat, etc. on O_PATH file descriptors, so that O_PATH becomes a conforming implementation of O_SEARCH and O_EXEC. I think glibc could do the same. (Note that glibc may want to include the test I mentioned in an earlier post on this issue to make sure O_PATH is supported, in case unintended opening of files with O_RDONLY is deemed dangerous, e.g. for devices.)
(In reply to joseph@codesourcery.com from comment #4) > I don't know if O_PATH will do the trick. If it does, what do kernels > before 2.6.39 (when O_PATH was introduced) do when O_PATH is used? (If > they just ignore it and open the file as if O_PATH were not present, then > we'd need to make sure not to try using O_PATH on older kernels, to avoid > any side-effects from opening device files, for example.) O_PATH will do the trick for kernels 3.6 and newer; here's a discussion where qemu implemented a solution similar to Rich's pseudo-solution: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg01600.html https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg01812.html In the qemu case, the code is enforcing that no symlinks are dereferenced, so it is not quite the same as the general solution for making AT_SYMLINK_NOFOLLOW be implemented correctly, but the point remains that the fstat() test for symlinks followed by chmod(/proc/self/fd) is viable and avoids TOCTTOU races. Since the kernel has not yet accepted Greg's patch for a new fchmodat2() syscall, we could at least enhance the user-space wrapper to do the right thing.
(In reply to Eric Blake from comment #11) > (In reply to joseph@codesourcery.com from comment #4) > > I don't know if O_PATH will do the trick. If it does, what do kernels > > before 2.6.39 (when O_PATH was introduced) do when O_PATH is used? (If > > they just ignore it and open the file as if O_PATH were not present, then > > we'd need to make sure not to try using O_PATH on older kernels, to avoid > > any side-effects from opening device files, for example.) > > O_PATH will do the trick for kernels 3.6 and newer; here's a discussion > where qemu implemented a solution similar to Rich's pseudo-solution: > https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg01600.html > https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg01812.html I think the Linux 3.6 refers to fstat support for O_PATH descriptors. fstatat with AT_EMPTY_PATH should work for them since kernel 2.6.39: commit 65cfc6722361570bfe255698d9cd4dccaf47570d Author: Al Viro <viro@zeniv.linux.org.uk> Date: Sun Mar 13 15:56:26 2011 -0400 readlinkat(), fchownat() and fstatat() with empty relative pathnames For readlinkat() we simply allow empty pathname; it will fail unless we have dfd equal to O_PATH-opened symlink, so we are outside of POSIX scope here. For fchownat() and fstatat() we allow AT_EMPTY_PATH; let the caller explicitly ask for such behaviour. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Patches posted: https://sourceware.org/ml/libc-alpha/2020-01/msg00510.html
The master branch has been updated by Florian Weimer <fw@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6b89c385d8bd0700b25bac2c2d0bebe68d5cc05d commit 6b89c385d8bd0700b25bac2c2d0bebe68d5cc05d Author: Florian Weimer <fweimer@redhat.com> Date: Wed Jan 22 18:56:04 2020 +0100 io: Implement lchmod using fchmodat [BZ #14578]
The master branch has been updated by Florian Weimer <fw@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=752dd17443e55a4535cb9e6baa4e550ede383540 commit 752dd17443e55a4535cb9e6baa4e550ede383540 Author: Florian Weimer <fweimer@redhat.com> Date: Wed Jan 22 19:01:20 2020 +0100 Linux: Emulate fchmodat with AT_SYMLINK_NOFOLLOW using O_PATH [BZ #14578] /proc/self/fd files are special and chmod on O_PATH descriptors in that directory operates on the symbolic link itself (like lchmod).
Fixed in glibc 2.32. I will backport this into the release branches in a week or so.
This appears to trigger an XFS kernel bug: https://www.sourceware.org/ml/libc-alpha/2020-02/msg00467.html
The master branch has been updated by Florian Weimer <fw@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c commit a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c Author: Florian Weimer <fweimer@redhat.com> Date: Tue Feb 18 17:52:27 2020 +0100 Linux: Work around kernel bugs in chmod on /proc/self/fd paths [BZ #14578] It appears that the ability to change symbolic link modes through such paths is unintended. On several file systems, the operation fails with EOPNOTSUPP, even though the symbolic link permissions are updated. The expected behavior is a failure to update the permissions, without file system changes. Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
I've posted a proposed kernel patch that will make it so emulation isn't needed: https://lkml.org/lkml/2020/9/10/671