This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fallout from dlopen() blocking SIGSYS


On 5/12/2019 17:03, Florian Weimer wrote:
> I have re-reviewed the referenced patch and posted:
> 
>   <https://sourceware.org/ml/libc-alpha/2019-12/msg00175.html>
>   <https://sourceware.org/ml/libc-alpha/2019-12/msg00176.html>
>   <https://sourceware.org/ml/libc-alpha/2019-12/msg00177.html>
> 
> Lazy binding is buggy and has races, but with the new patches, the
> NODELETE changes should not make matters worse.

Thanks, it seems like that should put out the immediate fire at least.

> But I think we do need something better for seccomp sandboxing in the
> medium term, so I'm happy to have a larger conversation now.

We'd probably need to include the Chromium people in this. The
implementation we use is to a large extent based on theirs, and, IIRC,
it's also that team that did parts of the initial implementation of
seccomp-bpf in the kernel. The (only) reason this now affected Firefox
first is that dlopen() was the first to block signals, and we have a use
case where we need to do that inside the sandbox, and (apparently)
Chromium doesn't (yet).

But if the signal blocking is going to be required for other libc calls
to work, I would assume it is going to risk breaking Chromium too, as
they also use the SECCOMP_RET_TRAP mechanism in several places:

https://chromium.googlesource.com/chromium/src/+/refs/heads/master/services/service_manager/sandbox/linux/bpf_network_policy_linux.cc#38
https://chromium.googlesource.com/chromium/src/+/refs/heads/master/services/service_manager/sandbox/linux/bpf_gpu_policy_linux.cc#75
https://chromium.googlesource.com/chromium/src/+/refs/heads/master/services/service_manager/sandbox/linux/bpf_audio_policy_linux.cc#132

We could try to answer some of your questions as we're also familiar
with the code but it'd be like getting second hand information to some
extent.

> For (a), we really need a list of system calls which are safe to perform
> in such critical sections.  Can we call your interposed malloc, or will
> that try to open files in /proc in some cases?

It should do an anonymous mmap, AFAIK. We TRAP on

https://searchfox.org/mozilla-central/source/security/sandbox/linux/SandboxFilter.cpp

Anything trying to hit the filesystem:
open, openat, access, accessat, stat, lstat, statat, chmod, link,
symlink, rename, mkdir, rmdir, unlink, readlink, readlinkat, faccessat,
statx (in the future)

Anything affecting other processes or leaking too much data about the
system:
tkill, prctl (sometimes), getppid, connect, socketpair, socketcall,
sched, uname, fcntl, sched_getparam, sched_getscheduler,
sched_setscheduler (sometimes)

> When we fix bug 25098 and adopt clone3, you might be a bit of a problem
> because of the in-memory flags argument for clone3, and you can't
> intercept the system call due to the blocked signals.

Yes, that looks like it would be a serious problem, for both browsers:

https://searchfox.org/mozilla-central/rev/ea63a0888d406fae720cf24f4727d87569a8cab5/security/sandbox/linux/SandboxFilter.cpp#325

https://chromium.googlesource.com/chromium/src/+/refs/heads/master/sandbox/linux/seccomp-bpf-helpers/syscall_parameters_restrictions.cc#140

It sounds like what Christian Brauner is proposing would solve this, but
we can't rely on it being present - and likely won't be able to due to
backwards compatibility for many more years.

Some implementations in glibc try to use a new syscall, and if it
doesn't exist on the current kernel (ENOSYS), fall back to older
interfaces. If that's possible for clone3 usage, then we'd simply return
ENOSYS on that and force the fallback to regular clone, unless the
kernel is new enough that we can filter. That would decouple glibc's
ability to use the new syscall from the state of the seccomp filtering
implementation in the kernel. Could that work here?

We currently do this for statx, which just gets an ENOSYS instead of a
TRAP - glibc (and rust's stdlib!) will happily use their fallback paths
until we write a broker implementation for it.

-- 
GCP


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]