This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: system and popen fail in case of big application


Thanks for taking such a close look!

On Wed, Sep 12, 2018 at 6:27 PM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:

> The code you linked [1] contains some outdated information regarding
> vfork and posix_spawn:
>
>   - vfork would be supported on Linux not because its a good interface,
>     but rather than Linux policy is to not broke userspace in any case.
>     And it would be supported only on architectures that already provide
>     it, new ports do not automatically export vfork and they are not
>     encouraged to do so. Currently ARC, alpha, s390, microblame, x86, sH,
>     powerpc, hexagen, parisc, m68k, arm, and aarch64 do so (and aarch64
>     only if compat mode is enabled, a 64-bit kernel only won't have vfork
>     syscall). For remaining arches, glibc will just use a clone plus
>     required flags (CLONE_VM | CLONE_VFORK | SIGCHLD).
>
>   - Almost sure the clone issues described is due the fact of old tid
>     caching due CLONE_VM and this semantic has been changed in recent
>     glibc version.  clone now should not touch any shared internal
>     structures, the only constraint is it still requires a stack argument.

Can you then update the bug from WONTFIX ?
https://www.sourceware.org/bugzilla/show_bug.cgi?id=10311

> Also, on vfork oath I am not very confident openjdk implementation works
> around correctly on the issue that posix_spawn has fixed recently:
>
>   - inherent usage issue with signal handlers (BZ#14750): I don't see
>     any signal handling in neither Java_java_lang_ProcessImpl_forkAndExec,
>     startChild, vforkChild, or childProcess (maybe it is handle by the
>     Java_java_lang_ProcessImpl_forkAndExec caller?).

We've never worried about signals arriving between vfork and exec, and
no one has ever reported a problem.

>   - The closeDescriptors is inherent racy in the way it is implemented
>     for vfork case: either by iterating over proc/%d/fd or _SC_OPEN_MAX
>     you can't guarantee that another thread won't open another file.
>     Worse, taking locks in the vfork helper case can't be quite
>     troublesome for the aforementioned case where helper process is
>     killed before calling either _exit or _execv*.

On vfork, parent and child must have independent file descriptor
tables; else how could the child close all file descriptors without
affecting the parent?

> You may have not crossed any of this issue due the fact pthread cancellation
> is not widely used, some compiler luck with code generation, and with runtime
> not issuing misguided signals.  In any case I would say it would be better to
> use the posix_spawn strategy by issuing a helper process for Linux as well.
>
> And what I would really like is to see what prevents openjdk and other programs
> to use posix_spawn directly. It seems now that you need an extension to change
> the working directory and to close the inherited file descriptor. We are
> discussing the former and I think it would be feasible to raise it to Austin
> group as an extension.

Right.  Java needs to change the process environment and working
directory, and fiddle with file descriptors.
I do worry that an interface like posix_spawn will never be quite
complete enough to allow us to spawn the child we want in one step,
but there will remain some reason why we need to spawn a small helper
program first.

> For former I see there is no easy way to provide a similar closefrom
> function without either being racy (as *BSD and openjdk implementations)
> or to remove some scalability by adding serialization (for open* syscall
> to avoid create new FD while closeall is closing them). A better alternative
> would to request for some kernel helper, but I am not convinced that
> current way of explicit open all file descriptors with O_CLOEXEC is not
> the better option. Unfortunately it does not help run external code
> (through shared libraries) that still call open in default mode.

It seems like traditional methods to create file descriptors will
never set the CLOEXEC flag by default, and most programmers will not
be aware of the problem, so those of us implementing subprocess
creation cannot ever rely on all the file descriptors having the flag
set.

After vfork, there should be no race with closefrom (CLONE_FILES is not used).

> [1] http://hg.openjdk.java.net/jdk/jdk/file/tip/src/java.base/unix/native/libjava/ProcessImpl_md.c


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]