This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: system and popen fail in case of big application

On 12/09/2018 16:45, Martin Buchholz wrote:
> On Wed, Sep 12, 2018 at 9:29 AM, Zack Weinberg <> wrote:
>> I don't want to encourage people to use posix_spawn because
>> posix_spawn is a badly designed API.  It's difficult to use correctly.
>> It's *tedious* to use correctly, which means people won't bother.  It
>> can't do everything you might want to do on the child side (witness
>> the discussion of adding an extension to let it do chdir()).  Its
>> behavior in case of errors is underspecified.  And it might be
>> implemented in terms of fork, which means it doesn't guarantee to
>> solve Sergey's problem.
> I seem to be in agreement with Zack's eloquent post.  From java's
> point of view, we need to avoid the momentary doubling of memory
> caused by fork()ing a huge process, and we need to be able to specify
> a working directory for the child.  One strategy is to use posix_spawn
> to start a small helper process which then in turn execs the real
> child after necessary set up (including chdir).

The code you linked [1] contains some outdated information regarding
vfork and posix_spawn:

  - vfork would be supported on Linux not because its a good interface,
    but rather than Linux policy is to not broke userspace in any case.
    And it would be supported only on architectures that already provide
    it, new ports do not automatically export vfork and they are not 
    encouraged to do so. Currently ARC, alpha, s390, microblame, x86, sH,
    powerpc, hexagen, parisc, m68k, arm, and aarch64 do so (and aarch64 
    only if compat mode is enabled, a 64-bit kernel only won't have vfork
    syscall). For remaining arches, glibc will just use a clone plus
    required flags (CLONE_VM | CLONE_VFORK | SIGCHLD).

  - Almost sure the clone issues described is due the fact of old tid
    caching due CLONE_VM and this semantic has been changed in recent
    glibc version.  clone now should not touch any shared internal 
    structures, the only constraint is it still requires a stack argument.

Also, on vfork oath I am not very confident openjdk implementation works
around correctly on the issue that posix_spawn has fixed recently:

  - inherent usage issue with signal handlers (BZ#14750): I don't see
    any signal handling in neither Java_java_lang_ProcessImpl_forkAndExec,
    startChild, vforkChild, or childProcess (maybe it is handle by the
    Java_java_lang_ProcessImpl_forkAndExec caller?).

  - cancellation state being shared with the parent (BZ#10354): at least
    two cancellation entrypoints are called on childProcess (write and
    close), which in a vfork state is undefined behaviour if cancellation
    in pending and act upon it.

  - vfork limitation of possible parent clobber due stack spilling:
    although the call stack does have some care of assuming a shared
    memory environment, it does not have any treatment of possible
    compiler spilling the parent's stack (glibc's posix_spawn uses a
    mmaped stack for this issue).

  - Another issue is the case where the helper process itself
    is killed somehow (there were some discussion about it on BZ#22273).

  - The closeDescriptors is inherent racy in the way it is implemented
    for vfork case: either by iterating over proc/%d/fd or _SC_OPEN_MAX
    you can't guarantee that another thread won't open another file. 
    Worse, taking locks in the vfork helper case can't be quite
    troublesome for the aforementioned case where helper process is
    killed before calling either _exit or _execv*.

You may have not crossed any of this issue due the fact pthread cancellation
is not widely used, some compiler luck with code generation, and with runtime
not issuing misguided signals.  In any case I would say it would be better to
use the posix_spawn strategy by issuing a helper process for Linux as well.

And what I would really like is to see what prevents openjdk and other programs
to use posix_spawn directly. It seems now that you need an extension to change 
the working directory and to close the inherited file descriptor. We are 
discussing the former and I think it would be feasible to raise it to Austin 
group as an extension.

For former I see there is no easy way to provide a similar closefrom
function without either being racy (as *BSD and openjdk implementations)
or to remove some scalability by adding serialization (for open* syscall
to avoid create new FD while closeall is closing them). A better alternative
would to request for some kernel helper, but I am not convinced that
current way of explicit open all file descriptors with O_CLOEXEC is not
the better option. Unfortunately it does not help run external code
(through shared libraries) that still call open in default mode.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]