This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: clone with CLONE_VM behavior


On 04/13/2016 09:32 AM, Adhemerval Zanella wrote:
> In fact, pthread does not really require it, since it will use CLONE_THREAD
> and the glibc clone implementation won't change THERAD_SELF pid/tid field.
> So the question is if glibc really should change pid/tid value for
> CLONE_VM flag.
> 
> My impression is to allow programs that might use this flag to be able
> to use the raise function and force pthread call to fail (due invalid
> handle), however I am not convinced it is the right path to take
> (and with proposed solution on BZ#15368 I think using cached value is
> not an option).

Florian and I discussed this briefly, and I think we came to some more
dire conclusions. After a cup of coffee and further discussions with
Adhemerval and Szabolcs Nagy I realized things are not so bad. I would
still like to see Florian review this list and agree with me.

I went through several questions when thinking about this issue, and so
I wrote them all down, even though only (e) applies.

(a) Non-problem: vfork.

All glibc vfork implementations set tcb->pthread->pid to -PID during the vfork
operation. Calls to getpid or getppid in a signal handler at this point will
return valid results from having called getpid or getppid from the kernel
(because they see a negative value in the cache), and will *not* refresh the 
tcb->pthread->tid cache (tid is non-zero) because doing so would result in
the cache having a stale value in the child and a signal handler could observe
that.

Corollary: All async-signal-safe functions that operate on the tid/pid cache
           directly must understand -PID/-TID.

(b) Non-problem: failed exec after vfork changes parent errno value.

A failed exec after vfork sets errno, and that sets the parent's errno, and
thus vfork technically sets errno but is not described as setting errno.
POSIX says that since vfork doesn't describe setting errno that the value
is unspecified, and so this is OK from a standards perspective that calling
vfork may change the parent's errno.

(c) Non-problem: pthread_join from another thread to a vfork-ing thread.

A vfork-ing thread will have it's child write -PID into it's own PID cache,
but the TID remains intact. Therefore another thread may still call
pthread_join on the thread that is vfork-ing and wait for it to exit.

(d) Problem: raise is not async-signal safe.

Still true. See https://sourceware.org/bugzilla/show_bug.cgi?id=15368
Though with (e) fixed there is one less scenario to worry about.

(e) Problem: pthread_join from another thread to a posix_spawn'ing thread.

The test nptl/tst-exec1.c calls pthread_join on a thread using posix_spawn,
and this results in a testsuite failure given the new posix_spawn
implementation. The thread spawning the child using posix_spawn is able
to accelerate the spawn with clone, but that results in tid/pid caches
set to -1 and thus the pthread_join from the other thread fails, and the
test fails.

The real problem is that __clone sets tid/pid to -1 in child using
parent memory. 

All implmentations of clone.S in glibc write -1 to tid/pid in the
tcb->pthread->pid/tid fields. This modifies the tid/pid of the parent.
It would seem that the purpose of this is to reset the tid/pid cache
such that a signal handler calling gepid/getpid in the child will
call the syscall to get the new correct results. The consequence is
that any external parallel thread calling pthread_join will see the
value of -1 in tid and be unable to wait on the thread calling vfork
(there is no way to get the right tid value anymore). You should be
able to join on the thread calling posix_spawn. I expect it was an
oversight that pthread_join fails to work in this case e.g. vfork
from a thread was not considered.

There are two theoretical solutions, but only one is practical:

(e.1) Remove setting tid/pid to -1 in __clone and provide users with a clone
      that does *nothing* to the parent.

      This would fix a complaint from ASAN which has them using a direct clone
      syscall because our clone wrapper breaks the parent's tid/pid with CLONE_VM
      but not CLONE_THREAD (they use this for a detached pseudo-monitor thread).

      https://github.com/llvm-mirror/compiler-rt/blob/master/lib/sanitizer_common/sanitizer_linux.cc#L881

      Verify that posix_spawn{p} does not modify tid/pid. The child created by
      posix_spawn{p} will have default signal dispositions so no signal handler
      exists that can observe the wrong tid/pid.

      Change the tst-pid* tests in glibc to verify tid/pid does not change.

(e.2) Create a fake tcb->pthread.

      In the parent create a tcbhead_t, and struct pthread, and set them up with
      enough data they are valid, and reference the parent's TLS data or a copy
      of the parent's TLS data.

      Call clone with with CLONE_SETTLS pointing at the tcbhead_t copy, with
      CLONE_PARENT_SETTID pointing at the pid, and CLONE_CHILD_SETTID pointing
      at the tid. This will mean that you enter the child with unique data
      for struct pthread, tcbhead_t, and unique and correct values for pid/tid
      which are the same.

      You can immediately take a signal and the returned values from getpid
      and getppid in a signal handler would be valid.

      It is unclear exactly how hard it will be to get a valid working tcbhead_t
      or struct pthread setup and initialized for everything to operate
      correctly.

In practice (1) is the most reasonable.

The only problem is if we have an expectation that user code can call clone
or __clone2 and have getpid and getppid work correctly in the child? I think
the answer to that is: No we don't have any such expectation because clone
is designed for low-level runtime implementation.

I support removing the -1 tid/pid reset in clone/__clone2 (all clone.S assembly
files for all machines).

Other comments?

-- 
Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]