This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

pthreads & epoll, take 2


The problem that epoll_wait() is not a pthread cancellation point still
remains unsolved. An ACK/NAK on this issue would be very useful.

Since my last email to this list, I've rummaged through the Linux kernel
source code, inquired on the LKML, and have returned to this list. So
far, both from empirical testing and actual reading of the source code,
glibc seems to be the place where such functionality should be
implemented, if at all. The kernel seems to be doing everything
correctly in this case.

In other words, after doing my homework, glibc seems to be the _correct_
place to be asking this question. If you need appropriate patches, I can
work on providing those.

Attached is a snippet of the LKML thread.

-Vadim Lobanov

---------- Forwarded message ----------
Date: Fri, 23 Sep 2005 12:09:00 -0700 (PDT)
From: Davide Libenzi <>
To: Vadim Lobanov <>
Cc: Linux Kernel Mailing List <>
Subject: Re: [RFC] epoll

On Fri, 23 Sep 2005, Vadim Lobanov wrote:

> On Fri, 23 Sep 2005, Davide Libenzi wrote:
>
>>>>> 3. Wakeup
>>>>> As determined by testing with userland code, the sys_tgkill() and
>>>>> sys_tkill() functions currently will NOT wake up a sleeping
>>>>> epoll_wait(). Effectively, this means that epoll_wait() is NOT a pthread
>>>>> cancellation point. There are two potential issues with this:
>>>>> - epoll_wait() meets the unofficial(?) definition of a "system call that
>>>>> may block".
>>>>> - epoll_wait() behaves differently from poll() and friends.
>>>>
>>>> The epoll_wait() wait loop is the standard one that even poll() uses (prep
>>>> wait, make interruptible, test signals, sched timeo). So if poll() is woke
>>>> up, so should epoll_wait(). A minimal code snippet that proves poll()
>>>> behing woke up, and epoll_wait() not, would help.
>>>>
>>>
>>> Certainly. :-) See end of email for sample program.
>>
>> I'm afraid you need to bug the glibc guys, since I think they wrap
>> sys_poll(). Try the test program below, when defining _X_, that makes it
>> call sys_poll() directly. It will have the same epoll_wait() behaviour.
>
> I'm still a bit confused by how the pthread implementation fits
> together. Correct me if the following is wrong, please:
> Whenever the user wants to cancel a pthread, glibc eventually calls
> {sys-}tgkill() upon the given thread, causing the kernel to return EINTR
> to the blocking system call, in this case epoll_wait(). It is glibc's
> job to catch this return value and realize that the thread is ready to be
> killed, which it is not doing in the case of epoll_wait().
> Or is the "current thread has been cancelled and should be killed" check
> happening elsewhere / in some other way?

Please do not make me look at glibc/pthread code since I do not have time
ATM. I can only speculate on what it is happening. The sys_poll() and
sys_epoll_wait() system calls, when called directly, have the same
behaviour (like you can see in the test code snippet). They both return
EINTR to the caller. When you call glibc's poll(), the behaviour changes
and function is explicitly made a pthread cancellation point. The glibc's
epoll_wait() is not wrapped by the same code, and this makes it unable to
be pthread-canceled. Try to post to glibc the code snippet, and see if
they want to make epoll_wait() pthread-cancel enabled too.

- Davide


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]