Indefinite hang in getaddrinfo / check_pf / make_request
Tue Sep 29 22:05:00 GMT 2015
On Sep 24, 2015, at 11:36 AM, Steven Schlansker <firstname.lastname@example.org> wrote:
> On Sep 22, 2015, at 9:59 PM, Steven Schlansker <email@example.com> wrote:
>>> On Sep 22, 2015, at 9:04 PM, Paul Pluzhnikov <firstname.lastname@example.org> wrote:
>>> On Tue, Sep 22, 2015 at 8:53 PM, Steven Schlansker
>>> <email@example.com> wrote:
>>>> We found the following issue:
>>> You may be seeing https://sourceware.org/bugzilla/show_bug.cgi?id=12926 instead.
>>> See if that patch has been applied to your sources as well.
>> Thanks for finding this. While that fix is not applied to our deployed version,
>> I think the symptoms are slightly different
> Thanks Paul and Adhemerval for the advice. I believe I have evidence that this is
> not the same issue as either 15946 or 12926.
> I am going to spend some time trying to distill down a test case that just exercises the check_pf code and see if I can reproduce in isolation.
> In the meantime, does anyone have any ideas for further diagnostics that would be useful? I'm not sure how to check the kernel side of the netlink socket effectively,
> to see if it actually tried to reply or not...
Hello again, in case anyone stumbles across this in the future --
I got a test case, and narrowed it down further. It seems to be related
to incorrect kernel handling of the netlink sockets; under contention
they can get lost:
Kernel 4.0.4 is known to be affected. We're testing out 4.0.9
in the hopes it is not. So this is in fact a new bug, albeit
not a glibc bug.
Thank you for your time.
More information about the Libc-help