Bug 17977 - gethostbyname_r hangs forever
Summary: gethostbyname_r hangs forever
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: 2.22
Assignee: Dmitry V. Levin
URL: https://sourceware.org/ml/libc-alpha/...
Keywords:
Depends on:
Blocks: 5375
  Show dependency treegraph
 
Reported: 2015-02-13 20:51 UTC by Eric Newton
Modified: 2015-12-04 18:28 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
proposed patch for the bug (174 bytes, patch)
2015-02-13 20:51 UTC, Eric Newton
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Newton 2015-02-13 20:51:55 UTC
Created attachment 8126 [details]
proposed patch for the bug

A large (java) multi-threaded server process was found to be hanging on calls to gethostbyname_r.

It was further determined that it only hung when /etc/hosts.conf contained "reorder on".

Inspecting the source for _res_hconf_reorder_addrs, it is straightforward to see the bug.

Assume there are 3 threads executing the function at the same time. All see num_ifs is -1 at line 407, and attempt to get the lock on line 422.

One thread gets the lock at line 422, initializes the static data structure, and unlocks the lock.

The next thread gets the lock.  It double-checks the value of num_ifs at line 425.  Seeing that it is now >0, it skips the initialization.

But this thread does not unlock the lock.

The last thread hangs on the lock forever.
Comment 1 Andreas Schwab 2015-02-13 21:31:26 UTC
Funny.  The bug was introduced by this change:

        [BZ #5375]
        * resolv/res_hconf.c (_res_hconf_reorder_addrs): Fix locking when
        initializing interface list.

The proposed patch in bug 5375 got it right.
Comment 2 Florian Weimer 2015-02-17 19:56:45 UTC
Eric, did this cause service availability issues in your setup?  Do you think it would be possible to trigger this deliberately?
Comment 3 Eric Newton 2015-02-17 21:12:23 UTC
The error did cause service availability failures.

Note:
* it requires the "reorder on" configuration
* it requires at least 3 threads attempting to run the _res_hconf_reorder_addrs function concurrently to lock up a single thread
* these concurrent accesses must be on the first run of the function
* future requests of _res_hconf_reorder_addrs will work just fine

In this case, a component of a large distributed database was asked by hundred of other services to perform work at the moment it came online. These requests locked other resources before performing a name lookup.

It would be hard, but not impossible to do this deliberately to a service.

It was not deliberately caused when it was found.  Just (un)lucky.
Comment 4 Florian Weimer 2015-02-17 21:24:09 UTC
(In reply to Eric Newton from comment #3)
> Note:
> * it requires the "reorder on" configuration
> * it requires at least 3 threads attempting to run the
> _res_hconf_reorder_addrs function concurrently to lock up a single thread
> * these concurrent accesses must be on the first run of the function
> * future requests of _res_hconf_reorder_addrs will work just fine

> It would be hard, but not impossible to do this deliberately to a service.

Based on your third note, it seems to me that this can only happen once per process, right?  Wouldn't this make it difficult to trigger *deliberately* in a typical long-running multi-threaded service?

(I'm trying to figure out if we have to treat this as a security issue or a mere bug.)
Comment 5 Eric Newton 2015-02-18 17:21:08 UTC
You are correct: it only happens once per process.
Comment 6 Christopher 2015-06-17 22:50:31 UTC
Is there any way this patch can be applied? It seems like a small, trivial fix to a potentially big, hard to debug, problem.

It'd be really useful to get this patch in, so it can be applied downstream, in https://bugzilla.redhat.com/show_bug.cgi?id=1192621
Comment 7 Sourceware Commits 2015-06-23 09:51:51 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  b57525f1a376149840f740a31535681c07152ba4 (commit)
       via  47852c972d1ad80d8b38d9e94507b27df0ede421 (commit)
      from  554edb23ffc7a953ca86309cc5f02dbd1a63abe0 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=b57525f1a376149840f740a31535681c07152ba4

commit b57525f1a376149840f740a31535681c07152ba4
Author: Dmitry V. Levin <ldv@altlinux.org>
Date:   Thu Jun 18 21:40:46 2015 +0000

    Fix potential hanging of gethostbyaddr_r/gethostbyname_r
    
    When "reorder" resolver option is enabled, threads of a multi-threaded process
    could hang in gethostbyaddr_r, gethostbyname_r, or gethostbyname2_r.
    
    Due to a trivial bug in _res_hconf_reorder_addrs, simultaneous
    invocations of this function in a multi-threaded process could result to
    _res_hconf_reorder_addrs returning without releasing the lock it holds,
    causing other threads to block indefinitely while waiting for the lock
    that is not going to be released.
    
    [BZ #17977]
    * resolv/res_hconf.c (_res_hconf_reorder_addrs): Fix unlocking
    when initializing interface list, based on the bug analysis
    and the patch proposed by Eric Newton.
    * resolv/tst-res_hconf_reorder.c: New test.
    * resolv/Makefile [$(have-thread-library) = yes] (tests): Add
    tst-res_hconf_reorder.
    ($(objpfx)tst-res_hconf_reorder): Depend on $(libdl)
    and $(shared-thread-library).
    (tst-res_hconf_reorder-ENV): New variable.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=47852c972d1ad80d8b38d9e94507b27df0ede421

commit 47852c972d1ad80d8b38d9e94507b27df0ede421
Author: Dmitry V. Levin <ldv@altlinux.org>
Date:   Mon Jun 22 09:57:14 2015 +0000

    _res_hconf_reorder_addrs: fix typo in comment
    
    * resolv/res_hconf.c (_res_hconf_reorder_addrs): Fix typo in comment.

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                      |   16 ++++++
 NEWS                           |   20 ++++----
 resolv/Makefile                |    4 ++
 resolv/res_hconf.c             |    6 +-
 resolv/tst-res_hconf_reorder.c |  112 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 145 insertions(+), 13 deletions(-)
 create mode 100644 resolv/tst-res_hconf_reorder.c
Comment 8 Dmitry V. Levin 2015-06-23 09:55:51 UTC
Fixed in master.
Comment 10 Florian Weimer 2015-12-04 18:28:12 UTC
Please disregard the previous comment.