[PATCH] __check_pf: Add a cancellation cleanup handler [BZ #20975]

Noah Goldstein goldstein.w.n@gmail.com
Thu Apr 27 20:42:50 GMT 2023


On Thu, Apr 27, 2023 at 3:06 PM H.J. Lu via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> There are reports for hang in __check_pf:
>
> https://github.com/JoeDog/siege/issues/4
>
> It is reproducible only under specific configurations:
>
> 1. Large number of cores (>= 64) and large number of threads (> 3X of
> the number of cores) with long lived socket connection.
> 2. Low power (frequency) mode.
> 3. Power management is enabled.
>
> While holding lock, __check_pf calls make_request which calls __sendto
> and __recvmsg.  Since __sendto and __recvmsg are cancellation points,
> lock held by __check_pf won't be released and can cause deadlock when
> thread cancellation happens in __sendto or __recvmsg.  Add a cancellation
> cleanup handler for __check_pf to unlock the lock when cancelled by
> another thread.  This fixes BZ #20975 and the siege hang issue.
> ---
>  sysdeps/unix/sysv/linux/Makefile   |  2 ++
>  sysdeps/unix/sysv/linux/check_pf.c | 15 +++++++++++++++
>  2 files changed, 17 insertions(+)
>
> diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
> index aec7a94785..0160be8790 100644
> --- a/sysdeps/unix/sysv/linux/Makefile
> +++ b/sysdeps/unix/sysv/linux/Makefile
> @@ -529,6 +529,8 @@ sysdep_headers += \
>  sysdep_routines += \
>    netlink_assert_response \
>    # sysdep_routines
> +
> +CFLAGS-check_pf.c += -fexceptions
>  endif
>
>  # Don't compile the ctype glue code, since there is no old non-GNU C library.
> diff --git a/sysdeps/unix/sysv/linux/check_pf.c b/sysdeps/unix/sysv/linux/check_pf.c
> index b157c5126c..2b0b8b6368 100644
> --- a/sysdeps/unix/sysv/linux/check_pf.c
> +++ b/sysdeps/unix/sysv/linux/check_pf.c
> @@ -292,6 +292,14 @@ make_request (int fd, pid_t pid)
>    return NULL;
>  }
>
> +#ifdef __EXCEPTIONS
> +static void
> +cancel_handler (void *arg __attribute__((unused)))
> +{
> +  /* Release the lock.  */
> +  __libc_lock_unlock (lock);
> +}
> +#endif
>
>  void
>  attribute_hidden
> @@ -304,6 +312,10 @@ __check_pf (bool *seen_ipv4, bool *seen_ipv6,
>    struct cached_data *olddata = NULL;
>    struct cached_data *data = NULL;
>
> +#ifdef __EXCEPTIONS
> +  /* Make sure that lock is released when the thread is cancelled.  */
> +  __libc_cleanup_push (cancel_handler, NULL);
> +#endif
>    __libc_lock_lock (lock);
>
>    if (cache_valid_p ())
> @@ -338,6 +350,9 @@ __check_pf (bool *seen_ipv4, bool *seen_ipv6,
>         }
>      }
>
> +#ifdef __EXCEPTIONS
> +  __libc_cleanup_pop (0);
> +#endif
>    __libc_lock_unlock (lock);
>
>    if (data != NULL)
> --
> 2.40.0
>

Can we add a test the will just XFAIL (or XPASS) if cores < 64?


More information about the Libc-alpha mailing list