rfc/patch: debuginfod client $DEBUGINFOD_PROGRESS env var

Mark Wielaard mark@klomp.org
Tue Dec 10 23:01:00 GMT 2019


Hi Frank,

On Wed, 2019-12-04 at 16:10 -0500, Frank Ch. Eigler wrote:
>     debuginfod: usability tweaks, incl. $DEBUGINFOD_PROGRESS client
> support
>     
>     This facility allows a default progress-printing function
>     to be installed if the given environment variable is set.

I like this idea but have a bit of a security concern about the ability
to set it to any file. If someone manages to set it then they can
overwrite anything a process using the library can write to. I rather
it would just use stderr always when set since it is only meant as a
human debugging device.

>     Some larger usage experience (systemtap/kernels) indicates
>     the default timeout is too short, so bumped it to 30s.

My first thought is that we might need two timeouts. The original 5s
timeout is a good inactivity timeout, if the server couldn't be
connected to in 5 seconds, doesn't provide a response within 5 seconds
or a download doesn't see new data within 5 seconds that probably means
the data won't arrive, or too slow to be useful. But for larger data we
might want a timeout for the total download time, maybe 30 seconds is
too low for that. Given how large kernel debuginfo is it might take
several minutes.

libcurl does say about CURLOPT_TIMEOUT: "Since this option puts a hard
limit on how long time a request is  allowed to take, it has limited
use in dynamic use cases with varying transfer times. That is
especially apparent when using the multi interface, which may queue the
transfer, and that time is included." And it also has a
CURLOPT_CONNECTTIMEOUT option.

>     Signed-off-by: Frank Ch. Eigler <fche@redhat.com>
> 
> diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
> index 8aa2944..a9483d0 100644
> --- a/debuginfod/ChangeLog
> +++ b/debuginfod/ChangeLog
> @@ -1,3 +1,13 @@
> +2019-12-04  Frank Ch. Eigler  <fche@redhat.com>
> +
> +	* debuginfod-client.c (default_progressfn): New function.
> +	(debuginfod_begin): Use it if $DEBUGINFOD_PROGRESS set.
> +	(server_timeout): Bump to 30 seconds.
> +	(debuginfod_query_server): Call progressfn -after- rather than
> +	before curl ops, to make it likely that a successful transfer
> +	results in final a=b call.  Tweak cleanup sequence.
> +	* debuginfod.h: Document $DEBUGINFOD_PROGRESS name.
> +
>  2019-11-26  Mark Wielaard  <mark@klomp.org>
>  
>  	* Makefile.am (BUILD_STATIC): Add needed libraries for libdw and
> diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
> index 6e62b86..3ee5e48 100644
> --- a/debuginfod/debuginfod-client.c
> +++ b/debuginfod/debuginfod-client.c
> @@ -40,6 +40,7 @@
>  
>  #include "config.h"
>  #include "debuginfod.h"
> +#include "lib/system.h"
>  #include <assert.h>
>  #include <dirent.h>
>  #include <stdio.h>
> @@ -107,7 +108,7 @@ static const char url_delim_char = ' ';
>  /* Timeout for debuginfods, in seconds.
>     This env var must be set for debuginfod-client to run.  */
>  static const char *server_timeout_envvar = DEBUGINFOD_TIMEOUT_ENV_VAR;
> -static int server_timeout = 5;
> +static int server_timeout = 30;

So maybe have a new DEBUGINFOD_PROGRESS_TIMEOUT environment variable
with the original 5 second default, which would timeout the connection
if it doesn't succeed within 5 seconds, or if the download didn't see
any progress within the progress_timeout?

Then you could increase the server_timeout to something larger, say 60
seconds, which might be more appropriate for larger debuginfo files?

>  /* Data associated with a particular CURL easy handle. Passed to
>     the write callback.  */
> @@ -511,6 +512,28 @@ debuginfod_query_server (debuginfod_client *c,
>      {
>        CURLMcode curl_res;
>  
> +      /* Wait 1 second, the minimum DEBUGINFOD_TIMEOUT.  */
> +      curl_multi_wait(curlm, NULL, 0, 1000, NULL);
> +
> +      /* If the target file has been found, abort the other queries.  */
> +      if (target_handle != NULL)
> +        for (int i = 0; i < num_urls; i++)
> +          if (data[i].handle != target_handle)
> +            curl_multi_remove_handle(curlm, data[i].handle);
> +
> +      curl_res = curl_multi_perform(curlm, &still_running);
> +
> +      if (curl_res != CURLM_OK)
> +        {
> +          switch (curl_res)
> +            {
> +            case CURLM_CALL_MULTI_PERFORM: continue;
> +            case CURLM_OUT_OF_MEMORY: rc = -ENOMEM; break;
> +            default: rc = -ENETUNREACH; break;
> +            }
> +          goto out1;
> +        }
> +
>        if (c->progressfn) /* inform/check progress callback */
>          {
>            loops ++;
> @@ -554,27 +577,6 @@ debuginfod_query_server (debuginfod_client *c,
>            if ((*c->progressfn) (c, pa, pb))
>              break;
>          }
> -
> -      /* Wait 1 second, the minimum DEBUGINFOD_TIMEOUT.  */
> -      curl_multi_wait(curlm, NULL, 0, 1000, NULL);
> -
> -      /* If the target file has been found, abort the other queries.  */
> -      if (target_handle != NULL)
> -        for (int i = 0; i < num_urls; i++)
> -          if (data[i].handle != target_handle)
> -            curl_multi_remove_handle(curlm, data[i].handle);
> -
> -      curl_res = curl_multi_perform(curlm, &still_running);
> -      if (curl_res != CURLM_OK)
> -        {
> -          switch (curl_res)
> -            {
> -            case CURLM_CALL_MULTI_PERFORM: continue;
> -            case CURLM_OUT_OF_MEMORY: rc = -ENOMEM; break;
> -            default: rc = -ENETUNREACH; break;
> -            }
> -          goto out1;
> -        }
>      } while (still_running);

This seems to mean that if there is an error then the progress function
is never called, is that intended? Also for CURLM_CALL_MULTI_PERFORM we
continue/skip the progress function.

>    /* Check whether a query was successful. If so, assign its handle
> @@ -673,9 +675,9 @@ debuginfod_query_server (debuginfod_client *c,
>  
>    curl_multi_cleanup(curlm);
>    unlink (target_cache_tmppath);
> +  close (fd); /* before the rmdir, otherwise it'll fail */
>    (void) rmdir (target_cache_dir); /* nop if not empty */
>    free(data);
> -  close (fd);

This looks good. Just surprising. I assumed unlinking the file was
enough.

>   out0:
>    free (server_urls);
> @@ -684,6 +686,35 @@ debuginfod_query_server (debuginfod_client *c,
>    return rc;
>  }
>  
> +
> +/* Activate a basic form of progress tracing */
> +static int
> +default_progressfn (debuginfod_client *c, long a, long b)
> +{
> +  (void) c;
> +
> +  int fd = open(getenv(DEBUGINFOD_PROGRESS_ENV_VAR),
> +		O_APPEND|O_WRONLY|O_CREAT, 0666);
> +  if (fd < 0)
> +    goto out;
> +
> +  char *msg = NULL;
> +  int rc = asprintf(&msg,
> +		    "Downloading from debuginfod %ld/%ld%s", a, b,
> +		    ((a == b) ? "\n" : "\r")); /* XXX: include URL - stateful */
> +  if (rc < 0)
> +    goto out1;
> +
> +  (void) write_retry(fd, msg, rc);
> +  free (msg);
> +
> +  out1:
> +    close (fd);
> +  out:
> +    return 0;
> +}

The \r \n trick is nice.
Just always using stderr would simplify this a bit.
Also note that you can use dprintf to printf to a file descriptor. 

>  /* See debuginfod.h  */
>  debuginfod_client  *
>  debuginfod_begin (void)
> @@ -692,7 +723,12 @@ debuginfod_begin (void)
>    size_t size = sizeof (struct debuginfod_client);
>    client = (debuginfod_client *) malloc (size);
>    if (client != NULL)
> -    client->progressfn = NULL;
> +    {
> +      if (getenv(DEBUGINFOD_PROGRESS_ENV_VAR))
> +	client->progressfn = default_progressfn;
> +      else
> +	client->progressfn = NULL;
> +    }
>    return client;
>  }

OK. 

> diff --git a/debuginfod/debuginfod.h b/debuginfod/debuginfod.h
> index 6b1b1cc..33fae86 100644
> --- a/debuginfod/debuginfod.h
> +++ b/debuginfod/debuginfod.h
> @@ -33,6 +33,7 @@
>  #define DEBUGINFOD_URLS_ENV_VAR "DEBUGINFOD_URLS"
>  #define DEBUGINFOD_CACHE_PATH_ENV_VAR "DEBUGINFOD_CACHE_PATH"
>  #define DEBUGINFOD_TIMEOUT_ENV_VAR "DEBUGINFOD_TIMEOUT"
> +#define DEBUGINFOD_PROGRESS_ENV_VAR "DEBUGINFOD_PROGRESS"
>  
>  /* Handle for debuginfod-client connection.  */
>  typedef struct debuginfod_client debuginfod_client;
> diff --git a/doc/ChangeLog b/doc/ChangeLog
> index 00a61ac..9542161 100644
> --- a/doc/ChangeLog
> +++ b/doc/ChangeLog
> @@ -1,3 +1,9 @@
> +2019-12-04  Frank Ch. Eigler  <fche@redhat.com>
> +
> +	* debuginfod-find.1: Bump default timeout to 30.
> +	* debuginfod_find_debuginfo.3: Ditto.
> +	Document $DEBUGINFOD_PROGRESS.
> +
>  2019-09-02  Mark Wielaard  <mark@klomp.org>
>  
>  	* readelf.1 (symbols): Add optional section name.
> diff --git a/doc/debuginfod-find.1 b/doc/debuginfod-find.1
> index a759ecb..7f14e8c 100644
> --- a/doc/debuginfod-find.1
> +++ b/doc/debuginfod-find.1
> @@ -121,7 +121,7 @@ debuginfod instances.  Alternate URL prefixes are separated by space.
>  .B DEBUGINFOD_TIMEOUT
>  This environment variable governs the timeout for each debuginfod HTTP
>  connection.  A server that fails to respond within this many seconds
> -is skipped.  The default is 5.
> +is skipped.  The default is 30.
>  
>  .TP 21
>  .B DEBUGINFOD_CACHE_PATH
> diff --git a/doc/debuginfod_find_debuginfo.3 b/doc/debuginfod_find_debuginfo.3
> index be8eed0..63766b3 100644
> --- a/doc/debuginfod_find_debuginfo.3
> +++ b/doc/debuginfod_find_debuginfo.3
> @@ -165,7 +165,15 @@ debuginfod instances.  Alternate URL prefixes are separated by space.
>  .B DEBUGINFOD_TIMEOUT
>  This environment variable governs the timeout for each debuginfod HTTP
>  connection.  A server that fails to respond within this many seconds
> -is skipped.  The default is 5.
> +is skipped.  The default is 30.
> +
> +.TP 21
> +.B DEBUGINFOD_PROGRESS
> +This environment variable governs the default progress function.  If
> +set, and if a progressfn is not explicitly set, then the library will
> +configure a default progressfn.  This function will append a simple
> +progress message periodically to the given file.  Consider using
> +"/dev/stderr" on platforms that support it.  The default is nothing.
>
>  .TP 21
>  .B DEBUGINFOD_CACHE_PATH
> diff --git a/tests/ChangeLog b/tests/ChangeLog
> index 6e3923f..d63d0eb 100644
> --- a/tests/ChangeLog
> +++ b/tests/ChangeLog
> @@ -1,3 +1,7 @@
> +2019-12-04  Frank Ch. Eigler  <fche@redhat.com>
> +
> +	* run-debuinfod-find.sh: Test $DEBUGINFOD_PROGRESS.
> +
>  2019-11-26  Mark Wielaard  <mark@klomp.org>
>  
>  	* Makefile.am (BUILD_STATIC): Add libraries needed for libdw.
> diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
> index 0ade03b..be6d565 100755
> --- a/tests/run-debuginfod-find.sh
> +++ b/tests/run-debuginfod-find.sh
> @@ -153,8 +153,11 @@ cmp $filename F/prog2
>  cat vlog
>  grep -q Progress vlog
>  tempfiles vlog
> -filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find executable $BUILDID2`
> +filename=`testrun env DEBUGINFOD_PROGRESS=vlog2 ${abs_top_builddir}/debuginfod/debuginfod-find executable $BUILDID2`
>  cmp $filename F/prog2
> +cat vlog2
> +grep -q Downloading vlog2
> +tempfiles vlog2
>  filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $BUILDID2 ${PWD}/prog2.c`
>  cmp $filename ${PWD}/prog2.c

To test if you just use /dev/stderr would just need to redirect stderr
to a file.



More information about the Elfutils-devel mailing list