patch 5 debuginfod: prometheus metrics

Mark Wielaard mark@klomp.org
Fri Nov 15 17:26:00 GMT 2019


Hi,

On Thu, 2019-11-07 at 04:08 -0500, Frank Ch. Eigler wrote:
> This webapi extensions allows admins to hook up debuginfod to a
> prometheus-compatible monitoring system for general situational
> statistics.  The metrics are simple enough that local curl requests
> can give a user a sense of what's going on.  The metrics are
> documented as unstable with respect to future versions.
> +.SS /metrics
> +
> +This endpoint returns a Prometheus formatted text/plain dump of a
> +variety of statistics about the operation of the debuginfod server.
> +The exact set of metrics and their meanings may change in future
> +versions.  Caution: configuration information (path names, versions)
> +may be disclosed.

Could you also add a reference to the Prometheus Exposition format. I
see it is already in a comment in the code. Best to also add it as See
also in the docs.

>  .SH DATA MANAGEMENT
>  
>  debuginfod stores its index in an sqlite database in a densely
> packed
> @@ -291,7 +299,8 @@ a denial-of-service in terms of RAM, CPU, disk
> I/O, or network I/O.
>  If this is a problem, users are advised to install debuginfod with a
>  HTTPS reverse-proxy front-end that enforces site policies for
>  firewalling, authentication, integrity, authorization, and load
> -control.
> +control.  The \fI/metrics\fP webapi endpoint is probably not
> +appropriate for disclosure to the public.

So, should there be an option to turn it off?
 
>  When relaying queries to upstream debuginfods, debuginfod \fBdoes not\fP
>  include any particular security features.  It trusts that the binaries
> diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
> index 7c7a0c5d7ef5..efe8c80fa081 100644
> --- a/debuginfod/debuginfod.cxx
> +++ b/debuginfod/debuginfod.cxx
> @@ -72,6 +72,7 @@ extern "C" {
>  #include <cstring>
>  #include <vector>
>  #include <set>
> +#include <map>
>  #include <string>
>  #include <iostream>
>  #include <iomanip>
> @@ -98,6 +99,14 @@ using namespace std;
>  #include <sys/syscall.h>
>  #endif
>  
> +#ifdef __linux__
> +#define gettid() syscall(SYS_gettid)
> +#else
> +#define gettid() pthread_self()
> +#endif

You might want to rename this since newer glibc might expose gettid().

The rest of the code looks good as far as I can see.
But I would suggest you add a command line option to disable the
metrics, which would not install the metrics handler and make the
metrics update functions noops.

> diff --git a/tests/ChangeLog b/tests/ChangeLog
> index 3d50ee8623ee..156a693f8886 100644
> --- a/tests/ChangeLog
> +++ b/tests/ChangeLog
> @@ -1,3 +1,8 @@
> +2019-11-07  Frank Ch. Eigler  <fche@redhat.com>
> +
> +	* run-debuginfod-find.sh: Test debuginfod metrics via curl.
> +	Fix federated testing, asserted by metrics.
> +
>  2019-11-06  Frank Ch. Eigler  <fche@redhat.com>
>  
>  	* run-debuginfod-find.sh: Test debuginfod -L mode.  Drop
> diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
> index 644901073d75..4c3e3cb306c2 100755
> --- a/tests/run-debuginfod-find.sh
> +++ b/tests/run-debuginfod-find.sh
> @@ -181,7 +181,8 @@ sleep 3
>  
>  # have clients contact the new server
>  export DEBUGINFOD_URLS=http://localhost:$PORT2
> -testrun ${abs_builddir}/debuginfod_build_id_find -e F/prog 1
> +rm -rf $DEBUGINFOD_CACHE_PATH
> +testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID
>  
>  # confirm that first server can't resolve symlinked info in L/ but second can
>  BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \
> @@ -202,6 +203,16 @@ export DEBUGINFOD_URLS="BAD http://localhost:$PORT1 localhost:$PORT1 http://loca
>  
>  testrun ${abs_builddir}/debuginfod_build_id_find -e F/prog2 1
>  
> +########################################################################
> +
> +# Fetch some metrics, if curl program is installed
> +if which curl 2>/dev/null; then
> +    curl http://localhost:$PORT1/badapi
> +    curl http://localhost:$PORT1/metrics
> +    curl http://localhost:$PORT2/metrics
> +    curl http://localhost:$PORT1/metrics | grep -q 'http_responses_total.*result.*error'
> +    curl http://localhost:$PORT2/metrics | grep -q 'http_responses_total.*result.*upstream'
> +fi
>  
>  ########################################################################

I think it is better to check with:

if type curl >/dev/null 2>&1; then

Which avoid executing which, which might not be installed...

Cheers,

Mark



More information about the Elfutils-devel mailing list