This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: patch 2/2 debuginfod server etc.

From: "Frank Ch. Eigler" <fche at redhat dot com>
To: Mark Wielaard <mark at klomp dot org>
Cc: elfutils-devel at sourceware dot org, amerey at redhat dot com
Date: Fri, 15 Nov 2019 16:00:47 -0500
Subject: Re: patch 2/2 debuginfod server etc.
References: <20191028190438.GC14349@redhat.com> <20191028190602.GD14349@redhat.com> <20191028190726.GE14349@redhat.com> <f2e22335b1e5f671768d3b24d462e442d46b916a.camel@klomp.org>

Hi -

> > +// Roll this identifier for every sqlite schema incompatiblity.
> > +#define BUILDIDS "buildids9"
> > +
> > +#if SQLITE_VERSION_NUMBER >= 3008000
> > +#define WITHOUT_ROWID "without rowid"
> > +#else
> > +#define WITHOUT_ROWID ""
> > +#endif
> > +
> > +static const char DEBUGINFOD_SQLITE_DDL[] =
> > +  "pragma foreign_keys = on;\n"
> > +  "pragma synchronous = 0;\n" // disable fsync()s - this cache is disposa> [...]

> Is there any way these sql DDL statements can be put somewhere else?

Not an easy way.  Some of the content is parametrized based on sqlite
version; some of it on the schema-version literal BUILDIDS.


> > +// Print a standard timestamp.
> > +static ostream&
> > +timestamp (ostream &o)
> > +  char *now2 = ctime (&now);

> I think you want to use ctime_r with a stack allocated char[26].
> Also does the now2[19] always work? Isn't it better to strchr for the
> '\n'. Or maybe just use strftime "%c"?

Good idea.


> [...]
> Note that newer glibc actually define a function called gettid:
> https://sourceware.org/bugzilla/show_bug.cgi?id=6399
> So you might want to rename it to something that doesn't accidentally
> clashes.

OK.


> > +static string
> > +conninfo (struct MHD_Connection * conn)
> > +{
> > +  char hostname[128];
> > +  char servname[128];
> 
> boo, hardcode name sizes. But ok.

Yeah, that's the nature of the API though.  Will bump up those sizes a bit.


> > +  string popen_cmd = string("/usr/bin/rpm2cpio " + shell_escape(b_source0));
> 
> Why the hardcoded path?
> Could you check at startup if rpm2cpio is in the PATH?

Hm, since this is run under popen(), so it'll do $PATH searching for
us.  Checking whether it is present at runtime ... hmmm ugh ... how
about an autoconf test instead?  The worst thing that happens with the
current code is that on a non-rpm system, if it does find .rpm files,
the code will print errors and otherwise ignore RPMs.


> > +      // extract this file to a temporary file
> > +      char tmppath[PATH_MAX] = "/tmp/debuginfod.XXXXXX"; // XXX: $TMP_DIR etc.
> 
> Some other code uses:
>   const char *tmpdir = getenv ("TMPDIR") ?: P_tmpdir;
>   static const char suffix[] = "/debuginfod.XXXXXX";
> Also PATH_MAX?

OK -- and yeah if we getenv() we might need PATH_MAX.
Will try asprintf() here and the client instead.


> > +      rc = archive_read_data_into_fd (a, fd);
> > +      struct MHD_Response* r = MHD_create_response_from_fd (archive_entry_size(e), fd);
> > +      if (r == 0)
> > +        {
> > +          if (verbose)
> > +            obatched(clog) << "cannot create fd-response for " << b_source0 << endl;
> > +          close(fd);
> > +        }
> 
> Should this break; ?

Sure, continuing the loop just wastes time.

> Also I prefer checking against NULL, it is slightly more obvious (0
> returns  often means success).

... the C++ tradition is 0 for null pointers, but if you insist...


> BTW. The usage of "R" or _r_ in the code is slightly confusing. I
> would prefer it to just say RPM. Or isn't that what is meant?

Changed in the diagnostics, for user consumption.  In the code,
I kind of like the short size that lets f and r (and near future
d debs) look pretty parallel.


> > +////////////////////////////////////////////////////////////////////////
> > +
> > +
> > +// borrowed from src/nm.c get_local_names()
> 
> This is slightly misleading, most of the function is not from there.

Added a clarifying adverb for this credit kudos.


> > +  Dwarf_Off offset = 0;
> > +  Dwarf_Off old_offset;
> > +  size_t hsize;
> > +
> > +  while (dwarf_nextcu (dbg, old_offset = offset, &offset, &hsize, NULL, NULL, NULL) == 0)
> 
> These days I would prefer dwarf_get_units (). It is slightly higher
> level and immediately gives you the cudie and unit_type.

Will look into that later.


> > +    {
> > +      Dwarf_Die cudie_mem;
> > +      Dwarf_Die *cudie = dwarf_offdie (dbg, old_offset + hsize, &cudie_mem);
> > +
> > +      if (cudie == NULL)
> > +        continue;
> > +      if (dwarf_tag (cudie) != DW_TAG_compile_unit)
> > +        continue;
> > +
> > +      const char *cuname = dwarf_diename(cudie) ?: "unknown";
> > +
> > +      Dwarf_Files *files;
> > +      size_t nfiles;
> > +      if (dwarf_getsrcfiles (cudie, &files, &nfiles) != 0)
> > +        continue;
> 
> So you are really only interested in the file/line tables.
> In that case you could also use dwarf_next_lines which iterates through
> the debug line units directly, so you don't need to do the whole CU DIE
> tree iteration yourself (and it handles CUless tables).

Ditto.


> > +          string waldo;
> > +          if (hat[0] == '/') // absolute
> > +            waldo = (string (hat));
> > +          else // comp_dir relative
> > +            waldo = (string (comp_dir) + string("/") + string (hat));
> 
> Do you have to think about/handle a comp_dir that ends with a / ?
> Old debugedit truncated some strings by adding /// (to fill up the
> spaces till the '\0'...) Yes, terrible :{

It should just work(tm) if the debugger uses the documented convention
and preserves those extra slashes (just as if it preserved ../ and
such).


> > +                // NB: we catch exceptions from elf_classify here too, so that we can
> > +                // cache the corrupt-elf case (!executable_p && !debuginfo_p) just below,
> > +                // just as if we had an EPERM error from open(2).
> > +                    
> > +                catch (const reportable_exception& e)
> > +                  {
> > +                    e.report(clog);
> > +                  }
> 
> I think the comment is wrong since elf_classify seems to eat its own
> reportable_exceptions (and those thrown from
> dwarf_extract_source_paths) because it has its own try catch for
> reportable_exceptions and doesn't rethrow them.

Hm yeah, the comment describes an earlier version of code.  Will just
tweak the comment for now, but may be able to simplify the
elf_classify() side later on, using one of those fancy deferred_dtor<>
widgets.



> > +      sleep (1);
> > +      rescan_timer ++;
> > +      if (rescan_s)
> > +        rescan_timer %= rescan_s;
> > +    }
> > +  
> > +  return 0;
> > +}
> 
> Can we use something nicer than the hardcode sleep (1) ?

Sort of ... doing it this way allows us to have up-to-date metrics (as
per the following patch) as to how much longer to expect to wait for
any given scanner thread to get going again ... and protects us from
the case where only a single particular thread gets interrupted to
serve a signal like SIGUSR*, and the others just sleep for the much
larger rescan_s times, and miss responding quickly.

> > +  // XXX: _r_dalt
> 
> What?

Earlier thoughts, nuked.


> > +                if (0)  // XXX: if unsatisfied debugalt set is non-empty ...:
> > +                  break;
> 
> What?

Another earlier thought, mooted by code just later.



> > +static void
> > +signal_handler (int /* sig */)
> > +{
> > +  interrupted ++;
> > +
> > +  if (db)
> > +    sqlite3_interrupt (db);
> > +  
> > +  // NB: don't do anything else in here
> > +}
> 
> Nothing ever sets db to NULL after sqlite3_open.
> The documentation of sqlite3_interrupt says:
> 
> "it is not safe to call this routine with a database connection that is
> closed or might close before sqlite3_interrupt() returns"
> 
> So it seems this might cause trouble when this is called just
> before/while sqlite3_close (db) is being called.

Good catch, will clear db at close time.  OTOH the damage from such a
very late even SIGSEGV would be small indeed, as by then, all the
databases are closed, threads shut down, etc.


- FChE

Follow-Ups:
- Re: patch 2/2 debuginfod server etc.
  - From: Mark Wielaard

References:
- patch 0/2 debuginfod submission
  - From: Frank Ch. Eigler
- patch 1/2 debuginfod client
  - From: Frank Ch. Eigler
- patch 2/2 debuginfod server etc.
  - From: Frank Ch. Eigler
- Re: patch 2/2 debuginfod server etc.
  - From: Mark Wielaard

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]