[PATCH 2/2] gdb: Allow reading of enum definitions in PDB files

Thu May 11 13:41:05 GMT 2023

>>>>> "Mark" == Mark Harmstone <mark@harmstone.com> writes:

Mark> Adds a new file pdbread.c, which parses LF_ENUM records in PDB files.

Hi.  Thank you for the patch.  I'm happy to see this happening.

Normally, gdb patches should be sent to the gdb-patches list.  It's fine
IMO to just CC one like this, in the middle of a series, to gdb-patches.

I don't know anything about pdb, so I won't comment much on that.

Mark> --- a/gdb/configure.tgt
Mark> +++ b/gdb/configure.tgt
Mark> @@ -335,7 +335,7 @@ i[34567]86-*-cygwin*)
Mark>  	;;
Mark>  i[34567]86-*-mingw32*)
Mark>  	# Target: Intel 386 running win32
Mark> -	gdb_target_obs="i386-windows-tdep.o windows-tdep.o"
Mark> +	gdb_target_obs="i386-windows-tdep.o windows-tdep.o pdbread.o"

I don't think this should be handled here.

Object and debuginfo readers can either be compiled in unconditionally,
or by seeing whether any necessary support is present in BFD.  For the
latter, see how elfread.o is handled in gdb/configure.ac.

Mark> +/* Return the size in bytes of a PDB builtin type.  */
Mark> +static ULONGEST
Mark> +pdb_builtin_type_length (uint32_t t)
Mark> +{

Probably this can be removed in favor.  It's only used here:

+  t->set_target_type (pdb_builtin_type (builtin_type (objfile),
+					bfd_getl32 (&en.underlying_type)));
+  t->set_length (pdb_builtin_type_length (bfd_getl32 (&en.underlying_type)));

but this will do something weird if the builtin type and
pdb_builtin_type_length disagree about the size.  Instead you can do
something like:

   struct type *targ_type = pdb_builtin_type (...);
   t->set_target_type (targ_type);
   t->set_length (targ_type->length ());

Mark> +/* Some integers, such as enum values, get stored as an extended value if
Mark> +   they're too large to fit into two bytes.  The two bytes that would normally
Mark> +   be the value are instead a type indicator, and the actual value follows.  */
Mark> +static bool
Mark> +pdb_read_extended_value (uint16_t kind, char **ptr, uint16_t *length,
Mark> +			 LONGEST *ret)

gdb normally puts 'const' where it makes sense, so for instance 'const char **ptr'.
This requires a change in the caller but I was going to request that anyway.

Mark> +  buf = (char *) xmalloc (length);
Mark> +
Mark> +  if (bfd_bread (buf, length, tpi) != length)
Mark> +    goto end;

gdb generally uses RAII now for cleanups.  'buf' could either be a
unique_xmalloc_ptr<char>, or a gdb::char_vector (or even byte_vector
depending on if you want to use gdb_byte).  This will ensure proper
cleanup even in the face of future refactorings, and instead of 'goto end',
the code can just 'return'.

Mark> +	  /* Round to 4-byte boundary.  */
Mark> +	  if ((ptr - buf + 2) % 4)
Mark> +	    {
Mark> +	      if (length < 4)
Mark> +		goto end;
Mark> +
Mark> +	      length -= 4 - ((ptr - buf + 2) % 4);
Mark> +	      ptr += 4 - ((ptr - buf + 2) % 4);

I thought gdb had some macro for this but I can't find it now.

Mark> +	case LF_INDEX: {
Mark> +	  struct lf_index *ind;
Mark> +	  uint32_t fl_type;
Mark> +
Mark> +	  if (length < sizeof (*ind))
Mark> +	    goto end;
Mark> +
Mark> +	  ind = (struct lf_index *) ptr;

Not sure if this is really guaranteed to have correct alignment; if not
it could just be memcpy'd out.

Mark> +  name_len = length - sizeof (en);
Mark> +  name = (char * ) xmalloc (name_len + 1);

Either unique_xmalloc_ptr<char> or std::string here.

Mark> +  t = type_allocator (objfile).new_type ();
Mark> +  t->set_code (TYPE_CODE_ENUM);
Mark> +  t->set_name (name);

This will leak 'name'.

In gdb, types are allocated on an obstack; in this case, the objfile
obstack.  When the objfile is destroyed, the obstack is also freed --
but this does not run any destructors or free any data associated with
objects on the obstack.

Instead, if a string must be preserved, it should be copied to the
objfile obstack.  If the string is likely to be duplicated in the
objfile (if it might be needed multiple times), then objfile::intern can
be used.

Mark> +  num_types = last_type - first_type + 1;
Mark> +  types = (struct pdb_type *) xmalloc (sizeof (*types) * num_types);

Use std::vector here.

Mark> +
Mark> +  for (unsigned int i = 0; i < num_types; i++)

You can use C++ "foreach".

Mark> +	  bfd_close (tpi);

gdb has a complicated BFD reference-counting setup, but that might be
overkill for this use.  However, it's not difficult to make a deleter
type that wraps bfd_close, so that explicit closes on error paths aren't
needed.  This would be more idiomatic in gdb.  If you look for typedefs
of unique_ptr, you'll find some of these.

Mark> +static void
Mark> +pdb_symfile_read (struct objfile *objfile, symfile_add_flags symfile_flags)
Mark> +{
Mark> +  buildsym_compunit builder(objfile, "", nullptr, language_c, 0);

Assuming this is the first of several PDB-reading patches, it may be
worth considering the overall approach.

Other debug info readers in gdb work in two passes.  First they make
some kind of "quick" index.  Then, when a symbol lookup finds a match,
the full debug info for a given compilation unit is read in.

The code here only does the second step.  I don't know PDB, so maybe
that's the only way it can be done?

In gdb, the symbols are normally associated with a given compilation
unit.  This isn't done in this patch... I'm wondering if that's possible
to do?

Mark> +if {![istarget i*86-*-mingw*]
Mark> +  && ![istarget x86_64-*-mingw*]} {
Mark> +    return 0

Probably can use 'require {is_any_target ...}'

Mark> +gdb_test "ptype enum enum1" "type = enum enum1 \{red, green, blue = -1, yellow = 32768, purple = 4294967296\}.*" "ptype enum enum1"

Normally we'd split the lines so they fit in 80 columns.

Tom