libctf: new enum-related API functions: request for better names

Nick Alcock nick.alcock@oracle.com
Fri May 17 12:08:28 GMT 2024


So Stephen Brennan pointed out many years ago that libctf's handling of
enumeration constants is needlessly unhelpful: it treats them as if they
are scoped within a given enum: you can only query from constant name ->
value and back within a given enum's scope, so if you don't already know
what enum something is part of you have to walk over every enum in the
dict hunting for it.

Worse yet, we do not consider enum constants with clashing values to be
a sign of a type conflict, so can easily end up with multiple distinct
enums containing enumeration constants with the *same name* appearing
in the shared dict. This definitely violates the principle of least
surprise and the (largely unstated) assumption that the shared dict
should be "as if" the entire C program's non-conflicting types were
declared in a single giant file which was compiled with -gctf: you can't
write a C file that declares the same enumeration constant twice!

Half of this is easy to fix: libctf, and in particular the deduplicator,
should track enumeration constant names just like it does all other
identifiers, and push enums with clashing names into child dicts. (This
might eat a lot of space when the enums have many other enumerators, but
most of that space is identical strings, which means we can win nearly
all the space back in v4 via the string-saving trick that is the second
entry in <https://sourceware.org/binutils/wiki/CTF/Todo/Compactness>.)

But I'm having trouble figuring out names for the new API functions
we'll need for the rest.  Right now libctf has these:

/* Convert the specified value to the corresponding enum tag name, if a
   matching name can be found.  Otherwise NULL is returned.  */

const char *ctf_enum_name (ctf_dict_t *fp, ctf_id_t type, int value);

/* Convert the specified enum tag name to the corresponding value, if a
   matching name can be found.  Otherwise CTF_ERR is returned.  */

int ctf_enum_value (ctf_dict_t *fp, ctf_id_t type, const char *name,
		    int *valp);

/* Iterate over the members of an ENUM.  We pass the string name and
   associated integer value of each enum element to the specified callback
   function.  */

int ctf_enum_iter (ctf_dict_t *fp, ctf_id_t type, ctf_enum_f *func, void *arg);

/* Iterate over the members of an enum TYPE, returning each enumerand's NAME or
   NULL at end of iteration or error, and optionally passing back the
   enumerand's integer VALue.  */

const char *ctf_enum_next (ctf_dict_t *fp, ctf_id_t type, ctf_next_t **it,
    	                   int *val);

At the very least we want something like dict-wide equivalents of the
first two: but ctf_enum_name has the very annoying behaviour of just
picking the first name if there are multiple conflicting ones with the
same value, and on a dict-wide basis there will be huge numbers of these
(can you imagine how many enumeration constants have the value 1? :) )

But also we want to not completely fail when faced with existing shared
dicts that have many enumeration constants, in different enums, with
thee same name. (These will still be able to happen even in the future,
albeit rarely, because when using custom linkers like the one I'm hoping
to upstream into the Linux kernel, CTF child dicts are not always the
same as C translation units: you can have two enumeration constantss FOO
that are in different translation units in the same kernel module, and
the kernel CTF linker will combine them into the same CTF dict. One enum
will be marked hidden/non-root, but both will be there.)

So functions to iterate over all enumeration constants with a given
value are obviously necessary, but we probably want a non-iterator to
just return the specific enum value that a given name expands to: maybe
we want to just pick one if this is ambiguous (as ctf_enum_value already
does). Below I have the maximally general approach, controlled by a
flag, but this is probably total overkill. I suspect
ctf_enumerator_name_next may still be necessary for existing dicts, but
probably not the flag to ctf_enumerator_value -- but I'm not sure.

For now, I've got this (using int64_t for enum values -- I'm not sure
how to migrate the other enum functions that way in future, but we
clearly have to *somehow*).


I'm very unsatisfied with the naming: to me, ctf_enumerator_* does not
read "like ctf_enum_* but dict-wide": but ctf_dict_enum_* feels wrong
too, as if it were dealing with enums *of* dicts. Suggestions?

/* ... _CTF_ERRORS ... */
  _CTF_ITEM (ECTF_ENUM_NAME_CONFLICT, "Multiple enumeration constants exist with this name.")

/* Flags for ctf_enumerator_value.  */

#define CTF_ENUM_UNIQUE 0x1

/* Get the value of a given named enumerator, dict-wide: also optionally return
   the associated enum type.  It is possible, but rare, for dicts to have
   multiple values for some names, or the same values in multiple distinct enum
   types: if the flags include CTF_ENUM_UNIQUE, fail with
   ECTF_ENUM_NAME_CONFLICT in this case.  Otherwise, return the first
   found.

   There is no function that maps the other way because it is downright common
   (and legal C) to have a single value that many enum constants are defined as
   across an entire dict.  Use ctf_enumerator_name_next instead.  */

int64_t ctf_enumerator_value (ctf_dict_t *fp, const char *name, int flags);

/* Iterate over all enumeration constants with a given value in one
    enum.  */

const char *ctf_enum_name_next (ctf_dict_t *fp, ctf_id_t type, ctf_next_t **it,
                                int64_t value);

/* Iterate over all enumerands in a dict.  The enum TYPE and the
   enumerand VALUE may optionally be returned as well.  */
const char *ctf_enumerator_next (ctf_dict_t *fp, ctf_next_t **it, ctf_id_t *type,
                                int64_t *value);

/* Iterate over the values of all enumeration constants in a dict with a given
   name.  The enum TYPE may optionally be returned as well.  */
int64_t ctf_enumerator_value_next (ctf_dict_t *fp, const char *name,
                                   ctf_next_t **it, ctf_id_t *type);

/* Iterate over the names of all enumeration constants in a dict with a given
   value.  The enum TYPE may optionally be returned as well.  */
const char *ctf_enumerator_name_next (ctf_dict_t *, int64_t value,
                                      ctf_next_t **, ctf_id_t *type);



More information about the Binutils mailing list