This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: C/C++ preprocessor macro support for GDB



Neil Booth <neil@daikokuya.demon.co.uk> writes:
> What are the issues with using libcpp?  It would be a good test of its
> viability as an independent library to have it used somewhere else.

I think there are two issues.  Both might simply be my
misunderstanding of the libcpp header files and code I read; I'd love
to be set straight.

- GDB has commands like this:

        (gdb) break *ADDRESS if CONDITION

  This sets a conditional breakpoint at the address computed by
  evaluating the expression ADDRESS, whose condition is CONDITION.
  ADDRESS needs to be evaluated in the current scope --- the currently
  selected frame and its PC --- but CONDITION needs to be evaluated in
  the scope in force at the *breakpoint's* address.  So you can't just
  take the whole command and smoosh it through an expander all at
  once: ADDRESS and CONDITION might have totally different contexts,
  as far as the preprocessor is concerned.

  This means you've got to decide if there's an `if' in the command
  before you can macro-expand things.  Obviously, an `if' in a string,
  or as part of a larger identifier, doesn't count --- you really need
  to work in terms of tokens.

  (There's a similar situation involving commas: sometimes the parser
  is supposed to stop when it finds its first comma outside of any
  parens.)

  So my macro expander has the following function in its public
  interface:

    /* If the null-terminated string pointed to by *LEXPTR begins with a
       macro invocation, return the result of expanding that invocation as
       a null-terminated string, and set *LEXPTR to the next character
       after the invocation.  The result is completely expanded; it
       contains no further macro invocations.

       Otherwise, if *LEXPTR does not start with a macro invocation,
       return zero, and leave *LEXPTR unchanged.

       Use LOOKUP_FUNC and LOOKUP_BATON to find macro definitions.

       If this function returns a string, the caller is responsible for
       freeing it, using xfree.

       We need this expand-one-token-at-a-time interface in order to
       accomodate GDB's C expression parser, which may not consume the
       entire string.  When the user enters a command like

          (gdb) break *func+20 if x == 5

       the parser is expected to consume `func+20', and then stop when it
       sees the "if".  But of course, "if" appearing in a character string
       or as part of a larger identifier doesn't count.  So you pretty
       much have to do tokenization to find the end of the string that
       needs to be macro-expanded.  Our C/C++ tokenizer isn't really
       designed to be called by anything but the yacc parser engine.  */
    char *macro_expand_next (char **lexptr,
                             macro_lookup_ftype *lookup_func,
                             void *lookup_baton);

  I changed GDB's lexer to call macro_expand_next before carving out
  each token.  This means we don't have to worry about commas or `if's
  in macro invocations being confused with terminating commas: the
  expander consumes them before we ever see them.

  As far as I can tell, libcpp doesn't provide an analogous
  token-by-token entry point.

  Another way to deal with this would be to lex the command string
  twice: once to find the `if' or comma, and then again to do the real
  parsing, after macro-expanding each of the various expressions
  properly.  The only difficulty here is that GDB's lexer expects to
  be called by a yacc-style parsing engine; it deposits tokens'
  semantic values in yylval, etc.  To work around this, we'd need to
  make the lexer independent of yacc --- give it some other way to
  return semantic values, mostly --- and hook that into both yacc and
  the code looking for `if's and commas.  But that approach wouldn't
  require any change to libcpp's interface.

  There's nothing too hard there.  But I wanted to put together a
  patch which actually worked, while disturbing the existing GDB code
  as little as possible.  And I think there's something unsatisfying
  about the two-pass approach; parsers ought to be able to leave input
  unconsumed if they want.  It's a common enough idiom.  Shouldn't
  libcpp support it?

- GDB's macro data structures record all the macros that were ever
  #defined in a compilation unit, and the line numbers at which they
  were in force.  Given a name and an #inclusion and a line number (or
  in libcpp's terminology, a logical line number?), it can find the
  #definition in scope at that point.

  This is a bit different from libcpp's data structures, which only
  record the macros currently in force as libcpp makes a pass through
  the file's text.  (At least, that's the impression I got.)

  My macro expander is completely ignorant of the lookup table's
  structure; you pass it a function and a data pointer that it uses
  blindly for lookups.  Here's the relevant typedef, and one of the
  prototypes, from the expander's public interface:

    /* A function for looking up preprocessor macro definitions.  Return
       the preprocessor definition of NAME in scope according to BATON, or
       zero if NAME is not defined as a preprocessor macro.

       The caller must not free or modify the definition returned.  It is
       probably unwise for the caller to hold pointers to it for very
       long; it probably lives in some objfile's obstacks.  */
    typedef struct macro_definition *(macro_lookup_ftype) (const char *name,
                                                           void *baton);

    /* Expand any preprocessor macros in SOURCE, and return the expanded
       text.  Use LOOKUP_FUNC and LOOKUP_FUNC_BATON to find identifiers'
       preprocessor definitions.  SOURCE is a null-terminated string.  The
       result is a null-terminated string, allocated using xmalloc; it is
       the caller's responsibility to free it.  */
    char *macro_expand (const char *source,
                        macro_lookup_ftype *lookup_func,
                        void *lookup_func_baton);

  When expanding an expression, GDB packages up the #inclusion and
  line number in the baton argument, and provides a lookup_func that
  takes those together with the macro name to search the macro table.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]