This is the mail archive of the dwarf2@corp.sgi.com mailing list for the dwarf2 project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Line Number Table Issue


Hello,

I'm new to the group so bear with me if I break protocol.  My name is Brian
Nettleton and I work for Wind River Systems, Hardware Software Integration
Division (used to be EST Corp).  I'm responsible for the symbol reading
portion of the visionCLICK debugger.


Line Number Table Is_Stmt Issue:
-------------------------------

Recently I came across a situation where a compiler vendor was generating
something I thought unusual for DWARF 2 Line Number Information.  This
particular compiler had cleared the is_stmt boolean to false for every entry
in the line number table.  The visionCLICK symbol reader was ignoring any
entries that were not true and so threw away all the entries from this
compiler.  After reading the DWARF 2.0.0 spec for the is_stmt boolean it
seemed to me that the meaning of this boolean is not real clear and I'm
hoping this group can help shed some light in the upcoming revision.

Here's what the DWARF 2.0.0 spec says about the Is_Stmt boolean.

  6.2 Line Number Information

  ...
  If space were not a consideration, the
  information provided in the .debug_line
  section could be represented as a large
  matrix, with one row for each
  instruction in the emitted object code.
  The matrix would have columns for:
  ...
  - whether this instruction is the
    beginning of a source statement
  ...

  6.2.2 State Machine Registers
  ...
  is_stmt    A boolean indicating that the
             current instruction is the
             beginning of a statement.
  ...
  At the beginning of each sequence within
  a statement program, the state of the
  registers is:
  ...
  is_stmt    determined by default_is_stmt
             in the statement program
             prologue
  ...

  6.2.4 The Statement Program Prologue
  ...
   5. default_is_stmt (ubyte)
      The initial value of the is_stmt
      register.

      A simple code generator that emits
      machine instructions in the order
      implied by the source program would
      set this to "true," and every entry
      in the matrix would represent a
      statement boundary.  A pipeline
      scheduling code generator would set
      this to "false" and emit a specific
      statement program opcode for each
      instruction that represented a
      a statement boundary.

  6.2.5.2 Standard Opcodes
  ...
   6. DW_LNS_negate_stmt
      Takes no arguments.  Set the is_stmt
      register of the state machine to the
      logical negation of its current
      value.


This seems straight forward enough except the part in section 6.2.4 about a
pipeline scheduling code generator.  This is where the problem gets
interesting (without this case the boolean would be unnecessary anyway).
There does seem to be a potential argument that a pipeline optimizing
compiler writer could make that no instruction is a statement boundary!

While I understand the meaning of the theoretical boolean in section 6.2, it
seems less clear in the context of the state machine and the actual is_stmt
boolean.  What would one expect a debugger to do with entries where the
is_stmt boolean is false?  This note has more discussion, ad nauseam, of the
issue after a proposal for change.


Proposal for DWARF 2.1 change:
------------------------------

This change modifies the is_stmt register of the state machine to have an
initialized value of "true", and clarifies the responsibility of a pipeline
scheduling code generator to identify some instruction as the "beginning" of
a source line.

Textual changes to the specification:


6.2 Line Number Information
...
Such a matrix, however, would be impractically large.  We shrink it with two
techniques.  First, we delete from the matrix each row whose file, line and
source column information is identical with that of its predecessors.  [new
text] Any deleted rows would never be the beginning of a source statement.
[end new text]
...

6.2.2 State Machine Registers
...
is_stmt   A boolean indicating that the current
          instruction is the beginning of a
          statement.

          [new text] Every distinct line number
          within should always have one and
          only one instruction for which this
          boolean is true.  Except in the case
          of inlining or template expansion
          where a line number is semantically
          repeated in a source file, then each
          expansion of a line number should
          always have one and only one
          instruction for which this boolean
          is true.

          A simple code generator that emits
          machine instructions in the order
          implied by the source program would
          never modify this register and every
          entry in the matrix would represent
          a statement boundary.  A pipeline
          scheduling code generator might mark
          some instructions as false when
          instructions from several source
          statements are intermixed.[end new text]
...
At the beginning of each sequence within a statement program, the state of
the registers is:
...
is_stmt   [modified text] "true" [end modified text]
basic_block ...

6.2.4 The Statement Program Prologue
...
5. [modified text] unused (ubyte)
   This byte is currently unused. [end modified text]

6. line_base (sbyte)
...


Further Discussion:
-------------------

The current spec always for, and in fact says a pipeline scheduling code
generator should default the is_stmt boolean to "false".  This is wrong in
that the first instruction of any sequence would seem by definition to be
the beginning of a source line!  It is allowed for a compiler to generated
instructions which aren't associated with any line number in which case the
line number is identified as 0.  A debugger would largely ignore these
instructions anyway (especially the is_stmt boolean for these).  So even if
an optimizing compiler generated instructions which aren't associated with a
line number then eventually the first instruction generated for an actual
source line would still seem to be the first instruction for that source
line.

So what might a debugger do with entries in the table where is_stmt is
false.  Debuggers use the line number tables for basically four things:

1 - To set a breakpoint at the beginning of a source line.

2 - When stepping at the source level to identify when a new source line has
been encountered.

3 - When displaying interspersed disassembled machine code with source code
the line number tables are used to identify where to insert source code into
the disassembly listing.

4 - When a hardware exception occurs, or when displaying a stack trace back
the tables are used to identify the particular source line associated with
an instruction address.

Number 4 is probably the main situation where instructions with both "true"
and "false" is_stmt's are useful.  Certainly for number 1 only the "true"
is_stmt instructions are interesting.  It isn't clear whether the "false"
is_stmt instructions would or should be used for items 2 and 3 (while using
them might be more technically accurate it also would significantly add to
the "noise" when debugging, stepping back and forth over several lines is
distracting).

One might ask "Do we need an is_stmt boolean anyway?  Can't a debugger
simply identify the first instruction associated with a line number and use
this for setting breakpoints and then deal with the other situations as
needed?"  The answer is that yes we do need the is_stmt boolean to handle
situations where a source line is expanded multiple times in a file.  For
example an inline subroutine which was called twice would have it's source
lines "begin" twice in the instruction sequence.  It's not clear that this
is why the DWARF 2 spec originally included this boolean, but this probably
does justify it's existence.


-Brian Nettleton


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]