Trying to pin down the rationale for a change in the behaviour of 'print symbols' option

Andrew Dinn adinn@redhat.com
Wed Oct 19 10:31:47 GMT 2022


I have been using gdb 12.1-1 on fedora to debug Java application images 
generated by GraalVM Native. I have encountered a problem with printing 
of object field addresses which I think is due to a recent change to the 
print code. I'd like to understand why the change was made and question 
whether it is appropriate, at  least in its current form.

Below I

  1) explain the problem
  2) explain why I think the current behaviour is inappropriate
  3) recommend a possible remedy

I'd be grateful if whoever is responsible for the relevant code could 
assess the problem and provide some feedback.

regards,


Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill


1) PROBLEM

Java binary images generated by GraalVM can optionally include DWARF 
info that allows gdb debug them. Debug support is fairly well advanced 
and includes the option for Java objects to be printed field by field as 
shown by the following session log which records execution of a simple 
Hello World app:

----- 8< -------- 8< -------- 8< -------- 8< -------- 8< ---
GNU gdb (GDB) Fedora 12.1-1.fc35
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
. . .
(gdb) info func ::println
All functions matching regular expression "::println":

File java/io/PrintStream.java:
1047:	void java.io.PrintStream::println(java.lang.Object*);

File java/io/PrintWriter.java:
835:	void java.io.PrintWriter::println(java.lang.Object*);
. . .
(gdb) b java.io.PrintStream::println
Breakpoint 1 at 0x40601e: java.io.PrintStream::println. (43 locations)
(gdb) run Andrew
Starting program: 
/home/adinn/redhat/openjdk/graal/graal/substratevm/hello Andrew
. . .
Thread 1 "hello" hit Breakpoint 1, 
java.io.PrintStream::println(java.lang.String*) (this=0xd11678, 
x=0x7ffff7a01d88) at java/io/PrintStream.java:1027
Missing separate debuginfos, use: dnf debuginfo-install 
glibc-2.34-42.fc35.x86_64 zlib-1.2.11-32.fc35.x86_64
(gdb) p *this
$1 = {
    <java.io.FilterOutputStream> = {
      <java.io.OutputStream> = {
        <java.lang.Object> = {
          <_objhdr> = {
            hub = 0xb7e350,
            idHash = 1158657982
          }, <No data fields>}, <No data fields>},
      members of java.io.FilterOutputStream:
      closed = false,
      out = 0xf37278,
      closeLock = 0xf37260
    },
    members of java.io.PrintStream:
    textOut = 0xf37228,
    charOut = 0xf37110,
    autoFlush = true,
    closing = false
}
. . .
----- 8< -------- 8< -------- 8< -------- 8< -------- 8< ---


So far so good. the address fields are printed as simple hex values even 
though print symbol is on:


----- 8< -------- 8< -------- 8< -------- 8< -------- 8< ---
(gdb) show print symbol
Printing of symbols when printing pointers is on.
. . .
----- 8< -------- 8< -------- 8< -------- 8< -------- 8< ---


However, a colleague of mine who has been using gdb 12.1-6 on a later 
fedora has noticed different print behaviour:


----- 8< -------- 8< -------- 8< -------- 8< -------- 8< ---
GNU gdb (GDB) Fedora Linux 12.1-6.fc37
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
. . .
(gdb) p *this
$1 = {
    <java.io.FilterOutputStream> = {
      <java.io.OutputStream> = {
        <java.lang.Object> = {
          <_objhdr> = {
            hub = 0x4990c8 
<com.oracle.svm.core.jni.functions.JNIFunctions::RegisterNatives(com.oracle.svm.core.jni.headers.JNIEnvironment*, 
com.oracle.svm.core.jni.headers.JNIObjectHandle*, 
com.oracle.svm.core.jni.headers.JNINativeMethod*, int)+328>,
            idHash = 0
          }, <No data fields>}, <No data fields>},
      members of java.io.FilterOutputStream:
      closed = false,
      out = 0x6b81e0 
<java.math.MutableBigInteger::euclidModInverse(int)+384>,
      closeLock = 0xffffffffffe0a7c8
    },
    members of java.io.PrintStream:
    lock = 0xffffffffffe0ec78,
    charset = 0x736220 
<java.text.DecimalFormat::applyPattern(java.lang.String*, bool)+4624>,
    textOut = 0xffffffffffe0abb0,
    charOut = 0xffffffffffe0a7e0,
    autoFlush = true,
    closing = false
}
----- 8< -------- 8< -------- 8< -------- 8< -------- 8< ---



Note that my colleague was running a different app when he reported the 
problem -- which explains why we are seeing a slightly different data 
layout, including the 'lock' and 'charset' fields that were optimized 
away in the first program -- but that is not the cause of the disparity. 
We can observe the same disparity in print behaviour when running the 
Hello World app on gdb 12-1.6. Also note that in both cases 'print 
symbols' is set to ON. So, the disparity is this:

   Since 12.1-6 (or maybe earlier) field addresses are now being printed 
with attached <symbol + offset> annotations.

So what you might ask? Unfortunately, these annotations do not appear to 
match the data being printed.

2) APPROPRIATENESS

These annotations are not at all appropriate given what the program is 
doing and what the DWARF info records. The symbols being displayed in 
these annotations are local function symbols for compiled methods. 
However, the data fields they label are *not* function pointers.

The addresses appearing as values in these structures identify final, 
constant Java objects located in the code section. For example the 'hub' 
field which holds address 0x4990c8 is of type java.lang.Class i.e. it 
identifies the Java class of this java.io.PrintStream instance. Indeed 
the DWARF info for the _objhdr struct inherited by java.lang.Object and 
all subtypes, including PrintStream, makes this very clear:

----- 8< -------- 8< -------- 8< -------- 8< -------- 8< ---
(gdb) ptype _objhdr
type = struct _objhdr {
     java.lang.Class *hub;
     int idHash;
}
(gdb) ptype 'java.lang.Object'
type = class java.lang.Object : public _objhdr {
   public:
     void Object(void);
     boolean equals(java.lang.Object *);
   private:
     java.lang.Class * getClass(void);
. . .
----- 8< -------- 8< -------- 8< -------- 8< -------- 8< ---


The Class instance at address 0x4990c8 has no symbol because it is 
neither a program variable nor a program function. It is just a piece of 
(meta)data. Obviously, the nearest symbol below this class object turned 
out to be a local function symbol for method 
JNIFunctions::RegisterNatives and that completely unrelated symbol is 
now being used to print a 'helpful' annotation for the address.

This is not just happening for program metadata like java.lang.Class 
objects. Another example we can see above is the object data value in 
field 'charset' whose type is java.nio.charset.CharSet. The printed 
address is being annotated with an offset from a function symbol that 
identifies code for method DecimalFormat::applyPattern.

Once again, this is a bare object value which does not have any 
associated program symbol to label it. The default charset gets encoded 
as a final constant object into the text section. So, do the various 
component objects that this Charset references). There is no actual 
program reference to the default Charset so there is no symbol to label 
the address at which it is stored.

Note that this lack of any corresponding var symbol with which to label 
an object data value is a common occurrence with Java apps. The text 
section contains a lot of such constant object data. It can be constant 
objects that are referenced indirectly from static fields or it can be 
constant objects (like String instances)   that are present for loading 
by compiled code as an inlined final constant value. These constant 
objects legitimately have no associated symbol because they are just 
constant data values.

There are also cases where a Java symbol is generated to identify an 
object value e.g. a non-final Java static field of Object type. However, 
in these cases the address of the object does not lie in the text 
section. Instead it is in an RW 'initial heap' section.

I am not sure that this same problem cannot happen with C++ data but I 
believe something similar can occur. For example, assume a program 
declares a final constant array of strings initialised to with a 
sequence of string values. If one of those strings is placed into an 
object field then it's address will not have an associated symbol. 
However, the current code will annotate it with an offset derived from 
whatever symbol happens to precede it in the text section.

So, to sum up, annotating addresses for what are clearly constant data 
object references of non-function type using an offset from a function 
symbol seems to be a rather bizarre thing to do. It is almost certainly 
never going to be of help to any user of gdb.

3) POSSIBLE REMEDIES

My colleague's session also had 'print symbols' set to ON. Resetting to 
OFF avoided printout of the <symbol + offset> annotations in the data 
displays. However, this is rather a sledgehammer approach as it also 
affects cases where printing symbol + offset annotations is legitimate 
and useful.

One obvious, simple response to this problem would be to provide an 
extra, independent setting 'print field-symbols' with setting ON, OFF 
and AUTO (the latter meaning use the setting for 'print symbols'.

A more sophisticated response would be to detect the type of the field 
value and the type of the symbol and avoid printing an annotation when 
the the field type is not a function symbol and the symbol is a function 
or vice versa



More information about the Gdb mailing list