Bug 24541

Summary:	Incorrect evaluation of systemtap probes due to register being signed and probe expression assuming unsigned
Product:	gdb	Reporter:	Andrew Burgess <aburgess>
Component:	breakpoints	Assignee:	Not yet assigned to anyone <unassigned>
Status:	RESOLVED FIXED
Severity:	normal	CC:	brobecker, fche, fweimer, jakub, sergiodj
Priority:	P2
Version:	HEAD
Target Milestone:	8.3.1
Host:		Target:
Build:		Last reconfirmed:
Attachments:	Test case.

Description Andrew Burgess 2019-05-08 23:59:12 UTC

Created attachment 11768 [details]
Test case.

Grab the attached tar file, then:

  $ tar -xf probe-bug.tar.xz
  cd probe-bug
  make
  make check

The included test program does throw and catch some exceptions, and the gdb script sets up a possible catch with 'catch throw blahblah'.  However, 'blahblah' doesn't match any of the exception types we throw or catch, so the expectation is that GDB will not stop at any of the throws, and instead the program will run to completion.

What I actually observe is the program stops at the 'throw' and GDB prints this error:

  could not find minimal symbol for typeinfo address 0xffff8804

Unfortunately this bug relies on the specific address that happens to be in a register, if the address doesn't have the "correct" bit set then the bug will not trigger.

What has happened is that GDB stops at the SystemTap probe point for "throw", it then tries to extract the arguments for this probe point.  For the particular problem argument the systemtap expression as understood by gdb is:

	    0  UNOP_CAST             Type @0x55ade9b8e430 (uint32_t)
	    3    OP_REGISTER           Register $di

The value of $di register when we stop is this:

    (gdb) p/x $edi
    $1 = 0x8048804
    (gdb) p/x $di
    $2 = 0x8804

And, critically, the type of $di is:

    (gdb) ptype $di
    type = int16_t

What's happening then is when GDB evaluates the SystemTap expression it is doing this:

    (gdb) p/x (unsigned int) $di
    $3 = 0xffff8804

Which gives us the '0xffff8804' problem value we see in the error.  However what SystemTap is expecting is something more like this:

    (gdb) p/x (unsigned int) ((unsigned short) $di)
    $4 = 0x8804

Or to put it another way, SystemTap is assuming the register is unsigned.

It's not clear to me if the problem is
   (a) SystemTaps expressions are wrong, and should not assume the register is signed / unsigned, 
   (b) GDB's register type is wrong, and we should change the registers to be unsigned, or
   (c) GDB's expression evaluation is wrong, and we should somehow be special casing SystemTap expressions to force registers unsigned in this case...

One final note, this bug is present on older and more recent versions of GDB (current HEAD 9-May-2019) however, in current HEAD a completely different bug will hit you first.  When GDB tries to print the error "could not find minimal symbol for typeinfo address 0xffff8804" a terminal ownership bug will cause GDB to receive SIGSTOP and go into the background.  I've included a GDB patch to work around this issue inside the probe-bug.tar.xz tar file (see probe-bug/sigstop-bug.patch).

Comment 1 Andrew Burgess 2019-05-09 09:56:12 UTC

After some additional investigation I was slightly wrong last night.  The probe does claim that the argument is in the $di register, this is the system tap entry from readelf on /lib/libstdc++.so.6:

  stapsdt              0x00000028       NT_STAPSDT (SystemTap probe descriptors)            Provider: libstdcxx
    Name: throw
    Location: 0x00072c96, Base: 0x00133d64, Semaphore: 0x00000000
    Arguments: 4@%si 4@%di

However, I suspect the issue is that the  probe information might be wrong.  The second argument is described as an unsigned 32-bit value in register $di, however, register $di is only 16-bits, so GDB currently extracts its value and casts it up to uint32_t.

If I look at the code around the trace point however, I see this:


    00072c80 <__cxa_throw@@CXXABI_1.3>:
       72c80:       57                      push   %edi
       72c81:       56                      push   %esi
       72c82:       53                      push   %ebx
       72c83:       8b 74 24 10             mov    0x10(%esp),%esi
       72c87:       e8 74 bf ff ff          call   6ec00 <__cxa_finalize@plt+0x980>
       72c8c:       81 c3 74 e3 10 00       add    $0x10e374,%ebx
       72c92:       8b 7c 24 14             mov    0x14(%esp),%edi
       72c96:       90                      nop                      <----------------- PROBE IS HERE
       72c97:       e8 d4 a2 ff ff          call   6cf70 <__cxa_get_globals@plt>
       72c9c:       83 40 04 01             addl   $0x1,0x4(%eax)
       72ca0:       83 ec 04                sub    $0x4,%esp
       72ca3:       ff 74 24 1c             pushl  0x1c(%esp)
       72ca7:       57                      push   %edi
       72ca8:       56                      push   %esi
       72ca9:       e8 62 a3 ff ff          call   6d010 <__cxa_init_primary_exception@plt>
       72cae:       8d 70 40                lea    0x40(%eax),%esi
       72cb1:       c7 00 01 00 00 00       movl   $0x1,(%eax)
       72cb7:       89 34 24                mov    %esi,(%esp)
       72cba:       e8 61 96 ff ff          call   6c320 <_Unwind_RaiseException@plt>
       72cbf:       89 34 24                mov    %esi,(%esp)
       72cc2:       e8 c9 84 ff ff          call   6b190 <__cxa_begin_catch@plt>
       72cc7:       e8 d4 b3 ff ff          call   6e0a0 <_ZSt9terminatev@plt>
       72ccc:       66 90                   xchg   %ax,%ax
       72cce:       66 90                   xchg   %ax,%ax

Notice that $di is never used, only the 32-bit $edi.  I suspect that the issue is incorrect generation of the probe arguments.  I'm taking a look at GCC to see if I can confirm this, but I'm posting this update in case anyone is interested.

Comment 2 Andrew Burgess 2019-05-09 11:23:25 UTC

So this issue seems to be relevant:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80115

which links to this change in SystemTap:

  https://sourceware.org/git/gitweb.cgi?p=systemtap.git;a=commitdiff;h=272146660f54786bb61d388f6d3a4eb20e7d9369

Which is where SystemTap changed to emit the narrowed register names by adding the 'w' prefix.  To see how GCC interprets the 'w' prefix, see:

  https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/i386/i386.c;h=ab1524c3524846468185d01302d86f1ddbbc268c;hb=refs/heads/master#l12247

The implication seems to be that when SystemTap says the argument is this `4@%di` then we should interpret this to mean `((uint32_t) $edi)`.

Comment 3 Sergio Durigan Junior 2019-05-10 22:33:48 UTC

Thanks a lot for the throrough investigation, Andrew!

I can confirm that replacing "di" by "edi" works, i.e., the inferior finishes (with return code 1).

It's not clear to me whether we should special-case this scenario (doesn't seem like a very elegant solution); I'll look deeper into this and see if I can come up with a solution.  Thanks.

Comment 4 Frank Ch. Eigler 2019-05-12 15:51:06 UTC

We should work out why the sys/sdt.h machinery marks the %si operand as though it was unsigned.  It's not a property of the register, but of the gcc expression type at that point.

Comment 5 Sergio Durigan Junior 2019-05-12 19:59:37 UTC

(In reply to Frank Ch. Eigler from comment #4)
> We should work out why the sys/sdt.h machinery marks the %si operand as
> though it was unsigned.  It's not a property of the register, but of the gcc
> expression type at that point.

Thanks for the reply, Frank.

This is an important question, but the real problem here is the use of %di (or %si) instead of their extended versions (%edi and %esi), as far as I know.  Even if we have "-4@%di" (which is still wrong; should be "-2@%di"), GDB still can't proceed and errors out.

Comment 6 Andrew Burgess 2019-05-12 20:40:01 UTC

(In reply to Sergio Durigan Junior from comment #5)
> (In reply to Frank Ch. Eigler from comment #4)
> > We should work out why the sys/sdt.h machinery marks the %si operand as
> > though it was unsigned.  It's not a property of the register, but of the gcc
> > expression type at that point.
> 
> Thanks for the reply, Frank.
> 
> This is an important question, but the real problem here is the use of %di
> (or %si) instead of their extended versions (%edi and %esi), as far as I
> know.  Even if we have "-4@%di" (which is still wrong; should be "-2@%di"),
> GDB still can't proceed and errors out.

In this case changing to "4@edi" would be what GDB needs, the particular argument is a 32-bit address.  I'm trying to be careful not to blame either SystemTap or GDB here, it's not clear which needs to change.

What isn't clear to me from reading the original GCC issue, is why SystemTap changed to force the use of the 16-bit register name instead of the name that matches the size of the operand.

Comment 7 Frank Ch. Eigler 2019-05-13 13:45:50 UTC

The "%w[id]" syntax is meant to do nothing but permit gcc to offer a more specific (subregister) name for values that fit.  If the operand is 32 bits long, it seems to me that gcc shouldn't be describing it as %di.

Comment 8 Jakub Jelinek 2019-05-13 13:55:23 UTC

No, GCC does the documented thing.
%w[id] says take the argument of id and print it as if it was 16-bit.
It is like all the other size modifiers and is heavily used that way by both GCC itself and various projects in inline-asm.  If you want to auto-detect the size of the operand rather than override it, don't use any size modifiers.

Comment 9 Jakub Jelinek 2019-05-13 14:01:07 UTC

Note, in http://PR80115 you said systemtap doesn't care how the register is spelled and that the width is encoded differently, which is why I've suggested that w modifier, it could have been k modifier too and then everything would be printed as %edi etc.; the reason for w was that it is shorter.

Comment 10 Frank Ch. Eigler 2019-05-13 14:01:57 UTC

(In reply to Jakub Jelinek from comment #8)
> No, GCC does the documented thing.
> [...] If you want to auto-detect
> the size of the operand rather than override it, don't use any size
> modifiers.

OK, in that case it seems like we were wrong to take the https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80115 patch.  We don't want gcc to give information known not to match the actual data.  I wonder why we opted for this instead of 

# define STAP_SDT_ARG_CONSTRAINT  norq

or such, so that the basic constraint includes the 1-byte subregister type.  Would we need another code for 2-byte subregisters, if those are also excluded from "r" class?

Comment 11 Jakub Jelinek 2019-05-13 14:28:08 UTC

(In reply to Frank Ch. Eigler from comment #10)
> (In reply to Jakub Jelinek from comment #8)
> > No, GCC does the documented thing.
> > [...] If you want to auto-detect
> > the size of the operand rather than override it, don't use any size
> > modifiers.
> 
> OK, in that case it seems like we were wrong to take the
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80115 patch.  We don't want gcc
> to give information known not to match the actual data.  I wonder why we
> opted for this instead of 
> 
> # define STAP_SDT_ARG_CONSTRAINT  norq

Not really sure what you want to achieve with that.  That means the argument can be either integer, or offsettable memory, or any GPR register, or any GPR register ok for 8-bit values, the union of the last two sets is the third set, so norq is the same thing as nor.  If you don't have the constraint strings dependent on the size of the argument (not really sure how you could do it except for requiring to use a different macro for 8-bit arguments, perhaps with some compile time verification), then you either use nor and then need to use some address override (w and all registers will be printed as %ax, %bx, %cx, %dx, %si, %di, %bp, %sp or k and all registers will be printed as %eax, %ebx, %ecx, %edx, %esi, %edi, %ebp, %esp, regardless of the size), or use noq on ia32 and then that will penalize code quite a lot, because will force all those arguments into just %al, %bl, %cl or %dl.

From what I see, the size is encoded in the string too, so for 8-bit arguments you have there 1%ax, for 16-bit arguments 2%ax, for 32-bit arguments 4%ax and for 64-bit arguments 8%ax, all that can be interpreted by stap as %al, %ax, %eax and %eax + (%edx<<32) respectively.  So, I believe you have quite compact representation that doesn't omit any details and is easy to interpret.

Comment 12 Frank Ch. Eigler 2019-05-13 17:01:48 UTC

OK, trying to figure out where the mismatches are.

One one hand, one may interpret the register names as -a- name for the container that holds the value, and use the operand-size value to identify the N-bit bitstring payload within it.  In this case, the decoder must settle which names are deemed equivalent.  (Note that there is not a need to shorten register name strings, so this is not worth doing on that account only.)

On another hand, one may interpret the register names as a complete site for the value in question, so %al would refer to an at-most-8-bit value in AL, definitely not a 64-bit value in %rax.  In this case, the names don't have to be reprocessed by a smart decoder.  Register names would carry an implicit maximum size.

So on hand, 4@%si would mean the 4-byte value %esi, with the decoder put in charge of figuring this pairing out.  On the other hand, 4@%si would be an error because %si is only 2 bytes wide in the first place.

To my knowledge, we have not specified which of these interpretations we intend.  The sys/sdt.h patch from gcc bug# 80115 enshrined the first interpretation, but we did not fully grok/document this.

Does anyone have concerns about moving to case 2 - which in the i386 case would mean using  _SDT_ARGTMPL(id) = %k[id], so gcc would emit 4@%edi etc. for this case?

Comment 13 Jakub Jelinek 2019-05-13 17:08:55 UTC

(In reply to Frank Ch. Eigler from comment #12)
> Does anyone have concerns about moving to case 2 - which in the i386 case
> would mean using  _SDT_ARGTMPL(id) = %k[id], so gcc would emit 4@%edi etc.
> for this case?

Then the decoder still needs to know what does it mean when 8@%eax appears (how to find what is the next register and take endianity into account etc.).
But true, for 8/16/32 bit sizes it could just take the named register, take the lowest N*8 bits out of that and use that.
Transforming %di to %edi isn't that hard though and will mean you don't have to rebuild all the software which has the notes already created.

Comment 14 Frank Ch. Eigler 2019-05-13 17:46:02 UTC

Good point about 8@eax ... assuming that's how long-longs would show up.  At least if we could state the policy that if a stated width larger than the register's own width, then this is a signal that a decoder must be intelligent and match gcc's notion of register-pairings etc. This assumes that this is a fixed / context-free / stateless algorithm that we could document as a table over time.

In the mean time, we could switch to the %k[] variant to make this case less likely to trigger, requiring that documentation & logic.

Comment 15 Sergio Durigan Junior 2019-05-13 18:12:51 UTC

(In reply to Frank Ch. Eigler from comment #14)
> Good point about 8@eax ... assuming that's how long-longs would show up.  At
> least if we could state the policy that if a stated width larger than the
> register's own width, then this is a signal that a decoder must be
> intelligent and match gcc's notion of register-pairings etc. This assumes
> that this is a fixed / context-free / stateless algorithm that we could
> document as a table over time.
> 
> In the mean time, we could switch to the %k[] variant to make this case less
> likely to trigger, requiring that documentation & logic.

Thanks for taking a look into this, Frank and Jakub.

If I am understanding correctly, the proposal here is to make both modifications: use %k[] on sdt.h, and also make the decoder interpret when the informed size is actually larger than the actual register size.

I am more than fine with using %k[], of course, if it means that we will have the right pair of size-string@register there.

As for the second part, I think I can accomodate and expand the current parser to take that into account, even though my first impression is that the parser shouldn't need to have this knowledge built into it.

Comment 16 Frank Ch. Eigler 2019-05-13 18:16:39 UTC

> If I am understanding correctly, the proposal here is to make both
> modifications: use %k[] on sdt.h, and also make the decoder interpret when
> the informed size is actually larger than the actual register size.

I wish we didn't have to deal with the latter, but we may not have a choice
on the latter, for larger-than-any-register types.

> I am more than fine with using %k[], of course, if it means that we will
> have the right pair of size-string@register there.

Just to confirm, there is no sign/size problem in the sdt.h machinery?

> As for the second part, I think I can accomodate and expand the current
> parser to take that into account, even though my first impression is that
> the parser shouldn't need to have this knowledge built into it.

Indeed.  And there are multiple projects that consume sdt.h macro notes
(gdb, stap, perf, several other tracing tools), so this code would to be
replicated.  We can start assembling a table in one of the wikis, or the
header file itself.

Comment 17 Sergio Durigan Junior 2019-05-13 21:04:36 UTC

(In reply to Frank Ch. Eigler from comment #16)
> > I am more than fine with using %k[], of course, if it means that we will
> > have the right pair of size-string@register there.
> 
> Just to confirm, there is no sign/size problem in the sdt.h machinery?

Hm, not sure I understood the question.

The current arguments do have sign and size problems; "4@%di" should actually be "-2@di".  But maybe the question wasn't for me; sorry if that's the case.

> > As for the second part, I think I can accomodate and expand the current
> > parser to take that into account, even though my first impression is that
> > the parser shouldn't need to have this knowledge built into it.
> 
> Indeed.  And there are multiple projects that consume sdt.h macro notes
> (gdb, stap, perf, several other tracing tools), so this code would to be
> replicated.  We can start assembling a table in one of the wikis, or the
> header file itself.

Sounds like a good idea.

Comment 18 Frank Ch. Eigler 2019-05-13 21:21:06 UTC

> > Just to confirm, there is no sign/size problem in the sdt.h machinery?
> 
> Hm, not sure I understood the question.
> 
> The current arguments do have sign and size problems; "4@%di" should
> actually be "-2@di".  But maybe the question wasn't for me; sorry if that's
> the case.

That's almost exactly the question.
Here, by "%di", gcc meant "%edi".  So the question is whether the 4@ part is correct, if we understand that it's referring to the entire hypothetical register.  And I believe it is, being a 32-bit pointer.

Comment 19 Sergio Durigan Junior 2019-05-13 23:42:05 UTC

(In reply to Frank Ch. Eigler from comment #18)
> > > Just to confirm, there is no sign/size problem in the sdt.h machinery?
> > 
> > Hm, not sure I understood the question.
> > 
> > The current arguments do have sign and size problems; "4@%di" should
> > actually be "-2@di".  But maybe the question wasn't for me; sorry if that's
> > the case.
> 
> That's almost exactly the question.
> Here, by "%di", gcc meant "%edi".  So the question is whether the 4@ part is
> correct, if we understand that it's referring to the entire hypothetical
> register.  And I believe it is, being a 32-bit pointer.

Thanks for clarifying.  According to GDB:

(gdb) ptype $edi
type = int32_t

So it should be "-4@%edi", to account for the signedness.

Comment 20 Frank Ch. Eigler 2019-05-14 01:03:12 UTC

> (gdb) ptype $edi
> type = int32_t
> 
> So it should be "-4@%edi", to account for the signedness.

But the signedness is that of the value (the pointer), not that of the register holding the value (which in this case is an invention of gdb anyway, and not something the CPU considers).

Comment 21 Frank Ch. Eigler 2019-05-14 01:43:49 UTC

We're spinning a new stap rawhide build which includes the %k[] modification.

Comment 22 Frank Ch. Eigler 2019-05-21 14:09:43 UTC

https://koji.fedoraproject.org/koji/buildinfo?buildID=1267378

would appreciate a check that, at least for this test, it's good enough

Comment 23 Sergio Durigan Junior 2019-05-22 14:27:01 UTC

(In reply to Frank Ch. Eigler from comment #22)
> https://koji.fedoraproject.org/koji/buildinfo?buildID=1267378
> 
> would appreciate a check that, at least for this test, it's good enough

I built a scratch GCC package on Fedora's Koji, which used the new systemtap package (systemtap-sdt-devel 4.2-0.20190513git8b868f3dd030.fc31), and then I installed the libstdc++.i686 RPM on my system:

  https://koji.fedoraproject.org/koji/taskinfo?taskID=34985489

Then, I ran the test again, and unfortunately I still see the same failure.

The new /usr/lib/libstdc++.so.6.0.26 still refers to %di:

$ readelf -n /usr/lib/libstdc++.so.6.0.26
[...]
Displaying notes found in: .note.stapsdt
  Owner                 Data size       Description
  stapsdt              0x0000002e       NT_STAPSDT (SystemTap probe descriptors)
    Provider: libstdcxx
    Name: catch
    Location: 0x0008abf3, Base: 0x0019648b, Semaphore: 0x00000000
    Arguments: 4@%dx 4@-48(%esi)
  stapsdt              0x00000028       NT_STAPSDT (SystemTap probe descriptors)
    Provider: libstdcxx
    Name: throw
    Location: 0x0008c13a, Base: 0x0019648b, Semaphore: 0x00000000
    Arguments: 4@%si 4@%di
  stapsdt              0x0000002a       NT_STAPSDT (SystemTap probe descriptors)
    Provider: libstdcxx
    Name: rethrow
    Location: 0x0008c1fd, Base: 0x0019648b, Semaphore: 0x00000000
    Arguments: 4@%bx 4@%ax

Comment 24 Sourceware Commits 2019-06-16 15:28:47 UTC

The master branch has been updated by Andrew Burgess <aburgess@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=93cb9841d68263174a600dc70af742a8e2eabfc6

commit 93cb9841d68263174a600dc70af742a8e2eabfc6
Author: Andrew Burgess <andrew.burgess@embecosm.com>
Date:   Sun Jun 16 16:17:59 2019 +0100

    gdb/testsuite: Improve detection of bug gdb/24541
    
    In bug gdb/24686 a testsuite failure was reported, this failure was
    actually just another instance of bug gdb/24541, however, due to the
    non-deterministic nature of bug gdb/24541 the testsuite pattern that
    was intended to catch this bug failed.
    
    This commit adds a second pattern to help detect gdb/24541, which
    should change the FAIL reported in gdb/24686 into a KFAIL.
    
    gdb/testsuite/ChangeLog:
    
    	PR gdb/24686
    	* gdb.mi/mi-catch-cpp-exceptions.exp: Add an extra pattern to
    	improve detection of bug gdb/24541.

Comment 25 Sergio Durigan Junior 2019-06-16 17:44:57 UTC

Thanks, Andrew.  I haven't forgotten about this bug.  If I have the time, I'll work on it this next week.

Comment 26 Sergio Durigan Junior 2019-06-26 22:01:23 UTC

Patch posted: https://sourceware.org/ml/gdb-patches/2019-06/msg00607.html

Comment 27 Sourceware Commits 2019-06-28 20:29:46 UTC

The master branch has been updated by Sergio Durigan Junior <sergiodj@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=7d7571f0c14b4673ca95f6dc31d6f07d429e6697

commit 7d7571f0c14b4673ca95f6dc31d6f07d429e6697
Author: Sergio Durigan Junior <sergiodj@redhat.com>
Date:   Wed Jun 26 17:34:50 2019 -0400

    Adjust i386 registers on SystemTap probes' arguments (PR breakpoints/24541)
    
    This bug has been reported on PR breakpoints/24541, but it is possible
    to reproduce it easily by running:
    
      make check-gdb TESTS=gdb.base/stap-probe.exp RUNTESTFLAGS='--target_board unix/-m32'
    
    The underlying cause is kind of complex, and involves decisions made
    by GCC and the sys/sdt.h header file about how to represent a probe
    argument that lives in a register in 32-bit programs.  I'll use
    Andrew's example on the bug to illustrate the problem.
    
    libstdc++ has a probe named "throw" with two arguments.  On i386, the
    probe is:
    
      stapsdt              0x00000028       NT_STAPSDT (SystemTap probe descriptors)
        Provider: libstdcxx
        Name: throw
        Location: 0x00072c96, Base: 0x00133d64, Semaphore: 0x00000000
        Arguments: 4@%si 4@%di
    
    I.e., the first argument is an unsigned 32-bit value (represented by
    the "4@") that lives on %si, and the second argument is an unsigned
    32-bit value that lives on %di.  Note the discrepancy between the
    argument size reported by the probe (32-bit) and the register size
    being used to store the value (16-bit).
    
    However, if you take a look at the disassemble of a program that uses
    this probe, you will see:
    
        00072c80 <__cxa_throw@@CXXABI_1.3>:
           72c80:       57                      push   %edi
           72c81:       56                      push   %esi
           72c82:       53                      push   %ebx
           72c83:       8b 74 24 10             mov    0x10(%esp),%esi
           72c87:       e8 74 bf ff ff          call   6ec00 <__cxa_finalize@plt+0x980>
           72c8c:       81 c3 74 e3 10 00       add    $0x10e374,%ebx
           72c92:       8b 7c 24 14             mov    0x14(%esp),%edi
           72c96:       90                      nop                      <----------------- PROBE IS HERE
           72c97:       e8 d4 a2 ff ff          call   6cf70 <__cxa_get_globals@plt>
           72c9c:       83 40 04 01             addl   $0x1,0x4(%eax)
           72ca0:       83 ec 04                sub    $0x4,%esp
           72ca3:       ff 74 24 1c             pushl  0x1c(%esp)
           72ca7:       57                      push   %edi
           72ca8:       56                      push   %esi
           72ca9:       e8 62 a3 ff ff          call   6d010 <__cxa_init_primary_exception@plt>
           72cae:       8d 70 40                lea    0x40(%eax),%esi
           72cb1:       c7 00 01 00 00 00       movl   $0x1,(%eax)
           72cb7:       89 34 24                mov    %esi,(%esp)
           72cba:       e8 61 96 ff ff          call   6c320 <_Unwind_RaiseException@plt>
           72cbf:       89 34 24                mov    %esi,(%esp)
           72cc2:       e8 c9 84 ff ff          call   6b190 <__cxa_begin_catch@plt>
           72cc7:       e8 d4 b3 ff ff          call   6e0a0 <_ZSt9terminatev@plt>
           72ccc:       66 90                   xchg   %ax,%ax
           72cce:       66 90                   xchg   %ax,%ax
    
    Note how the program is actually using %edi, and not %di, to store the
    second argument.  This is the problem here.
    
    GDB will basically read the probe argument, then read the contents of
    %di, and then cast this value to uint32_t, which causes the wrong
    value to be obtained.  In the gdb.base/stap-probe.exp case, this makes
    GDB read the wrong memory location, and not be able to display a test
    string.  In Andrew's example, this causes GDB to actually stop at a
    "catch throw" when it should actually have *not* stopped.
    
    After some discussion with Frank Eigler and Jakub Jelinek, it was
    decided that this bug should be fixed on the client side (i.e., the
    program that actually reads the probes), and this is why I'm proposing
    this patch.
    
    The idea is simple: we will have a gdbarch method, which, for now, is
    only used by i386.  The generic code that deals with register operands
    on gdb/stap-probe.c will call this method if it exists, passing the
    current parse information, the register name and its number.
    
    The i386 method will then verify if the register size is greater or
    equal than the size reported by the stap probe (the "4@" part).  If it
    is, we're fine.  Otherwise, it will check if we're dealing with any of
    the "extendable" registers (like ax, bx, si, di, sp, etc.).  If we
    are, it will change the register name to include the "e" prefix.
    
    I have tested the patch here in many scenarios, and it fixes Andrew's
    bug and also the regressions I mentioned before, on
    gdb.base/stap-probe.exp.  No regressions where found on other tests.
    
    Comments?
    
    gdb/ChangeLog:
    2019-06-27  Sergio Durigan Junior  <sergiodj@redhat.com>
    
    	PR breakpoints/24541
    	* gdbarch.c: Regenerate.
    	* gdbarch.h: Regenerate.
    	* gdbarch.sh: Add 'stap_adjust_register'.
    	* i386-tdep.c: Include '<unordered_set>'.
    	(i386_stap_adjust_register): New function.
    	(i386_elf_init_abi): Register 'i386_stap_adjust_register'.
    	* stap-probe.c (stap_parse_register_operand): Call
    	'gdbarch_stap_adjust_register'.

Comment 28 Sergio Durigan Junior 2019-06-28 20:35:05 UTC

Fixed.

Comment 29 Sourceware Commits 2019-07-17 15:26:21 UTC

The master branch has been updated by Andrew Burgess <aburgess@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=6b78c3f83c8bcbfa714aab7627ece9673b2d602a

commit 6b78c3f83c8bcbfa714aab7627ece9673b2d602a
Author: Andrew Burgess <andrew.burgess@embecosm.com>
Date:   Tue Jul 2 12:06:06 2019 +0100

    gdb: Remove a non-const reference parameter
    
    Non-const reference parameter should be avoided according to the GDB
    coding standard:
    
      https://sourceware.org/gdb/wiki/Internals%20GDB-C-Coding-Standards#Avoid_non-const_reference_parameters.2C_use_pointers_instead
    
    This commit updates the gdbarch method gdbarch_stap_adjust_register,
    and the one implementation i386_stap_adjust_register to avoid using a
    non-const reference parameter.
    
    I've also removed the kfail from the testsuite for bug 24541, as this
    issue is now resolved.
    
    gdb/ChangeLog:
    
    	PR breakpoints/24541
    	* gdbarch.c: Regenerate.
    	* gdbarch.h: Regenerate.
    	* gdbarch.sh: Adjust return type and parameter types for
    	'stap_adjust_register'.
    	(i386_stap_adjust_register): Adjust signature and return new
    	register name.
    	* stap-probe.c (stap_parse_register_operand): Adjust use of
    	'gdbarch_stap_adjust_register'.
    
    gdb/testsuite/ChangeLog:
    
    	PR breakpoints/24541
    	* gdb.mi/mi-catch-cpp-exceptions.exp: Remove kfail due to 24541.

Comment 30 Sourceware Commits 2019-09-11 20:04:24 UTC

The gdb-8.3-branch branch has been updated by Tom de Vries <vries@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3b0e664e2fa479d66e7b39ba28739f6d6315f27c

commit 3b0e664e2fa479d66e7b39ba28739f6d6315f27c
Author: Sergio Durigan Junior <sergiodj@redhat.com>
Date:   Wed Sep 11 22:03:40 2019 +0200

    Adjust i386 registers on SystemTap probes' arguments (PR breakpoints/24541)
    
    [ Backport of master commit 7d7571f0c1. ]
    
    This bug has been reported on PR breakpoints/24541, but it is possible
    to reproduce it easily by running:
    
      make check-gdb TESTS=gdb.base/stap-probe.exp RUNTESTFLAGS='--target_board unix/-m32'
    
    The underlying cause is kind of complex, and involves decisions made
    by GCC and the sys/sdt.h header file about how to represent a probe
    argument that lives in a register in 32-bit programs.  I'll use
    Andrew's example on the bug to illustrate the problem.
    
    libstdc++ has a probe named "throw" with two arguments.  On i386, the
    probe is:
    
      stapsdt              0x00000028       NT_STAPSDT (SystemTap probe descriptors)
        Provider: libstdcxx
        Name: throw
        Location: 0x00072c96, Base: 0x00133d64, Semaphore: 0x00000000
        Arguments: 4@%si 4@%di
    
    I.e., the first argument is an unsigned 32-bit value (represented by
    the "4@") that lives on %si, and the second argument is an unsigned
    32-bit value that lives on %di.  Note the discrepancy between the
    argument size reported by the probe (32-bit) and the register size
    being used to store the value (16-bit).
    
    However, if you take a look at the disassemble of a program that uses
    this probe, you will see:
    
        00072c80 <__cxa_throw@@CXXABI_1.3>:
           72c80:       57                      push   %edi
           72c81:       56                      push   %esi
           72c82:       53                      push   %ebx
           72c83:       8b 74 24 10             mov    0x10(%esp),%esi
           72c87:       e8 74 bf ff ff          call   6ec00 <__cxa_finalize@plt+0x980>
           72c8c:       81 c3 74 e3 10 00       add    $0x10e374,%ebx
           72c92:       8b 7c 24 14             mov    0x14(%esp),%edi
           72c96:       90                      nop                      <----------------- PROBE IS HERE
           72c97:       e8 d4 a2 ff ff          call   6cf70 <__cxa_get_globals@plt>
           72c9c:       83 40 04 01             addl   $0x1,0x4(%eax)
           72ca0:       83 ec 04                sub    $0x4,%esp
           72ca3:       ff 74 24 1c             pushl  0x1c(%esp)
           72ca7:       57                      push   %edi
           72ca8:       56                      push   %esi
           72ca9:       e8 62 a3 ff ff          call   6d010 <__cxa_init_primary_exception@plt>
           72cae:       8d 70 40                lea    0x40(%eax),%esi
           72cb1:       c7 00 01 00 00 00       movl   $0x1,(%eax)
           72cb7:       89 34 24                mov    %esi,(%esp)
           72cba:       e8 61 96 ff ff          call   6c320 <_Unwind_RaiseException@plt>
           72cbf:       89 34 24                mov    %esi,(%esp)
           72cc2:       e8 c9 84 ff ff          call   6b190 <__cxa_begin_catch@plt>
           72cc7:       e8 d4 b3 ff ff          call   6e0a0 <_ZSt9terminatev@plt>
           72ccc:       66 90                   xchg   %ax,%ax
           72cce:       66 90                   xchg   %ax,%ax
    
    Note how the program is actually using %edi, and not %di, to store the
    second argument.  This is the problem here.
    
    GDB will basically read the probe argument, then read the contents of
    %di, and then cast this value to uint32_t, which causes the wrong
    value to be obtained.  In the gdb.base/stap-probe.exp case, this makes
    GDB read the wrong memory location, and not be able to display a test
    string.  In Andrew's example, this causes GDB to actually stop at a
    "catch throw" when it should actually have *not* stopped.
    
    After some discussion with Frank Eigler and Jakub Jelinek, it was
    decided that this bug should be fixed on the client side (i.e., the
    program that actually reads the probes), and this is why I'm proposing
    this patch.
    
    The idea is simple: we will have a gdbarch method, which, for now, is
    only used by i386.  The generic code that deals with register operands
    on gdb/stap-probe.c will call this method if it exists, passing the
    current parse information, the register name and its number.
    
    The i386 method will then verify if the register size is greater or
    equal than the size reported by the stap probe (the "4@" part).  If it
    is, we're fine.  Otherwise, it will check if we're dealing with any of
    the "extendable" registers (like ax, bx, si, di, sp, etc.).  If we
    are, it will change the register name to include the "e" prefix.
    
    I have tested the patch here in many scenarios, and it fixes Andrew's
    bug and also the regressions I mentioned before, on
    gdb.base/stap-probe.exp.  No regressions where found on other tests.
    
    Comments?
    
    gdb/ChangeLog:
    2019-06-27  Sergio Durigan Junior  <sergiodj@redhat.com>
    
    	PR breakpoints/24541
    	* gdbarch.c: Regenerate.
    	* gdbarch.h: Regenerate.
    	* gdbarch.sh: Add 'stap_adjust_register'.
    	* i386-tdep.c: Include '<unordered_set>'.
    	(i386_stap_adjust_register): New function.
    	(i386_elf_init_abi): Register 'i386_stap_adjust_register'.
    	* stap-probe.c (stap_parse_register_operand): Call
    	'gdbarch_stap_adjust_register'.

Comment 31 Joel Brobecker 2019-09-16 23:55:16 UTC

Tentatively setting the milestone to 8.3.1 in preparation for that release.
I say "tentatively" because not all patches have been backported, but
IIUC, the one that really matters has?

Comment 32 Sourceware Commits 2019-09-17 04:03:34 UTC

The gdb-8.3-branch branch has been updated by Sergio Durigan Junior <sergiodj@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=6f4f8f476a4e41cc7117a8e85087963c0ac3e95b

commit 6f4f8f476a4e41cc7117a8e85087963c0ac3e95b
Author: Sergio Durigan Junior <sergiodj@redhat.com>
Date:   Tue Sep 17 00:01:13 2019 -0400

    Update ChangeLog entry of commit e9224f6d80ea35e90a61d44575f12b26742eaaf3 and mention PR breakpoints/24541

Comment 33 Sourceware Commits 2019-09-18 21:50:00 UTC

The master branch has been updated by Joel Brobecker <brobecke@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a1726c38a9e8ac07b3fd6bbfac04da27c32c0d70

commit a1726c38a9e8ac07b3fd6bbfac04da27c32c0d70
Author: Joel Brobecker <brobecker@adacore.com>
Date:   Wed Sep 18 14:42:24 2019 -0700

    Update ChangeLog entry of commit 677052f2a5c67f1d9b2e6d1b2a2149b5f0c20cd0...
    
    ... to include PR breakpoints/24541 (for documentation purposes).
    
    For the reader's convenience, the commit in question was the following:
    
        commit 677052f2a5c67f1d9b2e6d1b2a2149b5f0c20cd0
        Author: Sergio Durigan Junior <sergiodj@redhat.com>
        Date:   Thu May 16 16:23:24 2019 -0400
        Subject: Make stap-probe.c:stap_parse_register_operand's "regname"
           an std::string