This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
Re: Debugger support for __float128 type?
- From: Gabriel Paubert <paubert at iram dot es>
- To: Mark Kettenis <mark dot kettenis at xs4all dot nl>
- Cc: uweigand at de dot ibm dot com, gcc at gcc dot gnu dot org, gdb at sourceware dot org
- Date: Thu, 1 Oct 2015 11:45:36 +0200
- Subject: Re: Debugger support for __float128 type?
- Authentication-results: sourceware.org; auth=none
- References: <20150930173344 dot 9D78513CC at oc7340732750 dot ibm dot com> <201509302242 dot t8UMg5j1017796 at glazunov dot sibelius dot xs4all dot nl>
On Thu, Oct 01, 2015 at 12:42:05AM +0200, Mark Kettenis wrote:
> > Date: Wed, 30 Sep 2015 19:33:44 +0200 (CEST)
> > From: "Ulrich Weigand" <uweigand@de.ibm.com>
> >
> > Hello,
> >
> > I've been looking into supporting __float128 in the debugger, since we're
> > now introducing this type on PowerPC. Initially, I simply wanted to do
> > whatever GDB does on Intel, but it turns out debugging __float128 doesn't
> > work on Intel either ...
> >
> > The most obvious question is, how should the type be represented in
> > DWARF debug info in the first place? Currently, GCC generates on i386:
> >
> > .uleb128 0x3 # (DIE (0x2d) DW_TAG_base_type)
> > .byte 0xc # DW_AT_byte_size
> > .byte 0x4 # DW_AT_encoding
> > .long .LASF0 # DW_AT_name: "long double"
> >
> > and
> >
> > .uleb128 0x3 # (DIE (0x4c) DW_TAG_base_type)
> > .byte 0x10 # DW_AT_byte_size
> > .byte 0x4 # DW_AT_encoding
> > .long .LASF1 # DW_AT_name: "__float128"
> >
> > On x86_64, __float128 is encoded the same way, but long double is:
> >
> > .uleb128 0x3 # (DIE (0x31) DW_TAG_base_type)
> > .byte 0x10 # DW_AT_byte_size
> > .byte 0x4 # DW_AT_encoding
> > .long .LASF0 # DW_AT_name: "long double"
> >
> > Now, GDB doesn't recognize __float128 on either platform, but on i386
> > it could at least in theory distinguish the two via DW_AT_byte_size.
> >
> > But on x86_64 (and also on powerpc), long double and __float128 have
> > the identical DWARF encoding, except for the name.
> >
> > Looking at the current DWARF standard, it's not really clear how to
> > make a distinction, either. The standard has no way to specifiy any
> > particular floating-point format; the only attributes for a base type
> > of DW_ATE_float encoding are related to the size.
> >
> > (For the Intel case, one option might be to represent the fact that
> > for long double, there only 80 data bits and the rest is padding, via
> > some combination of the DW_AT_bit_size and DW_AT_bit_offset or
> > DW_AT_data_bit_offset attributes. But that wouldn't help for PowerPC
> > since both long double and __float128 really use 128 data bits,
> > just different encodings.)
> >
> > Some options might be:
> >
> > - Extend the official DWARF standard in some way
> >
> > - Use a private extension (e.g. from the platform-reserved
> > DW_AT_encoding value range)
> >
> > - Have the debugger just hard-code a special case based
> > on the __float128 name
> >
> > Am I missing something here? Any suggestions welcome ...
> >
> > B.t.w. is there interest in fixing this problem for Intel? I notice
> > there is a GDB bug open on the issue, but nothing seems to have happened
> > so far: https://sourceware.org/bugzilla/show_bug.cgi?id=14857
>
> Perhaps you should start with explaining what __float128 actually is
> on your specific platform? And what long double actually is.
>
> I'm guessing long double is a what we sometimes call an IBM long
> double, which is essentially two IEEE double-precision floating point
> numbers packed together and that __float128 is an attempt to fix
> history and have a proper IEEE quad-precision floating point type ;).
> And that __float128 isn't actually implemented in hardware.
An IBM mainframe might want to discuss this point with you :-).
See pages 24-25 of http://arith22.gforge.inria.fr/slides/s1-schwarz.pdf
Latencies are decent, not extremely low, but we are speaking of a
processor clocked at 5GHz, so the latencies are 2.2ns for add/subtract,
4.6ns for multiplications, and ~10ns for division.
To put things in perspective, how many cycles is a memory access which
misses in both L1 and L2 caches these days?
> The reason people haven't bothered to fix this, is probably because
> nobody actually implements quad-precision floating point in hardware.
> And software implementations are so slow that people don't really use
> them unless they need to. Like I did to nomerically calculate some
> asymptotic expansions for my Thesis work...
Which would probably run much faster if ported to a z13.
Gabriel