This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: incorrect signed data

From: Mark Wielaard <mjw at redhat dot com>
To: elfutils-devel at lists dot fedorahosted dot org
Date: Wed, 05 Feb 2014 12:44:52 +0100
Subject: Re: incorrect signed data

On Tue, 2014-02-04 at 18:24 -0800, Josh Stone wrote:
> On 02/04/2014 03:12 PM, Josh Stone wrote:
> > There are only a few internal dwarf_formsdata calls: for the decls as I
> > mentioned, and in array_size() for DW_AT_lower/upper_bound.  AFAICS the
> > spec doesn't explicitly call bounds signed or unsigned, but only
> > unsigned makes sense to me, so these also ought to use dwarf_formudata.
> 
> http://www.dwarfstd.org/ShowIssue.php?issue=020702.1
> 
> So Fortran allows negative bounds, yay, and this is the origin of the
> standard's vague statements about data[1248] context.

Thanks for finding this, it explains the context nicely.

> Here's a little experiment with gcc-gfortran-4.8.2-7.fc20.x86_64:
> (and forgive my fortran ignorance, but at least this compiles)
> 
>      PROGRAM main
>        INTEGER A(10:199)
>        INTEGER B(-20:-10)
>        A(10) = B(-10)
>      END
> 
> yields:
> 
>  [    67]    array_type
>              type                 (ref4) [    7f]
>              sibling              (ref4) [    78]
>  [    70]      subrange_type
>                type                 (ref4) [    78]
>                lower_bound          (data1) 10
>                upper_bound          (data1) 199
>  [    78]    base_type
>              byte_size            (data1) 8
>              encoding             (data1) signed (5)
>              name                 (strp) "integer(kind=8)"
>  [    7f]    base_type
>              byte_size            (data1) 4
>              encoding             (data1) signed (5)
>              name                 (strp) "integer(kind=4)"
>  [    86]    array_type
>              type                 (ref4) [    7f]
>              sibling              (ref4) [    a5]
>  [    8f]      subrange_type
>                type                 (ref4) [    78]
>                lower_bound          (data8) 18446744073709551596
>                upper_bound          (data8) 18446744073709551606
> 
> Thus gfortran appears to support the current elfutils behavior - read it
> as unsigned and cast it without sign extension.  It happily put 199 in
> data1, and went all the way to data8 for negative values.  It could have
> been more compact with sdata instead of data8 though.
> 
> Also, apparently eu-readelf is not using dwarf_formsdata for bounds, and
> it should.

It doesn't because it is very low-level and doesn't use any context. So
it just sees the DW_FORM_data8 and will print its value. But if I read
the DWARF issue correctly then a higher-level interface seeing a
DW_TAG_subrange_type would lookup the DW_TAG_type for the DIE first to
see whether it is signed or not to decide how to interpret the
DW_AT_lower and upper bound values. It can even be a reference or an
exprloc that represents the actual value. We might want to introduce a
dwarf_subrange_bounds () function that does that.

>   Binutils readelf prints those as hex, no better.
> 
> FWIW, libdwarf's dwarfdump just reveals its indecision:
>   DW_AT_lower_bound           10
>   DW_AT_upper_bound           199(as signed = -57)
> and
>   DW_AT_lower_bound           18446744073709551596(as signed = -20)
>   DW_AT_upper_bound           18446744073709551606(as signed = -10)

Right, because dwarfdump is similar to eu-readelf, it doesn't use any
context and so it doesn't know how to represent the value encoded with
DW_FORM_data8. I actually like that it also prints the signed value if
different. Maybe we should make eu-readelf do the same? Printing is hex
like binutils readelf does is another way to mask the ambiguity at the
low-level.

> So now I'm not sure anything needs to change.  At least dwarf_formsdata
> should stay as-is for gcc.

Are you sure? I think your original analysis is correct that
dwarf_formsdata () is wrong and really should sign-extend.

>   We could conceivably use dwarf_formudata for
> DW_AT_decl_file/line/column, since those really are specified unsigned,
> but this is unlikely to ever make a difference.  The values for
> dwarf_decl_line/column are asserted 0..INT_MAX, and people with more
> than INT64_MAX files are already insane.

You are right. Still using dwarf_formudata () would be more correct
IMHO.

Cheers,

Mark

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]