This is the mail archive of the mailing list for the elfutils project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/3] Simplify and inline get_uleb128 and get_sleb128

On 12/12/2013 04:13 AM, Petr Machata wrote:
> Josh Stone <> writes:
>> +static inline uint64_t
>> +__libdw_get_uleb128 (const unsigned char **addrp)
>> +{
>> +  uint64_t acc = 0;
>> +  get_uleb128_step (acc, *addrp, 0);
>> +  for (unsigned int i = 1; i < len_leb128(acc); ++i)
>> +    get_uleb128_step (acc, *addrp, i);
> Is there a reason not to use for (i = 0; ...) instead of the pre-step
> followed by a for (i = 1; ...)?  The only difference would be for
> len_leb128 of 0, where your code makes one step anyway, but that can't
> happen.  Then you could just expand get_uleb128_step inline, and get rid
> of it altogether.

Ah, yes, I should explain that a little.  I found that the code was
actually faster with the first step unrolled.  And you may have noticed
I didn't do this for sleb128, because I found that slightly slower.

It seems to work like an extra hint to the compiler.  There's already a
likely-return in the loop, but writing it this way seems to convince gcc
that the first step in particular is *really* likely to return.  Then
that step also gets to const-prop the zero.

I can only guess that the sign-extension for sleb128 is complicated
enough to nullify the gain of unrolling.

>> +#define get_uleb128(var, addr)						      \
>>    do {									      \
>> +    (var) = __libdw_get_uleb128(&(addr));				      \
>> +    (void)(var);							      \
>>    } while (0)
> Space before (...).  Space after type cast operator.
> Actually, could this actually be defined like this?
>         #define get_uleb128(var, addr) ((var) = __libdw_get_uleb128 (&(addr)))
> Then you wouldn't need the cast as the only set-but-not-used warnings
> would come from sites that actually set but not use, and those could be
> changed to simply call __libdw_get_uleb128.  But that's probably a
> logically separate change.

Right, there are two reasons I kept it this way.  One is the cast for
set-but-not-used, as you note, which didn't trigger before because the
former macros did both read and write the value in place.  I think there
was only one caller affected though, so I could just change that.  The
do-while keeps the macro as a statement, vs your expression, but maybe
that doesn't matter much.

>> +#define get_sleb128_step(var, addr, nth)				      \
>>    do {									      \
>> +    unsigned char __b = *(addr)++;					      \
>> +    if (likely ((__b & 0x80) == 0))					      \
>>        {									      \
>> +	   struct { signed int i:7; } __s = { .i = __b };			      \
>> +	   (var) |= (typeof (var)) __s.i << ((nth) * 7);			      \
> Oh, the bitfield trick is clever!

I should give credit, I found that trick here:

The former code was trying to sign-extend after the value was shifted
and combined, which as a variable width is harder.  I really like that
this way the compiler is fully aware that this is a sign extension,
rather than being a side effect of ORing bits or left-right shifts.

> This all looks fine to me.  I find the new code much clearer than the
> original, which seemed to wind every which way and was hard to follow.

I agree the original was convoluted, and I think it was actually broken
for some edge cases, though you almost have to look at it preprocessed
to make sense of it.  In particular, __libdw_get_sleb128 ORed with acc
in the final step, but only _v was kept updated throughout the loop.
(But it's unlikely any real DWARF had a 10-byte sleb128 to hit this.)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]