[RFC PATCH 1/1] sframe: Represent FP without RA on stack
Indu Bhagat
indu.bhagat@oracle.com
Thu Apr 25 06:59:16 GMT 2024
On 4/23/24 08:44, Jens Remus wrote:
> Am 23.04.2024 um 01:58 schrieb Indu Bhagat:
>> On 4/22/24 08:58, Jens Remus wrote:
>>> If an architecture uses both SFrame RA and FP tracking SFrame assumes
>>> that the RA offset is the 2nd offset and the FP offset is the 3rd offset
>>> following the SFrame FRE. An architecture does not need to store both on
>>> the stack. SFrame cannot represent a FP without RA on stack, since it
>>> cannot distinguish whether the 2nd offset is the RA or FP offset.
>>>
>>> Use an invalid SFrame FRE RA offset value of zero as dummy padding to
>>> represent the FP being saved on the stack when the RA is not saved on
>>> the stack.
>>>
>>> include/
>>> * sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining
>>> the invalid RA offset value used to represent a dummy padding
>>> offset.
>>>
>>> gas/
>>> * gen-sframe.c (get_fre_num_offsets): Accommodate for dummy
>>> padding RA offset if FP without RA on stack.
>>> (sframe_get_fre_offset_size): Likewise.
>>> (output_sframe_row_entry): Write a dummy padding RA offset
>>> if FP without RA needs to be represented.
>>>
>>> libsframe/
>>> * sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA
>>> offsets as if they were undefined. Display them as "u*" to
>>> distinguish them.
>>>
>>> Signed-off-by: Jens Remus <jremus@linux.ibm.com>
>>> ---
>>>
>>> Notes (jremus):
>>> This patch eliminates 497 occurrences of the warning "skipping
>>> SFrame
>>> FDE due to FP without RA on stack" for a build of glibc on
>>> s390x. For
>>> libc.so this increases the number of FDEs by 166 and the number of
>>> FREs by 861, while adding 337 dummy padding RA offsets. With a
>>> total
>>> of 28157 offsets the dummy padding offsets account for ~1.20 %
>>> of the
>>> offsets.
>>
>> While this increase seems small, it does look wasteful.
>>
>> An orthogonal question below...
>>
>>> SFrame statistics without patch:
>>> VALUE TOTAL MIN MAX AVG
>>> FDEs: 3478 - - -
>>> FREs/FDE: 14441 1 15 4
>>> Offsets/FDE: 28157 1 31 8
>>> 8-bit: 0 0 0 0
>>> 16-bit: 28157 1 31 8
>>> 32-bit: 0 0 0 0
>>> Offsets/FRE: 28157 1 3 1
>>> 8-bit: - 0 0 0
>>> 16-bit: - 1 3 1
>>> 32-bit: - 0 0 0
>>> SFrame statistics with patch applied:
>>> VALUE TOTAL MIN MAX AVG
>>> FDEs: 3644 - - -
>>> FREs/FDE: 15302 1 20 4
>>> Offsets/FDE: 29944 1 38 8
>>> 8-bit: 0 0 0 0
>>> 16-bit: 29944 1 38 8
>>> 32-bit: 0 0 0 0
>>> Offsets/FRE: 29944 1 3 1
>>> 8-bit: - 0 0 0
>>> 16-bit: - 1 3 1
>>> 32-bit: - 0 0 0
>>> O_Padd/FDE: 337 - - 0
>>> 8-bit: 0
>>> 16-bit: 337
>>> 32-bit: 0
>>> Note that on s390x the offsets are at minimum 16-bits in size,
>>> due to
>>> the mandatory CFA offset being at least 160.
>>>
>>
>> IIUC, all stack layouts supported in the ABI use the offset 160. Is
>> that right ? I am wondering if adjusting the stored offsets in the
>> SFrame section (by decrementing 160 from it) will work ?
>>
>> If yes, we could encode this constant in SFrame aux hdr bytes for s390x.
>
> Thank you for the hint! Using a constant adjustment of -160 on s390x for
> the CFA offset from CFA base register should work to allow for 8-bit
> offsets to be used. Aren't all tracked offsets (i.e. CFA, FP, and RA)
> signed anyway? Thus applying a constant adjustment should work in any case?
>
Yes, these offsets are signed. Applying the constant should work for CFA.
> Couldn't it simply be an architecture specific constant in the code to
> begin with? For example a new macro, which is only applied when defined?
>
> #define S390_SFRAME_CFA_OFFSET_ADJUSTMENT -160
> #define SFRAME_CFA_OFFSET_ADJUSTMENT S390_SFRAME_CFA_OFFSET_ADJUSTMENT
>
I think a macro should also work in this case.
> Implementing this in the SFrame auxiliary header would of course allow
> to implement this enhancement at a later stage and to change the
> adjustment value in the future, as the linker can then either reject or
> merge different adjustment values during link editing.
>
> I wonder whether it would make sense to store the FP register number in
> the SFrame auxiliary header for s390x as well. Register 11 is just a
> convention of the compilers and not defined by the ABI. That would
> enable us to choose a different register as frame pointer in the future.
>
I think the problem will remain that there is ATM no way to communicate
this information to the assembler (that compiler used r11 as fp role).
And even if there was a way, I am not so sure. Since the ABI doesnt
mandate r11 as fp, the compiler may pick another register for say a
different stack layout etc, in the future ? IOW, even it picking
different fp register across functions is a possibility, no? So what is
expected of the compiler then...
More information about the Binutils
mailing list