[RFC PATCH 1/1] sframe: Represent FP without RA on stack

Indu Bhagat indu.bhagat@oracle.com
Thu Apr 25 06:59:16 GMT 2024


On 4/23/24 08:44, Jens Remus wrote:
> Am 23.04.2024 um 01:58 schrieb Indu Bhagat:
>> On 4/22/24 08:58, Jens Remus wrote:
>>> If an architecture uses both SFrame RA and FP tracking SFrame assumes
>>> that the RA offset is the 2nd offset and the FP offset is the 3rd offset
>>> following the SFrame FRE. An architecture does not need to store both on
>>> the stack. SFrame cannot represent a FP without RA on stack, since it
>>> cannot distinguish whether the 2nd offset is the RA or FP offset.
>>>
>>> Use an invalid SFrame FRE RA offset value of zero as dummy padding to
>>> represent the FP being saved on the stack when the RA is not saved on
>>> the stack.
>>>
>>> include/
>>>     * sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining
>>>     the invalid RA offset value used to represent a dummy padding
>>>     offset.
>>>
>>> gas/
>>>     * gen-sframe.c (get_fre_num_offsets): Accommodate for dummy
>>>     padding RA offset if FP without RA on stack.
>>>     (sframe_get_fre_offset_size): Likewise.
>>>     (output_sframe_row_entry): Write a dummy padding RA offset
>>>     if FP without RA needs to be represented.
>>>
>>> libsframe/
>>>     * sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA
>>>     offsets as if they were undefined. Display them as "u*" to
>>>     distinguish them.
>>>
>>> Signed-off-by: Jens Remus <jremus@linux.ibm.com>
>>> ---
>>>
>>> Notes (jremus):
>>>      This patch eliminates 497 occurrences of the warning "skipping 
>>> SFrame
>>>      FDE due to FP without RA on stack" for a build of glibc on 
>>> s390x. For
>>>      libc.so this increases the number of FDEs by 166 and the number of
>>>      FREs by 861, while adding 337 dummy padding RA offsets. With a 
>>> total
>>>      of 28157 offsets the dummy padding offsets account for ~1.20 % 
>>> of the
>>>      offsets.
>>
>> While this increase seems small, it does look wasteful.
>>
>> An orthogonal question below...
>>
>>>      SFrame statistics without patch:
>>>          VALUE        TOTAL      MIN        MAX        AVG
>>>          FDEs:        3478       -          -          -
>>>          FREs/FDE:    14441      1          15         4
>>>          Offsets/FDE: 28157      1          31         8
>>>             8-bit:    0          0          0          0
>>>            16-bit:    28157      1          31         8
>>>            32-bit:    0          0          0          0
>>>          Offsets/FRE: 28157      1          3          1
>>>             8-bit:    -          0          0          0
>>>            16-bit:    -          1          3          1
>>>            32-bit:    -          0          0          0
>>>      SFrame statistics with patch applied:
>>>          VALUE        TOTAL      MIN        MAX        AVG
>>>          FDEs:        3644       -          -          -
>>>          FREs/FDE:    15302      1          20         4
>>>          Offsets/FDE: 29944      1          38         8
>>>             8-bit:    0          0          0          0
>>>            16-bit:    29944      1          38         8
>>>            32-bit:    0          0          0          0
>>>          Offsets/FRE: 29944      1          3          1
>>>             8-bit:    -          0          0          0
>>>            16-bit:    -          1          3          1
>>>            32-bit:    -          0          0          0
>>>          O_Padd/FDE:  337        -          -          0
>>>             8-bit:    0
>>>            16-bit:    337
>>>            32-bit:    0
>>>      Note that on s390x the offsets are at minimum 16-bits in size, 
>>> due to
>>>      the mandatory CFA offset being at least 160.
>>>
>>
>> IIUC, all stack layouts supported in the ABI use the offset 160. Is 
>> that right ? I am wondering if adjusting the stored offsets in the 
>> SFrame section (by decrementing 160 from it) will work ?
>>
>> If yes, we could encode this constant in SFrame aux hdr bytes for s390x.
> 
> Thank you for the hint! Using a constant adjustment of -160 on s390x for 
> the CFA offset from CFA base register should work to allow for 8-bit 
> offsets to be used. Aren't all tracked offsets (i.e. CFA, FP, and RA) 
> signed anyway? Thus applying a constant adjustment should work in any case?
> 

Yes, these offsets are signed.  Applying the constant should work for CFA.

> Couldn't it simply be an architecture specific constant in the code to 
> begin with? For example a new macro, which is only applied when defined?
> 
> #define S390_SFRAME_CFA_OFFSET_ADJUSTMENT -160
> #define SFRAME_CFA_OFFSET_ADJUSTMENT S390_SFRAME_CFA_OFFSET_ADJUSTMENT
> 

I think a macro should also work in this case.

> Implementing this in the SFrame auxiliary header would of course allow 
> to implement this enhancement at a later stage and to change the 
> adjustment value in the future, as the linker can then either reject or 
> merge different adjustment values during link editing.
> 
> I wonder whether it would make sense to store the FP register number in 
> the SFrame auxiliary header for s390x as well. Register 11 is just a 
> convention of the compilers and not defined by the ABI. That would 
> enable us to choose a different register as frame pointer in the future.
> 

I think the problem will remain that there is ATM no way to communicate 
this information to the assembler (that compiler used r11 as fp role). 
And even if there was a way, I am not so sure.  Since the ABI doesnt 
mandate r11 as fp, the compiler may pick another register for say a 
different stack layout etc, in the future ?  IOW, even it picking 
different fp register across functions is a possibility, no? So what is 
expected of the compiler then...




More information about the Binutils mailing list