[RFC PATCH v3 0/2] s390: Initial support to generate SFrame info from CFI directives in assembler

Indu Bhagat indu.bhagat@oracle.com
Thu Apr 25 07:54:41 GMT 2024


On 4/22/24 17:26, Indu Bhagat wrote:
> On 4/12/24 09:36, Jens Remus wrote:
>> As suggested by Indu I have split my RFC V2 patch series to add initial
>> support to generate SFrame information on s390x into:
>>
>> 1. A separate series consisting of all the generic preparatory cleanups,
>>     enhancements, and fixes to the generation of SFrame information
>>     in the assembler: "[PATCH v3 00/15] sframe: Enhancements to SFrame
>>     info generation".
>>
>> 2. This RFC series on top with the actual "s390: Initial support
>>     to generate SFrame info from CFI directives in assembler"
>>
>> Since the original patch series was at V2 I am sending both as V3.
>>
>>
>> This patch series adds initial support to the assembler on s390x to
>> generate SFrame stack trace information from CFI directives. It is
>> largely based on the respective AArch64 support.
>>
>> Please be aware that the SFrame support for s390x provided by patches
>> 1 and 2 of this series still has some open issues, which need to be
>> addressed.
>> Any ideas or assistance to overcome the current SFrame limitations
>> listed below and in the patch description are very welcome.
>>
>> Patch 1 adds initial support to the assembler on s390x to generate
>> SFrame stack trace information from CFI directives. Due to differences
>> in the s390x ELF ABI [1] compared to the AArch64 and x68 AMD64 ELF ABIs
>> and the simplified assumptions of the current SFrame format V2 there
>> are several unresolved issues (see also patch description):
>>
>> - GCC on s390x may save the frame pointer (FP; r11) and/or return
>>    address (RA; r14) registers in other registers (e.g. floating-point
>>    registers) instead of saving them on the stack. SFrame currently only
>>    supports FP/RA tracking on the stack.
>>    GCC only does so in leaf-functions. That are functions that do not
>>    call any other functions. Therefore they are guaranteed to only ever
>>    be the topmost function on the call stack.
>>    This is addressed by patch 2, which extends SFrame with limited
>>    support to represent FP/RA saved in another register on s390x.
>>
>> - The s390x ELF ABI does not designate a frame pointer register number.
>>    Currently GCC and clang mostly use register r11.
>>    This could be addressed by designating r11 as frame-pointer register
>>    in the s390x ELF ABI in a SFrame compatibility section.
>>
>> - GCC GCC be observed to use r14 as frame-pointer register in the stack
>>    clash protector in the function prologue. This guarantees that the
>>    function is toplevel on the call stack during this non-default frame-
>>    pointer use.
>>    This could be addressed by changing GCC to use the "default" frame-
>>    pointer register r11. Another option would be to extend SFrame
>>    with limited support to track a non-SP/FP CFA base register number on
>>    s390x.
>>
>> - Glibc uses non-default frame-pointer register r12 in two instances.
>>    This most likely can be addressed by chaning glibc to use the default
>>    frame-pointer register r11 instead.
>>
>> - Glibc mcount() / fentry() use non-default return-adress register r0.
>>    This cannot be changed, but the use of Linux Kernel perf together
>>    with profiling can be documented as incompatible on s390x.
>>
>> - GCC on s390x may copy the return address (RA) from register r14 to
>>    another register and even use that to return to the caller.
>>    Effectively this is just a specialization of the first bullet and
>>    is addressed by patch 2.
>>
>> - GCC may produce code and CFI where the FP is saved on the stack
>>    while the RA is not. SFrame currently cannot represent this.
>>    See the respective patch in my separate preparatory patch series.
>>    This can be addressed by enhancing SFrame to represent this case.
>>    I will send two alternative options to achieve this as RFC soon.
>>
> 
> Hi Jens,
> 
> For this issue I have taken a look at your two proposals:
>    - #R1 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (padding)
>    - #R2 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap)
> 
> (Thanks a lot for working out those options in detail and providing the 
> patches, and stats where applicable).
> 
> I think #R1 is good in the way it keeps the changes isolated to s390 
> support but it is a bit wasteful to force an offset.  The latter 
> indicates limitation in SFrame representation (pointing to the need to 
> improve and not suffer by sacrificing space). As for #R2, IMO it makes 
> the format less future-proof: it will only allow upto 4 offsets.
> 
> Now an option indeed is to:
> 
> 1. Use your #R2 approach but keep it confined to s390.
> 
> We can use additional bits in the SFrame auxiliary header to indicate 
> the fact that for s390, the 4-bits in sframe_fre_info are "flags for 
> offsets" and not "Number of offsets". Hence keeping this limitation 
> confined to the s390x incarnation of the format.
> 
> Adding additional information in the the auxiliary header area: Using 
> sfh_auxhdr_len to define a 32-bit area (32-bit size so that FDEs 
> accesses do not cause unaligned memory accesses; not sure myself yet if 
> unaligned accesses are OK on s390 though).
> 
> Although I dont like the additional complexity of "adding info in the 
> the aux hdr to allow a different interpretation of an existing field 
> differently"; but I don't have a strong opinion against this.
> 
> 2. There is an unused bit in sframe_fre_info.  We could consider using 
> it (for s390x) to tag those FREs where RA has not been saved on stack, 
> but FP is.
> 
> I dont think this information can be encoded in SFrame FDE.  There can 
> be a function that does not save r14 in prologue, but does so later, 
> right ? If not, we also have the following:
> 
> There are unused bits in SFrame FDE (see sfde_func_info) that can be 
> used to indicate that an FDE does not save RA on stack, hence the second 
> offset, if present, will be FP.  Then we can use it in conjunction with 
> "RA-tracking-enabled".
> 
> I was thinking of suggesting to use the bit [(((sfde_func_info) >> 5) & 
> 0x1)] (i.e. the one currently used for Pauth key on aarch64).
> 
> We could instead also consider using one of the upper two bits 
> (sfde_func_info), which are currently unused, if this feature will be 
> useful for other arches.
> 

And ...

3. Similar to the padding solution, using SFrame FRE offset to store 
some metadata is also an option.  A rough sketch provided in the other 
thread.

(I am already of two minds about the padding solution: I initially found 
it wasteful, but overall it does fit well with the SFrame philosophy in 
that we dont lose the flexibility.)

> 
>> Patch 2 (new) extends SFrame with limited support to represent FP/RA
>> saved in another register on s390x. Functions may use this only while
>> they are the toplevel function on the call stack. That is for instance
>> in leaf-functions and in the function prologue and epilogue.
>>
>> [1] ELF ABI s390x Supplement:
>>      https://github.com/IBM/s390x-abi/releases
>>
>>
>> Changes v3 -> v2:
>> - New patch as noted above and in the patch notes.
>>
>> Changes v1 -> v2:
>> - Resolved a regression reported by Linaro-TCWG-CI on AArch64 in one of
>>    the generic ld SFrame test cases. The test case contained a
>>    .cfi_def_cfa directive, specifying a CFA base register number that is
>>    not necessarily a SFrame SP/FP register number on all architectures.
>>    This caused the changes from patch 6 to skip SFrame FDE generation on
>>    AArch64 (and s390x aswell) with a warning, causing the test case to
>>    fail.
>>
>> Thanks and regards,
>> Jens
> 
> 



More information about the Binutils mailing list