[RFC PATCH v3 0/2] s390: Initial support to generate SFrame info from CFI directives in assembler

Indu Bhagat indu.bhagat@oracle.com
Tue Apr 23 00:26:31 GMT 2024


On 4/12/24 09:36, Jens Remus wrote:
> As suggested by Indu I have split my RFC V2 patch series to add initial
> support to generate SFrame information on s390x into:
> 
> 1. A separate series consisting of all the generic preparatory cleanups,
>     enhancements, and fixes to the generation of SFrame information
>     in the assembler: "[PATCH v3 00/15] sframe: Enhancements to SFrame
>     info generation".
> 
> 2. This RFC series on top with the actual "s390: Initial support
>     to generate SFrame info from CFI directives in assembler"
> 
> Since the original patch series was at V2 I am sending both as V3.
> 
> 
> This patch series adds initial support to the assembler on s390x to
> generate SFrame stack trace information from CFI directives. It is
> largely based on the respective AArch64 support.
> 
> Please be aware that the SFrame support for s390x provided by patches
> 1 and 2 of this series still has some open issues, which need to be
> addressed.
> Any ideas or assistance to overcome the current SFrame limitations
> listed below and in the patch description are very welcome.
> 
> Patch 1 adds initial support to the assembler on s390x to generate
> SFrame stack trace information from CFI directives. Due to differences
> in the s390x ELF ABI [1] compared to the AArch64 and x68 AMD64 ELF ABIs
> and the simplified assumptions of the current SFrame format V2 there
> are several unresolved issues (see also patch description):
> 
> - GCC on s390x may save the frame pointer (FP; r11) and/or return
>    address (RA; r14) registers in other registers (e.g. floating-point
>    registers) instead of saving them on the stack. SFrame currently only
>    supports FP/RA tracking on the stack.
>    GCC only does so in leaf-functions. That are functions that do not
>    call any other functions. Therefore they are guaranteed to only ever
>    be the topmost function on the call stack.
>    This is addressed by patch 2, which extends SFrame with limited
>    support to represent FP/RA saved in another register on s390x.
> 
> - The s390x ELF ABI does not designate a frame pointer register number.
>    Currently GCC and clang mostly use register r11.
>    This could be addressed by designating r11 as frame-pointer register
>    in the s390x ELF ABI in a SFrame compatibility section.
> 
> - GCC GCC be observed to use r14 as frame-pointer register in the stack
>    clash protector in the function prologue. This guarantees that the
>    function is toplevel on the call stack during this non-default frame-
>    pointer use.
>    This could be addressed by changing GCC to use the "default" frame-
>    pointer register r11. Another option would be to extend SFrame
>    with limited support to track a non-SP/FP CFA base register number on
>    s390x.
> 
> - Glibc uses non-default frame-pointer register r12 in two instances.
>    This most likely can be addressed by chaning glibc to use the default
>    frame-pointer register r11 instead.
> 
> - Glibc mcount() / fentry() use non-default return-adress register r0.
>    This cannot be changed, but the use of Linux Kernel perf together
>    with profiling can be documented as incompatible on s390x.
> 
> - GCC on s390x may copy the return address (RA) from register r14 to
>    another register and even use that to return to the caller.
>    Effectively this is just a specialization of the first bullet and
>    is addressed by patch 2.
> 
> - GCC may produce code and CFI where the FP is saved on the stack
>    while the RA is not. SFrame currently cannot represent this.
>    See the respective patch in my separate preparatory patch series.
>    This can be addressed by enhancing SFrame to represent this case.
>    I will send two alternative options to achieve this as RFC soon.
> 

Hi Jens,

For this issue I have taken a look at your two proposals:
   - #R1 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (padding)
   - #R2 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap)

(Thanks a lot for working out those options in detail and providing the 
patches, and stats where applicable).

I think #R1 is good in the way it keeps the changes isolated to s390 
support but it is a bit wasteful to force an offset.  The latter 
indicates limitation in SFrame representation (pointing to the need to 
improve and not suffer by sacrificing space). As for #R2, IMO it makes 
the format less future-proof: it will only allow upto 4 offsets.

Now an option indeed is to:

1. Use your #R2 approach but keep it confined to s390.

We can use additional bits in the SFrame auxiliary header to indicate 
the fact that for s390, the 4-bits in sframe_fre_info are "flags for 
offsets" and not "Number of offsets". Hence keeping this limitation 
confined to the s390x incarnation of the format.

Adding additional information in the the auxiliary header area: Using 
sfh_auxhdr_len to define a 32-bit area (32-bit size so that FDEs 
accesses do not cause unaligned memory accesses; not sure myself yet if 
unaligned accesses are OK on s390 though).

Although I dont like the additional complexity of "adding info in the 
the aux hdr to allow a different interpretation of an existing field 
differently"; but I don't have a strong opinion against this.

2. There is an unused bit in sframe_fre_info.  We could consider using 
it (for s390x) to tag those FREs where RA has not been saved on stack, 
but FP is.

I dont think this information can be encoded in SFrame FDE.  There can 
be a function that does not save r14 in prologue, but does so later, 
right ? If not, we also have the following:

There are unused bits in SFrame FDE (see sfde_func_info) that can be 
used to indicate that an FDE does not save RA on stack, hence the second 
offset, if present, will be FP.  Then we can use it in conjunction with 
"RA-tracking-enabled".

I was thinking of suggesting to use the bit [(((sfde_func_info) >> 5) & 
0x1)] (i.e. the one currently used for Pauth key on aarch64).

We could instead also consider using one of the upper two bits 
(sfde_func_info), which are currently unused, if this feature will be 
useful for other arches.


> Patch 2 (new) extends SFrame with limited support to represent FP/RA
> saved in another register on s390x. Functions may use this only while
> they are the toplevel function on the call stack. That is for instance
> in leaf-functions and in the function prologue and epilogue.
> 
> [1] ELF ABI s390x Supplement:
>      https://github.com/IBM/s390x-abi/releases
> 
> 
> Changes v3 -> v2:
> - New patch as noted above and in the patch notes.
> 
> Changes v1 -> v2:
> - Resolved a regression reported by Linaro-TCWG-CI on AArch64 in one of
>    the generic ld SFrame test cases. The test case contained a
>    .cfi_def_cfa directive, specifying a CFA base register number that is
>    not necessarily a SFrame SP/FP register number on all architectures.
>    This caused the changes from patch 6 to skip SFrame FDE generation on
>    AArch64 (and s390x aswell) with a warning, causing the test case to
>    fail.
> 
> Thanks and regards,
> Jens




More information about the Binutils mailing list