Bug 29543

Summary: [gdb/tdep, ppc] inferior call with long double vararg not handled correctly
Product: gdb Reporter: Tom de Vries <vries>
Component: tdepAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: cel, uweigand
Priority: P2    
Version: HEAD   
Target Milestone: 13.1   
Host: Target:
Build: Last reconfirmed:
Attachments: gzipped executable
tentative patch

Description Tom de Vries 2022-09-02 13:00:45 UTC
[ Spinoff from PR29247 comment 23. ]

On a powerpc system with gcc 12 built to default to 128-bit IEEE long double, I run into:
...
(gdb) print find_max_long_double_real(4, ldc1, ldc2, ldc3, ldc4)^M
$8 = 0 + 0i^M
(gdb) FAIL: gdb.base/varargs.exp: print find_max_long_double_real(4, ldc1, ldc2, ldc3, ldc4)
...

I've minimized this to:
...
$ cat gdb/testsuite/gdb.base/varargs.c
#include <stdarg.h>
#include <stdlib.h>

long double _Complex orig = 4.0L + 4.0Li;
long double _Complex copy = 0L + 0Li;

int
test (void)
{
  return 0;
}

void
do_copy (int num_vals, ...)
{
#if 1
  va_list argp;
  va_start (argp, num_vals);
  copy = va_arg (argp, long double _Complex);
#else
  copy = orig;
#endif
}

void
_start (void)
{
  test ();

  while (1)
    ;
}
...
and:
...
$ cat gdb/testsuite/gdb.base/varargs.exp
standard_testfile .c

set additional_flags {debug}
lappend additional_flags "additional_flags=-nostdlib"

if  { [gdb_compile "${srcdir}/${subdir}/${srcfile}" "${binfile}" \
           executable ${additional_flags}] != "" } {
     untested "failed to compile"
     return -1
}

clean_restart ${binfile}

gdb_test_no_output "set print sevenbit-strings"
gdb_test_no_output "set print address off"
gdb_test_no_output "set width 0"

runto test

gdb_test "print orig" ".*= 4 \\+ 4i"
gdb_test "print copy" ".*= 0 \\+ 0i"

gdb_test "print do_copy (1, orig)"

with_test_prefix "after copy" {
    gdb_test "print orig" ".*= 4 \\+ 4i"
    gdb_test "print copy" ".*= 4 \\+ 4i"
}
...
Comment 1 Tom de Vries 2022-09-02 13:07:03 UTC
Created attachment 14309 [details]
gzipped executable

Static executable, so no dynamic libs needed, which should make it easier to run on other systems:
...
$ file varargs 
varargs: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, BuildID[sha1]=8792d24c47498f2eee792c53a0136c6249c84c21, with debug_info, not stripped
...

Reproduce using:
...
$ gdb -q -batch ./varargs -ex "tbreak test" -ex run -ex "print do_copy (1, orig)" -ex "print copy"
Temporary breakpoint 1 at 0x1000018c: file /home/abuild/rpmbuild/BUILD/gdb-12.1/build-ppc64le-suse-linux/gdb/testsuite/../../../gdb/testsuite/gdb.base/varargs.c, line 10.

Temporary breakpoint 1, test () at /home/abuild/rpmbuild/BUILD/gdb-12.1/build-ppc64le-suse-linux/gdb/testsuite/../../../gdb/testsuite/gdb.base/varargs.c:10
10      /home/abuild/rpmbuild/BUILD/gdb-12.1/build-ppc64le-suse-linux/gdb/testsuite/../../../gdb/testsuite/gdb.base/varargs.c: No such file or directory.
$1 = void
$2 = 0 + 0i
...
Comment 2 Tom de Vries 2022-09-02 13:16:53 UTC
System specifics, in case that matters:
...
$ cat /proc/cpuinfo 
processor       : 0
cpu             : POWER9 (architected), altivec supported
clock           : 3000.000000MHz
revision        : 2.2 (pvr 004e 0202)
....
Comment 3 Tom de Vries 2022-09-02 13:26:14 UTC
I simplified the example to strip the _Complex, and I still got the same problem, so it's really related to "long double".
Comment 4 Tom de Vries 2022-09-02 15:14:51 UTC
(In reply to Tom de Vries from comment #3)
> I simplified the example to strip the _Complex, and I still got the same
> problem, so it's really related to "long double".

Looking at that variant, I added the following code:
...
void __attribute__((noinline))
test2 (void)
{
  do_copy (1, orig);
}
...
to get an example how code is generated, and compiled it at -O2 (adding noinline attributes for all functions) to make it more readable.

I get this:
...
0000000010000210 <test2>:
    10000210:   02 10 40 3c     lis     r2,4098
    10000214:   00 80 42 38     addi    r2,r2,-32768
    10000218:   a6 02 08 7c     mflr    r0
    1000021c:   00 00 00 60     nop
    10000220:   01 00 60 38     li      r3,1
    10000224:   00 80 22 39     addi    r9,r2,-32768
    10000228:   00 00 a9 e8     ld      r5,0(r9)
    1000022c:   08 00 c9 e8     ld      r6,8(r9)
    10000230:   10 00 01 f8     std     r0,16(r1)
    10000234:   a1 ff 21 f8     stdu    r1,-96(r1)
    10000238:   81 ff ff 4b     bl      100001b8 <do_copy+0x8>
    1000023c:   60 00 21 38     addi    r1,r1,96
    10000240:   10 00 01 e8     ld      r0,16(r1)
    10000244:   a6 03 08 7c     mtlr    r0
    10000248:   20 00 80 4e     blr
    1000024c:   00 00 00 00     .long 0x0
    10000250:   00 00 00 01     .long 0x1000000
    10000254:   80 00 00 00     .long 0x80
    10000258:   00 00 00 60     nop
    1000025c:   00 00 42 60     ori     r2,r2,0
...

AFAICT, the first arg is loaded into r3, and the second into r5/r6.

So, for some reason r4 is skipped.

However, when stepping through ppc64_sysv_abi_push_val I find that gdb pushes the value to r4/r5.  The code in do_copy then proceeds to read it from r5/r6, and this explains the failure.
Comment 5 Tom de Vries 2022-09-04 07:49:37 UTC
This fixes it for me:
...
$ git diff
diff --git a/gdb/ppc-sysv-tdep.c b/gdb/ppc-sysv-tdep.c
index 1fe81b95f6c..8d68f90df51 100644
--- a/gdb/ppc-sysv-tdep.c
+++ b/gdb/ppc-sysv-tdep.c
@@ -1444,7 +1444,7 @@ ppc64_sysv_abi_push_param (struct gdbarch *gdbarch,
          == floatformats_ieee_quad))
     {
       /* IEEE FLOAT128, args in vector registers.  */
-      ppc64_sysv_abi_push_val (gdbarch, val, TYPE_LENGTH (type), 0, argpos);
+      ppc64_sysv_abi_push_val (gdbarch, val, TYPE_LENGTH (type), 16, argpos);
       ppc64_sysv_abi_push_vreg (gdbarch, val, argpos);
     }
   else if (type->code () == TYPE_CODE_FLT
...
Comment 6 Tom de Vries 2022-09-04 07:56:45 UTC
(In reply to Tom de Vries from comment #5)
> This fixes it for me:
> ...
> $ git diff
> diff --git a/gdb/ppc-sysv-tdep.c b/gdb/ppc-sysv-tdep.c
> index 1fe81b95f6c..8d68f90df51 100644
> --- a/gdb/ppc-sysv-tdep.c
> +++ b/gdb/ppc-sysv-tdep.c
> @@ -1444,7 +1444,7 @@ ppc64_sysv_abi_push_param (struct gdbarch *gdbarch,
>           == floatformats_ieee_quad))
>      {
>        /* IEEE FLOAT128, args in vector registers.  */
> -      ppc64_sysv_abi_push_val (gdbarch, val, TYPE_LENGTH (type), 0, argpos);
> +      ppc64_sysv_abi_push_val (gdbarch, val, TYPE_LENGTH (type), 16,
> argpos);
>        ppc64_sysv_abi_push_vreg (gdbarch, val, argpos);
>      }
>    else if (type->code () == TYPE_CODE_FLT
> ...

Hmm, we still had 16 in v2 ( PR29247 comment 16 ), but it became 0 in v3 ( PR29247 comment 18 ).
Comment 7 Ulrich Weigand 2022-09-05 16:23:44 UTC
You're right, this should specify a 16-byte alignment.

Note it would appear the same problem also exists when handling homogeneous (ELFv2) or single-element (ELFv1) aggregates: these also need 16-byte alignment if the base type is a vector or _Float128 type.
Comment 8 Tom de Vries 2022-09-06 05:44:11 UTC
(In reply to Ulrich Weigand from comment #7)
> Note it would appear the same problem also exists when handling homogeneous
> (ELFv2) or single-element (ELFv1) aggregates: these also need 16-byte
> alignment if the base type is a vector or _Float128 type.

Can you or Carl propose a patch for this additional problem?

Alternatively, can you show code examples that fail due this problem?
Comment 9 Ulrich Weigand 2022-09-07 12:10:24 UTC
(In reply to Tom de Vries from comment #8)
> (In reply to Ulrich Weigand from comment #7)
> > Note it would appear the same problem also exists when handling homogeneous
> > (ELFv2) or single-element (ELFv1) aggregates: these also need 16-byte
> > alignment if the base type is a vector or _Float128 type.
> 
> Can you or Carl propose a patch for this additional problem?

It's just a few lines further down in the same function, in the final else branch:

  else
    {
      ppc64_sysv_abi_push_val (gdbarch, val, TYPE_LENGTH (type), 0, argpos);
[...]

followed by either ppc64_sysv_abi_push_vreg or ppc64_sysv_abi_push_freg.

The problem is that alignment of 0 is only correct in those cases where ppc64_sysv_abi_push_freg is called - in those cases where ppc64_sysv_abi_push_vreg is called, we also need to use 16 byte alignment here.

> Alternatively, can you show code examples that fail due this problem?

I haven't tried, but the original test case in the bugzilla should show this problem if you replace the "long double _Complex" with a single-element struct with just this type (or just "long double") as element.
Comment 10 Tom de Vries 2022-09-07 13:46:09 UTC
Created attachment 14320 [details]
tentative patch

(In reply to Ulrich Weigand from comment #9)
> (In reply to Tom de Vries from comment #8)
> > (In reply to Ulrich Weigand from comment #7)
> > > Note it would appear the same problem also exists when handling homogeneous
> > > (ELFv2) or single-element (ELFv1) aggregates: these also need 16-byte
> > > alignment if the base type is a vector or _Float128 type.
> > 
> > Can you or Carl propose a patch for this additional problem?
> 
> It's just a few lines further down in the same function, in the final else
> branch:
> 
>   else
>     {
>       ppc64_sysv_abi_push_val (gdbarch, val, TYPE_LENGTH (type), 0, argpos);
> [...]
> 
> followed by either ppc64_sysv_abi_push_vreg or ppc64_sysv_abi_push_freg.
> 
> The problem is that alignment of 0 is only correct in those cases where
> ppc64_sysv_abi_push_freg is called - in those cases where
> ppc64_sysv_abi_push_vreg is called, we also need to use 16 byte alignment
> here.
> 

Like this?

I've also added a test-case, but I haven't been able to try this on ppc64le yet.
Comment 11 Carl E Love 2022-09-07 15:33:56 UTC
I got back from vacation yesterday.  I started looking at this to see what Ulrich was talking about.  I was able to reproduce the initial error and verify the change that Tom proposed fixes the issue.  I can try the lasted patch from Tom to see if I can get the test case for the second issue that Ulrich mentioned to fail and then verify if Tom's changes fix it.
Comment 12 Tom de Vries 2022-09-07 17:16:05 UTC
(In reply to Carl E Love from comment #11)
> I got back from vacation yesterday. 

Welcome back :)

> I started looking at this to see what
> Ulrich was talking about.  I was able to reproduce the initial error and
> verify the change that Tom proposed fixes the issue.  I can try the lasted
> patch from Tom to see if I can get the test case for the second issue that
> Ulrich mentioned to fail and then verify if Tom's changes fix it.

Ok, waiting for your feedback then, thanks.
Comment 13 Carl E Love 2022-09-07 18:22:49 UTC
I ran the following on a Power 10 system with:
Fedora release 36 (Thirty Six)
gcc (GCC) 12.2.1 20220819 (Red Hat 12.2.1-1)


Initially with the current gdb tree, the top commit was:

commit d647c797b7386dcaca5984a3fbe9fba469771737 (origin/master, origin/HEAD)
Author: Aaron Merey <amerey@redhat.com>
Date:   Fri Sep 2 19:01:56 2022 -0400

    Add debuginfod support for objdump -S
   ...

The   make check RUNTESTFLAGS='GDB=/home/carll/bin/gdb gdb.base/varargs.exp ' test was used.

The results from the test in gdb/testsuite/gdb.log were:

gdb) FAIL: gdb.base/varargs.exp: print find_max_long_double_real(4, ldc1, ldc2, ldc3, ldc4)
testcase /home/carll/GDB/binutils-gdb-bug-29247/gdb/testsuite/gdb.base/varargs.exp completed in 0 seconds

                === gdb Summary ===

# of expected passes            10
# of unexpected failures        1

gdb reported one failure as expected.

-----------------

From Tom's patch, I added just the gdb code change:

--- a/gdb/ppc-sysv-tdep.c
+++ b/gdb/ppc-sysv-tdep.c
@@ -1444,7 +1444,7 @@ ppc64_sysv_abi_push_param (struct gdbarch *gdbarch,
 	  == floatformats_ieee_quad))
     {
       /* IEEE FLOAT128, args in vector registers.  */
-      ppc64_sysv_abi_push_val (gdbarch, val, TYPE_LENGTH (type), 0, argpos);
+      ppc64_sysv_abi_push_val (gdbarch, val, TYPE_LENGTH (type), 16, argpos);
       ppc64_sysv_abi_push_vreg (gdbarch, val, argpos);
     }

The gdb test now passes with:

               === gdb Summary ===

# of expected passes            11

as expected.

---------------------

Add just the testcase changes for the long double _Complex case, i.e. no additional gdb functional changes.  The results of the run were:


(gdb) FAIL: gdb.base/varargs.exp: print find_max_struct_long_double_real(4, sldc1, sldc2, sldc3, sldc4)
testcase /home/carll/GDB/binutils-gdb-bug-29247/gdb/testsuite/gdb.base/varargs.exp completed in 1 seconds

                === gdb Summary ===

# of expected passes            11
# of unexpected failures        1

Which generates the error as Ulrich expected.

----------------------

Added the additional fixes from Tom's patch to address the complex case.  The 
results of the gdb test was:


               === gdb Summary ===

# of expected passes            12

With Tom's additional fixes, the updated testcase also passes.


Tom's patch works with elf v2 and the gcc version 12 built to default to 128-bit IEEE long double.
Comment 14 Tom de Vries 2022-09-12 09:05:17 UTC
(In reply to Carl E Love from comment #13)
> Tom's patch works with elf v2 and the gcc version 12 built to default to
> 128-bit IEEE long double.

Thanks for testing, I've submitted here ( https://sourceware.org/pipermail/gdb-patches/2022-September/191778.html ).
Comment 16 Carl E Love 2022-09-16 16:38:54 UTC
FYI, the test is now working fine on the IBM nightly gdb buildbots.  Everything looks good.  Thanks.
Comment 17 Carl E Love 2022-11-15 00:55:39 UTC
Tom:

I believe the issue is fixed and resolved.  Can the issue be closed?