How to tell PPC assembler VSX is available?

Sun Mar 11 03:44:00 GMT 2018

On 3/10/18 6:21 PM, Jeffrey Walton wrote:
> $ g++ -DTEST_MAIN -g2 -O3 -mcpu=power8 sha256-p8.cxx -o sha256-p8.exe
> /home/noloader/tmp/ccbDnfFr.s: Assembler messages:
> /home/noloader/tmp/ccbDnfFr.s:758: Error: operand out of range (32 is
> not between 0 and 31)
> /home/noloader/tmp/ccbDnfFr.s:983: Error: operand out of range (48 is
> not between 0 and 31)

Works for me on gcc112.  Is your test case you show not actually what
is failing?  Otherwise, can you compile with -S and attach the entire
assembly file?

[bergner@gcc2-power8 ~]$ which gcc as ld
/usr/bin/gcc
/usr/bin/as
/usr/bin/ld
[bergner@gcc2-power8 ~]$ cat sha256-p8.cxx
typedef unsigned char uint8_t;
typedef __vector unsigned int  uint32x4_p8;
uint32x4_p8 VEC_XL_BE(const uint8_t* data, int offset)
{
  uint32x4_p8 res;
  __asm(" lxvd2x  %x0, %1, %2    \n\t"
        : "=wa" (res)
        : "g" (data), "g" (offset));
  return res;
}
[bergner@gcc2-power8 ~]$ g++ -S -O3 -mcpu=power8 sha256-p8.cxx
[bergner@gcc2-power8 ~]$ cat sha256-p8.s
[snip]
_Z9VEC_XL_BEPKhi:
	 lxvd2x  34, 3, 4
	blr

> According to IBM's docs at [1], -mcpu=power8 is the correct option;
> and it enables other options, like -mvsx. Enabling other options in
> turn, like -maltivec and -mvsx, does not help.

Yes, -mcpu=power8 implicitly enables -maltivec and -mvsx, so you don't
need to add them.  That said, the error you're getting above it not a
compiler error, but an error from the assembler saying one of the
registers is out of range.  I'd like to see the instruction that it
is complaining about.

> typedef __vector unsigned int  uint32x4_p8;
> ...
> 
> uint32x4_p8 VEC_XL_BE(const uint8_t* data, int offset)
> {
> #if defined(__xlc__) || defined(__xlC__)
>   return (uint32x4_p8)vec_xl_be(offset, (uint8_t*)data);
> #else
>   uint32x4_p8 res;
>   __asm(" lxvd2x  %x0, %1, %2    \n\t"
>         : "=wa" (res)
>         : "g" (data), "g" (offset));
>   return res;
> 
> #endif
> }

Note that the "g" constraints you're using for (data) and (offset)
really should be: "b" (data), "r" (offset)
Using "b" with (data) is the important constraint, in that it tells the
compiler not to use r0 for %1.  That's because "lxvd2x RD,RA,RB" uses
the base value zero if RA equals r0 instead of the value contained in r0.

Peter