This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/14] [x86_64] Vector math functions (vector cos)


On Thu, 30 Apr 2015, Andrew Senkevich wrote:

> +   Copyright (C) 2014 Free Software Foundation, Inc.

2014-2015 in all new files.

> +ENTRY(_ZGVbN2v_cos_sse4)
> +/* ALGORITHM DESCRIPTION:
> + *
> + *      ( low accuracy ( < 4ulp ) or enhanced performance ( half of correct
> + *      mantissa ) implementation )
> + *
> + *      Argument representation:
> + *      arg + Pi/2 = (N*Pi + R)
> + *
> + *      Result calculation:
> + *      cos(arg) = sin(arg+Pi/2) = sin(N*Pi + R) = (-1)^N * sin(R)
> + *      sin(R) is approximated by corresponding polynomial
> + */
> +        pushq     %rbp
> +        movq      %rsp, %rbp
> +        andq      $-64, %rsp
> +        subq      $320, %rsp

Stack adjustments should have CFI that accurately describes how to unwind 
the stack on any instruction boundary (try stepping by instruction in GDB 
and making sure you can get an accurate backtrace after every step).  
Probably something like

pushq %rbp
cfi_adjust_cfa_offset (8)
cfi_rel_offset (%rbp, 0)
movq %rsp, %rbp
cfi_def_cfa_register (%rbp)

(having defined %rbp to be the CFA, after that %rsp adjustments don't need 
further CFI)

...

movq %rbp, %rsp
cfi_def_cfa_register (%rsp)
popq %rbp
cfi_adjust_cfa_offset (-8)
cfi_restore (%rbp)

(but since you jump past code like that, you need to save and restore CFI 
state appropriately for that).

If you save, clobber and restore any call-preserved registers, CFI is also 
needed for that.  Again, you need to take care about jumps to ensure the 
CFI state is correct at each label that may be a jump target.

> +ENTRY(_ZGVcN4v_cos)
> +        pushq          %rbp
> +        movq           %rsp, %rbp
> +        andq           $-32, %rsp
> +        subq           $32, %rsp
> +        vextractf128   $1, %ymm0, (%rsp)
> +        vzeroupper
> +        call           _ZGVbN2v_cos@PLT
> +        vmovapd                %xmm0, 16(%rsp)
> +        vmovaps                (%rsp), %xmm0
> +        call           _ZGVbN2v_cos@PLT

Does this pass testing (specifically, the localplt test in the elf/ 
directory)?  I'd expect you to need to use HIDDEN_JUMPTARGET when calling 
any symbol exported from libmvec (together with libm_hidden_def / 
libm_hidden_weak on the definition of the function called, if not already 
present).

If it passes despite there being unnecessary local PLT references in 
libmvec, that points to something missing from the earlier patches: 
libmvec should be included in the libraries for which the localplt test 
runs.

> +/* Data table for vector implementations of function cos.
> + * The table may contain polynomial, reduction, lookup
> + * coefficients and other constants obtained through different
> + * methods of research and experimental work.

Comments in glibc don't use an initial '*' on each line after the first.

-- 
Joseph S. Myers
joseph@codesourcery.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]