This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/14] [x86_64] Vector math functions (vector cos)
- From: Joseph Myers <joseph at codesourcery dot com>
- To: Andrew Senkevich <andrew dot n dot senkevich at gmail dot com>
- Cc: libc-alpha <libc-alpha at sourceware dot org>
- Date: Thu, 30 Apr 2015 16:06:10 +0000
- Subject: Re: [PATCH 1/14] [x86_64] Vector math functions (vector cos)
- Authentication-results: sourceware.org; auth=none
- References: <CAMXFM3t+v7zYEmvWQxV9aRmJB6Ag2hQgQ56cZJTDy=siRBszDw at mail dot gmail dot com>
On Thu, 30 Apr 2015, Andrew Senkevich wrote:
> + Copyright (C) 2014 Free Software Foundation, Inc.
2014-2015 in all new files.
> +ENTRY(_ZGVbN2v_cos_sse4)
> +/* ALGORITHM DESCRIPTION:
> + *
> + * ( low accuracy ( < 4ulp ) or enhanced performance ( half of correct
> + * mantissa ) implementation )
> + *
> + * Argument representation:
> + * arg + Pi/2 = (N*Pi + R)
> + *
> + * Result calculation:
> + * cos(arg) = sin(arg+Pi/2) = sin(N*Pi + R) = (-1)^N * sin(R)
> + * sin(R) is approximated by corresponding polynomial
> + */
> + pushq %rbp
> + movq %rsp, %rbp
> + andq $-64, %rsp
> + subq $320, %rsp
Stack adjustments should have CFI that accurately describes how to unwind
the stack on any instruction boundary (try stepping by instruction in GDB
and making sure you can get an accurate backtrace after every step).
Probably something like
pushq %rbp
cfi_adjust_cfa_offset (8)
cfi_rel_offset (%rbp, 0)
movq %rsp, %rbp
cfi_def_cfa_register (%rbp)
(having defined %rbp to be the CFA, after that %rsp adjustments don't need
further CFI)
...
movq %rbp, %rsp
cfi_def_cfa_register (%rsp)
popq %rbp
cfi_adjust_cfa_offset (-8)
cfi_restore (%rbp)
(but since you jump past code like that, you need to save and restore CFI
state appropriately for that).
If you save, clobber and restore any call-preserved registers, CFI is also
needed for that. Again, you need to take care about jumps to ensure the
CFI state is correct at each label that may be a jump target.
> +ENTRY(_ZGVcN4v_cos)
> + pushq %rbp
> + movq %rsp, %rbp
> + andq $-32, %rsp
> + subq $32, %rsp
> + vextractf128 $1, %ymm0, (%rsp)
> + vzeroupper
> + call _ZGVbN2v_cos@PLT
> + vmovapd %xmm0, 16(%rsp)
> + vmovaps (%rsp), %xmm0
> + call _ZGVbN2v_cos@PLT
Does this pass testing (specifically, the localplt test in the elf/
directory)? I'd expect you to need to use HIDDEN_JUMPTARGET when calling
any symbol exported from libmvec (together with libm_hidden_def /
libm_hidden_weak on the definition of the function called, if not already
present).
If it passes despite there being unnecessary local PLT references in
libmvec, that points to something missing from the earlier patches:
libmvec should be included in the libraries for which the localplt test
runs.
> +/* Data table for vector implementations of function cos.
> + * The table may contain polynomial, reduction, lookup
> + * coefficients and other constants obtained through different
> + * methods of research and experimental work.
Comments in glibc don't use an initial '*' on each line after the first.
--
Joseph S. Myers
joseph@codesourcery.com