This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Patch updated to reflect recommended changes. Okay for trunk? 2017-10-19 Michael Collison <michael.collison@arm.com> * sysdeps/aarch64/fpu/e_sqrt.c (ieee754_sqrt): Replace asm statements with __builtin_sqrt. * sysdeps/aarch64/fpu/e_sqrtf.c (ieee754_sqrtf): Replace asm statements with __builtin_sqrtf. * sysdeps/aarch64/fpu/s_ceil.c (__ceil): Replace asm statements with __builtin_ceil. * sysdeps/aarch64/fpu/s_ceilf.c (__ceilf): Replace asm statements with __builtin_ceilf. * sysdeps/aarch64/fpu/s_floor.c (__floor): Replace asm statements with __builtin_floor. * sysdeps/aarch64/fpu/s_floorf.c (__floorf): Replace asm statements with __builtin_floorf. * sysdeps/aarch64/fpu/s_fma.c (__fma): Replace asm statements with __builtin_fma. * sysdeps/aarch64/fpu/s_fmaf.c (__fmaf): Replace asm statements with __builtin_fmaf. * sysdeps/aarch64/fpu/s_fmax.c (__fmax): Replace asm statements with __builtin_fmax. * sysdeps/aarch64/fpu/s_fmaxf.c (__fmaxf): Replace asm statements with __builtin_fmaxf. * sysdeps/aarch64/fpu/s_fmin.c (__fmin): Replace asm statements with __builtin_fmin. * sysdeps/aarch64/fpu/s_fminf.c (__fminf): Replace asm statements with __builtin_fminf. * sysdeps/aarch64/fpu/s_frint.c: Delete file. * sysdeps/aarch64/fpu/s_frintf.c: Delete file. * sysdeps/aarch64/fpu/s_llrint.c (__llrint): Replace asm statements with builtin and neon intrinsic. * sysdeps/aarch64/fpu/s_llrintf.c (__llrintf): Ditto. * sysdeps/aarch64/fpu/s_llround.c (__llround): Replace asm statements with neon intrinsic. * sysdeps/aarch64/fpu/s_llroundf.c (__llroundf): Ditto. * sysdeps/aarch64/fpu/s_lrint.c (__lrint): Replace asm statements with builtin and neon intrinsic. * sysdeps/aarch64/fpu/s_lrintf.c (__lrintf): Ditto. * sysdeps/aarch64/fpu/s_lround.c (__lround): Replace asm statements with neon intrinsic vcvtad_s64_f64. * sysdeps/aarch64/fpu/s_lroundf.c (__lroundf): Replace asm statements with neon intrinsic vcvtas_s32_f32. * sysdeps/aarch64/fpu/s_nearbyint.c (__nearbyint): Replace asm statements with __builtin_nearbyint. * sysdeps/aarch64/fpu/s_nearbyintf.c (__nearbyintf): Replace asm statements with __builtin_nearbyintf. * sysdeps/aarch64/fpu/s_rint.c (__rint): Replace asm statements with __builtin_rint. * sysdeps/aarch64/fpu/s_rintf.c (__rintf): Replace asm statements with __builtin_rintf. * sysdeps/aarch64/fpu/s_round.c (__round): Replace asm statements with __builtin_round. * sysdeps/aarch64/fpu/s_roundf.c (__roundf): Replace asm statements with __builtin_roundf. * sysdeps/aarch64/fpu/s_trunc.c (__trunc): Replace asm statements with __builtin_trunc. * sysdeps/aarch64/fpu/s_truncf.c (__truncf): Replace asm statements with __builtin_truncf. * sysdeps/aarch64/fpu/Makefile: Build e_sqrt[f].c with -fno-math-errno. -----Original Message----- From: Szabolcs Nagy Sent: Tuesday, October 17, 2017 9:15 AM To: Adhemerval Zanella <adhemerval.zanella@linaro.org>; libc-alpha@sourceware.org; Michael Collison <Michael.Collison@arm.com> Cc: nd <nd@arm.com> Subject: Re: [PATCH] aarch64: Implement math acceleration via builtins On 13/10/17 23:36, Adhemerval Zanella wrote: > On 13/10/2017 18:26, Michael Collison wrote: >> This patch converts asm statements into builtin and neon intrinsics for AArch64. >> As an example for the file sysdeps/aarch64/fpu/_s_ceil.c, we convert >> the function from >> >> double >> __ceil (double x) >> { >> double result; >> asm ("frintp\t%d0, %d1" : >> "=w" (result) : "w" (x) ); >> return result; >> } >> >> into >> >> double >> __ceil (double x) >> { >> return __builtin_ceil (x); >> } >> ... > Looks good in general with some comments below. Is there any > advantage of the refactor besides avoid inline assembly? i think this is a code clean up patch (the generated code should not change other than temp register numbers), it avoids the current preprocessor hackery and inline asm. > Since you refactoring it, I would also suggest to use builtin for the > aarch64 match_private.h sqrt and get/set fpcr at fpu_control.h. > note that's no longer a code clean up patch: e.g. gcc does not know the cost of inline asm so it cannot schedule it optimally, but it has detailed model of the builtins so using them in headers should be an improvement. Wilco has some sqrt patches, so sqrt can be left alone (if those patches don't progress then it can be fixed later) https://sourceware.org/ml/libc-alpha/2017-09/msg00958.html https://sourceware.org/ml/libc-alpha/2017-09/msg00959.html https://sourceware.org/ml/libc-alpha/2017-09/msg00960.html i think using builtins for fpcr/fpsr is a good idea. __builtin_aarch64_get_fpcr __builtin_aarch64_set_fpcr __builtin_aarch64_get_fpsr __builtin_aarch64_set_fpsr please check if earliest supported gcc (4.9) have them, i think fixing this can be a separate patch. >> +long long int >> +__llrint (double x) >> +{ >> + double r = __builtin_rint (x); >> + >> + // Prevent gcc from calling lrint directly when compiled with -fno-math-errno >> + // by inserting a barrier >> + >> + math_opt_barrier (r); ... > > Also I think the comment line is too long and I am not sure if it is > usual to use '//' comments (usually we use default '/*' '*/' pairs). > please fix the comment style. >> +long int >> +__lrint (double x) >> { >> - OTYPE result; >> - ITYPE temp; >> >> #if IREG_SIZE == 64 && OREG_SIZE == 32 >> + long int result; >> + >> if (__builtin_fabs (x) > INT32_MAX) >> { >> /* Converting large values to a 32 bit int may cause the frintx/fcvtza >> @@ -96,10 +74,14 @@ __CONCATX(__,FUNC) (ITYPE x) >> return result; >> } >> #endif >> - asm ( "frintx" "\t%" IREGS "1, %" IREGS "2\n\t" >> - "fcvtzs" "\t%" OREGS "0, %" IREGS "1" >> - : "=r" (result), "=w" (temp) : "w" (x) ); >> - return result; >> + >> + double r = __builtin_rint (x); >> + >> + // Prevent gcc from calling lrintf directly when compiled with -fno-math-errno >> + // by inserting a barrier >> + >> + math_opt_barrier (r); >> + return r; >> } > > Same comment as before about math_opt_barrier and the comments seems off (it is > referring to lrintf instead of lrint). > please fix the comment.
Attachment:
gnutools-1925-rebase-v13.patch
Description: gnutools-1925-rebase-v13.patch
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |