[PATCH] Fixes tree-loop-distribute-patterns issues
Carlos O'Donell
carlos@redhat.com
Wed Jun 19 16:39:00 GMT 2013
On 06/19/2013 10:27 AM, Adhemerval Zanella wrote:
> On 18-06-2013 17:56, Roland McGrath wrote:
>> What was discussed before was using a configure check. In general it's
>> better to use empirical checks rather than version checks, even if in
>> practice a version check is going to suffice. In this case, it does not
>> seem difficult to write a configure check.
>>
>> Other macros of this sort we have are defined in lowercase
>> (internal_function, etc.). Some of those have names indicative of their
>> important semantics rather than echoing the syntax of the implementation.
>> I think we should use such a name for this case too. I haven't thought of
>> a particularly good name, but I don't think it should include "tree",
>> "distribute", or "patterns".
>>
>> Attributes can appear in function definitions, right before the return
>> type. There's no need for a separate forward declaration.
>>
>> The most obvious place we need these annotations is on the simple_*
>> functions in string/test-*.c, so I would start by adding it to all those.
>> That seems like the right initial set that goes in the change that
>> introduces the macro and serves as implicit documentation of its use.
>> Then we can take on each actual implementation that might be affected.
>>
>>
>> Thanks,
>> Roland
>>
> Thanks for the reviews, I change it to use a configure check instead and
> changed the compiler attribute name a simpler one. I also added the attribute
> on the string/test-*.c as you suggested. I put all the changes on one patch
> since the string/memset.c and string/memmove.c modification are required so
> loader can work correctly.
Thanks, this is looking better and is exactly what I was thinking of.
We check for the feature, set a feature macro, and avoid relying on the
compiler version.
> Tested on PPC64/PPC32 with -mcpu=powerpc and -mcpu=powerpc64.
>
> ---
>
> 2013-06-19 Adhemerval Zanella <azanella@linux.vnet.ibm.com>
>
> * config.h.in (HAVE_CC_PREDEF_DISTRIBUTE_PATTERNS): New define.
> * configure.in (libc_cv_predef_loop_distribute_patttern): Check if
> compiler plus optimization may generate memset/memmove library calls
> for some loop constructions.
> * include/libc-symbols.h (libc_disable_loop_to_calls): Define to
> disable loop transformation to library calls.
> * string/memmove.c (MEMMOVE): Disable loop transformation to avoid
> recursive call.
> * string/memset.c (memset): Likewise.
> * string/test-memmove.c (simple_memmove): Disable loop transformation
> to library calls.
> * string/test-memset.c (simple_memset): Likewise.
> * configure: Regenerated.
>
> --
>
> diff --git a/config.h.in b/config.h.in
> index 8c2479e..f598e82 100644
> --- a/config.h.in
> +++ b/config.h.in
> @@ -69,6 +69,9 @@
> /* Define if the compiler supports __builtin_memset. */
> #undef HAVE_BUILTIN_MEMSET
>
> +/* Define if compiler implicitly defines -ftree-loop-distribute-patterns. */
> +#undef HAVE_CC_PREDEF_DISTRIBUTE_PATTERNS
As Roland said, this name is about the implementation not the operation
that is going on.
Suggest: HAVE_CC_LOOP_TO_FUNCTION
> +
> /* Define if the regparm attribute shall be used for local functions
> (gcc on ix86 only). */
> #undef USE_REGPARMS
> diff --git a/configure b/configure
> index 09d8336..7a718b4 100755
> --- a/configure
> +++ b/configure
Please don't include generated files in the diff.
> --- a/configure.in
> +++ b/configure.in
> @@ -1964,6 +1964,29 @@ if test -n "$submachine"; then
> fi
> AC_SUBST(libc_cv_cc_submachine)
>
> +dnl Check whether the compiler may replace some loop constructions
> +dnl by memset/memmove calls.
Suggest:
Check wheter the compiler may replace some loop constructions
by function calls to memset or memmove. This replacement is
problematic since it may lead to recursive calls in the
implementation of memset or memmove. We use this check to
detect the optimization and disable it if required.
> +AC_CACHE_CHECK([whether $CC $CFLAGS implicitly replaces loops constructions \
> +by function calls],
> + libc_cv_predef_loop_distribute_pattterns, [
Suggest `libc_cv_cc_loop_to_function'
> + cat > conftest.c << EOF
> +void foo (int *vec, int size) {
> + int i=0;
> + int *p = vec;
> + for (i=0; i<size; ++i, ++p)
> + *p = 0;
> +}
> +EOF
I'm a little worried about the fragility of the test, but if it doesn't
optimize to memset, then it may not optimize the internal implementation
to memset, and therefore we need not avoid the benefit of the optimization.
I think this is a good approach, but we will have to see how it plays out
and if it is robust enough.
> +libc_cv_predef_loop_distribute_pattterns=no
> + if ${CC-cc} $CFLAGS -c -S conftest.c -o - | fgrep "memset" > /dev/null; then
> + libc_cv_predef_loop_distribute_pattterns=yes
> + fi
> + rm -f conftest*])
> +if test $libc_cv_predef_loop_distribute_pattterns = yes; then
> + AC_DEFINE(HAVE_CC_PREDEF_DISTRIBUTE_PATTERNS)
Fixup HAVE_* name per suggestion (mentioned once just to call attention to it).
> +fi
> +AC_SUBST(libc_cv_predef_loop_distribute_pattterns)
> +
Fixup libv_cv_* name per suggestion (mentioned once just to call attention to it).
> dnl Check whether we have the gd library available.
> AC_MSG_CHECKING(for libgd)
> if test "$with_gd" != "no"; then
> diff --git a/include/libc-symbols.h b/include/libc-symbols.h
> index f043ce0..0affa79 100644
> --- a/include/libc-symbols.h
> +++ b/include/libc-symbols.h
> @@ -782,4 +782,11 @@ for linking")
> #define libc_ifunc_hidden_def(name) \
> libc_ifunc_hidden_def1 (__GI_##name, name)
>
> +#ifdef HAVE_CC_PREDEF_DISTRIBUTE_PATTERNS
> +# define libc_disable_loop_to_calls \
Suggest consistency in naming e.g. libc_disable_cc_loop_to_function.
> + __attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
OK. Since we only support compiling with gcc, this doesn't need a guard.
> +#else
> +# define libc_disable_loop_to_calls
Fixup name per suggestion (mentioned once just to call attention to it).
> +#endif
> +
> #endif /* libc-symbols.h */
> diff --git a/string/memmove.c b/string/memmove.c
> index 9dcd2f1..e377c2b 100644
> --- a/string/memmove.c
> +++ b/string/memmove.c
> @@ -40,7 +40,7 @@
> #define MEMMOVE memmove
> #endif
>
> -rettype
> +libc_disable_loop_to_calls rettype
OK (modulo name fixup).
> MEMMOVE (a1, a2, len)
> a1const void *a1;
> a2const void *a2;
> diff --git a/string/memset.c b/string/memset.c
> index 868be53..89d49d3 100644
> --- a/string/memset.c
> +++ b/string/memset.c
> @@ -20,7 +20,7 @@
>
> #undef memset
>
> -void *
> +libc_disable_loop_to_calls void *
OK (modulo name fixup).
> memset (dstpp, c, len)
> void *dstpp;
> int c;
> diff --git a/string/test-memmove.c b/string/test-memmove.c
> index 4ec55b2..78f415b 100644
> --- a/string/test-memmove.c
> +++ b/string/test-memmove.c
> @@ -46,7 +46,7 @@ IMPL (simple_memmove, 0)
> IMPL (memmove, 1)
> #endif
>
> -char *
> +libc_disable_loop_to_calls char *
OK (modulo name fixup).
> simple_memmove (char *dst, const char *src, size_t n)
> {
> char *ret = dst;
> diff --git a/string/test-memset.c b/string/test-memset.c
> index 9981fce..4d0f6d6 100644
> --- a/string/test-memset.c
> +++ b/string/test-memset.c
> @@ -63,7 +63,7 @@ builtin_memset (char *s, int c, size_t n)
> }
> #endif
>
> -char *
> +libc_disable_loop_to_calls char *
OK (modulo name fixup).
> simple_memset (char *s, int c, size_t n)
> {
> char *r = s, *end = s + n;
>
The benchtest still needs fixing:
benchtests/bench-memset.c
benchtests/bench-memmove.c
Otherwise the performance bench tests will also suffer the same problem.
OK for me with the suggestions and fixes in benchtests.
Cheers,
Carlos.
More information about the Libc-alpha
mailing list