Bug 21955 - Wrong alignment of L(SP_RANGE)/L(SP_INF_0) in sysdeps/x86_64/fpu/e_expf.S
Summary: Wrong alignment of L(SP_RANGE)/L(SP_INF_0) in sysdeps/x86_64/fpu/e_expf.S
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: math (show other bugs)
Version: 2.26
: P2 normal
Target Milestone: 2.27
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-15 17:26 UTC by H.J. Lu
Modified: 2017-08-17 09:41 UTC (History)
0 users

See Also:
Host:
Target: x86-64
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2017-08-15 17:26:35 UTC
sysdeps/x86_64/fpu/e_expf.S has

        /* Here if |x| is Inf */
        lea     L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */
        movss   (%rdx,%rax,4), %xmm0    /* return zero or Inf */
        ret
...
         .section .rodata.cst8,"aM",@progbits,8
...
        .p2align 2
L(SP_INF_0):
        .long   0x7f800000      /* single precision Inf */
        .long   0               /* single precision zero */
        .type L(SP_INF_0), @object
        ASM_SIZE_DIRECTIVE(L(SP_INF_0))

Since L(SP_INF_0) is accessed as an array of 4-byte elements, it can't be
put in

.section .rodata.cst8,"aM",@progbits,8
Comment 1 Sourceware Commits 2017-08-15 17:44:41 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, hjl/expf/master has been created
        at  a13f5e6e34a6160607c8ce9448c618b9ae024364 (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a13f5e6e34a6160607c8ce9448c618b9ae024364

commit a13f5e6e34a6160607c8ce9448c618b9ae024364
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Aug 15 08:45:34 2017 -0700

    x86-64: Optimize e_expf with FMA [BZ #21912]
    
    	[BZ #21912]
    	* sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file.
    	* sysdeps/x86_64/fpu/multiarch/e_expf-sse2.S: Likewise.
    	* sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise.
    	* sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5c18dfae535d8dd308a034280176c771b4065664

commit 5c18dfae535d8dd308a034280176c771b4065664
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Aug 15 10:34:22 2017 -0700

    x86-64: Put L(SP_INF_0) in .rodata.cst4 section [BZ #21955]
    
    sysdeps/x86_64/fpu/e_expf.S has
    
            /* Here if |x| is Inf */
            lea     L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */
            movss   (%rdx,%rax,4), %xmm0    /* return zero or Inf */
            ret
    ...
             .section .rodata.cst8,"aM",@progbits,8
    ...
            .p2align 2
    L(SP_INF_0):
            .long   0x7f800000      /* single precision Inf */
            .long   0               /* single precision zero */
            .type L(SP_INF_0), @object
            ASM_SIZE_DIRECTIVE(L(SP_INF_0))
    
    Since L(SP_INF_0) is accessed as an array of 4-byte elements, it should
    be placed in
    
    	.section .rodata.cst4,"aM",@progbits,4
    
    	[BZ #21955]
    	* sysdeps/x86_64/fpu/e_expf.S (L(SP_INF_0)): Place it in
    	.rodata.cst4 section.

-----------------------------------------------------------------------
Comment 2 H.J. Lu 2017-08-15 18:00:22 UTC
L(SP_RANGE) have the same issue.
Comment 3 Sourceware Commits 2017-08-15 19:56:34 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, hjl/pr21955/master has been created
        at  25ccb7689da648a69a4da6957b6f62a09bcd5d76 (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=25ccb7689da648a69a4da6957b6f62a09bcd5d76

commit 25ccb7689da648a69a4da6957b6f62a09bcd5d76
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Aug 15 10:34:22 2017 -0700

    x86-64: Put L(SP_RANGE)/L(SP_INF_0) in .rodata.cst4 section [BZ #21955]
    
    sysdeps/x86_64/fpu/e_expf.S has
    
            lea     L(SP_RANGE)(%rip), %rdx /* load over/underflow bound */
            cmpl    (%rdx,%rax,4), %ecx     /* |x|<under/overflow bound ? */
    ...
            /* Here if |x| is Inf */
            lea     L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */
            movss   (%rdx,%rax,4), %xmm0    /* return zero or Inf */
            ret
    ...
             .section .rodata.cst8,"aM",@progbits,8
    ...
            .p2align 2
    L(SP_RANGE): /* single precision overflow/underflow bounds */
            .long   0x42b17217      /* if x>this bound, then result overflows */
            .long   0x42cff1b4      /* if x<this bound, then result underflows */
            .type L(SP_RANGE), @object
            ASM_SIZE_DIRECTIVE(L(SP_RANGE))
    
            .p2align 2
    L(SP_INF_0):
            .long   0x7f800000      /* single precision Inf */
            .long   0               /* single precision zero */
            .type L(SP_INF_0), @object
            ASM_SIZE_DIRECTIVE(L(SP_INF_0))
    
    Since L(SP_RANGE) and L(SP_INF_0) are accessed as arrays of 4-byte
    elements, they should be placed in .rodata.cst4 section.
    
    	[BZ #21955]
    	* sysdeps/x86_64/fpu/e_expf.S (L(SP_RANGE)): Place it in
    	.rodata.cst4 section.
    	(L(SP_INF_0)): Likewise.

-----------------------------------------------------------------------
Comment 4 H.J. Lu 2017-08-15 20:52:56 UTC
         .section .rodata.cst8,"aM",@progbits,8
...
        .p2align 2
L(SP_RANGE): /* single precision overflow/underflow bounds */
        .long   0x42b17217      /* if x>this bound, then result overflows */
        .long   0x42cff1b4      /* if x<this bound, then result underflows */
        .type L(SP_RANGE), @object
        ASM_SIZE_DIRECTIVE(L(SP_RANGE))

        .p2align 2
L(SP_INF_0):
        .long   0x7f800000      /* single precision Inf */
        .long   0               /* single precision zero */
        .type L(SP_INF_0), @object
        ASM_SIZE_DIRECTIVE(L(SP_INF_0))

Since L(SP_RANGE) and L(SP_INF_0) are in .rodata.cst8 section, they
must be aligned to 8 bytes.
Comment 5 Sourceware Commits 2017-08-15 20:54:46 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, hjl/pr21955/master has been deleted
       was  25ccb7689da648a69a4da6957b6f62a09bcd5d76

- Log -----------------------------------------------------------------
25ccb7689da648a69a4da6957b6f62a09bcd5d76 x86-64: Put L(SP_RANGE)/L(SP_INF_0) in .rodata.cst4 section [BZ #21955]
-----------------------------------------------------------------------
Comment 6 Sourceware Commits 2017-08-15 20:55:07 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, hjl/pr21955/master has been created
        at  39245565fc0523eece29721c4590639ccebb6145 (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=39245565fc0523eece29721c4590639ccebb6145

commit 39245565fc0523eece29721c4590639ccebb6145
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Aug 15 10:34:22 2017 -0700

    x86-64: Align L(SP_RANGE)/L(SP_INF_0) to 8 bytes [BZ #21955]
    
    sysdeps/x86_64/fpu/e_expf.S has
    
            lea     L(SP_RANGE)(%rip), %rdx /* load over/underflow bound */
            cmpl    (%rdx,%rax,4), %ecx     /* |x|<under/overflow bound ? */
    ...
            /* Here if |x| is Inf */
            lea     L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */
            movss   (%rdx,%rax,4), %xmm0    /* return zero or Inf */
            ret
    ...
             .section .rodata.cst8,"aM",@progbits,8
    ...
            .p2align 2
    L(SP_RANGE): /* single precision overflow/underflow bounds */
            .long   0x42b17217      /* if x>this bound, then result overflows */
            .long   0x42cff1b4      /* if x<this bound, then result underflows */
            .type L(SP_RANGE), @object
            ASM_SIZE_DIRECTIVE(L(SP_RANGE))
    
            .p2align 2
    L(SP_INF_0):
            .long   0x7f800000      /* single precision Inf */
            .long   0               /* single precision zero */
            .type L(SP_INF_0), @object
            ASM_SIZE_DIRECTIVE(L(SP_INF_0))
    
    Since L(SP_RANGE) and L(SP_INF_0) are in .rodata.cst8 section, they
    must be aligned to 8 bytes.
    
    	[BZ #21955]
    	* sysdeps/x86_64/fpu/e_expf.S (L(SP_RANGE)): Aligned to 8 bytes.
    	(L(SP_INF_0)): Likewise.

-----------------------------------------------------------------------
Comment 7 Sourceware Commits 2017-08-15 21:06:32 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  f59f7adb4a00b7784cab1becdf257366104587b7 (commit)
      from  6b11a6ad714e7f2bb83556c77d2306e55a94ca54 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f59f7adb4a00b7784cab1becdf257366104587b7

commit f59f7adb4a00b7784cab1becdf257366104587b7
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Aug 15 14:04:59 2017 -0700

    x86-64: Align L(SP_RANGE)/L(SP_INF_0) to 8 bytes [BZ #21955]
    
    sysdeps/x86_64/fpu/e_expf.S has
    
            lea     L(SP_RANGE)(%rip), %rdx /* load over/underflow bound */
            cmpl    (%rdx,%rax,4), %ecx     /* |x|<under/overflow bound ? */
    ...
            /* Here if |x| is Inf */
            lea     L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */
            movss   (%rdx,%rax,4), %xmm0    /* return zero or Inf */
            ret
    ...
             .section .rodata.cst8,"aM",@progbits,8
    ...
            .p2align 2
    L(SP_RANGE): /* single precision overflow/underflow bounds */
            .long   0x42b17217      /* if x>this bound, then result overflows */
            .long   0x42cff1b4      /* if x<this bound, then result underflows */
            .type L(SP_RANGE), @object
            ASM_SIZE_DIRECTIVE(L(SP_RANGE))
    
            .p2align 2
    L(SP_INF_0):
            .long   0x7f800000      /* single precision Inf */
            .long   0               /* single precision zero */
            .type L(SP_INF_0), @object
            ASM_SIZE_DIRECTIVE(L(SP_INF_0))
    
    Since L(SP_RANGE) and L(SP_INF_0) are in .rodata.cst8 section, they must
    be aligned to 8 bytes.
    
    	[BZ #21955]
    	* sysdeps/x86_64/fpu/e_expf.S (L(SP_RANGE)): Aligned to 8 bytes.
    	(L(SP_INF_0)): Likewise.

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                   |    6 ++++++
 sysdeps/x86_64/fpu/e_expf.S |    4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)
Comment 8 H.J. Lu 2017-08-15 21:10:56 UTC
Fixed for 2.27.
Comment 9 Sourceware Commits 2017-08-16 13:23:25 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, hjl/pr21955/master has been deleted
       was  39245565fc0523eece29721c4590639ccebb6145

- Log -----------------------------------------------------------------
39245565fc0523eece29721c4590639ccebb6145 x86-64: Align L(SP_RANGE)/L(SP_INF_0) to 8 bytes [BZ #21955]
-----------------------------------------------------------------------
Comment 10 Sourceware Commits 2017-08-16 15:56:44 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, hjl/fma/2.26 has been updated
       via  6d5f5b16bc4bd3945e138509d7986a5231ab5ee6 (commit)
       via  ce3e7f4136a9f5943328c74511542834ca05811b (commit)
      from  7e7b5de8ffc9ac8fda45b988cde5650004bdbca7 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6d5f5b16bc4bd3945e138509d7986a5231ab5ee6

commit 6d5f5b16bc4bd3945e138509d7986a5231ab5ee6
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Aug 16 08:43:35 2017 -0700

    x86-64: Optimize e_expf with FMA [BZ #21912]
    
    FMA optimized e_expf improves performance by more than 50% on Skylake.
    
    	[BZ #21912]
    	* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
    	Add e_expf-fma.
    	* sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file.
    	* sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise.
    	* sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise.
    
    (cherry picked from commit 24a2e6588d2e0c91b4003878b0625d4a9360e8f3)

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ce3e7f4136a9f5943328c74511542834ca05811b

commit ce3e7f4136a9f5943328c74511542834ca05811b
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Aug 15 14:04:59 2017 -0700

    x86-64: Align L(SP_RANGE)/L(SP_INF_0) to 8 bytes [BZ #21955]
    
    sysdeps/x86_64/fpu/e_expf.S has
    
            lea     L(SP_RANGE)(%rip), %rdx /* load over/underflow bound */
            cmpl    (%rdx,%rax,4), %ecx     /* |x|<under/overflow bound ? */
    ...
            /* Here if |x| is Inf */
            lea     L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */
            movss   (%rdx,%rax,4), %xmm0    /* return zero or Inf */
            ret
    ...
             .section .rodata.cst8,"aM",@progbits,8
    ...
            .p2align 2
    L(SP_RANGE): /* single precision overflow/underflow bounds */
            .long   0x42b17217      /* if x>this bound, then result overflows */
            .long   0x42cff1b4      /* if x<this bound, then result underflows */
            .type L(SP_RANGE), @object
            ASM_SIZE_DIRECTIVE(L(SP_RANGE))
    
            .p2align 2
    L(SP_INF_0):
            .long   0x7f800000      /* single precision Inf */
            .long   0               /* single precision zero */
            .type L(SP_INF_0), @object
            ASM_SIZE_DIRECTIVE(L(SP_INF_0))
    
    Since L(SP_RANGE) and L(SP_INF_0) are in .rodata.cst8 section, they must
    be aligned to 8 bytes.
    
    	[BZ #21955]
    	* sysdeps/x86_64/fpu/e_expf.S (L(SP_RANGE)): Aligned to 8 bytes.
    	(L(SP_INF_0)): Likewise.
    
    (cherry picked from commit f59f7adb4a00b7784cab1becdf257366104587b7)

-----------------------------------------------------------------------

Summary of changes:
 sysdeps/x86_64/fpu/e_expf.S               |    4 +-
 sysdeps/x86_64/fpu/multiarch/Makefile     |    3 +
 sysdeps/x86_64/fpu/multiarch/e_expf-fma.S |  182 +++++++++++++++++++++++++++++
 sysdeps/x86_64/fpu/multiarch/e_expf.c     |   26 ++++
 sysdeps/x86_64/fpu/multiarch/ifunc-fma.h  |   34 ++++++
 5 files changed, 247 insertions(+), 2 deletions(-)
 create mode 100644 sysdeps/x86_64/fpu/multiarch/e_expf-fma.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/e_expf.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/ifunc-fma.h