This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 2/6] Optimize i386 syscall inlining
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Zack Weinberg <zackw at panix dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Fri, 21 Aug 2015 04:56:00 -0700
- Subject: Re: [PATCH 2/6] Optimize i386 syscall inlining
- Authentication-results: sourceware.org; auth=none
- References: <20150812192001 dot GA12730 at intel dot com> <20150812221203 dot GA4224 at intel dot com> <CAKCAbMi1e9CniEZVRbgb7W=m0=zFrBes8=h+ev1e_Ofg8GnzCw at mail dot gmail dot com> <20150814120309 dot GB28610 at gmail dot com> <CAMe9rOqD+RRytcgn-Vbe95ULpNL=Nv1DCg1Td4rCuDVWn-gNXw at mail dot gmail dot com>
On Fri, Aug 14, 2015 at 10:33 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, Aug 14, 2015 at 5:03 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Wed, Aug 12, 2015 at 07:37:20PM -0400, Zack Weinberg wrote:
>>> If I'm reading that right, it's still not quite optimal; there's an
>>> unnecessary register shuffle after the system call... better would be
>>>
>>> push %ebx
>>> mov $0x2d,%eax
>>> mov 0x8(%esp),%ebx
>>> call __x86.get_pc_thunk.cx
>>> add $_GLOBAL_OFFSET_TABLE_,%ecx
>>> call *%gs:0x10
>>> mov __curbrk(%ecx),%edx
>>> mov %eax,(%edx)
>>> cmp %eax,%ebx
>>> ja 1f
>>> xor %eax,%eax
>>> pop %ebx
>>> ret
>>> 1:
>>> ; set errno and return -1
>>>
>>
>> Here is the updated patch. OK for master?
>>
>> H.J.
>> --
>> Define INLINE_SYSCALL_RETURN and INLINE_SYSCALL_ERROR_RETURN so
>> that i386 can optimize setting errno by branching to the internal
>> __syscall_error without PLT.
>>
>> Since GCC 5 and above can properly spill %ebx when needed, we can inline
>> syscalls with 6 arguments if GCC 5 or above is used to compile glibc.
>> This patch rewrites INTERNAL_SYSCALL macros and skips __libc_do_syscall
>> for GCC 5.
>>
>> For sysdeps/unix/sysv/linux/i386/brk.c, with -O2 -march=i686
>> -mtune=generic, GCC 5.2 now generates:
>>
>> <__brk>:
>> 0: push %ebx
>> 1: mov $0x2d,%eax
>> 6: mov 0x8(%esp),%ebx
>> a: call b <__brk+0xb> b: R_386_PC32 __x86.get_pc_thunk.dx
>> f: add $0x2,%edx 11: R_386_GOTPC _GLOBAL_OFFSET_TABLE_
>> 15: call *%gs:0x10
>> 1c: mov 0x0(%edx),%edx 1e: R_386_GOT32 __curbrk
>> 22: cmp %eax,%ebx
>> 24: mov %eax,(%edx)
>> 26: ja 30 <__brk+0x30>
>> 28: xor %eax,%eax
>> 2a: pop %ebx
>> 2b: ret
>>
>> instead of
>>
>> <__brk>:
>> 0: push %ebx
>> 1: mov 0x8(%esp),%ecx
>> 5: call 6 <__brk+0x6> 6: R_386_PC32 __x86.get_pc_thunk.bx
>> a: add $0x2,%ebx c: R_386_GOTPC _GLOBAL_OFFSET_TABLE_
>> 10: xchg %ecx,%ebx
>> 12: mov $0x2d,%eax
>> 17: call *%gs:0x10
>> 1e: xchg %ecx,%ebx
>> 20: mov %eax,%edx
>> 22: mov 0x0(%ebx),%eax 24: R_386_GOT32 __curbrk
>> 28: mov %edx,(%eax)
>> 2a: xor %eax,%eax
>> 2c: cmp %edx,%ecx
>> 2e: ja 38 <__brk+0x38>
>> 30: pop %ebx
>> 31: ret
>>
>> The new one is shorter by 2 instructions.
>>
>> * sysdeps/unix/sysv/linux/i386/Makefile [$(subdir) == csu]
>> (sysdep-dl-routines): Add sysdep.
>> [$(subdir) == nptl] (libpthread-routines): Likewise.
>> [$(subdir) == rt] (librt-routines): Likewise.
>> * sysdeps/unix/sysv/linux/i386/brk.c (__brk): Add
>> INTERNAL_SYSCALL_DECL. Use INLINE_SYSCALL_ERROR_RETURN.
>> * sysdeps/unix/sysv/linux/i386/clone.S (__clone): Don't check
>> PIC when branching to SYSCALL_ERROR_LABEL.
>> * sysdeps/unix/sysv/linux/i386/fcntl.c (__fcntl_nocancel): Use
>> INLINE_SYSCALL_RETURN and INLINE_SYSCALL_ERROR_RETURN.
>> (__libc_fcntl): Likewise.
>> * sysdeps/unix/sysv/linux/i386/fxstat.c (__fxstat): Likewise.
>> * sysdeps/unix/sysv/linux/i386/fxstatat.c (__fxstatat):
>> Likewise.
>> * sysdeps/unix/sysv/linux/i386/getmsg.c (getmsg): Likewise.
>> * sysdeps/unix/sysv/linux/i386/lockf64.c (lockf64): Likewise.
>> * sysdeps/unix/sysv/linux/i386/lxstat.c (__lxstat): Likewise.
>> * sysdeps/unix/sysv/linux/i386/msgctl.c (__old_msgctl):
>> Likewise.
>> (__new_msgctl): Likewise.
>> * sysdeps/unix/sysv/linux/i386/putmsg.c (putmsg): Likewise.
>> * sysdeps/unix/sysv/linux/i386/semctl.c (__old_semctl):
>> Likewise.
>> (__new_semctl): Likewise.
>> * sysdeps/unix/sysv/linux/i386/setegid.c (setegid): Likewise.
>> * sysdeps/unix/sysv/linux/i386/seteuid.c (seteuid): Likewise.
>> * sysdeps/unix/sysv/linux/i386/shmctl.c (__old_shmctl):
>> Likewise.
>> (__new_shmctl): Likewise.
>> * sysdeps/unix/sysv/linux/i386/sigaction.c (__libc_sigaction):
>> Likewise.
>> * sysdeps/unix/sysv/linux/i386/xstat.c (__xstat): Likewise.
>> * sysdeps/unix/sysv/linux/i386/libc-do-syscall.S
>> (__libc_do_syscall): Defined only if !__GNUC_PREREQ (5,0).
>> * sysdeps/unix/sysv/linux/i386/sysdep.S: Removed.
>> * sysdeps/unix/sysv/linux/i386/sysdep.c: New file.
>> * sysdeps/unix/sysv/linux/i386/sysdep.h: Define assembler macros
>> only if !__GNUC_PREREQ (5,0).
>> (SYSCALL_ERROR_LABEL): Changed to __syscall_error.
>> (SYSCALL_ERROR_HANDLER): Changed to empty.
>> (SYSCALL_ERROR_ERRNO): Removed.
>> (SYSCALL_ERROR_HANDLER_TLS_STORE): Likewise.
>> (__syscall_error): New prototype.
>> (INLINE_SYSCALL_RETURN): New.
>> (INLINE_SYSCALL_ERROR_RETURN): Likewise.
>> (LOADREGS_0): Likewise.
>> (ASMARGS_0): Likewise.
>> (LOADREGS_1): Likewise.
>> (ASMARGS_1): Likewise.
>> (LOADREGS_2): Likewise.
>> (ASMARGS_2): Likewise.
>> (LOADREGS_3): Likewise.
>> (ASMARGS_3): Likewise.
>> (LOADREGS_4): Likewise.
>> (ASMARGS_4): Likewise.
>> (LOADREGS_5): Likewise.
>> (ASMARGS_5): Likewise.
>> (LOADREGS_6): Likewise.
>> (ASMARGS_6): Likewise.
>> (INTERNAL_SYSCALL_MAIN_6): Optimize for GCC 5.
>> (INTERNAL_SYSCALL_MAIN_INLINE): Likewise.
>> (INTERNAL_SYSCALL_NCS): Likewise.
>
> Here is the patch for updated INLINE_SYSCALL_RETURN and
> INLINE_SYSCALL_ERROR_RETURN. OK for master?
I will check it in next Monday if there are no objections.
--
H.J.