This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Inefficient ia64 system call implementation in glibc

H.J. Lu <> write:

The inline ia64 system call assumes all values passed to kernel are
signed 64bit. It does sign extension if the incoming arg is not signed
64bit. In case of fxstat.c:

__fxstat (int vers, int fd, struct stat *buf)
return INLINE_SYSCALL (fstat, 2, fd, CHECK_1 (buf));
it leads to

0000000000000000 <__fxstat>:
   0:   00 20 39 0c 80 05       [MII]       alloc r36=ar.pfs,14,6,0
   6:   f0 e0 01 12 48 a0                   mov r15=1212
   c:   04 08 00 84                         mov r37=r1
  10:   01 38 01 44 00 21       [MII]       mov r39=r34
  16:   60 02 84 2c 00 60                   sxt4 r38=r33
  1c:   04 00 c4 00                         mov r35=b0;;
  20:   0a 00 00 00 00 02       [MMI]       break.m 0x100000;;
  26:   10 02 20 00 42 e0                   mov r33=r8

The real inefficiency here is the compiler output. Given the realities of the Itanium 2 implementation, the first two bundles will require 3 cycles to execute. A better coding would be:

	{	.mmi
		alloc	r36=ar.pfs,14,6,0
		mov	r15=1212
		mov	r35=b0
	{	.mmi
		mov	r37=r1
		mov	r39=r34
		sxt4	r38=r33
	} ;;

    which will execute in one cycle. The sign extension, although
"unnecessary" doesn't cost any cycles. Admittedly you could use the
mi;;i bundle to pack the break instruction in the second bundle if
you didn't have to sign-extend, but I'd rather see the 3 v. 1 cycle
problem addressed first.

	John "I worry about this stuff way too much" Worley

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]