This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.
Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
On Wed, Aug 29, 2001 at 03:32:36PM -0700, Richard Henderson wrote: > On Tue, Aug 21, 2001 at 10:46:30PM +0200, Jakub Jelinek wrote: > > The attached strncpy.S beats them all for small strings (or sizes) or for > > aligned strings, unfortunately for some strange reason strnlen+memcpy+memset > > is faster for long unaligned strings (unaligned is ((dst^src) & 0x7) != 0). > > Can any IA-64 assembly hacker look into it? > > > (p5) ld1 c = [src], 1 // c = *src++ > ;; > st1 [dest] = c, 1 // *dest++ = c > > IA-64 won't write-combine smaller than 8 (or 4) bytes. You'll > get something like a 6 cycle stall writing to sequential bytes > like this. > > If you want to quickly copy unaligned data, you have to play > word shifting games. Probably. I took this part unmodified from strcpy.S. There are really many ways how to speed this up (like not faulting in the first chk.s failure, but instead just setting some predicate and segfaulting only later - this way even the misaligned case can use much bigger read-ahead and thus far more parallelism). Jakub
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |