This is the mail archive of the newlib@sourceware.cygnus.com mailing list for the newlib project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Watch out for those performance minded RTEMS users. You will hear about a wasted cycle for sure. :) Here is Eric's feedback on what toolset/arguments he was using. FYI he ported the KA9Q and Linux TCP/IP stacks to RTEMS, the FP trap code required for the 68040, implemented the termios console support, and written the 68360 BSP. He is pretty swift. :) --joel ---------- Forwarded message ---------- Date: Tue, 9 Dec 97 16:09:55 -0600 From: Eric Norum <eric@skatter.usask.ca> To: Joel Sherrill <joel@OARcorp.com> Subject: Re: memcpy performance You wrote: > What args did you give to gcc for the case you reported on the > list? One of the new Cygnus newlib maintainers wants to know. And > before they ask what version of gcc are you using. < > > I am getting pretty good responses this week from the Cygnus sde of > the world. m68k-rtems-gcc --version egcs-2.90.04 970901 (gcc2-970802 experimental) Here's how memcpy.c gets compiled. /shareNeXT/OS4.2/RTEMS/src/tools-970904/build-m68k-tools/gcc/xgcc -B/shareNeXT/OS4.2/RTEMS/src/tools-970904/build-m68k-tools/gcc/ -idirafter /shareNeXT/OS4.2/RTEMS/src/tools-970904/build-m68k-tools/m68k-rtems/newlib/targ-include -idirafter /shareNeXT/OS4.2/RTEMS/src/tools-970904/src/newlib/libc/include -nostdinc -O2 -g -pipe -m68332 -O2 -DHAVE_GETTIMEOFDAY -DMALLOC_PROVIDED -DEXIT_PROVIDED -DMISSING_SYSCALL_NAMES -DSIGNAL_PROVIDED -DREENTRANT_SYSCALLS_PROVIDED -fno-builtin -I/shareNeXT/OS4.2/RTEMS/src/tools-970904/build-m68k-tools/m68k-rtems/newlib/./targ-include -I/shareNeXT/OS4.2/RTEMS/src/tools-970904/src/newlib/./libc/include -c ../../../../../../src/newlib/libc/string/memcpy.c This produces the 5-instruction/byte copy: 0xe2ea <memcpy+22>: moveb %a1@+,%a0@+ 0xe2ec <memcpy+24>: movel %d1,%d0 0xe2ee <memcpy+26>: subql #1,%d1 0xe2f0 <memcpy+28>: tstl %d0 0xe2f2 <memcpy+30>: bnes 0xe2ea <memcpy+22> Changing the memcpy source to: if (len) { do { *ap++ = *bp++; } while (--len); } improves the loop to: .L9: move.b (%a0)+,(%a1)+ subq.l #1,%d0 jbne .L9 No loop mode, but certainly a lot faster! The `memcpy turns into bcopy which calls memmove' problem is because of the way the compiler was built. The -DTARGET_MEM_FUNCTIONS=1 flag should be used (or set up when the compiler is configured). Perhaps this change could make it into the next tools distribution. --- Eric Norum eric@skatter.usask.ca Saskatchewan Accelerator Laboratory Phone: (306) 966-6308 University of Saskatchewan FAX: (306) 966-6058 Saskatoon, Canada.