Porting glibc to Coldfire
Richard Sandiford
richard@codesourcery.com
Tue Aug 15 16:27:00 GMT 2006
This patch ports glibc to Coldfire. It was tested with the kernel
from Freescale's CWF-MCF547X-548X-2-6-KL BSP:
http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=CWB-MCF547X-548X-2-6-KL&srch=1
with this additional patch from Freescale applied:
http://www.codesourcery.com/archives/coldfire-gnu-discuss/msg00041.html
Everything was compiled with the csl/coldfire-4_1 branch of gcc:
svn://gcc.gnu.org/svn/gcc/branches/csl/coldfire-4_1
which we hope to merge into mainline after 4.2 has branched.
Coldfire vs. m680x0
===================
I suppose one fundamental question is: should Coldfire be treated as
a separate port from m68k, or as a subport? Although there are several
differences between Coldfire and m680x0, I think the architectures are
similar enough to justify treating them as variations of the same base
port. gcc, linux and uClinux have done the same thing.
The patch therefore adopts the following directory structure:
sysdeps/.../m68k/{,fpu/}
sysdeps/.../m68k/m680x0/{,fpu/}
sysdeps/.../m68k/m680x0/m68020/{,fpu/}
sysdeps/.../m68k/coldfire/{,fpu/}
so that files can classified as m680x0-only, Coldfire-only, or suitable
for both. This involves moving a lot of files from m68k/ to m68k/m680x0/,
and because a patch to do that would be almost unreadable, I've attached
a shell script to do it instead. I've made the main patch relative to
the moved files.
If a file only needs small changes for Coldfire, I've kept it in
sysdeps/.../m68k/ and used __mcoldfire__ or __mcffpu__ to select
the Coldfire parts. Obviously it's a judgement call as to how much
variation can be treated as "small", but I hope the balance seems OK.
If no processor is specifically selected by the target triplet, the patch
to sysdeps/m68k/preconfigure will use the compiler to choose between m680x0
or Coldfire as appropriate.
Generic m68k fixes
==================
I came across a few problems with the existing m68k port. Because I've
got no way of testing the port without the Coldfire support, and because
one or two of the fixes are in code that is sensitive to the Coldfire/m680x0
distinction, I'm afraid everything's lumped together. The fixes are fairly
simple though. Specifically:
- sysdeps.h didn't guard against multiple inclusion.
- The definitions of feholdexcept and fesetround were missing
a libm_hidden_def().
- setjmp.c used hidden_def() rather than libc_hidden_def(), which led to:
error: '__EI___sigsetjmp' aliased to undefined symbol '__GI___sigsetjmp'
I've changed it to use libc_hidden_def() instead.
- In dl-trampoline.S:
- The code that rounds the frame size used "lsr" (implicitly "lsr.w")
rather than lsr.l, causing it to mishandle large frames:
| Round framesize up to even
addq.l #1, %d1
lsr #1, %d1
sub.l %d1, %a0
sub.l %d1, %a0
- The code that calls _dl_call_pltexit() failed to initialize the
lrv_a0 field of the outregs parameter, which in turn meant that
the contents of lrv_fp0 were at the wrong offset. Also, the inregs
parameter pointed 4 bytes below the structure it was supposed to
point at.
I've fixed these problems and adjusted the stack offsets of other
data to account for the extra field.
- The port was missing ldsodefs.h and tst-audit.h. These files are
needed because upstream sources no longer provide the m68k definitions.
- struct fpregset was out of sync with linux. linux puts the
data registers after the control registers, but glibc had them
the other way round.
- The layout of struct ucontext was also out of sync with linux.
uc_sigmask should come after uc_filler, and uc_filler should
have 80 rather than 174 elements.
- m68k glibc was using the standard linux layout of struct siginfo, but
m68k linux uses a different layout. It appears that the uid fields
were once 16-bit fields on m68k linux, and that, to avoid breaking
backward compatibility, 32-bit versions were later tacked on to the
end of each substructure. I've therefore added an m68k linux-specific
siginfo.h file.
- The generic implementation of wcpcpy.c accesses the source string
using an offset from the destination string:
wchar_t *
__wcpcpy (dest, src)
wchar_t *dest;
const wchar_t *src;
{
wchar_t *wcp = (wchar_t *) dest - 1;
wint_t c;
const ptrdiff_t off = src - dest + 1;
do
{
c = wcp[off];
*++wcp = c;
}
while (c != L'\0');
return wcp;
}
which means that sizeof (wchar_t) must be __alignof__ (wchar_t).
On m68k, the values are 4 and 2 respectively, so the routine won't
work if ((intptr_t) dest % 2) != ((intptr_t) src % 2).
wcscpy.c (which was written a year earlier) does check the alignment,
and so works out of the box on m68k. I don't think there's any chance
of getting the upstream version of wcpcpy.c changed in the same way,
so I've added a port-local version. I've also done the same for
wcpcpy-chk.c, which has the same problem.
- m68k/sysdep-cancel.h wrongly treated __librt_multiple_threads as
hidden, and the assembler version of SINGLE_THREAD_P used PC-relative
addressing to access it. I've removed the hidden attribute and made
librt's SINGLE_THREAD_P load the symbol from the GOT instead. The new
implementation of SINGLE_THREAD_P needs a temporary address register,
which is passed as an argument to the macro.
Optimizations
=============
I had to change the implementation of the string and memory functions
for Coldfire, and noticed that some of them could be optimized slightly.
When trying to reach an alignment boundary, the current code moves the
address into a data register and "and"s it with 3 to see if it is
already aligned. If it isn't aligned, the code would repeat the check
one byte later, and again for the byte after that. It would be simpler
to use subq and addq on the first "and" result instead. (We can use
addq.w and subq.w on m680x0.) From what I remember of 68000, I think
this is better for 680x0 targets too.
Coldfire changes
================
The main differences between the Coldfire and m680x0 code are as follows:
- FPU differences:
- FP registers are 64 bits rather than 96 bits wide.
- Coldfire does not have the 68881's fmovem.l; we must save and restore
individual control registers.
- Long doubles are the same as doubles.
- The canonical NaN has all significand bits set. Some files in
ieee754/dbl-64 use hard-coded hex constants, so I've overridden
them (e_pow.c, s_sin.c and u_remainder.c).
- Unlike the 68881, the Coldfire FPU lets you raise exceptions by
setting the appropriate EXC bits of the FPSR and then executing
an arithmetic instruction. This makes the implementation of
fraiseexcpt.c easier.
- The Coldfire FPU has a much smaller set of instructions than the 68881.
The functions it does support directly are: fabs(), sqrt(), lrint()
rint(), and their float and long double equivalents.
- ISA differences:
- 32-bit PC-relative offsets must be loaded into a register and then
applied using offset(%pc,reg). I've added a PCREL_OP macro to wrap up
this difference.
- Coldfire does not have jmp (%dN) and jsr (%dN). Those instructions are
used in dl-trampoline.S in cases where every address register is live,
so I've simulated them using push and rts instructions.
- Coldfire strongly prefers a 32-bit aligned stack pointer, so I've
rounded frame sizes up to longword rather than word alignment.
- Coldfire does not have dbra, exg or word-sized register operations.
- Kernel differences:
- FPU-related fields are often laid out differently.
- FP registers have different ptrace() numbers.
- sigcontext has fields for all registers, avoiding the need for the
real_catch_segfault hack in register-dump.h.
- Coldfire has no atomic compare-and-swap instruction and the kernel
does not yet have any userspace atomicity support. I've therefore
used the generic bits/atomic.h implementation, but with the addition
of the now-required atomic*_t types. (I don't think upstream would
allow these types to be added to the generic bits/atomic.h as none of
the core targets use that file.)
Compatibility
=============
Because Coldfire is a new port, we don't need to be compatible with
versions before 2.4. So:
- I've set the default version to GLIBC_2.4 in
sysdeps/m68k/coldfire/shlib-versions.
- I've moved oldgetrlimit and oldsetrlimit from
sysdeps/unix/sysv/linux/m68k/syscalls.list to the new
sysdeps/unix/sysv/linux/m68k/m680x0/syscalls.list.
Expected test faliures
======================
As far as the testsuite goes, some tests failed for me because of the
usual environmental limitations. For example, the board had only 64MB
of RAM, which isn't enough for some tests, and the root fs was
NFS-mounted, which causes tests like tst-utmp and tst-utmpx to fail.
There are some expected non-environment failures too:
math/test-misc.out
misc/tst-efgcvt.out
stdio-common/tst-printf.out
- These tests require correct subnormal handling. The kernel does
not yet emulate subnormal operations.
build rt/tst-aio2.o
build rt/tst-aio3.o
- These tests should (but don't) include <pthread.h>, as they refer
to PTHREAD_BARRIER_SERIAL_THREAD. Changes are unlikely to be
accepted upstream because NPTL ports presumably work as-is.
rt/tst-aio10.out
rt/tst-aio9.out
- A LinuxThreads limitation. We implement lio_listio using
pthread_cond_wait, which does not stop and return EINTR when
a signal is raised. NPTL avoids this using aio_misc.h.
math/test-double.out
math/test-float.out
math/test-idouble.out
math/test-ifloat.out
- All four tests fail some llrint_upward and llrint_downward checks
because of a bug in the generic llrint.c code; see bug #2592.
test-float.out and test-double.out also fail because the Coldfire
FPU does not distinguish between quiet and signalling NaNs;
all NaN inputs raise an Invalid Operation exception.
As a sanity check, I've also built a 68020 glibc. I used
csl/coldfire-4_1 again, but with the attached mainline backports applied,
and with gcc configured using --with-cpu=68020 and --with-float=hard.
The patch is in three pieces; the initial move-only script, the main
ports patch, and a linuxthreads patch. There is talk of supporting
NPTL in future, but nothing definite yet.
Please install if OK.
Richard
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: glibc-move.clog
URL: <http://sourceware.org/pipermail/libc-ports/attachments/20060815/02569106/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: glibc-move.sh
URL: <http://sourceware.org/pipermail/libc-ports/attachments/20060815/02569106/attachment-0001.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: glibc.clog
URL: <http://sourceware.org/pipermail/libc-ports/attachments/20060815/02569106/attachment-0002.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: glibc.diff
URL: <http://sourceware.org/pipermail/libc-ports/attachments/20060815/02569106/attachment-0003.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: glibc-linuxthreads.clog
URL: <http://sourceware.org/pipermail/libc-ports/attachments/20060815/02569106/attachment-0004.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: glibc-linuxthreads.diff
URL: <http://sourceware.org/pipermail/libc-ports/attachments/20060815/02569106/attachment-0005.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: gcc-backports.diff
URL: <http://sourceware.org/pipermail/libc-ports/attachments/20060815/02569106/attachment-0006.ksh>
More information about the Libc-ports
mailing list