Quite some time ago, I reported a regression in qemu-user on sh4 which caused an unhandled trap when running programs like autogen on qemu-sh4 while they worked fine on real hardware. Bi-secting the problem lead to this change in qemu: 61dedf2af79fb5866dc7a0f972093682f2185e17 is the first bad commit commit 61dedf2af79fb5866dc7a0f972093682f2185e17 Author: Richard Henderson <rth@twiddle.net> Date: Tue Jul 18 10:02:50 2017 -1000 target/sh4: Add missing FPSCR.PR == 0 checks Both frchg and fschg require PR == 0, otherwise undefined_operation. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <20170718200255.31647-26-rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> According to Peter Maydell, however, the behavior of qemu here is according to spec and he suspects that glibc is handling the setting of the FPSCR.PR bit incorrectly. To quote him from the bug report: ============================================================================== I can reproduce this bug, but I'm not sure what QEMU is doing wrong. Looking at the "SH4 Software Manual", it definitely says that if the FPSCR.PR bit is 0 then the 'frchg' and 'fschg' instructions should both trap. The 'frchg' that autogen is hitting is the one in glibc's "getcontext" implementation: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/sh/sh4/getcontext.S;hb=b6d2c4475d5abc05dd009575b90556bdd3c78ad0#l70 QEMU linux-user mode runs with FPSCR=0x000800000, which is "FPSCR.PR == 1", ie "double precision". This seems to match what the kernel has for its FPSCR_INIT value. ============================================================================== So maybe glibc should handle the FPSCR.PR bit the same way the kernel does? > [1] https://bugs.launchpad.net/qemu/+bug/1796520
Please don't use the ports component in Bugzilla for new bugs. The component description says "obsolete component, do not file new issues here"; the component should only have bugs that were filed and resolved long ago when a separate ports repository used to be in use.
Thorsten Glaser has suggested the following patch which fixes the bug for me: diff --git a/sysdeps/unix/sysv/linux/sh/sh4/getcontext.S b/sysdeps/unix/sysv/linux/sh/sh4/getcontext.S index 2545fe7870..ebe1b3fd4f 100644 --- a/sysdeps/unix/sysv/linux/sh/sh4/getcontext.S +++ b/sysdeps/unix/sysv/linux/sh/sh4/getcontext.S @@ -67,6 +67,8 @@ ENTRY(__getcontext) add #(oFPUL+4-124),r0 sts.l fpul, @-r0 sts.l fpscr, @-r0 + mov #0, r6 + lds r6, fpscr frchg fmov.s fr15, @-r0 fmov.s fr14, @-r0 @@ -101,6 +103,10 @@ ENTRY(__getcontext) fmov.s fr2, @-r0 fmov.s fr1, @-r0 fmov.s fr0, @-r0 + mov r4, r0 + add #124, r0 + add #(oFPSCR-124), r0 + lds.l @r0+, fpscr #endif /* __SH_FPU_ANY__ */ /* sigprocmask (SIG_BLOCK, NULL, &uc->uc_sigmask). */ diff --git a/sysdeps/unix/sysv/linux/sh/sh4/setcontext.S b/sysdeps/unix/sysv/linux/sh/sh4/setcontext.S index dd523a8cdb..098a305e80 100644 --- a/sysdeps/unix/sysv/linux/sh/sh4/setcontext.S +++ b/sysdeps/unix/sysv/linux/sh/sh4/setcontext.S @@ -50,6 +50,8 @@ ENTRY(__setcontext) .Lsetcontext_restore: #ifdef __SH_FPU_ANY__ + mov #0, r9 + lds r9, fpscr mov r8, r0 add #(oFR0),r0 fmov.s @r0+, fr0 diff --git a/sysdeps/unix/sysv/linux/sh/sh4/swapcontext.S b/sysdeps/unix/sysv/linux/sh/sh4/swapcontext.S index 1ff4fd612b..85cc4b1104 100644 --- a/sysdeps/unix/sysv/linux/sh/sh4/swapcontext.S +++ b/sysdeps/unix/sysv/linux/sh/sh4/swapcontext.S @@ -67,6 +67,8 @@ ENTRY(__swapcontext) add #(oFPUL+4-124),r0 sts.l fpul, @-r0 sts.l fpscr, @-r0 + mov #0, r9 + lds r9, fpscr frchg fmov.s fr15, @-r0 fmov.s fr14, @-r0
Some comments: 1. QEMU's behaviour matches my reading of the spec 2. Linux's arch/sh/kernel/cpu/sh4/fpu.c says "Executing frchg with PR set causes a trap on some SH4 implementations", so presumably some implementations allow it, including the processor on your boards 3. What happens if a signal is delivered whilst FPCSR.PR is temporarily 0? 4. For __getcontext, can you use a temporary register (various ones are about to be clobbered for the syscall arguments; r1, r2, r3, r5 and r6 all look free to me) instead of the slightly ugly approach of loading back from the context that was just written to?
(In reply to Jessica Clarke from comment #3) > Some comments: > > 1. QEMU's behaviour matches my reading of the spec > 2. Linux's arch/sh/kernel/cpu/sh4/fpu.c says "Executing frchg with PR set > causes a trap on some SH4 implementations", so presumably some > implementations allow it, including the processor on your boards > 3. What happens if a signal is delivered whilst FPCSR.PR is temporarily 0? > 4. For __getcontext, can you use a temporary register (various ones are > about to be clobbered for the syscall arguments; r1, r2, r3, r5 and r6 all > look free to me) instead of the slightly ugly approach of loading back from > the context that was just written to? According to official publicly available documentation from Hitachi/Renesas and STMicroelectronics (which was one of the bigger licensee / user of the SH4 IP) FRCHG insn is only well defined when FPSCR.PR = 0. The documentations are often copied over from one chip product to the next. Maybe some of the last available SH4A implementations (SH7785, SH7786, ..) lift this restriction, but I can't find any official documentation about it.
(In reply to Oleg Endo from comment #4) > > Maybe some of the last available SH4A implementations (SH7785, SH7786, ..) > lift this restriction, but I can't find any official documentation about it. In this sense, for best compatibility, it's better to stick to the original restriction and have FPSCR.PR = 0 when using or FSCHG, FRCHG to toggle the other bits. This seems to be what the patch in comment #2 does.