27543 – Inconsistent behavior when handling FPSCR on SH

Bug 27543 - Inconsistent behavior when handling FPSCR on SH

Summary: Inconsistent behavior when handling FPSCR on SH

Status:	UNCONFIRMED

Alias:	None

Product:	glibc
Classification:	Unclassified
Component:	ports (show other bugs)
Version:	2.31

Importance:	P2 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2021-03-08 10:05 UTC by John Paul Adrian Glaubitz
Modified:	2024-11-07 00:49 UTC (History)
CC List:	7 users (show)

See Also:
Host:
Target:	sh--*
Build:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description John Paul Adrian Glaubitz 2021-03-08 10:05:39 UTC

Quite some time ago, I reported a regression in qemu-user on sh4 which caused an unhandled trap when running programs like autogen on qemu-sh4 while they worked fine on real hardware.

Bi-secting the problem lead to this change in qemu:

61dedf2af79fb5866dc7a0f972093682f2185e17 is the first bad commit
commit 61dedf2af79fb5866dc7a0f972093682f2185e17
Author: Richard Henderson <rth@twiddle.net>
Date: Tue Jul 18 10:02:50 2017 -1000

    target/sh4: Add missing FPSCR.PR == 0 checks

    Both frchg and fschg require PR == 0, otherwise undefined_operation.

    Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
    Signed-off-by: Richard Henderson <rth@twiddle.net>
    Message-Id: <20170718200255.31647-26-rth@twiddle.net>
    Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>

According to Peter Maydell, however, the behavior of qemu here is according to spec and he suspects that glibc is handling the setting of the FPSCR.PR bit incorrectly.

To quote him from the bug report:

==============================================================================

I can reproduce this bug, but I'm not sure what QEMU is doing wrong. Looking at the "SH4 Software Manual", it definitely says that if the FPSCR.PR bit is 0 then the 'frchg' and 'fschg' instructions should both trap.

The 'frchg' that autogen is hitting is the one in glibc's "getcontext" implementation:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/sh/sh4/getcontext.S;hb=b6d2c4475d5abc05dd009575b90556bdd3c78ad0#l70

QEMU linux-user mode runs with FPSCR=0x000800000, which is "FPSCR.PR == 1", ie "double precision". This seems to match what the kernel has for its FPSCR_INIT value.

==============================================================================

So maybe glibc should handle the FPSCR.PR bit the same way the kernel does?

> [1] https://bugs.launchpad.net/qemu/+bug/1796520

Comment 1 jsm-csl@polyomino.org.uk 2021-03-08 23:12:37 UTC

Please don't use the ports component in Bugzilla for new bugs.  The 
component description says "obsolete component, do not file new issues 
here"; the component should only have bugs that were filed and resolved 
long ago when a separate ports repository used to be in use.

Comment 2 John Paul Adrian Glaubitz 2021-07-21 12:54:18 UTC

Thorsten Glaser has suggested the following patch which fixes the bug for me:

diff --git a/sysdeps/unix/sysv/linux/sh/sh4/getcontext.S b/sysdeps/unix/sysv/linux/sh/sh4/getcontext.S
index 2545fe7870..ebe1b3fd4f 100644
--- a/sysdeps/unix/sysv/linux/sh/sh4/getcontext.S
+++ b/sysdeps/unix/sysv/linux/sh/sh4/getcontext.S
@@ -67,6 +67,8 @@ ENTRY(__getcontext)
 	add	#(oFPUL+4-124),r0
 	sts.l	fpul, @-r0
 	sts.l	fpscr, @-r0
+	mov	#0, r6
+	lds	r6, fpscr
 	frchg
 	fmov.s	fr15, @-r0
 	fmov.s	fr14, @-r0
@@ -101,6 +103,10 @@ ENTRY(__getcontext)
 	fmov.s	fr2, @-r0
 	fmov.s	fr1, @-r0
 	fmov.s	fr0, @-r0
+	mov	r4, r0
+	add	#124, r0
+	add	#(oFPSCR-124), r0
+	lds.l	@r0+, fpscr
 #endif /* __SH_FPU_ANY__ */
 
 	/* sigprocmask (SIG_BLOCK, NULL, &uc->uc_sigmask).  */
diff --git a/sysdeps/unix/sysv/linux/sh/sh4/setcontext.S b/sysdeps/unix/sysv/linux/sh/sh4/setcontext.S
index dd523a8cdb..098a305e80 100644
--- a/sysdeps/unix/sysv/linux/sh/sh4/setcontext.S
+++ b/sysdeps/unix/sysv/linux/sh/sh4/setcontext.S
@@ -50,6 +50,8 @@ ENTRY(__setcontext)
 
 .Lsetcontext_restore:
 #ifdef __SH_FPU_ANY__
+	mov	#0, r9
+	lds	r9, fpscr
 	mov	r8, r0
 	add	#(oFR0),r0
 	fmov.s	@r0+, fr0
diff --git a/sysdeps/unix/sysv/linux/sh/sh4/swapcontext.S b/sysdeps/unix/sysv/linux/sh/sh4/swapcontext.S
index 1ff4fd612b..85cc4b1104 100644
--- a/sysdeps/unix/sysv/linux/sh/sh4/swapcontext.S
+++ b/sysdeps/unix/sysv/linux/sh/sh4/swapcontext.S
@@ -67,6 +67,8 @@ ENTRY(__swapcontext)
 	add	#(oFPUL+4-124),r0
 	sts.l	fpul, @-r0
 	sts.l	fpscr, @-r0
+	mov	#0, r9
+	lds	r9, fpscr
 	frchg
 	fmov.s	fr15, @-r0
 	fmov.s	fr14, @-r0

Comment 3 Jessica Clarke 2021-08-05 02:32:16 UTC

Some comments:

1. QEMU's behaviour matches my reading of the spec
2. Linux's arch/sh/kernel/cpu/sh4/fpu.c says "Executing frchg with PR set causes a trap on some SH4 implementations", so presumably some implementations allow it, including the processor on your boards
3. What happens if a signal is delivered whilst FPCSR.PR is temporarily 0?
4. For __getcontext, can you use a temporary register (various ones are about to be clobbered for the syscall arguments; r1, r2, r3, r5 and r6 all look free to me) instead of the slightly ugly approach of loading back from the context that was just written to?

Comment 4 Oleg Endo 2024-11-07 00:10:39 UTC

(In reply to Jessica Clarke from comment #3)
> Some comments:
> 
> 1. QEMU's behaviour matches my reading of the spec
> 2. Linux's arch/sh/kernel/cpu/sh4/fpu.c says "Executing frchg with PR set
> causes a trap on some SH4 implementations", so presumably some
> implementations allow it, including the processor on your boards
> 3. What happens if a signal is delivered whilst FPCSR.PR is temporarily 0?
> 4. For __getcontext, can you use a temporary register (various ones are
> about to be clobbered for the syscall arguments; r1, r2, r3, r5 and r6 all
> look free to me) instead of the slightly ugly approach of loading back from
> the context that was just written to?

According to official publicly available documentation from Hitachi/Renesas and STMicroelectronics (which was one of the bigger licensee / user of the SH4 IP) FRCHG insn is only well defined when FPSCR.PR = 0.

The documentations are often copied over from one chip product to the next.  Maybe some of the last available SH4A implementations (SH7785, SH7786, ..) lift this restriction, but I can't find any official documentation about it.

Comment 5 Oleg Endo 2024-11-07 00:49:57 UTC

(In reply to Oleg Endo from comment #4)
> 
> Maybe some of the last available SH4A implementations (SH7785, SH7786, ..)
> lift this restriction, but I can't find any official documentation about it.

In this sense, for best compatibility, it's better to stick to the original restriction and have FPSCR.PR = 0 when using or FSCHG, FRCHG to toggle the other bits.  This seems to be what the patch in comment #2 does.