28872 – Building glibc for MIPSel single float targets

Bug 28872 - Building glibc for MIPSel single float targets

Summary: Building glibc for MIPSel single float targets

Status:	RESOLVED FIXED

Alias:	None

Product:	glibc
Classification:	Unclassified
Component:	ports (show other bugs)
Version:	2.30

Importance:	P2 critical
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2022-02-08 11:27 UTC by Den
Modified:	2022-03-18 15:01 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
single float MIPSel targets doubtful workaround (1.32 KB, patch) 2022-02-08 11:27 UTC, Den	Details \| Diff
patch (1.05 KB, patch) 2022-02-23 11:57 UTC, Adhemerval Zanella	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Den 2022-02-08 11:27:55 UTC

Created attachment 13962 [details]
single float MIPSel targets doubtful workaround

When building glibc for a MIPSel target with the next options - hardwared single-float ISA MIPS>=2 ABI o32 - there will be a problematic files to compile - sysdeps/mips/{__longjmp.c,setjmp_aux.c} and sysdeps/unix/sysv/linux/mips/{getcontext.S,setcontext.S,swapcontext.S}

The problem is that when the target, such as r5900 processor I am trying to build glibc for, supports only single precisions and have no support for double precisions at all - it do not have such opcodes like l.d, s.d and sqrt.d.

I made some workaround (attached), but it is sooo doubtful... Can you check whenever this correct or wrong? Anyway, having it built, I am getting the warnings "float register should be even, was (some uneven value)" from time to time while compiling with the cross compiler based on such glibc.

Comment 1 jsm-csl@polyomino.org.uk 2022-02-08 17:55:34 UTC

As stated in the component description, please do not file new bugs in the 
obsolete ports component.  Bugs in architecture-specific code should be 
filed in the component appropriate to what that code does, independent of 
the architecture.

What ABI are you using for this port?  If it's not using the same ABI (for 
floating-point argument passing and return of all floating-point types) as 
either the existing hard-float ABIs (intended for cases where both single 
and double precision are supported in hardware) or the existing soft-float 
ABIs, I don't think we'd want to support another ABI just for this 
variation, given we already have 24 ABI variants for MIPS as listed at 
<https://sourceware.org/glibc/wiki/ABIList>.

Comment 2 Den 2022-02-08 19:19:40 UTC

> As stated in the component description, please do not file new bugs in the 
> obsolete ports component.
Sorry for that... then maybe you can direct me where to fill it?

> What ABI are you using for this port?
> <https://sourceware.org/glibc/wiki/ABIList>.
If to note from the list, it's
classic NaN, o32, hard-float, LE: /lib/ld.so.1
but the hardware, r5900 processor, supports only single float.

Please, correct me if I mistake - you are not a community to thought-out the glibc functionality; bugzilla is for accepting the people's ready solutions only, right? - in that case, please, note it here and I'll close this "bug" request.

Comment 3 jsm-csl@polyomino.org.uk 2022-02-08 19:34:40 UTC

On Tue, 8 Feb 2022, archicharmer at mail dot ru via Glibc-bugs wrote:

> https://sourceware.org/bugzilla/show_bug.cgi?id=28872
> 
> --- Comment #2 from Den <archicharmer at mail dot ru> ---
> > As stated in the component description, please do not file new bugs in the 
> > obsolete ports component.
> Sorry for that... then maybe you can direct me where to fill it?

libc is probably appropriate.

> > What ABI are you using for this port?
> > <https://sourceware.org/glibc/wiki/ABIList>.
> If to note from the list, it's
> classic NaN, o32, hard-float, LE: /lib/ld.so.1
> but the hardware, r5900 processor, supports only single float.

But are double-precision values passed in exactly the same way as when 
building for a processor supporting double-precision in hardware, so that 
objects built for double-precision hardware could run against a shared 
libm built for single-precision hardware, passing double values to the 
libm functions that take double arguments and getting double return values 
from them, as long as this is run in double-precision hardware, for 
example?  (That's how the 32-bit Arm hard-float AAPCS variant works, for 
example - even processors with only single-precision float in hardware 
have double-precision registers and double-precision loads and stores, so 
there is no separate single-precision ABI.  Though such processors aren't 
actually relevant for glibc because they aren't 'A' variant.)

Comment 4 Den 2022-02-09 03:41:40 UTC

> But are double-precision values passed in exactly the same way as when 
> building for a processor supporting double-precision in hardware, so that 
> objects built for double-precision hardware could run against a shared 
> libm built for single-precision hardware, passing double values to the 
> libm functions that take double arguments and getting double return values 
> from them, as long as this is run in double-precision hardware, for 
> example?
> processors with only single-precision float in hardware 
> have double-precision registers and double-precision loads and stores, so 
> there is no separate single-precision ABI.  Though such processors aren't 
> actually relevant for glibc because they aren't 'A' variant.)
Double precision are emulating by GCC. For example, it "melts" opcode s.d (8) into a two sdc1 (4+4). But in case when there are assembler inlines with unsupported opcodes set exactly in them then, of course, the compiler will be rumbling.
The question is not to thought-out a whole line of rewritting the glibc building for a single float abi, neither to create the special abi. There are exact files are failing to be compiled because of their assembler inlines with an unsupported opcodes set directly. Let everything will be with double precision as it is, but is there a way to rewrite these assembler inlines corectly so it can be compiled with the supported opcodes?
That is not r5900 specific issue. If to build a cross compiler for r6000 with setting it have no double float, then it appears here too. r6000 have double precision either the related opcodes, but the omission of the -mdoobule-float flag sets such opcodes like s.d,l.d and sqrt.d unsupported.

Comment 5 Adhemerval Zanella 2022-02-09 14:16:56 UTC

(In reply to Den from comment #4)
> > But are double-precision values passed in exactly the same way as when 
> > building for a processor supporting double-precision in hardware, so that 
> > objects built for double-precision hardware could run against a shared 
> > libm built for single-precision hardware, passing double values to the 
> > libm functions that take double arguments and getting double return values 
> > from them, as long as this is run in double-precision hardware, for 
> > example?
> > processors with only single-precision float in hardware 
> > have double-precision registers and double-precision loads and stores, so 
> > there is no separate single-precision ABI.  Though such processors aren't 
> > actually relevant for glibc because they aren't 'A' variant.)
> Double precision are emulating by GCC. For example, it "melts" opcode s.d
> (8) into a two sdc1 (4+4). But in case when there are assembler inlines with
> unsupported opcodes set exactly in them then, of course, the compiler will
> be rumbling.
> The question is not to thought-out a whole line of rewritting the glibc
> building for a single float abi, neither to create the special abi. There
> are exact files are failing to be compiled because of their assembler
> inlines with an unsupported opcodes set directly. Let everything will be
> with double precision as it is, but is there a way to rewrite these
> assembler inlines corectly so it can be compiled with the supported opcodes?
> That is not r5900 specific issue. If to build a cross compiler for r6000
> with setting it have no double float, then it appears here too. r6000 have
> double precision either the related opcodes, but the omission of the
> -mdoobule-float flag sets such opcodes like s.d,l.d and sqrt.d unsupported.

Not sure if it would characterize as a new ABI, at least libgcc seems to handle it by checking __mips_fpr and setting macros for float and double load/store.  Maybe you could use the same on the affected assembly in glibc.

Comment 6 Den 2022-02-09 17:03:31 UTC

(In reply to Adhemerval Zanella from comment #5)
> libgcc seems to handle it by checking __mips_fpr and setting macros for float
> and double load/store.
I beleive that when the code with double precisions is written totally in C, GCC is choosing how to compile it according to the target it was built, and in case the target has the single floats only then GCC will compile it the way the double precisions will be replaced with the thought-out output as the single precisions.
But as I previously noted, GCC can do nothing with the assembler inlines, __asm__ __volatile__, if there are opcodes in them are set directly, the unsupported ones.

> Maybe you could use the same on the affected assembly in glibc.
I could and I tried - as a result I attached a patch for glibc-2.30. And made a "bug" request here to ask your community to check whenever it is reasonable or it is completely wrong. In case that I wrong then I would like to ask you to make the related workaround if it is possible.
But, I'll note once again, if the bugzilla is a community which business is only to keep, collect and approve the people's completed solved patches and workarounds, then, please, note it for me here, I'll close the thread and will be searching for the help in other places.

Maybe if the files
sysdeps/mips/{__longjmp.c,setjmp_aux.c} and sysdeps/unix/sysv/linux/mips/{getcontext.S,setcontext.S,swapcontext.S} and
sysdeps/mips/fpu/e_sqrt.c
could be written completely in C, then they might be compiled both correctly and without any error.

Comment 7 Adhemerval Zanella 2022-02-09 17:33:04 UTC

(In reply to Den from comment #6)
> (In reply to Adhemerval Zanella from comment #5)
> > libgcc seems to handle it by checking __mips_fpr and setting macros for float
> > and double load/store.
> I beleive that when the code with double precisions is written totally in C,
> GCC is choosing how to compile it according to the target it was built, and
> in case the target has the single floats only then GCC will compile it the
> way the double precisions will be replaced with the thought-out output as
> the single precisions.
> But as I previously noted, GCC can do nothing with the assembler inlines,
> __asm__ __volatile__, if there are opcodes in them are set directly, the
> unsupported ones.

I used libgcc example because it seems to have the same constraint on
libgcc/config/mips/mips16.S, where it is an assembly file that handles
different floating-point support depending of the compiler target selected.

> 
> > Maybe you could use the same on the affected assembly in glibc.
> I could and I tried - as a result I attached a patch for glibc-2.30. And
> made a "bug" request here to ask your community to check whenever it is
> reasonable or it is completely wrong. In case that I wrong then I would like
> to ask you to make the related workaround if it is possible.
> But, I'll note once again, if the bugzilla is a community which business is
> only to keep, collect and approve the people's completed solved patches and
> workarounds, then, please, note it for me here, I'll close the thread and
> will be searching for the help in other places.

Patches are not usually discussed on bugzilla and unless meant to backport
(which then come from an installed fix) they should be make against current
master branch.

We have a contributor checklist [1] so you follow the step to submit new 
patches.

> 
> Maybe if the files
> sysdeps/mips/{__longjmp.c,setjmp_aux.c} and
> sysdeps/unix/sysv/linux/mips/{getcontext.S,setcontext.S,swapcontext.S} and
> sysdeps/mips/fpu/e_sqrt.c
> could be written completely in C, then they might be compiled both correctly
> and without any error.

The POSIX 2001 context functions are tricky to implement in C because it
requires to control the process directly, which would require additional
compiler support to implement it right (with something like naked function,
plust register asm, a way to avoid stack allocation or either extra registers
spills).  It is *way* simpler to implement it on assembly.

The sqrt implementation was refactor by 2.32 by 32c65b28f37fc6c, to it now
uses compiler_builtin.

About your patch, it does seems fully correct since you replacing a double
load/store with a float load/store.  It will most likely trigger failures
in mips processors that fully implement double instructions.  As before, 
I think you will need to only use this code patch if __mips_fpr equal to
32 (I am not sure about __mips_fpr being 0).

The setjmp.h/jmp_buf-macros.h change is also wrong: they are installed
headers that do define the ABI, so you if you changing you de-facto
creating a new ABI or you need all the dance to provide compatibility
symbols (and I think the type can't be infered by compiler defined preprocessors
like __mips_fpr).

In any case I suggest you to prepare a patch, even if incomplete, so we
can discuss on libc-alpha.

[1] https://sourceware.org/glibc/wiki/Contribution%20checklist

Comment 8 Den 2022-02-10 02:43:23 UTC

(In reply to Adhemerval Zanella from comment #7)
> About your patch, it does seems fully correct since you replacing a double
> load/store with a float load/store.  It will most likely trigger failures
> in mips processors that fully implement double instructions.
From other side, since everything is planned to operate with doubles, I doubt that the redefinition of the fpregs in mips'es setjmp.h
from
double __fpregs[6];
to
float __fpregs[12];
is correct. I think it have to be as it is, __fpregs[6], instead.

The theory is we'll be getting the incoming double, halfing it and storing into the free registers and the higher part registers respectively.
The practice. Mips'es file setjmp_aux.c:

asm volatile ("s.d $f20, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[0]));

compiling it to get the object file. The appropriate to that line now is the next:

s.d $f20, 56($4)

I think that it is needed to be rewritten to the next:

swc1 $f20, 56($4)
swc1 $f21, 60($4)

Then back to asm volatile. I could not achieve as to rewrite it here. I'm sure it should be looking like this:
asm volatile ("swc1 $f20, ?\n\t
               swc1 $f21, ?"
              :
              : "?" ( ? ));

That's why I redefined __fpregs from double to float - to get them incoming by 4 instead of by 8 and so on I wrote a separated asm volatiles which might be a mistake here too, and it should be united. Can you in this particular example rewrite it as to get the lower and higher parts of the incoming __fpregs[0] and respectively return those data to $f20 and $f21 to store?

> It will most likely trigger failures
> in mips processors that fully implement double instructions.  As before, 
> I think you will need to only use this code patch if __mips_fpr equal to
> 32 (I am not sure about __mips_fpr being 0).

> checking __mips_fpr and setting macros for float and double load/store
Yes it might work, setjmp_aux.c "knows" about the definition of the __mips_fpr which is 32. I just presented the patch to show the logic of the rewritting, which is doubtful...

> The sqrt implementation was refactor by 2.32 by 32c65b28f37fc6c, to it now
> uses compiler_builtin.
Yes, I saw it before and tried to implement it in 2.30. If I'd know that it is approved in the v2.32 and newer, I just used the updated glibc to compile. And I'm surely will.

> The setjmp.h/jmp_buf-macros.h change is also wrong: they are installed
> headers that do define the ABI, so you if you changing you de-facto
> creating a new ABI or you need all the dance to provide compatibility
> symbols
Understood and agreed. And I think that's the hint that __fpregs should be stayed at [6] numbers of doubles definition. Just to understand the conception of halfing the double in the assembler...

> In any case I suggest you to prepare a patch, even if incomplete, so we
> can discuss on libc-alpha.
> 
> [1] https://sourceware.org/glibc/wiki/Contribution%20checklist
Alright, I'll involve into it. Should I attach it here or to create some separate bug request as suggested?

Comment 9 Adhemerval Zanella 2022-02-10 11:25:35 UTC

(In reply to Den from comment #8)
> (In reply to Adhemerval Zanella from comment #7)
> > About your patch, it does seems fully correct since you replacing a double
> > load/store with a float load/store.  It will most likely trigger failures
> > in mips processors that fully implement double instructions.
> From other side, since everything is planned to operate with doubles, I
> doubt that the redefinition of the fpregs in mips'es setjmp.h
> from
> double __fpregs[6];
> to
> float __fpregs[12];
> is correct. I think it have to be as it is, __fpregs[6], instead.
> 
> The theory is we'll be getting the incoming double, halfing it and storing
> into the free registers and the higher part registers respectively.
> The practice. Mips'es file setjmp_aux.c:
> 
> asm volatile ("s.d $f20, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[0]));
> 
> compiling it to get the object file. The appropriate to that line now is the
> next:
> 
> s.d $f20, 56($4)
> 
> I think that it is needed to be rewritten to the next:
> 
> swc1 $f20, 56($4)
> swc1 $f21, 60($4)
> 
> Then back to asm volatile. I could not achieve as to rewrite it here. I'm
> sure it should be looking like this:
> asm volatile ("swc1 $f20, ?\n\t
>                swc1 $f21, ?"
>               :
>               : "?" ( ? ));
> 
> That's why I redefined __fpregs from double to float - to get them incoming
> by 4 instead of by 8 and so on I wrote a separated asm volatiles which might
> be a mistake here too, and it should be united. Can you in this particular
> example rewrite it as to get the lower and higher parts of the incoming
> __fpregs[0] and respectively return those data to $f20 and $f21 to store?

You can't change the size unless you want either define case as a new ABI 
(so you will need to provide a new setjmp.h and all machinery to select this
as a new ABI, probably to set a new triple, etc.).  Also, changing it size
is a ABI change, you will need to considere all the implication of doing it.

Accessing the __fpregs members should be transparent to the application, setjmp
should place the information in a form the compiler generated code could retrieve
the information.  I am not if you need to emit a extend or trunc operation,
I am trying to understand which options you use with gcc to target this chip, 
-mabi=32 -march=mips2 -mhard-float -msingle-float -mfp32 seems to use 'sw/lw' to
load store doubles.

> 
> > It will most likely trigger failures
> > in mips processors that fully implement double instructions.  As before, 
> > I think you will need to only use this code patch if __mips_fpr equal to
> > 32 (I am not sure about __mips_fpr being 0).
> 
> > checking __mips_fpr and setting macros for float and double load/store
> Yes it might work, setjmp_aux.c "knows" about the definition of the
> __mips_fpr which is 32. I just presented the patch to show the logic of the
> rewritting, which is doubtful...
> 
> > The sqrt implementation was refactor by 2.32 by 32c65b28f37fc6c, to it now
> > uses compiler_builtin.
> Yes, I saw it before and tried to implement it in 2.30. If I'd know that it
> is approved in the v2.32 and newer, I just used the updated glibc to
> compile. And I'm surely will.

I didn't not understand what it should 'approved' here.  If you want to get
this fixed upstream you will need to patch against master.

> 
> > The setjmp.h/jmp_buf-macros.h change is also wrong: they are installed
> > headers that do define the ABI, so you if you changing you de-facto
> > creating a new ABI or you need all the dance to provide compatibility
> > symbols
> Understood and agreed. And I think that's the hint that __fpregs should be
> stayed at [6] numbers of doubles definition. Just to understand the
> conception of halfing the double in the assembler...
> 
> > In any case I suggest you to prepare a patch, even if incomplete, so we
> > can discuss on libc-alpha.
> > 
> > [1] https://sourceware.org/glibc/wiki/Contribution%20checklist
> Alright, I'll involve into it. Should I attach it here or to create some
> separate bug request as suggested?

Just reference the bugzill number in patch title 'Title (BZ #XXXXX)', patches
are reviewed only on the maillist.

Comment 10 Den 2022-02-10 19:45:39 UTC

(In reply to Adhemerval Zanella from comment #9)
> I am trying to understand which options you use with gcc to target this
> chip, 
> -mabi=32 -march=mips2 -mhard-float -msingle-float -mfp32
That's correct. -march=r6000 but yes it is mips2 anyway.

> seems to use
> 'sw/lw' to
> load store doubles.
I do not think that if to replace all of the s.d/l.d to sw/lw respectively will do the trick.

Let's summarize.
There is no way to somehow split a doubleword to a two words and vice versa conversion/uniting;
An extra ABI is required to be specified as to be thought-out. No one will be handling that because of it's rarity; even if the processor itself, it's model, is not rare;
Most of the programs compiled basely on glibc with my patch are providing a SegFault, invalid write access. Patch is a waste, it's conception is wrong.

Verdicts are:
patch is not worth to be accepated, even to be existing;
glibc can not be built for MIPSel targets with single float only;
thread closed;
invalid (to be as a) bug request.

Comment 11 Adhemerval Zanella 2022-02-10 20:57:52 UTC

(In reply to Den from comment #10)
> (In reply to Adhemerval Zanella from comment #9)
> > I am trying to understand which options you use with gcc to target this
> > chip, 
> > -mabi=32 -march=mips2 -mhard-float -msingle-float -mfp32
> That's correct. -march=r6000 but yes it is mips2 anyway.
> 
> > seems to use
> > 'sw/lw' to
> > load store doubles.
> I do not think that if to replace all of the s.d/l.d to sw/lw respectively
> will do the trick.

I am not really proposing it, I am in fact trying to understand what gcc emits in such case.  For instance,

$ cat f.c
void foo (double *x, double y)
{
  *x = y;
}
$ mips64el-glibc-linux-gnu-gcc -O3 -mabi=32 -march=r5900 -mhard-float -msingle-float -mfpxx f.c -S -o -
	.file	1 "f.c"
	.section .mdebug.abi32
	.previous
	.nan	legacy
	.module	singlefloat
	.module	oddspreg
	.abicalls
	.option	pic0
	.text
	.align	2
	.globl	foo
	.set	nomips16
	.set	nomicromips
	.ent	foo
	.type	foo, @function
foo:
	.frame	$sp,0,$31		# vars= 0, regs= 0/0, args= 0, gp= 0
	.mask	0x00000000,0
	.fmask	0x00000000,0
	sw	$6,0($4)
	.set	noreorder
	.set	nomacro
	jr	$31
	sw	$7,4($4)
[...]

So I am trying to understand how/when gcc does the double emulation that replaces
s.d by a two sdc1.

> 
> Let's summarize.
> There is no way to somehow split a doubleword to a two words and vice versa
> conversion/uniting;

My understanding is you do not need split, you just to need to save/restore
the floating point values in the correct offsets in jmp_buf.  The issue is
if the application trying to access the registers values using a different
type would see a double value which is different than float (I think maybe
you will need to float-extend the values before writting them down on jmp_buf).

> An extra ABI is required to be specified as to be thought-out. No one will
> be handling that because of it's rarity; even if the processor itself, it's
> model, is not rare;

It is really unfortunate to add *another* mips abi because of this idissioncracy.

> Most of the programs compiled basely on glibc with my patch are providing a
> SegFault, invalid write access. Patch is a waste, it's conception is wrong.

Is the segfault due the patch or is it another issue in fact?  Afaik there is
no floating-point support on loader and it is localized on specific cases in
libc.so.

> 
> Verdicts are:
> patch is not worth to be accepated, even to be existing;

Patch is not accepable *as-is*, I think it still possible to fix the
context functions to correctly work on the chip you are using.

> glibc can not be built for MIPSel targets with single float only;
> thread closed;
> invalid (to be as a) bug request.

Comment 12 Den 2022-02-11 03:54:46 UTC

(In reply to Adhemerval Zanella from comment #11)
> I am not really proposing it, I am in fact trying to understand what gcc
> emits in such case.  For instance,
> [...]
> 
> So I am trying to understand how/when gcc does the double emulation that
> replaces
> s.d by a two sdc1.
I'll write the part of the additions to configures for how binutils and gcc were built.
binutils-2.34 . --target=mipsel-unknown-linux-gnu --with-arch=r6000 --with-cpu=r6000
gcc-9.2.0 . --target=mipsel-unknown-linux-gnu --with-arch=r6000 --with-float=hard --with-fpu=single

It is not actually gcc emulates s.d as a two sdc1 (swc1 attentionly) (gcc only "rumbles" when it meets an unsupported opcode in asm volatile), I got that conception from this:
echo 's.d $f0,($0)' >test.s
mipsel-unknown-linux-gnu-as test.s
mipsel-unknown-linux-gnu-objdump -d a.out
00000000 <.text>:
   0:	e4000000 	swc1	$f0,0(zero)
   4:	e4010004 	swc1	$f1,4(zero)

> Is the segfault due the patch or is it another issue in fact?
Well, I could not determine that exactly because I can not build it without the patch. If to build a compiler without fp then everything is working fine and there is no such SegFaults in those places.
Actually, no one existing libc can not be built for MIPSel target when there are the single floats only, because there are the same files with same includings to compile. Same structure of files for FPU.

> Patch is not acceptable *as-is*
I mean the patch itself, the conception in it, is doing wrong.

> I think it still possible to fix the
> context functions to correctly work on the chip you are using.
Well, that is already behind my abilities. The patch was everything I got.

Comment 13 Den 2022-02-17 17:09:32 UTC

That issue was solved in newlib in the nineties
https://github.com/MIPS/newlib/blob/master/newlib/libc/machine/mips/setjmp.S

He was not in such trouble like determine an ABI first, either to calculate how much pieces of the problematic processors totally in the world to decide whenever to realize that workaround or to do that not. He just done it, simply, smartly.
However, I contacted to him for asking the help. He answered that he rely the solution of that to some another younger enthusiast person...

Comment 14 Adhemerval Zanella 2022-02-18 12:39:31 UTC

(In reply to Den from comment #13)
> That issue was solved in newlib in the nineties
> https://github.com/MIPS/newlib/blob/master/newlib/libc/machine/mips/setjmp.S
> 
> He was not in such trouble like determine an ABI first, either to calculate
> how much pieces of the problematic processors totally in the world to decide
> whenever to realize that workaround or to do that not. He just done it,
> simply, smartly.

In fact it does exactly what I suggested, by checking the float ABI you are building (__mips_hard_float) and for _ABIO32 it also checks __mips_fpr (which I suggested on comment #5).  However I don't think you will have the same alignment issue as newlib, glibc don't define the jmp_buf as opaque type as newlib (a int array), so __fpregs is guarantee to have double alignmnet.

> However, I contacted to him for asking the help. He answered that he rely
> the solution of that to some another younger enthusiast person...

Checking on newlib code, there is no much magic required to fix glibc to work for o32 FPXX and FP64. Below it is a fix to just use the correct load/store FP instruction.  At least on qemu it seems to works fine:

$ cat t.c
#include <stdio.h>
#include <setjmp.h>

jmp_buf bfoo, bbar;

void bar ();            // forward declaration 

void
foo ()
{
  int r;

  printf ("(A1)\n");

  r = setjmp (bfoo);
  if (r == 0)
    bar ();

  printf ("(A2) r=%d\n", r);

  r = setjmp (bfoo);
  if (r == 0)
    longjmp (bbar, 21);

  printf ("(A3) r=%d\n", r);

  r = setjmp (bfoo);
  if (r == 0)
    longjmp (bbar, 22);

  printf ("(A4) r=%d\n", r);
}

void
bar ()
{
  int r;

  printf ("(B1)\n");

  r = setjmp (bbar);
  if (r == 0)
    longjmp (bfoo, 11);

  printf ("(B2) r=%d\n", r);

  r = setjmp (bbar);
  if (r == 0)
    longjmp (bfoo, 12);

  printf ("(B3) r=%d\n", r);

  r = setjmp (bbar);
  if (r == 0)
    longjmp (bfoo, 13);
}


int
main (int argc, char **argv)
{
  foo ();
  return 0;
}
$ mips64-linux-gnu/bin/mips64-glibc-linux-gnu-gcc -mabi=32 -mips2 -mhard-float -mfpxx t.c -o t
$ ./elf/ld.so --library-path . ./t
(A1)
(B1)
(A2) r=11
(B2) r=21
(A3) r=12
(B3) r=22
(A4) r=13

---
diff --git a/sysdeps/mips/__longjmp.c b/sysdeps/mips/__longjmp.c
index 319da1895f..38fe98044c 100644
--- a/sysdeps/mips/__longjmp.c
+++ b/sysdeps/mips/__longjmp.c
@@ -38,12 +38,17 @@ ____longjmp (__jmp_buf env_arg, int val_arg)
 
 #ifdef __mips_hard_float
   /* Pull back the floating point callee-saved registers.  */
-  asm volatile ("l.d $f20, %0" : : "m" (env[0].__fpregs[0]));
-  asm volatile ("l.d $f22, %0" : : "m" (env[0].__fpregs[1]));
-  asm volatile ("l.d $f24, %0" : : "m" (env[0].__fpregs[2]));
-  asm volatile ("l.d $f26, %0" : : "m" (env[0].__fpregs[3]));
-  asm volatile ("l.d $f28, %0" : : "m" (env[0].__fpregs[4]));
-  asm volatile ("l.d $f30, %0" : : "m" (env[0].__fpregs[5]));
+# if __mips_fpr == 0 || __mips_fpr == 64
+#  define LDFPR "ldc1 "
+# else
+#  define LDFPR "l.d "
+# endif
+  asm volatile (LDFPR "$f20, %0" : : "m" (env[0].__fpregs[0]));
+  asm volatile (LDFPR "$f22, %0" : : "m" (env[0].__fpregs[1]));
+  asm volatile (LDFPR "$f24, %0" : : "m" (env[0].__fpregs[2]));
+  asm volatile (LDFPR "$f26, %0" : : "m" (env[0].__fpregs[3]));
+  asm volatile (LDFPR "$f28, %0" : : "m" (env[0].__fpregs[4]));
+  asm volatile (LDFPR "$f30, %0" : : "m" (env[0].__fpregs[5]));
 #endif
 
   /* Get the GP. */
diff --git a/sysdeps/mips/setjmp_aux.c b/sysdeps/mips/setjmp_aux.c
index 2f618e437c..6afc5cca7f 100644
--- a/sysdeps/mips/setjmp_aux.c
+++ b/sysdeps/mips/setjmp_aux.c
@@ -31,13 +31,18 @@ inhibit_stack_protector
 __sigsetjmp_aux (jmp_buf env, int savemask, int sp, int fp)
 {
 #ifdef __mips_hard_float
+# if __mips_fpr == 0 || __mips_fpr == 64
+#  define STFPR "ldc1 "
+# else
+#  define STFPR "l.d "
+# endif
   /* Store the floating point callee-saved registers...  */
-  asm volatile ("s.d $f20, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[0]));
-  asm volatile ("s.d $f22, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[1]));
-  asm volatile ("s.d $f24, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[2]));
-  asm volatile ("s.d $f26, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[3]));
-  asm volatile ("s.d $f28, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[4]));
-  asm volatile ("s.d $f30, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[5]));
+  asm volatile (STFPR "$f20, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[0]));
+  asm volatile (STFPR "$f22, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[1]));
+  asm volatile (STFPR "$f24, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[2]));
+  asm volatile (STFPR "$f26, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[3]));
+  asm volatile (STFPR "$f28, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[4]));
+  asm volatile (STFPR "$f30, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[5]));
 #endif
 
   /* .. and the PC;  */
diff --git a/sysdeps/unix/sysv/linux/mips/setcontext.S b/sysdeps/unix/sysv/linux/mips/setcontext.S
index 81ac4fd936..3aa6b93f5c 100644
--- a/sysdeps/unix/sysv/linux/mips/setcontext.S
+++ b/sysdeps/unix/sysv/linux/mips/setcontext.S
@@ -102,13 +102,17 @@ NESTED (__setcontext, FRAMESZ, ra)
 	l.d	fs7, (31 * SZREG + MCONTEXT_FPREGS)(v0)
 
 # else  /* _MIPS_SIM != _ABI64 */
-	l.d	fs0, (20 * SZREG + MCONTEXT_FPREGS)(v0)
-	l.d	fs1, (22 * SZREG + MCONTEXT_FPREGS)(v0)
-	l.d	fs2, (24 * SZREG + MCONTEXT_FPREGS)(v0)
-	l.d	fs3, (26 * SZREG + MCONTEXT_FPREGS)(v0)
-	l.d	fs4, (28 * SZREG + MCONTEXT_FPREGS)(v0)
-	l.d	fs5, (30 * SZREG + MCONTEXT_FPREGS)(v0)
-
+#  if __mips_fpr == 0 || __mips_fpr == 64
+#   define LDFPR ldc1
+#  else
+#   define LDFPR l.d
+#  endif
+	LDFPR	fs0, (20 * SZREG + MCONTEXT_FPREGS)(v0)
+	LDFPR	fs1, (22 * SZREG + MCONTEXT_FPREGS)(v0)
+	LDFPR	fs2, (24 * SZREG + MCONTEXT_FPREGS)(v0)
+	LDFPR	fs3, (26 * SZREG + MCONTEXT_FPREGS)(v0)
+	LDFPR	fs4, (28 * SZREG + MCONTEXT_FPREGS)(v0)
+	LDFPR	fs5, (30 * SZREG + MCONTEXT_FPREGS)(v0)
 # endif /* _MIPS_SIM != _ABI64 */
 
 	lw	v1, MCONTEXT_FPC_CSR(v0)
diff --git a/sysdeps/unix/sysv/linux/mips/swapcontext.S b/sysdeps/unix/sysv/linux/mips/swapcontext.S
index 4710dce2c2..21b6572c87 100644
--- a/sysdeps/unix/sysv/linux/mips/swapcontext.S
+++ b/sysdeps/unix/sysv/linux/mips/swapcontext.S
@@ -114,13 +114,17 @@ NESTED (__swapcontext, FRAMESZ, ra)
 	s.d	fs7, (31 * SZREG + MCONTEXT_FPREGS)(a0)
 
 # else  /* _MIPS_SIM != _ABI64 */
-	s.d	fs0, (20 * SZREG + MCONTEXT_FPREGS)(a0)
-	s.d	fs1, (22 * SZREG + MCONTEXT_FPREGS)(a0)
-	s.d	fs2, (24 * SZREG + MCONTEXT_FPREGS)(a0)
-	s.d	fs3, (26 * SZREG + MCONTEXT_FPREGS)(a0)
-	s.d	fs4, (28 * SZREG + MCONTEXT_FPREGS)(a0)
-	s.d	fs5, (30 * SZREG + MCONTEXT_FPREGS)(a0)
-
+#  if __mips_fpr == 0 || __mips_fpr == 64
+#   define STFPR sdc1
+#  else
+#   define STFPR s.d
+#  endif
+	STFPR	fs0, (20 * SZREG + MCONTEXT_FPREGS)(a0)
+	STFPR	fs1, (22 * SZREG + MCONTEXT_FPREGS)(a0)
+	STFPR	fs2, (24 * SZREG + MCONTEXT_FPREGS)(a0)
+	STFPR	fs3, (26 * SZREG + MCONTEXT_FPREGS)(a0)
+	STFPR	fs4, (28 * SZREG + MCONTEXT_FPREGS)(a0)
+	STFPR	fs5, (30 * SZREG + MCONTEXT_FPREGS)(a0)
 # endif /* _MIPS_SIM != _ABI64 */
 
 	cfc1	v1, fcr31
@@ -153,12 +157,18 @@ NESTED (__swapcontext, FRAMESZ, ra)
 	l.d	fs7, (31 * SZREG + MCONTEXT_FPREGS)(v0)
 
 # else  /* _MIPS_SIM != _ABI64 */
-	l.d	fs0, (20 * SZREG + MCONTEXT_FPREGS)(v0)
-	l.d	fs1, (22 * SZREG + MCONTEXT_FPREGS)(v0)
-	l.d	fs2, (24 * SZREG + MCONTEXT_FPREGS)(v0)
-	l.d	fs3, (26 * SZREG + MCONTEXT_FPREGS)(v0)
-	l.d	fs4, (28 * SZREG + MCONTEXT_FPREGS)(v0)
-	l.d	fs5, (30 * SZREG + MCONTEXT_FPREGS)(v0)
+#  if __mips_fpr == 0 || __mips_fpr == 64
+#   define LDFPR ldc1
+#  else
+#   define LDFPR l.d
+#  endif
+
+	LDFPR	fs0, (20 * SZREG + MCONTEXT_FPREGS)(v0)
+	LDFPR	fs1, (22 * SZREG + MCONTEXT_FPREGS)(v0)
+	LDFPR	fs2, (24 * SZREG + MCONTEXT_FPREGS)(v0)
+	LDFPR	fs3, (26 * SZREG + MCONTEXT_FPREGS)(v0)
+	LDFPR	fs4, (28 * SZREG + MCONTEXT_FPREGS)(v0)
+	LDFPR	fs5, (30 * SZREG + MCONTEXT_FPREGS)(v0)
 
 # endif /* _MIPS_SIM != _ABI64 */

Comment 15 Den 2022-02-19 08:27:47 UTC

(In reply to Adhemerval Zanella from comment #14)

Zanella, thank you for your support.

> In fact it does exactly what I suggested ... on comment #5.

Well, your patches here just replaces l.d to ldc1 and s.d to sdc1, but as I replied to your comment#5, the opcodes sdc1/ldc1 (the almost same load/store double but from/to coprocessor1) are not supported for single float targets as well. What about lwc1/swc1 and BYTES_PER_WORD? - doubles will not be doubles after those instructions, right?

By the way, you made a mistake in setjmp_aux.c (s.d and sdc1 instead of l.d and ldc1). Moreover, the files.c (at least them) can not be compiled when there are such presentations of LDFPR/STFPR inside asm volatiles. I suppose you just showed a quick realization.

> $ mips64-linux-gnu/bin/mips64-glibc-linux-gnu-gcc -mabi=32 -mips2
> -mhard-float -mfpxx t.c -o t

Please, while tests, additionally use flag -msingle-float and/or -mfp32 instead of -mfpxx

Comment 16 Adhemerval Zanella 2022-02-22 20:05:12 UTC

(In reply to Den from comment #15)
> (In reply to Adhemerval Zanella from comment #14)
> 
> Zanella, thank you for your support.
> 
> > In fact it does exactly what I suggested ... on comment #5.
> 
> Well, your patches here just replaces l.d to ldc1 and s.d to sdc1, but as I
> replied to your comment#5, the opcodes sdc1/ldc1 (the almost same load/store
> double but from/to coprocessor1) are not supported for single float targets
> as well. What about lwc1/swc1 and BYTES_PER_WORD? - doubles will not be
> doubles after those instructions, right?
> 
> By the way, you made a mistake in setjmp_aux.c (s.d and sdc1 instead of l.d
> and ldc1). Moreover, the files.c (at least them) can not be compiled when
> there are such presentations of LDFPR/STFPR inside asm volatiles. I suppose
> you just showed a quick realization.
> 
> > $ mips64-linux-gnu/bin/mips64-glibc-linux-gnu-gcc -mabi=32 -mips2
> > -mhard-float -mfpxx t.c -o t
> 
> Please, while tests, additionally use flag -msingle-float and/or -mfp32
> instead of -mfpxx

I am confused because in some places you refers to r5900 and other r6000. AFAIK r6000 was never supported on Linux [1], and it seems that gcc also does not fully support the r6010 floating point controller:

$ cat test.c
void foo (double *m, double v)
{
  *m = v;
}
$ mips64el-glibc-linux-gnu-gcc -O2 -mabi=32 -march=r6000 -mfp32 -msingle-float -mhard-float test.c -S -o -
        .file   1 "test.c"
        .section .mdebug.abi32
        .previous
        .nan    legacy
        .module singlefloat
        .module nooddspreg
        .abicalls
        .option pic0
        .text
        .align  2
        .globl  foo
        .set    nomips16
        .set    nomicromips
        .ent    foo
        .type   foo, @function
foo:
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        .set    noreorder
        .set    nomacro
        sw      $6,0($4)
        jr      $31
        sw      $7,4($4)
[...]
 
So gcc does not transform double stores to 'swc1', it seems to be using softfp in this case.

In an case, I think you can try replace LDFPR and STFPR on my patch to the with 'lwc1' and 'swc1' if __mips_fpr is 32.  Something like:

  #if __mips_fpr == 0 || __mips_fpr == 64
  # define STFPR sdc1
  #else if __mips_fpr == 32
  # define STFPR swc1
  #else
  # define STFPR s.d
  #endif

Assuming that the mips machine ou are targetting have the same set of floating -point registers.

[1] https://www.linux-mips.org/wiki/R6000

Comment 17 Den 2022-02-23 08:06:43 UTC

(In reply to Adhemerval Zanella from comment #16)
> I am confused because in some places you refers to r5900 and other r6000.
I am assuming that r6000 is supported the best and stable way from each gcc revision rather than r5900 which was accepted in relatively nearest years (2013?..). r5900 has some opcodes from ISA MIPS-III, but only SOME of them, so we decided to choose r6000 for fp tests purposes because of it's ISA MIPS-II, so it will be not generating the unsupported for r5900 opcodes. The only stable fp generating logic toolchain for linux mipsr5900el target known by me consists from binutils-2.23.1 and gcc-svn-20130804 . binutils was processing l.d/s.d opcodes, as I assume, before v2.25. Then something changed since v2.25 and those opcodes becomes unsupported for single-float targets
Maybe it is possible in gcc to selectively choose 64bit operations being processed emulately, with softfp? And rely 32bit single float operation being processed hardwarely, with hard- single-float fp.

> AFAIK r6000 was never supported on Linux [1], and it seems that gcc also
> does not fully support the r6010 floating point controller:
> 
> $ cat test.c
> void foo (double *m, double v)
> {
>   *m = v;
> }
> $ mips64el-glibc-linux-gnu-gcc -O2 -mabi=32 -march=r6000 -mfp32
> -msingle-float -mhard-float test.c -S -o -
>
> [...]
>  
> So gcc does not transform double stores to 'swc1', it seems to be using
> softfp in this case.
If to not use -O2 flag then the output changes...
Yes, I actually could not obtain whenever double is transformed from some C code into swc1 too.
$ cat asmtest.c
int main(){
asm volatile ( "s.d $f0,($0)" );}
$ mipsr5900el-unknown-linux-uclibc-gcc asmtest.c -o asmtest.o
$ mipsr5900el-unknown-linux-uclibc-objdump -d asmtest.o
- I noticed sdc1 only among the output.
(mipsr5900el-unknown-linux-uclibc is the target of that stable toolchain with gcc-svn I described above)

> In an case, I think you can try replace LDFPR and STFPR on my patch to the
> with 'lwc1' and 'swc1' if __mips_fpr is 32.  Something like:
> 
>   #if __mips_fpr == 0 || __mips_fpr == 64
>   # define STFPR sdc1
>   #else if __mips_fpr == 32
>   # define STFPR swc1
>   #else
>   # define STFPR s.d
>   #endif
> 
> Assuming that the mips machine ou are targetting have the same set of
> floating -point registers.
> 
> [1] https://www.linux-mips.org/wiki/R6000
That will result to the changes I began in this thread from. Which are supposedly wrong.

So, for the single-float targets, the task comes down to a someway method of storing/loading doubles into/from 32bit floating point registers. Can you imagine the practice realization of the next test:
-] x is double, 
-] y is might double too,
-] let x=maximum value of double (which is surely beyond the float maximum value),
-]some (math?) operation which result to the output which is storable in fp regs,
-]word store opcode use to fp register with x,
-] --word load opcode use from fp register with y-- instead of it - some (math?) operation to use the previously stored data in fp regs to retreive the original value of x and return it to the y
-] printf value of y.
-] y=x then hooray! - I found a way to implement that in libc without current abi logic thinking-out!
-] y!=x then damn it, searching for the workaround...


I was thinking about getting the hex value of the x (how it is stored in RAM), "slice" it, return the "sliced" pieces to float variables (maybe with a conversion to the decimal back again, the acceptable for float variable data), then store those variables into 32bit fp regs, and the reverse hex data uniting into an original for the x hex/decimal value.

Comment 18 Adhemerval Zanella 2022-02-23 11:57:00 UTC

(In reply to Den from comment #17)
> (In reply to Adhemerval Zanella from comment #16)
> > I am confused because in some places you refers to r5900 and other r6000.
> I am assuming that r6000 is supported the best and stable way from each gcc
> revision rather than r5900 which was accepted in relatively nearest years
> (2013?..). r5900 has some opcodes from ISA MIPS-III, but only SOME of them,
> so we decided to choose r6000 for fp tests purposes because of it's ISA
> MIPS-II, so it will be not generating the unsupported for r5900 opcodes. The
> only stable fp generating logic toolchain for linux mipsr5900el target known
> by me consists from binutils-2.23.1 and gcc-svn-20130804 . binutils was
> processing l.d/s.d opcodes, as I assume, before v2.25. Then something
> changed since v2.25 and those opcodes becomes unsupported for single-float
> targets
> Maybe it is possible in gcc to selectively choose 64bit operations being
> processed emulately, with softfp? And rely 32bit single float operation
> being processed hardwarely, with hard- single-float fp.

You will need to check on gcc, mips ABI is a complete mess.  But from an
ABI perpective, using different code sequence for double precision is an
different ABI.

> 
> > AFAIK r6000 was never supported on Linux [1], and it seems that gcc also
> > does not fully support the r6010 floating point controller:
> > 
> > $ cat test.c
> > void foo (double *m, double v)
> > {
> >   *m = v;
> > }
> > $ mips64el-glibc-linux-gnu-gcc -O2 -mabi=32 -march=r6000 -mfp32
> > -msingle-float -mhard-float test.c -S -o -
> >
> > [...]
> >  
> > So gcc does not transform double stores to 'swc1', it seems to be using
> > softfp in this case.
> If to not use -O2 flag then the output changes...

Yes, but it should not change ABI whether you use optimizations flags or not.

> Yes, I actually could not obtain whenever double is transformed from some C
> code into swc1 too.
> $ cat asmtest.c
> int main(){
> asm volatile ( "s.d $f0,($0)" );}
> $ mipsr5900el-unknown-linux-uclibc-gcc asmtest.c -o asmtest.o
> $ mipsr5900el-unknown-linux-uclibc-objdump -d asmtest.o
> - I noticed sdc1 only among the output.
> (mipsr5900el-unknown-linux-uclibc is the target of that stable toolchain
> with gcc-svn I described above)

I still learning the peculiarities os MIPS ABI and it seems that my initial example
does not exercices what I was trying find out.  It seems that o32 ABI always pass
floaring point register in $a0-$a3.  Using another example to force a floating-point
store:

$ cat test.c 
float foo_float (float *m, float x, float y)
{
  *m = x + y;
  return *m;
}

double foo_double (double *m, double x, double y)
{
  *m = x + y;
  return *m;
}

$ mips64el-linux-gnu/bin/mips64el-glibc-linux-gnu-gcc -O2 -mabi=32 -mfp32 -msingle-float -mhard-float test.c -S -o -
[...]
foo_float:
[...]
	mtc1	$5,$f0
	mtc1	$6,$f1
	nop
	add.s	$f0,$f0,$f1
	jr	$31
	swc1	$f0,0($4)
[...]
foo_double:
	.frame	$sp,32,$31		# vars= 0, regs= 2/0, args= 16, gp= 8
	.mask	0x80010000,-4
	.fmask	0x00000000,0
	.set	noreorder
	.set	nomacro
	addiu	$sp,$sp,-32
	move	$2,$6
	move	$3,$7
	lw	$6,48($sp)
	lw	$7,52($sp)
	sw	$16,24($sp)
	move	$5,$3
	move	$16,$4
	sw	$31,28($sp)
	jal	__adddf3
	move	$4,$2

	lw	$31,28($sp)
	sw	$2,0($16)
	sw	$3,4($16)
	lw	$16,24($sp)
	jr	$31
	addiu	$sp,$sp,32
[...]

So it confirms that for -mfp32 softfp is used.

> 
> > In an case, I think you can try replace LDFPR and STFPR on my patch to the
> > with 'lwc1' and 'swc1' if __mips_fpr is 32.  Something like:
> > 
> >   #if __mips_fpr == 0 || __mips_fpr == 64
> >   # define STFPR sdc1
> >   #else if __mips_fpr == 32
> >   # define STFPR swc1
> >   #else
> >   # define STFPR s.d
> >   #endif
> > 
> > Assuming that the mips machine ou are targetting have the same set of
> > floating -point registers.
> > 
> > [1] https://www.linux-mips.org/wiki/R6000
> That will result to the changes I began in this thread from. Which are
> supposedly wrong.

I think you misunderstand what I pointed as wrong, the issue was changing the
jmp_buf size and not making the use of lwc1/swc1 conditionally on the __mips_fpr
begin used by the compiler.  

The glibc does not use a opaque type but rather a struct with expected internal
types to certify that internal field alignement is the expected one (as you noted
newlib had to resort on a hack to make it work since it defines jmp_buf as an 
array of 'int'). 

> 
> So, for the single-float targets, the task comes down to a someway method of
> storing/loading doubles into/from 32bit floating point registers. Can you
> imagine the practice realization of the next test:
> -] x is double, 
> -] y is might double too,
> -] let x=maximum value of double (which is surely beyond the float maximum
> value),
> -]some (math?) operation which result to the output which is storable in fp
> regs,
> -]word store opcode use to fp register with x,
> -] --word load opcode use from fp register with y-- instead of it - some
> (math?) operation to use the previously stored data in fp regs to retreive
> the original value of x and return it to the y
> -] printf value of y.
> -] y=x then hooray! - I found a way to implement that in libc without
> current abi logic thinking-out!
> -] y!=x then damn it, searching for the workaround...
> 
> 
> I was thinking about getting the hex value of the x (how it is stored in
> RAM), "slice" it, return the "sliced" pieces to float variables (maybe with
> a conversion to the decimal back again, the acceptable for float variable
> data), then store those variables into 32bit fp regs, and the reverse hex
> data uniting into an original for the x hex/decimal value.

I think you also misunderstanding what need to done here: jmp_buf is essentially
a opaque type with enough size to store the machine *state* which would be save
and restored.  If the machine register state does not contain double precision
floating point state there is no sense in trying to save and restore it, you 
just need to save and restore the single-precision (the ABI expectations will
be fullfilled).  

I attached a patch which should be a fix, at least stdlib setcontext/makecontext
tests do not trigger any regression with qemu-user (although I don't think 
qemu-user has an option to select a CPU with single precision only). The double
precision tests uses softfp, which should be handled by save/restore the general
registers.

Comment 19 Adhemerval Zanella 2022-02-23 11:57:38 UTC

Created attachment 13994 [details]
patch

Comment 20 Den 2022-03-05 17:43:05 UTC

(In reply to Adhemerval Zanella from comment #19)
> sysdeps/mips/__longjmp.c
> +# if __mips_fpr == 0 || __mips_fpr == 64
> +#  define LDFPR "ldc1 "
> +# elif __mips_fpr == 32
> +#  define LDFPR "lwc1 "
> +# else
> +#  define LDFPR "l.d "
> +# endif
> +  asm volatile (LDFPR "$f20, %0" : : "m" (env[0].__fpregs[0]));
> ...
Those LDFPR macroses inside "asm volatile" are not replaced with their logic definition. That will not do the trick for the .c files, so they can not be compiled (as the compiler "thinks" that it is ldfpr opcode). The trick might be working with .s files.

The bad thing for me is that I obtained that currently there is no a suitable toolchain with the newest binutils and gcc to build a stable working cross-compiler for either r6000 with single-float (when using double it is OK) or r5900 - that kind of the cross-compiler generates malfunctions somewhere...
So, I can not check with a test whenever your patch, your suppose about saving/restoring fp state to/from 32bit registers just by using swc1/lwc1 "intrusions" in the current ABI realization, is correct or wrong. Because I have nothing stable to be sure in, to compare with.

Comment 21 Adhemerval Zanella 2022-03-07 11:07:44 UTC

(In reply to Den from comment #20)
> (In reply to Adhemerval Zanella from comment #19)
> > sysdeps/mips/__longjmp.c
> > +# if __mips_fpr == 0 || __mips_fpr == 64
> > +#  define LDFPR "ldc1 "
> > +# elif __mips_fpr == 32
> > +#  define LDFPR "lwc1 "
> > +# else
> > +#  define LDFPR "l.d "
> > +# endif
> > +  asm volatile (LDFPR "$f20, %0" : : "m" (env[0].__fpregs[0]));
> > ...
> Those LDFPR macroses inside "asm volatile" are not replaced with their logic
> definition. That will not do the trick for the .c files, so they can not be
> compiled (as the compiler "thinks" that it is ldfpr opcode). The trick might
> be working with .s files.

What do you mean 'replaced with their logic definition' here? For '__mips_fpr == 32' is lwc1 used to restore the floating-point state, which afaiu this is the expected intructions to load the for machines which does not contain double precision.

> 
> The bad thing for me is that I obtained that currently there is no a
> suitable toolchain with the newest binutils and gcc to build a stable
> working cross-compiler for either r6000 with single-float (when using double
> it is OK) or r5900 - that kind of the cross-compiler generates malfunctions
> somewhere...

Can't you use the expected compiler-flags (-O2 -mabi=32 -mfp32 -msingle-float -mhard-float) along with a recent gcc/binutils?

> So, I can not check with a test whenever your patch, your suppose about
> saving/restoring fp state to/from 32bit registers just by using swc1/lwc1
> "intrusions" in the current ABI realization, is correct or wrong. Because I
> have nothing stable to be sure in, to compare with.

Comment 22 Den 2022-03-07 16:17:23 UTC

(In reply to Adhemerval Zanella from comment #21)
> What do you mean 'replaced with their logic definition' here?
I'm sorry for my English...
Just try to compile those edited files .c and you see an error. Here is the test file:
$ cat asm_test.c
# if __mips_fpr == 0 || __mips_fpr == 64
#  define LDFPR "ldc1 "
# elif __mips_fpr == 32
#  define LDFPR "lwc1 "
# else
#  define LDFPR "l.d "
# endif

int main(){
	asm volatile ( "LDFPR $f0,($0)" );}
$ mipsel-unknown-linux-uclibc-gcc asm_test.c -o asm_test.o

Error: unrecognized opcode `ldfpr $f0,($0)'

- see? - that LDFPR not changes to any either defined lwc1 or l.d . The same error appears when trying to compile the glibc with your patch applied.

> Can't you use the expected compiler-flags (-O2 -mabi=32 -mfp32
> -msingle-float -mhard-float) along with a recent gcc/binutils?

OK, gcc-9.2.0, binutils-2.34.
$ mipsel-unknown-linux-uclibc-gcc -Q --help=target
I see all of these flags enabled, except -O2. And I'm getting some binaries provides SegFault (for a not deep test - magick, groupadd, native mipsel gdb). These binaries are working well when they were got using soft-float. I'm not sure well but I think using hard-float double-float could be stable too, I just do not know a way how to test it; moreover, that type of cross-compiler that uses doubles is not suitable for our goal.
To study the reason of those SegFaults deeply I need a working gdb compiled using hard'n'single-float. I need it either in QEMU or for r5900's native. Since the cross-compiler for r5900 generates the malfunction results, then I need to build it statically for r6000 (MIPSII) in some QEMU mipsel target, but I tried and realized that it builds soooo looong...
The other debug method is QEMU's gdbstub, but I have some problems understanding how to set it up... Anyway, the searches for the SegFault trigger is not really a glibc related problem concern.

Comment 23 Adhemerval Zanella 2022-03-07 16:59:47 UTC

(In reply to Den from comment #22)
> (In reply to Adhemerval Zanella from comment #21)
> > What do you mean 'replaced with their logic definition' here?
> I'm sorry for my English...
> Just try to compile those edited files .c and you see an error. Here is the
> test file:
> $ cat asm_test.c
> # if __mips_fpr == 0 || __mips_fpr == 64
> #  define LDFPR "ldc1 "
> # elif __mips_fpr == 32
> #  define LDFPR "lwc1 "
> # else
> #  define LDFPR "l.d "
> # endif
> 
> int main(){
> 	asm volatile ( "LDFPR $f0,($0)" );}

You need to put LDFPR *outside* the string, otherwise the preprocessor won't make de substition:

  int main ()
  {
    asm volatile (LDFPR "$f0,($0)");
  }

> $ mipsel-unknown-linux-uclibc-gcc asm_test.c -o asm_test.o
> 
> Error: unrecognized opcode `ldfpr $f0,($0)'
> 
> - see? - that LDFPR not changes to any either defined lwc1 or l.d . The same
> error appears when trying to compile the glibc with your patch applied.

With the fix above:

  $ mips64-glibc-linux-gnu-gcc -O2 -mabi=32 -mfp32 -msingle-float -mhard-float test.c -c
  $ mips64-glibc-linux-gnu-objdump -d test.o
  [...]
  00000000 <foo>:
     0:   c4000000        lwc1    $f0,0(zero)
     4:   03e00008        jr      ra
     8:   00000000        nop
     c:   00000000        nop

> 
> > Can't you use the expected compiler-flags (-O2 -mabi=32 -mfp32
> > -msingle-float -mhard-float) along with a recent gcc/binutils?
> 
> OK, gcc-9.2.0, binutils-2.34.
> $ mipsel-unknown-linux-uclibc-gcc -Q --help=target
> I see all of these flags enabled, except -O2. And I'm getting some binaries
> provides SegFault (for a not deep test - magick, groupadd, native mipsel
> gdb). These binaries are working well when they were got using soft-float.
> I'm not sure well but I think using hard-float double-float could be stable
> too, I just do not know a way how to test it; moreover, that type of
> cross-compiler that uses doubles is not suitable for our goal.
> To study the reason of those SegFaults deeply I need a working gdb compiled
> using hard'n'single-float. I need it either in QEMU or for r5900's native.
> Since the cross-compiler for r5900 generates the malfunction results, then I
> need to build it statically for r6000 (MIPSII) in some QEMU mipsel target,
> but I tried and realized that it builds soooo looong...
> The other debug method is QEMU's gdbstub, but I have some problems
> understanding how to set it up... Anyway, the searches for the SegFault
> trigger is not really a glibc related problem concern.

Can't you dump a corefile and analyze it on a different box? Or configure the box to use remote gdb?

Another option would be to add a strace dump to see where it fails and add an old printf debug ...

Comment 24 Den 2022-03-18 15:01:28 UTC

(In reply to Adhemerval Zanella from comment #23)
> I'm getting some binaries provides SegFault (for a not deep test - magick,
> groupadd, native mipsel gdb). These binaries are working well when they were
> got using soft-float.
> ... I just do not know a way how to test it ...
> To study the reason of those SegFaults deeply I need a working gdb compiled
> using hard'n'single-float. I need it either in QEMU or for r5900's native.
Finally I understood how to prepare the things for to be debugged with gdb and where to make those debugs themselves.
Cross-compiler for r6000 target, based on uClibc (because glibc denies to build 
 a standalone static binaries); uClibc-ng-1.0.40 patched with your, Zanella, edits.
I have a mipsel debian system installed inside qemu. There is gdb-7.12-6, just installed using apt-get install.
Statically compiled ImageMagick using r6000 cross compiler based on uClibc, got it's main static binary called "magick". Debugging trough gdb:
$ gdb magick
$ run import scr.jpg
 Program received signal SIGSEGV, Segmentation fault.
 0x008c30e4 in __divdf3 (x=0, y=4) at ../../../libgcc/config/hardfp.c:37
 37	../../../libgcc/config/hardfp.c: No such file or directory.

I do not know whenever that is an issue with single float targets again or else, but that were the reasons of the malfunctions generated by the cross-compilers for r5900 and r6000 I was building! I am sure it is someway related to this:
https://gcc.gnu.org/legacy-ml/gcc-patches/2014-02/msg00420.html
I know that here is a wrong place to discuss that... I just write a workaround for my case:
file gcc-9.2.0/libgcc/config.host :
 	# All MIPS targets provide a full set of FP routines.
 	cpu_type=mips
 	tmake_file="mips/t-mips"
-	if test "${libgcc_cv_mips_hard_float}" = yes; then
-		tmake_file="${tmake_file} t-hardfp-sfdf t-hardfp"
-	else
+#	if test "${libgcc_cv_mips_hard_float}" = yes; then
+#		tmake_file="${tmake_file} t-hardfp-sfdf t-hardfp"
+#	else
 		tmake_file="${tmake_file} t-softfp-sfdf"
-	fi
+#	fi
 	if test "${ac_cv_sizeof_long_double}" = 16; then
 		tmake_file="${tmake_file} mips/t-softfp-tf"
 	fi

Totally, I built a cross-compiler for target r5900 based on glibc-2.32 with your, Zanella, patch; with that edit of mine for gcc and recompiled the target-system from scratch. Everything seems alright.
Zanella, I am ready to test those machine *states* saving/restores on the real hardware, if you know how to do that. To be sure that your edits are correct and compatible.
By the way, if you'll be making a patch submit into the glibc source, please, do not forget to make the edits to the file
sysdeps/unix/sysv/linux/mips/getcontext.S
as well.