This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH resend] MIPS: Allow FPU emulator to use non-stack area.

From: David Daney <ddaney at caviumnetworks dot com>
To: Rich Felker <dalias at libc dot org>
Cc: David Daney <ddaney dot cavm at gmail dot com>, <libc-alpha at sourceware dot org>, <linux-kernel at vger dot kernel dot org>, <linux-mips at linux-mips dot org>, David Daney <david dot daney at cavium dot com>
Date: Mon, 6 Oct 2014 14:18:19 -0700
Subject: Re: [PATCH resend] MIPS: Allow FPU emulator to use non-stack area.
Authentication-results: sourceware.org; auth=none
References: <1412627010-4311-1-git-send-email-ddaney dot cavm at gmail dot com> <20141006205459 dot GZ23797 at brightrain dot aerifal dot cx>

On 10/06/2014 01:54 PM, Rich Felker wrote:

On Mon, Oct 06, 2014 at 01:23:30PM -0700, David Daney wrote:

From: David Daney <david.daney@cavium.com>

In order for MIPS to be able to support a non-executable stack, we
need to supply a method to specify a userspace area that can be used
for executing emulated branch delay slot instructions.

We add a new system call, sys_set_fpuemul_xol_area so that userspace
threads that are using the FPU can specify the location of the FPU
emulation out of line execution area.

Background:

MIPS floating point support requires that any instruction that cannot
be directly executed by the FPU, be emulated by the kernel.  Part of
this emulation involves executing non-FPU instructions that fall in
the delay slots of FP branch instructions.  Since the beginning of
MIPS/Linux time, this has been done by placing the instructions on the
userspace thread stack, and executing them there, as the instructions
must be executed in the MM context of the thread receiving the
emulation.

Because of this, the de facto MIPS Linux userspace ABI requires that
the userspace thread have an executable stack.  It is de facto,
because it is not written anywhere that this must be the case, but it
is never the less a requirement.

Problem:

How do we get MIPS Linux to use a non-executable stack in the face of
the FPU emulation problem?

Since userspace desires to change the ABI, put some of the onus on the
userspace code.  Any userspace thread desiring a non-executable stack,
must allocate a 4-byte aligned area at least 8 bytes long with that
has read/write/execute permissions and pass the address of that area
to the kernel with the new sys_set_fpuemul_xol_area system call.

This is similar to how we require userspace to notify the kernel of
the value of the thread local pointer.


Userspace should play no part in this; requiring userspace to help
make special accomodations for fpu emulation largely defeats the
purpose of fpu emulation.

That is certainly one way of looking at it. Really it is opinion,rather than fact though.

GLibc is full of code (see ld.so) that in earlier incantations ofUnix/Linux was in kernel space, and was moved to userspace. Given thatthere is a partitioning of code between kernel space and userspace, Ithink it not totally unreasonable to consider doing some of this inuserspace.

Even on systems with hardware FPU, the architecture specification allowsfor/requires emulation of certain cases (denormals, etc.) So it isalready a requirement that userspace cooperate by always having freespace below $SP for use by the kernel. So the current situation is thatuserspace is providing services for the kernel FPU emulator.

My suggestion is to change the nature of the way these services areprovided by the userspace program.

The kernel is perfectly capable of mapping
an appropriate page. The mapping should happen at exec time,  and at
clone time with CLONE_VM

Why? This adds overhead for threads that don't use the FPU. So thissuggestion adds at least one page of memory overhead for each thread inthe system (unless I misunderstand what you are saying).

unless the kernel is going to handle mutual
exclusion so that only one thread can be using the page at a time.
(Using one page for the whole process, and excluding simultaneous
execution of fpu emulation in multiple threads, may be the more
practical approach.)

As an alternative, if the space of possible instruction with a delay
slot is sufficiently small, all such instructions could be mapped as
immutable code in a shared mapping, each at a fixed offset in the
mapping. I suspect this would be borderline-impractical (multiple
megabytes?), but it is the cleanest solution otherwise.

Yes, there are 2^32 possible instructions. Each one is 4 bytes, plusyou need a way to exit after the instruction has executed, which wouldrequire another instruction. So you would need 32GB of memory to holdall those instructions, larger than the 32-bit virtual address space.

Rich

Follow-Ups:
- Re: [PATCH resend] MIPS: Allow FPU emulator to use non-stack area.
  - From: Rich Felker
- Re: [PATCH resend] MIPS: Allow FPU emulator to use non-stack area.
  - From: Ralf Baechle

References:
- [PATCH resend] MIPS: Allow FPU emulator to use non-stack area.
  - From: David Daney
- Re: [PATCH resend] MIPS: Allow FPU emulator to use non-stack area.
  - From: Rich Felker

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]