This is the mail archive of the ecos-patches@sourceware.org mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug 1001607] Cortex-M4F architectural Floating Point Support


Please do not reply to this email. Use the web interface provided at:
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001607

Jonathan Larmour <jifl@ecoscentric.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEEDINFO

--- Comment #12 from Jonathan Larmour <jifl@ecoscentric.com> 2012-07-01 05:23:09 BST ---
Hi Ilija,

(In reply to comment #10)
> (In reply to comment #9)
> > Architecture HAL:
> > 
> > - The CDL seems to be getting much more complex than it really needs to be.
> > There is now CDL for: CYGHWR_HAL_CORTEXM (== M3 or M4),
> > CYGINT_HAL_CORTEXM4_CODE, CYGOPT_HAL_CORTEXM4_CODE, CYGOPT_HAL_CORTEXM3_CODE,
> > CYGINT_HAL_CORTEXM_FPU, CYGINT_HAL_CORTEXM_FPV4_SP_D16, CYGHWR_HAL_CORTEXM_FPU,
> > and to an extent CYGHWR_HAL_CORTEXM_FPU_SWITCH=="NONE", which all have
> > overlapping behaviour with respect to each other and effects on each other.
> > There is unnecessary redundancy here.
> > 
> > I think we can simply remove CYGINT_HAL_CORTEXM4_CODE, CYGOPT_HAL_CORTEXM4_CODE
> > and CYGOPT_HAL_CORTEXM3_CODE because even if building with -mcpu=cortex-m4, gcc
> > defaults to soft floating-point. So that means if CYGHWR_HAL_CORTEXM == "M4"
> > then it's safe to use -mcpu=cortex-m4. So CYGBLD_GLOBAL_XFLAGS_ARCH can go too.
> 
> I am aware of this excessive complexity, as it gave me some headache but I'm
> afraid that for the time being it is necesary. The reason for keeping
> -mcpu=cortex-m3 option even for Cortex-M4 cores, is compatibility with the
> actual eCos compiler, gcc-4.3.2, that doesn't support/recognize
> -mcpu=cortex-m4. We can reconsider once we switch to gcc-4.6.3 or newer. 

Okay. But can't we just let the user do this by changing the value of
CYGHWR_HAL_CORTEXM? We can just (temporarily) relax the CYGHWR_HAL_CORTEXM ==
"M4" in the kinetis variant HAL. Since we can't use M4 features without the
compiler support, in practice it's treating as an M3 anyway. Then we ensure
CFLAGS/LDFLAGS get set according to CYGHWR_HAL_CORTEXM.

> Provided that we keep selection I would be happier with radio buttons. I have
> seen some of them (such as schedulers) but haven't figured out how it's done.
> What's the trick?

The trick is for the two options to implement a CDL interface, and that CDL
interface has a requires statement on it which says it must be == 1.

> > - CYGHWR_HAL_CORTEXM_FPU should be just:
> >             active_if       CYGINT_HAL_CORTEXM_FPU
> >             default_value   1
> > 
> > You don't need to set the default value specially when its inactive anyway.
> 
> I know, only it doesn't look beautiful (in my eyes) if some checkbox is checked
> and inactive. But probably I could get used to.

I guess it's only my own preference, so if you prefer it that way, keep it like
that.

> > - Here's a possible outline of a simplification. What do you think?
> >   cdl_option CYGHWR_HAL_CORTEXM_FPU_FLAGS {
> >      display    "FPU build flags"
> >      calculated { (CYGHWR_HAL_CORTEXM_FPU ? " -mfloat-abi=hard" : "" ) .
> > (CYGINT_HAL_CORTEXM_FPV4_SP_D16 ? " -mfpu=fpv4-sp-d16" : "" ) }
> >    }
> >    requires is_substr(CYGBLD_GLOBAL_CFLAGS, CYGHWR_HAL_CORTEXM_FPU_FLAGS )
> >    requires is_substr(CYGBLD_GLOBAL_LDFLAGS, CYGHWR_HAL_CORTEXM_FPU_FLAGS )
> >   requires { !CYGHWR_HAL_CORTEXM_FPU implies !is_substr(CYGBLD_GLOBAL_CFLAGS,
> > "-mfpu=fpv4-sp-d16") && !is_substr(CYGBLD_GLOBAL_CFLAGS, "-mfloat-abi=hard") }
> >   requires { !CYGHWR_HAL_CORTEXM_FPU implies !is_substr(CYGBLD_GLOBAL_LDFLAGS,
> > "-mfpu=fpv4-sp-d16") && !is_substr(CYGBLD_GLOBAL_LDFLAGS, "-mfloat-abi=hard") }
> 
> Since you ask me, I would get rid of is_substr(). Something like this:
[snip]
>       calculated { CYGHWR_HAL_CORTEXM_FPU ? " -mfloat-abi=hard" : "" }
[snip similar]
> Then we concatenate all these in CYGBLD_GLOBAL_CFLAGS.

Sure you could do it like that, I was just trying to avoid the extra options,
but I see your argument about it being easier to deal with other FPUs later.
But you need to have appropriate "requires" lines still, otherwise there are
problems if the user has set a value, or one has been inferred, which means the
default_value is no longer being used. Imagine the user changes -O2 to -O0
because they're debugging (so CYGBLD_GLOBAL_CFLAGS no longer comes from the
default_value), and then disables the hard FPU support. The flags used wouldn't
change and there is no conflict, but it is a problem. Ditto the other way
round.

> > - There is a quite important design issue. A lot of things assume the only
> > processor with an FPU is the Cortex-M4 and that FPU has the fpv4-sp-d16
> > arrangement. This is unlikely to remain the case, and it's easy to anticipate.
> > The M4 FPU is a bit cut-down after all, so it's easy to see a full VFP4 unit
> > being added in a later Cortex-M.  So we should be wrapping up much more
> > functionality in macros that can be supplied according to the FPU design, the
> > most obvious alternative possibility being a full VFP4 unit, to give an idea of
> > what to abstract.
> > 
> > With that in mind, I think there needs to be a fpv4-sp-d16.h header file,
> > probably included from cortexm_fpu.h (which could in future then instead
> > include alternatives e.g. a vfp4.h header once that became relevant). That
> > FPU-specific header will define a number of FPU-model-specific properties and
> > macros used for the implementation, such as from cortexm_stub.h and context.S. 
> > 
> 
> Funny, usually I am warned for over-generalization, now it'oposite :-).

You can't win. :-)

> I will consuder this and try to construct something. I wonder if we will need
> cortexm_fpu.h (yet another level of indirection) or we can simply rename
> cortexm_fpu.h into fpv4-sp-d16.h.

I think it is better to have one single header to include which provides the
necessary definitions, even if other files need to be included. Otherwise we
get things like this at the top of multiple files:
#if defined(CYGINT_HAL_CORTEXM_FPV4_SP_D16)
# include <cyg/hal/fpv4-sp-d16.h>
#elif defined(CYGINT_HAL_CORTEXM_VFP4)
# include <cyg/hal/vfp4.h>
#elif defined(CYGINT_HAL_CORTEXM_VFP4_D16)
etc.etc.

It also keeps anything common in one place, and maybe certain defaults which
are only overriden by the FPU-implementation specific header if needed, which
may simplify those headers.

> Then we have cortexm_fpu.c and hal_cortexm_fpu.cdl...

We don't need separate versions of those files. The CDL file has very little
specific to the FPU implementation other than the build flags. At this point,
cortexm_fpu.c doesn't need to change, but if it turned out there were things in
future which depended on the FPU implementation, then the way to deal with it
is to provide definitions in fpv4-sp-d16.h (included via cortexm_fpu.h) which
allow cortexm_fpu.c to do the right thing.

> > - In cortexm_regs.h, there has been the addition of
> > CYGARC_DSB()/CYGARC_CORTEXM_DATA_SYNC_BARRIER() and
> > CYGARC_ISB()/CYGARC_CORTEXM_INSTR_SYNC_BARRIER(). I think you should choose one
> > name for each and stick with it.
> 
> I'll keep the short one.

Fine.

> > Also for consistency with some other architectural ports, there should be a
> > HAL_MEMORY_BARRIER define:
> > 
> > #define HAL_MEMORY_BARRIER() \
> >   CYGARC_CORTEXM_DATA_SYNC_BARRIER(); CYGARC_CORTEXM_INSTR_SYNC_BARRIER()
> 
> Or:
> #define HAL_MEMORY_BARRIER() CYGARC_DSB(); CYGARC(ISB)

Indeed :).

Although thinking about it, it probably does want a CYG_MACRO_START/END
wrapper, otherwise (although very unlikely) someone could theoretically do:
if (something)
  HAL_MEMORY_BARRIER();
and things might not behave as intended.

> > ( This could then be used in cortexm_fpu.h, hal_fpu.c etc, to replace existing
> > calls which do exactly that, although that isn't vital.)
> > 
> > OOI strictly that cortexm_regs.h doesn't need #ifndef __ASSEMBLER__ as its all
> > macro definitions, but you can keep it like that if you want.
> 
> What's OOI?

Out of interest. http://www.internetslang.com/OOI-meaning-definition.asp
English has too many idioms and sayings, so lazy people like me abbreviate
them. :-)

> > - You don't save the floating point context for interrupts, but do for
> > exceptions. There's an argument that since both interrupt and exception
> > handlers could call arbitrary code which therefore may use FP, the context
> > should be saved. There's also an argument that unless this is a ROM monitor and
> > including stubs, you don't need to save the exception info either. I think
> > there may be merit in having two config options to govern whether to save FP
> > state for exceptions and interrupts. The default value for the exceptions one
> > would be CYGSEM_HAL_ROM_MONITOR || CYGDBG_HAL_DEBUG_GDB_INCLUDE_STUBS, the
> > default for the interrupt one would be 0.
> 
> I expected some arguments here... I can't imagine myself using FPU in ISR or
> DSR, but if it is an option, disabled by default here are some considerations:
>   - ALL: Implementation is easy, we just skip disabling of automatic stack
> saving.
>   - LAZY: Lazy saving introduces variable frame length that requires handling
> that will produce worst case latency even worse than for ALL. It requires suble
> intervention in ISR and may unnecessarily delay the project.
> 
> Therefore I would consider immediate implementation for ALL, but for LAZY i
> would prefer to put "Not implemented in current release" for the time being.

I think that's ok. So, to be clear, there would be options for whether to save
FP state for interrupts and exceptions, and for the interrupt one, it would
require !LAZY?

[snip] 
> > - Again we should definitely aim to replace FPU-specific parts with macros,
> > rather than duplicating entire functions such as hal_default_exception_vsr.
> > Either with the C preprocessor, or GAS .macro. If the latter you could have an
> > include/fpu.inc file, which will #include fpv4-sp-d16.h again (which would then
> > need to be asm-include safe) to configure itself - up to you. You can again
> > look at the i386 arch HAL to see what it did in the same situation -
> > specifically arch.inc, although it is more complex than we would need to be in
> > cortex-m.
> > 
> > But I think something needs to be done otherwise it becomes much harder to
> > follow the asm.
> 
> My intention was to leave as much of present code, especially ASM verbatim.
> Hence the code duplication.
> But now we can go on with macros.

Nick has mentioned to me in passing that he thinks that although it doesn't
implement lazy saves, the MIPS HAL is a better example to look at for examples
of such macros - the I386 has other complex bits you wouldn't have to worry
about, like SSE, which just make things look harder than they actually are. So
have a look at the MIPS architecture HAL's arch.inc.

> > - Given what I suggested above about being able to save FP context from ISRs,
> > something would be needed with the hal_default_interrupt_vsr here, although
> > with some care, I would expect the same macros should be able to be reused for
> > both exceptions and interrupts.
> 
> The automatic stack frame is different than the one used in
> default_exception_vsr

Of course, yes, that slipped my mind.

> > - hal_usagefault_exception_vsr is almost identical to
> > hal_default_exception_vsr. I don't see why hal_default_exception_vsr can't be
> > used, and instead treated specially in hal_deliver_exception() instead, which
> > would be less code overall.
> > 
> 
> Cortex-M architecture, unlike pure ARM, provides us with individual
> exception/interrupt vectors and it would be waste not to use them. We get both
> simpler, cleaner and faster code. I would rather extend this approach to other
> other exception handlers.

For interrupts I might agree, but exceptions are as defined in the name -
exceptional, i.e. unusual. They aren't common so runtime performance is not the
same concern. Duplicating each one is just a waste of code space. The
efficiency saving is probably only about 3 instructions (a mov, comparison and
branch). And this for something which only ever really gets used in lazy FPU
mode, and then only if a thread uses FP, and then only once ever per thread.

Also as I mentioned, it also breaks the ability to properly debug an unhandled
usage fault with the stub.

> > Tests:
> > 
> > - For fptestf.c and fpinttest.c: The test for whether the test is NA should
> > also include <pkgconf/isoinfra.h> and check CYGINT_ISO_C_CLOCK_FUNCS is
> > defined, and also CYGPKG_LIBM for fabsf() (which should then not be
> > prototyped). Note this uses a libm implementation which hasn't been checked in
> > yet. 
> 
> FYI fabsf() is a GCC built-in function, no libm involved, hence declaration.

Okay. Unless someone uses -fno-builtin, which I've had cause to do in the past
when the compiler has misoptimised some things! Please use __builtin_fabsf()
instead (with a comment to say it comes from GCC).

> It
> is of course temporary until float libm gets committed (thanks for the
> reminder).

In which case at that point we'll be able to use:
#ifndef CYGPKG_LIBM
# define fabsf(_f) __builtin_fabsf(_f)
#endif

[snip stuff presumably you agree with :)] 

Regarding comment #11, can you send me an updated version of the CDL which uses
my approach to the FPU flags (which you say results in an unresolved conflict
when the user changes options)? It would save me having to duplicate by hand
something you've already done. Then I can hopefully see if there's a better
way, but still simpler than your original. I might be wrong :-). You can email
me direct rather than attaching it to the bug if you prefer, whichever is
easiest.

Cheers,

Jifl

-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]