This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

MIPS: Standalone/inline assembly issues (was: MIPS/o32: Fix internal_syscall5/6/7)


Hi Adhemerval,

 Following up on the issues with MIPS assembly, both standalone sources 
and (as you have observed) inline pieces.

On Thu, 24 Aug 2017, Adhemerval Zanella wrote:

> >  NB while looking at it I've noticed we do not pass any `-O' optimisation 
> > flag to the GCC driver while building .S files.  That in turn means no GAS 
> > branch optimisation is enabled and consequently branch delay slots are not 
> > scheduled in `reorder' code by the assembler (the MIPS/GCC `asm' spec has 
> > this `%{noasmopt:-O0; O0|fno-delayed-branch:-O1; O*:-O2; :-O1}'), causing 
> > delay slots wasted with a NOP where a preceding instruction could be moved 
> > instead and save some code space.  It can be easily observed by comparing 
> > code in the compiler-generated MIPS16 wrappers vs the new standalone 
> > assembly regular MIPS (and microMIPS) wrappers.
> > 
> >  Was this a deliberate choice made sometime to have greater control over 
> > code produced or just an accidental oversight?
> 
> I am not sure if it was deliberate, but my guess it is not really an issue
> for most architectures since afaik any '-O' optimization flag along with
> .S files usually does not turn any extra flags (SUBTARGET_ASM_SPEC seems
> to be define only in a handful architectures on gcc).
> 
> For MIPS I think we can set ASFLAGS to O1 since it should enable the
> required optimization, unless there is an specific gas option to enable
> it (which I couldn't find). Another option would be to filter out CFLAGS
> and extract the optimization level used for ASFLAGS, but I think for this
> specific issue it should extra non required complexity.

 I wonder if just passing base CFLAGS unchanged would do?

 Anyone please correct me if I am wrong, but I believe that when building 
an assembly source the GCC driver is supposed to take all the usual 
options it accepts for C, and just silently discard those which have no 
effect at the assembly stage.  This is in principle so that you can do 
say:

$ gcc $CFLAGS -S foo.c
$ gcc $CFLAGS -c foo.s

and get exactly the same result (sans having the foo.s byproduct) as with:

$ gcc $CFLAGS -c foo.c

We already have to pass flags to GAS which affect multilib selection, such 
as `-EL' or `-mabi=', so why bother filtering?

> Another thing I noticed is another possible optimization for size would
> to use ".set push", ".set noreorder", and ".set pop" instead of current
> code as below.
> 
[...]
> 
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> index dadfa18..96867de 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> @@ -159,10 +159,11 @@ union __mips_syscall_return
>         register long __v0 asm ("$2");                                  \
>         register long __a3 asm ("$7");                                  \
>         __asm__ volatile (                                              \
> -       ".set\tnoreorder\n\t"                                           \
> +       ".set push\n\t"                                                 \
> +       ".set noreorder\n\t"                                            \
>         v0_init                                                         \
>         "syscall\n\t"                                                   \
> -       ".set reorder"                                                  \
> +       ".set pop"                                                      \
>         : "=r" (__v0), "=r" (__a3)                                      \
>         : input                                                         \
>         : __SYSCALL_CLOBBERS);                                          \

 [Etc...] If this makes any difference (and your size change observations 
do not merely come from updated $ASFLAGS), then this is very weird and I 
think it needs investigating.

 The thing is for backwards compatibility at the place where an inline asm 
is expanded GCC sets the `reorder' mode (unless it's been set already), as 
this is what was the GCC's mode of assembly generation before support for 
explicit relocations was added back in early-mid 2000s.  If switching back 
to the `noreorder' is required by GCC at the conclusion of the asm, it has 
to do the switch itself.

 So there should actually be no difference in machine code generated 
between current code and your patched version of sysdep.h.  If there is, 
then something hairy is going on.  I'll try experimenting with your patch 
and will check what's happening here.

  Maciej


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]