This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: Fix ll/sc for mips (take 3)


On Tue, Feb 05, 2002 at 01:54:07PM -0800, H . J . Lu wrote:

>   __asm__ __volatile__
>     ("/* Inline compare & swap */\n"
>      "1:\n\t"
>      "ll        %1,%5\n\t"
>      "move      %0,$0\n\t"
>      "bne       %1,%3,2f\n\t"
>      "move      %0,%4\n\t"
>      "sc        %0,%2\n\t"
>      "beqz      %0,1b\n\t"
>      "2:\n\t"
>      "/* End compare & swap */"
>      : "=&r" (ret), "=&r" (temp), "=m" (*p)
>      : "r" (oldval), "r" (newval), "m" (*p)
>      : "memory");
> 
> The assembler will do
> 
> 0xd724 <__pthread_alt_lock+212>:        ll      v1,0(s1)
> 0xd728 <__pthread_alt_lock+216>:        move    a1,zero
> 0xd72c <__pthread_alt_lock+220>:	bne v1,s0,0xd744 <__pthread_alt_lock+244>
> 0xd730 <__pthread_alt_lock+224>:        nop
> 0xd734 <__pthread_alt_lock+228>:        move    a1,v0
> 0xd738 <__pthread_alt_lock+232>:        sc      a1,0(s1)
> 0xd73c <__pthread_alt_lock+236>:	beqz        a1,0xd724 <__pthread_alt_lock+212>
> 0xd740 <__pthread_alt_lock+240>:        nop
> 
> There is an extra "nop" in the delay slot. I don't think gas is smart
> enough to fill the delay slot. I will put back those ".set noredor".

The solution is to move the move instruction in front of the branch
instruction.  The assembler will then move it into the delay slot:

   __asm__ __volatile__
     ("/* Inline compare & swap */\n"
      "1:\n\t"
      "ll        %1,%5\n\t"
      "move      %0,$0\n\t"
      "move      %0,%4\n\t"
      "bne       %1,%3,2f\n\t"
      "sc        %0,%2\n\t"
      "beqz      %0,1b\n\t"
      "2:\n\t"
      "/* End compare & swap */"
      : "=&r" (ret), "=&r" (temp), "=m" (*p)
      : "r" (oldval), "r" (newval), "m" (*p)
      : "memory");

Also this function looks like a good candidate for inlining (Is it actually
inlined?  Haven't checked ...) where depending on it's use the address of
*p is loaded twice from the GOT, so changing the code to:

   __asm__ __volatile__
     ("/* Inline compare & swap */\n"
      "1:\n\t"
      "ll        %1,(%5)\n\t"
      "move      %0,$0\n\t"
      "move      %0,%4\n\t"
      "bne       %1,%3,2f\n\t"
      "sc        %0,(%2)\n\t"
      "beqz      %0,1b\n\t"
      "2:\n\t"
      "/* End compare & swap */"
      : "=&r" (ret), "=&r" (temp), "=r" (p)
      : "r" (oldval), "r" (newval), "r" (p)
      : "memory");

will avoid having to pay that PIC bloat twice and get you around the gas
inefficiency of putting in too many nops into PIC code.

  Ralf


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]