Help Translating Message from MIPS Simulator

Joel Sherrill joel.sherrill@oarcorp.com
Mon Apr 17 14:26:00 GMT 2017



On 4/16/2017 9:26 AM, Joel Sherrill wrote:
>
>
> On 4/15/2017 6:44 PM, Maciej W. Rozycki wrote:
>> On Sun, 16 Apr 2017, Joel Sherrill wrote:
>>
>>> We have some RTEMS tests failing on the mips simulator
>>> in gdb. We are using the jmr3904 configuration and the
>>> run ends with this message:
>>>
>>> HILO: MFHI: MF at 0x88015698 following OP at 0x88000464 corrupted by MT at
>>> 0x88003c68
>>>
>>> 0x88000464 does appear to be in a reasonable location
>>> inside the test.
>>>
>>> How do I translate the rest to get an idea about the fault?
>>
>>  I'll give some background information.
>>
>>  In MIPS architecture HILO or HI/LO is the integer multiply-divide (MD)
>> unit's accumulator aka the HI and LO register pair.  For widening
>> multiplication LO holds the low part of the product and HI holds the
>> corresponding high part.  For division LO holds the quotient and HI holds
>> the remainder.
>>
>>  These registers are not a part of the general ALU and therefore special
>> operations have been defined to retrieve and also to store data there:
>> MFHI and MFLO (Move From HI/LO) are the read instructions and MTHI and
>> MTLO (Move To HI/LO) are the write instructions for the HI and the LO
>> register respectively.
>>
>>  In older architecture revisions the MTHI and MTLO instructions are only
>> really useful for context switches as data placed there cannot be further
>> used, except to read it back with MFHI or MFLO.  Consequently MTHI and
>> MTLO are seldom used and those architecture revisions do not have hardware
>> interlocks implemented for them, requiring a sufficient number of other
>> instructions to be executed between a MFHI or MFLO and a following MTHI or
>> MTLO for predictable results to be produced.  Otherwise the MTHI/MTLO
>> operation may (and generally will) corrupt data retrieved with MFHI/MFLO.
>
> This is an old MIPS being simulated. It is the TX3904.
> So all that applies.
>
>
>>  NB later architecture revisions have integer multiply-accumulate
>> instructions which use HI/LO as one of inputs, making MTHI and MTLO more
>> useful, and they do implement hardware interlocks for them.
>>
>>  So the message above means that the result of a MFHI or MFLO operation
>> (MF) at 0x88015698 executed after an MD unit operation (OP) at 0x88000464
>> has been corrupted by a MTHI or MTLO operation (MT) at 0x88003c68.  And I
>> believe that what I wrote above makes the HILO and MFHI prefixes obvious.
>
> Thanks for the explanation.
>
> 0x88000464 is in the test code.
>
> 0x88003c68 is in the device driver. It is executing in the same thread
> as the test code. This is a single threaded test with no context switches
> before the failure.
>
>     0x88003c54 <+88>:    beq     a2,a3,0x88003c7c <i2c_bus_transfer+128>
>     0x88003c58 <+92>:    addiu   v1,v1,12
>     0x88003c5c <+96>:    lhu     v0,2(v1)
>     0x88003c60 <+100>:   andi    t0,v0,0x4000
>     0x88003c64 <+104>:   bnez    t0,0x88003c2c <i2c_bus_transfer+48>
>     0x88003c68 <+108>:   mtlo    t3
>     0x88003c6c <+112>:   move    t1,a3
>
> The code for i2c_bus_transfer is in pure C and it looks like gcc generated
> that mtlo. I don't see a mthi or any mfhi/lo instructions in the method.
>
> 0x88015698 is the mfhi instruction in the outer level of the RTEMS
> interrupt processing code. This is the source:
>
>          mflo  t0
>          STREG t8, R_T8*R_SZ(sp)
>          STREG t0, R_MDLO*R_SZ(sp)
>          STREG t9, R_T9*R_SZ(sp)
>          mfhi  t0                 <===================
>          STREG gp, R_GP*R_SZ(sp)
>          STREG t0, R_MDHI*R_SZ(sp)
>          STREG fp, R_FP*R_SZ(sp)
>
>
> The first two are in C. The last was obviously is in assembly.

Thinking on this more, I think this is a false positive. The
code flag is saving hi/lo at the beginning of an interrupt
and will restore it at the end of the interrupt. This is a
perfectly save and proper.

I think the check is great for single-threaded (and likely
multi-threaded) code with no interrupts. But it doesn't know
that this is actually saving the CPU context and will later
restore it.

What do you think?

> Based on your description, I think gcc is using this as a scratch register
> and shouldn't. In case we are using the wrong compiler options, this is
> what we use:
>
> -march=r3900 -Wa,-xgot -G0
>
> I will file a gcc ticket and cc you on it if that looks like the explanation.
>
> Thanks.

Thanks again.

--joel
  
>>  HTH,
>>
>>   Maciej
>>
>
> --joel
>



More information about the Gdb mailing list