[ECOS] Debugging pc platform

Thu Mar 2 12:18:00 GMT 2006

On Wed, 2006-03-01 at 18:20 +0100, Andrew Lunn wrote:
> On Wed, Mar 01, 2006 at 04:06:11PM +0000, David Fernandez wrote:
> > On Tue, 2006-02-28 at 18:14 +0100, Andrew Lunn wrote:
> > > On Tue, Feb 28, 2006 at 05:08:32PM +0000, David Fernandez wrote:
> > > > 
> > > > 	Hi there,
> > > > 
> > > > 	I'm trying to run eCos in a couple of pc platforms... At the moment,
> > > > I've configured a pc_i82544 + redboot + FLOPPY_SMP, and running it on a
> > > > 2 Xeon board and a 2 Pentium-S board, both SMP 1.4 and no ACPI.
> > > > 
> > > > 	I've found problems with the 2 Pentium-S board, unless disabling SMP
> > > > support in eCos HAL \ i386 Architecture, it hangs just after loading
> > > > from floppy.
> > > > 
> > > > 	I've enabled the SHOW_DIAGNOSTICS in pcmb_smp.c, and redirected Debug
> > > > and Diagnostics to Port 2 (PC Screen). It prints a lot of messages and
> > > > then stops with something similar to a GDB stream... None of the above
> > > > messages seem to indicate an error, but I can't see them all.
> > > 
> > > Well the output to gdb will be interesting. Set the diagnostics to go
> > > out the serial port and run gdb on the other end.
> > > 
> > > Something else to try is enable INFRA_DEBUG and see if an assert's
> > > fail.
> > > 
> > >         Andrew
> > 
> > 	I've configured a minimal debugging environment at the moment.
> > 	
> > 	The debugger says that a SIGTRAP is the cause of the program stopping,
> > the address is 0x665eabeb, and the stack frame shows a function with no
> > arguments.
> 
> That address does not look very likely. I would guess the processor
> has jumped to a random address, possibly because of stack
> corruption/overflow, or a buffer overflow problem.
> 
> > 	I don't know how to configure the debugger to get more information,
> > I've never debugged hal-like things in Linux. Do you now the core file
> > that I should provide the debugger with, and what additional things need
> > to be done?
> 
> You don't need a core file, just the ELF of the image you are
> running. But since i think this address is outside of the image i
> doubt it will be of much use.
> 
> > 
> > 	I've activated "Asserts and Traces" in "Infracstructure", but nothing
> > new is printed, only the smp information, that looks like everything is
> > ok there so far, but the SIGTRAP keeps appearing.
> > 
> > 	Are there more things to turn debugging information on?
> 
> Try fully enabling stack checking.
> 
> Is it SMP which is causing the problem? Maybe try running SMP but hack
> Cyg_Scheduler::start() so that it does not start the other CPUs. That
> might tell you more....It might also be interesting to find out which
> CPU has the problem. 
> 
>         Andrew

This is the function that RedBoot uses to do the same that
Cyg_Scheduler::start, with some debugging of my own...

__externC void cyg_hal_smp_cpu_start_all(void)
{
    HAL_SMP_CPU_TYPE cpu;

    diag_printf( "dfernandez - CPU Count   : %02d\n",
HAL_SMP_CPU_COUNT() );
    diag_printf( "dfernandez - CPU Current : %02d\n",
HAL_SMP_CPU_THIS()  );

    for( cpu = 0; cpu < HAL_SMP_CPU_COUNT(); cpu++ )
    {
        cyg_hal_smp_cpu_sync[cpu] = 0;
        cyg_hal_smp_cpu_sync_flag[cpu] = 0;
        cyg_hal_smp_cpu_running[cpu] = 0;
        cyg_hal_smp_cpu_entry[cpu] = 0;

        if( cpu != HAL_SMP_CPU_THIS() )
        {
            //cyg_hal_cpu_start( cpu );
            diag_printf( "dfernandez - Leaving out CPU %02d\n", cpu );
        }
        else
        {
            diag_printf( "dfernandez - Confirm OUR CPU %02d\n", cpu );
            cyg_hal_smp_cpu_running[cpu] = 1;
        }
    }
}

It has turn out that HAL_SMP_CPU_THIS() gives the CPU IDENTIFICATION !,
and NOT the cpu ordinal number, as it seems to be expected, and as
HAL_SMP_CPU_COUNT() does. I'm trying to locate some mechanism that
convert one into another... If you know it, let me know... :-)

I have TWO (2) CPUs, identified 00 and 03... so 03 never get excluded
from being started again !

Thus, this seems to be a bug, isn't it?

David.

-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss