[MI] core awareness

Mon Dec 21 14:48:00 GMT 2009

On Thursday 17 December 2009 18:33:21 Pedro Alves wrote:

> (resending in us-ascii)
> 
> On Wednesday 16 December 2009 20:52:38, Vladimir Prus wrote:
> > The attached patch implements the core awareness, as discussed at:
> > 
> >         http://thread.gmane.org/gmane.comp.gdb.devel/27468/
> > 
> > The most current spec can be found at:
> > 
> >         http://article.gmane.org/gmane.comp.gdb.devel/27548
> > 
> > To summarize, with this patch, when debugging against gdbserver:
> > 
> > * The -list-thread-groups command will report the cores that each
> > process runs on (with or without --available)
> > * It's possible to pass several thread groups to -list-thread-groups
> > * The *stopped output includes core number
> > 
> > For native debugging on Linux, roughly the same is also possible.
> > 
> 
> > This is mostly MI change, but includes some remote protocol tweaks,
> > and documentation. I'd appreciate review of those parts.
> 
> More like 80% not-MI changes.  :-)

Uhm, well, it accidentally happened.

> > +@item qXfer:threads:read::@var{offset},@var{length}
> > +@anchor{qXfer threads read}
> > +Access the list of threads on target.  @xref{Thread List Format}.  The
> > +annex part of the generic @samp{qXfer} packet must be empty
> > +(@pxref{qXfer read}).
> > +
> > +This packet is not probed by default; the remote stub must request it,
> 
> What's the advantage of this, rather than probing it?  I would
> see it nice for GDB to know in advance that the target can
> report core data, but that's not that reporting support
> for this packet means, since the "core" field is optional.

Every other qXfer packet works this way. And it does not seem like much burden
on the stub to announce that this packet is available.

> > diff --git a/gdb/features/osdata.dtd b/gdb/features/osdata.dtd
> > index a04506b..67de837 100644
> > --- a/gdb/features/osdata.dtd
> > +++ b/gdb/features/osdata.dtd
> > @@ -10,7 +10,10 @@
> >  <!ATTLIST osdata version CDATA #FIXED "1.0">
> >  <!ATTLIST osdata type CDATA #REQUIRED>
> >  
> > -<!ELEMENT item (column*)>
> > +<!ELEMENT item (column*, threads?)>
> >  
> >  <!ELEMENT column (#PCDATA)>
> >  <!ATTLIST column name CDATA #REQUIRED>
> > +
> > +<!ELEMENT threads (item*)>
> > +
> 
> This dtd change isn't right.  "osdata" isn't specific to
> processes, it is just a generic table.  A "threads" element
> doesn't fit in, it wouldn't make sense for anything other
> than listing processes.  Two alternative solutions:
> 
> - Allow multiple sub-table elements in the osdata schema, and
>   allow a "name" attribute on this element.  In the listing
>   processes case, you'd have a sub-table whose type/name
>   attribute would be "threads".  You should change
>   info_osdata_command to descend into subtables somehow.
> 
> - Don't nest "threads" within processes osdata.  Treat it as
>   a different request / table (e.g., <osdata type="threads">).
>   That is, on the CLI, "info os processes" works as is, unmodified, and,
>   you'd add support for "info os threads".  No schema changes
>   are necessary with this option.  At the MI implementation level, you'd
>   do two requests to get the same data: get_osdata("processes"), and
>   get_osdata("threads").

I have implemented the second approach.

> > +             if (core != -1)
> > +               {
> > +                 char s[11];
> > +                 sprintf (s, "%u", core);
> > +
> > +                 if (count == allocated) 
> > +                   core_numbers = realloc (core_numbers, 
> > +                                           sizeof (int ) * (allocated *= 2));
> 
> No space after int.  Please move the 'allocated *= 2' in its own
> statement.  My GM hat tells me that I should point out that there are a
> bunch of hardcoded buffer sizes in the patch, that the GNU conventions
> tells us we should avoid.

I have replaced '11' with  sizeof ("4294967295") + 1.

> > +       char number[11];
> > +       sprintf (number, "%u", *b);
> > +       buffer_xml_printf (&buffer2, "%s%s", 
> > +                          (b == core_numbers) ? "" : ",", number);
> > +      }
> > +      buffer_grow_str0 (&buffer2, "");
> 
> 
> There's a bug here somewhere in the duplicates detection that
> we talked about off-list, that I know you've fixed
> already.  :-)

Yeah, in attempt to reimplement std::unique in C I've ended up implementing
something else. Of course, if gdb were just *compiled* by a C++ compiler,
I would be able to call std::unique, and that would just work. 

> > @@ -2857,11 +2968,29 @@ linux_qxfer_osdata (const char *annex,
> >                            "<item>"
> >                            "<column name=\"pid\">%s</column>"
> >                            "<column name=\"user\">%s</column>"
> > -                          "<column name=\"command\">%s</column>"
> > -                          "</item>",
> > +                          "<column name=\"command\">%s</column>",
> >                            dp->d_name,
> >                            entry ? entry->pw_name : "?",
> >                            cmd);
> > +
> > +                        buffer_init (&thread_buffer);
> > +                        list_threads (pid, &thread_buffer, &cores);
> > +
> > +                        if (cores)
> > +                          {
> > +                            buffer_xml_printf (
> > +                                &buffer, 
> > +                                "<column name=\"cores\">%s</column>", cores);
> > +                            free (cores);
> > +                          }
> > +                       
> > +                        buffer_grow (&buffer, 
> > +                                     thread_buffer.buffer, 
> > +                                     thread_buffer.used_size);
> > +                        free (buffer_finish (&thread_buffer));
> > +
> > +                        
> > +                        buffer_xml_printf (&buffer, "</item>");
> 
> 
> Any chance linux-nat.c also gets support for this (however it
> ends up looking like)?

It is surely possible, but the original motivation here is mostly embedded
systems, so I'd prefer to have gdbserver-only version checked in, and then
anybody interested in native linux can fill the bits for linux.

> > +  char filename[sizeof ("/proc//task//stat") 
> 
> s,//,/

I don't think so, actually. The final path will look like

	/proc/NNN/task/MMM/stat

so it will indeed have two slashes between 'proc' and 'task'

> > +       case RECURSE_OPT:
> > +         if (strcmp (optarg, "1") != 0)
> > +           error ("only '1' is permitted value for the '--recurse' option");
> 
> Seems strange not to allow '--recurse 0'.  

I've made it allowed.

> What future values are you thinking of?  

'2', 'inf', for example.

> Wouldn't it make more sense to call this
> "--threads"?  We may add other level-1 sublists to thread-groups
> in the future, and _not_ want --recurse 1 to always show
> them, no?

I would say we rather want them. If frontend wishes to grab entire
thread group hierarchy, he should be able to pass --recurse 1 or
in future '--recurse inf' to get absolutely everything.

> > +      VEC_safe_push (int, ids, inf);
> >      }
> > -  else if (available)
> > +  qsort (VEC_address (int, ids), 
> > +        VEC_length (int, ids),
> > +        sizeof (int), compare_positive_ints);
> > +
> > +  back_to = make_cleanup ((void (*)(void *))VEC_OP (int, free), &ids);
> 
> Please don't cast the function's type: instead write a
> small function that has the proper cleanup prototype, and
> call free there.

Is casting function type going to cause practical problems on some targets?

> > +      /* Update the core associated with a thread when we process stop
> > +        event in that thread.  */
> > +      info = find_thread_ptid (ptid);
> > +      if (info && info->private)
> > +       info->private->core = stop_reply->core;
> 
> Consider all-stop, multi-threaded applications:
> 
>  What about the core of the other threads?
>  Shouldn't they all be invalidated?
>  Won't then be stale?
> 
> (see comment below)

Yes, it will be stale. However, it's approximation only, and the only two
places where cores number is used is *stopped response (which uses core of
the stopped thread, updated above) and -list-thread-groups (which updates
this information anyway). It does not seem right to additionally update
this even if nothing will use it.

> > +    /* Return the core that thread PTID is on, or -1 if such information
> > +       is not available. For a stopped thread, this is supposed to return
> > +       the core the thread was last running on.  For running threads, it
> > +       should return one of the cores that the thread was running between
> > +       the call to this function and return -- and if it was running on
> > +       several cores, any other may be returned.  */
> > +    int (*to_core_of_thread) (struct target_ops *, ptid_t ptid);
> 
> I guess that on some targets, it will be impossible to know which core
> a running thread was running on.  Should it be simply documented as
> undefined instead?

I am not sure I understand. On such target, this method will return -1. 

> Other:
> 
>  - did you consider being able to list the available
>    cores?  A candidate for "info os cores", I guess.

Unfortunately, I do not know a suitable interface to get that information.
Parsing /proc/cpuinfo is not really attractive.

I attach a revision with yours and Eli's comments addressed, expect as
mentioned above, and a delta.

- Volodya
-------------- next part --------------
A non-text attachment was scrubbed...
Name: core2.diff
Type: text/x-patch
Size: 60105 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/gdb-patches/attachments/20091221/3f54a855/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: delta.diff
Type: text/x-patch
Size: 39728 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/gdb-patches/attachments/20091221/3f54a855/attachment-0001.bin>