This is the mail archive of the gdb@sourceware.cygnus.com mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: Standard GDB Remote Protocol

To: jtc at redback dot com
Subject: Re: Standard GDB Remote Protocol
From: Steven Johnson <sbjohnson at ozemail dot com dot au>
Date: Thu, 02 Dec 1999 14:23:54 +1000
CC: gdb at sourceware dot cygnus dot com
References: <199911090706.CAA13120@zwingli.cygnus.com> <199911102246.RAA01846@mescaline.gnu.org> <npr9hi321d.fsf@zwingli.cygnus.com> <199911231303.IAA01523@mescaline.gnu.org> <npr9hg2a9t.fsf@zwingli.cygnus.com> <199911251715.MAA09225@mescaline.gnu.org> <npzovvc04o.fsf@zwingli.cygnus.com> <199912010821.DAA27130@mescaline.gnu.org> <npogca9tb8.fsf@zwingli.cygnus.com> <3845AB0E.3795D99E@ozemail.com.au> <5md7sql00o.fsf@jtc.redbacknetworks.com>

"J.T. Conklin" wrote:
> 
> Since you're putting up your hand, would you be willing to review the
> protocol spec and point out areas that are ambiguous, confusing, need
> revising, etc?  
> 

Following is a Hopefully Constructive Critique, of the GDB Remote
Protocol.

It is based on my First Read of the current online version of protocol
specification at:
http://sourceware.cygnus.com/gdb/onlinedocs/gdb_14.html

In my Critique, I am not posing real questions when I discuss subjects.
What I am doing is hi-lighting areas where I have questions in my own
mind, where I find the description of the protocol lacking. The answers
will probably be present in the current implemented code and stubs, and
I have not yet looked for those answers. Nor do I wish to, until my
initial analysis of the Protocol is complete. I do not wish to taint my
understanding of the written words of the protocol with Black Knowledge
gleaned from the source. Further my critique is not a criticism of the
hard work that people have already done to get the
documentation/GDB/protocol to this state. Further it is obvious from my
first read of the protocol is that it has undergone extensive evolution,
and I have taken this into consideration.

Any comments I make on ways to fix things are simply my attempt at
understanding the problem. They do not represent a request or proposal
to change anything in the protocol, they are presented as part of the
thought process I underwent when analysing the protocol. They also
indicate areas where I have concerns with my understanding of the
protocol as documented.

Packet Structure:

Simple structure, obviously originally designed to be able to be driven
manually from a TTY. (Hence it's ASCII nature.) However, the protocol
has evolved quite significantly and I doubt it could still be used very
efficiently from a TTY. That said, it still demarks frames effectively.

Sequence Numbers:

Definition of Sequence ID's needs work. Are they necessary? Are they
deprecated? What purpose do they currently serve within GDB? One would
imagine that they are used to allow GDB to handle retransmits from a
remote system. Reading between the lines, this is done to allow error
recovery when a transmission from target to host fails. Possible
sequence being:

<- $packet-data#checksum
-> +
-> $sequence-id:packet-data#checksum (checksum fails or receive timeout
halfway through packet).
<- -sequence-id
-> $sequence-id:packet-data#checksum
<- +sequence-id

When do the sequence-id's increment? Presumably on the successful
receipt of the +sequence-id acknowledgement.

If they increment on the successful acknowledgement, what happens if the
acknowledgement is in error? For example a framing error on the '+'. The
target would never see the successful acknowledgement and would not
increment it's sequence number.

So what if it doesn't? The +/- Ack/Nak mechanism should be amply
sufficient to allow retransmits of missed responses. 

I can see little practical benefit in a sequence-id in the responses, as
it is currently documented. This is supported buy the comment within the
document: "Beyond that its meaning is poorly defined. GDB is not known
to output sequence-ids". This tends to indicate that the mechanism has
fallen out of use, Probably because it doesn't actually achieve
anything. If this is the case, it could be deprecated. However, I would
advocate not deprecating it from the protocol, because If they were sent
by GDB a current hole I believe is in the protocol could be plugged. (I
will discuss this hole later in this critique.)

 Ack/Nak Mechanism:

Simple Ack/Nak Mechanism, using + and - Respectively. Also reflects the
simple ASCII basis of the protocol. My main concern with this system is
there is no documentation of timing. Usually Ack/Nak must be received
within a certain time frame, otherwise a Nak is assumed and a retransmit
proceeds. This is necessary, because it is possible for the Ack/Nak
character to be lost (however unlikely) on the line due to a data error.
I think there should be a general timing basis to the entire protocol to
tie up some potential communications/implementation problems.

The 2 primary timing constraints I see that are missing are:

Inter character times during a message transmission, and Ack/Nak
response times.

If a message is only half received, the receiver has no ability without
a timeout mechanism of generating a NAK signalling failed receipt. If
this occurs, and there is no timeout on ACK/NAK reception, the entire
comms stream could Hang. Transmitter is Hung waiting for an ACK/NAK and
the Receiver is Hung waiting for the rest of the message.

I would propose that something needs to be defined along the lines of:

Once the $ character for the start of a packet is transmitted, each
subsequent byte must be received within "n" byte transmission times.
(This would allow for varying comms line speeds). Or alternately a
global timeout on the whole message could be define one "$" (start
sentinel) is sent, the complete message must be received within "X"
time. I personally favour the inter character time as opposed to
complete message time as it will work with any size message, however the
complete message time restrict the maximum size of any one message (to
how many bytes can be sent at the maximum rate for the period). These
tiemouts do not need to be very tight, as they are merely for complete
failure recovery and a little delay there does not hurt much. 

One possible timeout that would be easy to work with could be: Timeout
occurs 1 second after the last received byte.

For ACK/NAK I propose that something needs to be defined along the
lines: ACK/NAK must be received within X Seconds from transmission of
the end of the message, otherwise a NAK must be assumed.

There is no documentation of the recovery procedure, Does GDB retransmit
if its message is responded to with a NAK? If not, what does it do? How
is the target supposed to identify and handle retransmits from GDB.

What happens if something other than + or - is received when ACK/NAK is
expected. (For example $).

 Identified Protocol Hole:

Lets look at the following abstract scenario (Text in brackets are
supporting comments):

<- $packet-data#checksum (Run Target Command)
-> +                                     (Response is lost due to a line
error)
(Target runs for a very short period of time and then breaks).
-> $sequence-id:packet-data#checksum (Break Response - GDB takes as a
NAK, expecting a +, got a $).
<- $packet-data#checksum (GDB retransmits it's Run Target Command,
target restarts)
-> +                                     (Response received OK by GDB).
(Target again starts running.)

In this scenario, it is shown that with the currently documented
mechanisms, it is possible for transmission errors to occur that
interfere with debugging.  There was no mechanism for the target to
identify that GDB was re-transmitting and subsequently executed the same
operation twice. When GDB really only wanted to execute the command
once. 

Its this sort of scenario that I imagine the sequence id's were created
for.

If GDB sent Sequence ID's then the scenario would be much different:
<- $ sequence-id:packet-data#checksum (Run Target Command)
-> +                                                          (Response
is lost due to a line error)
(Target runs for a very short period of time and then breaks).
-> $sequence-id:packet-data#checksum (Break Response - GDB takes as a
NAK, expecting a +, got a $).
<- $ sequence-id:packet-data#checksum (GDB retransmits it's Run Target
Command, with the same

sequence -id as in the original command)
(Target identifies the sequence-id as a retransmit.)
(Instead of performing the operation again, it simply re-responds with
the results obtained from the last command)
-> +                                                         (Response
received OK by GDB).
-> $sequence-id:packet-data#checksum (Break Response - GDB processes as
expected.) 
(GDB then increments its sequence-id in preparation for the next
command.)

As an extra integrity check, the response sequence-id should be
identical to the request sequence-id. This would allow GDB to verify
that the response it is processing is properly paired with it's request.
Further, the target shouldn't require either ACK nor NAK. It should
process them properly if received, but otherwise process the received
packet, even if ACK/NAK was expected.

If this is the intent of sequence-id and it has fallen into disuse, then
to allow it's re-introduction at a later date, it could be documented
that if GDB sends a sequence-id, then the retransmit processing I've
documented here operates, otherwise the currently defined behaviour
operates, and that sequence-id is only sent by the target in responses
where they are present in the original GDB message. This would allow GDB
to probe if the target supports secure and recoverable message delivery
or not. 

 Run Length Encoding:

Is run length encoding supported in all packets, or just some packets?
(For example, not binary packets)
Why not allow lengths greater than 126? Or does this mean lengths
greater than 97 (as in 126-29)
If binary packets with 8 bit data can be sent, why not allow RLE to use
length also greater than 97. If the length maximum is really 126, then
this yields the character 0x9B which is 8 bits, wouldn't the maximum
length in this case be 226. Or is this a misprint?

Why are there 2 methods of RLE? Is it important for a Remote Target to
understand and process both, or is the "cisco encoding" a proprietary
extension of the GDB Remote protocol, and not part of the standard
implementation. The documentation of "cisco encoding" is confusing and
seems to conflict with standard RLE encoding. They appear to be mutually
exclusive. If they are both part of the protocol, how are they
distinguished when used?

Deprecated Messages:

Should an implementation of the protocol implement the deprecated
messages or not? What is the significance of the deprecated messages to
the current implementation?

Character Escaping:

The mechanism of Escaping the characters is not defined. Further it is
only defined as used by write mem binary. Wouldn't it be useful for
future expansion of the protocol to define Character Escaping as a
global feature of the protocol, so that if any control characters were
required to be sent, they could be escaped in a consistent manner across
all messages. Also, wouldn't the full list of escape characters be
$,#,+,-,*,0x7d. Otherwise, + & - might be processed inadvertently as ACK
or NAK. If this can't happen, then why must they be avoided in RLE? If
they are escaped across all messages, then that means they could be used
in RLE and not treated specially.

8/7 Bit protocol.

With the documentation of RAW Binary transfers, the protocol moves from
being a strictly 7 bit affair into being a 8 bit capable protocol. If
this is so, then shouldn't all the restrictions that are placed from the
7 bit protocol days be lifted to take advantage of the capabilities of
an 8 bit message stream. (RLE limitations, for example). Would anyone
seriously be using a computer that had a 7 bit limitation anymore
anyway? (At least a computer that would run GDB with remote debugging).

Thoughts on consistency and future growth:

Apply RLE as a feature of All messages. (Including binary messages, as
these can probably benefit significantly from it).

Apply the Binary Escaping mechanism as a feature of the packet that is
performed on all messages prior to transmission and immediately after
reception. Define an exhaustive set of "Characters to be escaped".

Introduce message timing constraints.

Properly define sequence-id and allow it to be used from GDB to make
communications secure and reliable.

Steven Johnson
Managing Director
Neurizon Pty Ltd

Follow-Ups:
- Re: Standard GDB Remote Protocol
  - From: William Gatliff
- Re: Standard GDB Remote Protocol
  - From: J.T. Conklin

References:
- MMX: Messy Multimedia eXtensions
  - From: Jim Blandy
- Re: none
  - From: Jim Blandy
- Re: none
  - From: Eli Zaretskii
- Re: none
  - From: Jim Blandy
- Re: none
  - From: Eli Zaretskii
- Re: none
  - From: Jim Blandy
- Re: ST(i) and MMj
  - From: Eli Zaretskii
- Re: ST(i) and MMj
  - From: Jim Blandy
- Standard GDB Remote Protocol
  - From: Steven Johnson
- Re: Standard GDB Remote Protocol
  - From: J.T. Conklin

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]