This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2] Optimize memory_xfer_partial for remote


On 6/24/2016 3:23 PM, Pedro Alves wrote:
> On 06/24/2016 10:21 PM, Don Breazeal wrote:
>>
>> and with commit hash: 67c059c29e1fb0cdeacdd2005f955514d8d1fb34
>>
> 
> Write:
> 
>  ... with commit 67c059c29e1f ("Improve performance of large restore
>  commands") ...
> 
> so the reader has a clue what the commit is about without having
> to check.

Hi Pedro,
Thanks for your comments; sorry about the coding convention and clarity
issues. The one above and the others are fixed in the attached patch below.

> 
> 
>> gdb/ChangeLog:
>> 2016-06-24  Don Breazeal  <donb@codesourcery.com>
>>
>> 	* remote.c (remote_get_memory_xfer_limit): New function.
>> 	* target-delegates.c (delegate_get_memory_xfer_limit,
>> 	debug_get_memory_xfer_limit, install_delegators,
>> 	install_dummy_methods, init_debug_target): New functions
>> 	and target_ops initialization from regenerating the file.
> 
> The standard practice is to just say:
> 
>  	* target-delegates.c: Regenerate.

Fixed in the attached patch.

> 
>> 	* target.c (default_get_memory_xfer_limit): New function and
>> 	forward declaration.
>> 	(memory_xfer_partial): Call target_ops.to_get_memory_xfer_limit.
>> 	* target.h (struct target_ops)<to_get_memory_xfer_limit>: New
>> 	member.
> 
> Space between ")<".

Fixed in the attached patch.

> 
>>
>> diff --git a/gdb/remote.c b/gdb/remote.c
>> index 501f3c6..03c7ab7 100644
>> --- a/gdb/remote.c
>> +++ b/gdb/remote.c
>> @@ -10160,6 +10160,12 @@ remote_xfer_partial (struct target_ops *ops, enum target_object object,
>>    return TARGET_XFER_OK;
>>  }
>>  
>> +static ULONGEST
>> +remote_get_memory_xfer_limit (struct target_ops *ops)
> 
> Intro comment, something like "Implementation of ... method.".

Fixed in the attached patch.

> 
>>  
>> +/* The default implementation for the to_get_memory_xfer_limit method.
>> +   The hard-coded limit here was determined to be a reasonable default
>> +   that eliminated exponential slowdown on very large transfers without
>> +   unduly compromising performance on smaller transfers.  */
> 
> Where's this coming from?  Is this new experimentation you did,
> or are you talking about Anton's patch?

Both.  I did some experimentation to verify that things were significantly
slower without any memory transfer limit, which they were, although I never
reproduced the extreme scenario Anton had reported.  Presumably the
performance differences were due to hardware and environment differences.
Regarding the comment, I thought some explanation of the hard-coded number
was appropriate.  Is there a better or more preferable way to do this, e.g.
refer to the commit hash, or does it just seem superfluous?

> 
>> @@ -1301,8 +1314,9 @@ memory_xfer_partial (struct target_ops *ops, enum target_object object,
>>  	 by memory_xfer_partial_1.  We will continually malloc
>>  	 and free a copy of the entire write request for breakpoint
>>  	 shadow handling even though we only end up writing a small
>> -	 subset of it.  Cap writes to 4KB to mitigate this.  */
>> -      len = min (4096, len);
>> +	 subset of it.  Cap writes to a limit specified by the target
>> +	 to mitigate this.  */
>> +      len = min (ops->to_get_memory_xfer_limit (ops), len);
>>  
> 
> Does this still work if remote is not the top-most target?

Yes, see next comment.

> 
> E.g., what happens if you do "record" to push a record_statum
> target on top?  Do we get the 4KB default limit, or the
> remote limit?

I ran this experiment and verified that the target delegation mechanism
worked as expected in this case. It walks down the target stack, skipping
targets that don't implement to_get_memory_xfer_limit, and returning after
calling the function in the first target that does implement it. I also
verified that it walks all the way down the target stack and calls the
default function if no target has defined its own version of
to_get_memory_xfer_limit.

Output of the experiment with record and remote is below, followed by the
revised patch.

Thanks
--Don

[build5-trusty-cs(5.68) 694] bin$ gdb ./gdb
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
... GDB banner stuff...
Reading symbols from ./gdb...done.
(gdb) set prompt (top)
(top) b target.c:1319
Breakpoint 1 at 0x65985b: file ../../binutils-gdb/gdb/target.c, line 1319.
(top) run ~/test/big
Starting program: /scratch/dbreazea/gdb-5301/install/bin/gdb ~/test/big
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
GNU gdb (GDB) 7.11.50.20160624-git
...GDB banner stuff...
Reading symbols from /home/dbreazea/test/big...done.
(gdb) tar rem localhost:51111
Remote debugging using localhost:51111
...Reading symbols msgs...
0x00007ffff7ddb2d0 in ?? () from target:/lib64/ld-linux-x86-64.so.2
(gdb) b main
Breakpoint 1 at 0x40051e: file /home/dbreazea/test/big.c, line 9.
(gdb) c
Continuing.
Reading /lib/x86_64-linux-gnu/libc.so.6 from remote target...
Reading /lib/x86_64-linux-gnu/libc-2.19.so from remote target...
Reading /lib/x86_64-linux-gnu/.debug/libc-2.19.so from remote target...

Breakpoint 1, main () at /home/dbreazea/test/big.c:9
9 printf ("starting...\n");
(gdb) record full <---<<< push the target on top of the remote target
(gdb) b 12
Breakpoint 2 at 0x40054d: file /home/dbreazea/test/big.c, line 12.
(gdb) c
Continuing.

Breakpoint 2, main () at /home/dbreazea/test/big.c:12
12 printf ("done, a[%d] is %d\n", 42, a[42]);
(gdb) restore ../../build/gdb/a.srec <---<<< 'restore' for big writes
Restoring section .sec1 (0x6009a0 to 0x6089a0)

Breakpoint 1, memory_xfer_partial (ops=0xe07e20 <record_full_ops>,
object=TARGET_OBJECT_MEMORY, readbuf=0x0, writebuf=0x212c480 "",
memaddr=6293920, len=32768, xfered_len=0x7fffffffdea8)
at ../../binutils-gdb/gdb/target.c:1319
1319 len = min (ops->to_get_memory_xfer_limit (ops), len);
(top) p ops->to_longname
$1 = 0x9ddc00 "Process record and replay target"
(top) s
delegate_get_memory_xfer_limit (self=0xe07e20 <record_full_ops>)
at ../../binutils-gdb/gdb/target-delegates.c:2070
2070 self = self->beneath;
(top) display self->to_longname
1: self->to_longname = 0x9ddc00 "Process record and replay target"
(top) s
2071 return self->to_get_memory_xfer_limit (self);
1: self->to_longname = 0x8fcd00 "Remote serial target in gdb-specific protocol"
(top) s
remote_get_memory_xfer_limit (ops=0xdf0360 <remote_ops>)
at ../../binutils-gdb/gdb/remote.c:10166
10166 return get_memory_write_packet_size ();
(top) n
10167 }
(top) n
delegate_get_memory_xfer_limit (self=0xdf0360 <remote_ops>)
at ../../binutils-gdb/gdb/target-delegates.c:2072
2072 }
1: self->to_longname = 0x8fcd00 "Remote serial target in gdb-specific protocol"
(top) n
memory_xfer_partial (ops=0xe07e20 <record_full_ops>,
object=TARGET_OBJECT_MEMORY, readbuf=0x0, writebuf=0x212c480 "",
memaddr=6293920, len=16383, xfered_len=0x7fffffffdea8)
at ../../binutils-gdb/gdb/target.c:1321
1321 buf = (gdb_byte *) xmalloc (len);
(top) p len
$2 = 16383 <---<<< 16K (approx) len from remote target function

---------- revised patch -----------
Some analysis we did here showed that increasing the cap on the
transfer size in target.c:memory_xfer_partial could give 20% or more
improvement in remote load across JTAG.  Transfer sizes are capped
to 4K bytes because of performance problems encountered with the
restore command, documented here:

https://sourceware.org/ml/gdb-patches/2013-07/msg00611.html

and in commit 67c059c29e1f ("Improve performance of large restore
commands").

The 4K cap was introduced because in a case where the restore command
requested a 100MB transfer, memory_xfer_partial would repeatedy
allocate and copy an entire 100MB buffer in order to properly handle
breakpoint shadow instructions, even though memory_xfer_partial would
actually only write a small portion of the buffer contents.

A couple of alternative solutions were suggested:
* change the algorithm for handling the breakpoint shadow instructions
* throttle the transfer size up or down based on the previous actual
  transfer size

I tried implementing the throttling approach, and my implementation
reduced the performance in some cases.

This patch implements a new target function that returns that target's
limit on memory transfer size.  It defaults to 4K bytes as before, but
for remote it uses remote.c:get_memory_write_packet_size.

The performance differences that I saw were as follows (in seconds),
using an artificially large application and a 100MB srec file:

USB  load:     orig   53.2 patch  18.9
USB  restore:  orig 1522.4 patch 543.6
Enet load:     orig   12.2 patch  10.0
Enet restore:  orig  368.0 patch 294.3

Tested on x86_64 Linux with native and native-gdbserver, and manually
tested 'load' and 'restore' on a Windows 7 host with a bare-metal ARM
board.

gdb/ChangeLog:
2016-06-27  Don Breazeal  <donb@codesourcery.com>

	* remote.c (remote_get_memory_xfer_limit): New function.
	* target-delegates.c: Regenerate.
	* target.c (default_get_memory_xfer_limit): New function and
	forward declaration.
	(memory_xfer_partial): Call target_ops.to_get_memory_xfer_limit.
	* target.h (struct target_ops) <to_get_memory_xfer_limit>: New
	member.

---
 gdb/remote.c           |  9 +++++++++
 gdb/target-delegates.c | 25 +++++++++++++++++++++++++
 gdb/target.c           | 18 ++++++++++++++++--
 gdb/target.h           |  6 ++++++
 4 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/gdb/remote.c b/gdb/remote.c
index 501f3c6..dfa41ef 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -10160,6 +10160,14 @@ remote_xfer_partial (struct target_ops *ops, enum target_object object,
   return TARGET_XFER_OK;
 }
 
+/* Implementation of to_get_memory_xfer_limit.  */
+
+static ULONGEST
+remote_get_memory_xfer_limit (struct target_ops *ops)
+{
+  return get_memory_write_packet_size ();
+}
+
 static int
 remote_search_memory (struct target_ops* ops,
 		      CORE_ADDR start_addr, ULONGEST search_space_len,
@@ -13073,6 +13081,7 @@ Specify the serial device it is connected to\n\
   remote_ops.to_interrupt = remote_interrupt;
   remote_ops.to_pass_ctrlc = remote_pass_ctrlc;
   remote_ops.to_xfer_partial = remote_xfer_partial;
+  remote_ops.to_get_memory_xfer_limit = remote_get_memory_xfer_limit;
   remote_ops.to_rcmd = remote_rcmd;
   remote_ops.to_pid_to_exec_file = remote_pid_to_exec_file;
   remote_ops.to_log_command = serial_log_command;
diff --git a/gdb/target-delegates.c b/gdb/target-delegates.c
index 03aa2cc..18f22b5 100644
--- a/gdb/target-delegates.c
+++ b/gdb/target-delegates.c
@@ -2064,6 +2064,27 @@ debug_xfer_partial (struct target_ops *self, enum target_object arg1, const char
   return result;
 }
 
+static ULONGEST
+delegate_get_memory_xfer_limit (struct target_ops *self)
+{
+  self = self->beneath;
+  return self->to_get_memory_xfer_limit (self);
+}
+
+static ULONGEST
+debug_get_memory_xfer_limit (struct target_ops *self)
+{
+  ULONGEST result;
+  fprintf_unfiltered (gdb_stdlog, "-> %s->to_get_memory_xfer_limit (...)\n", debug_target.to_shortname);
+  result = debug_target.to_get_memory_xfer_limit (&debug_target);
+  fprintf_unfiltered (gdb_stdlog, "<- %s->to_get_memory_xfer_limit (", debug_target.to_shortname);
+  target_debug_print_struct_target_ops_p (&debug_target);
+  fputs_unfiltered (") = ", gdb_stdlog);
+  target_debug_print_ULONGEST (result);
+  fputs_unfiltered ("\n", gdb_stdlog);
+  return result;
+}
+
 static VEC(mem_region_s) *
 delegate_memory_map (struct target_ops *self)
 {
@@ -4223,6 +4244,8 @@ install_delegators (struct target_ops *ops)
     ops->to_get_thread_local_address = delegate_get_thread_local_address;
   if (ops->to_xfer_partial == NULL)
     ops->to_xfer_partial = delegate_xfer_partial;
+  if (ops->to_get_memory_xfer_limit == NULL)
+    ops->to_get_memory_xfer_limit = delegate_get_memory_xfer_limit;
   if (ops->to_memory_map == NULL)
     ops->to_memory_map = delegate_memory_map;
   if (ops->to_flash_erase == NULL)
@@ -4454,6 +4477,7 @@ install_dummy_methods (struct target_ops *ops)
   ops->to_goto_bookmark = tdefault_goto_bookmark;
   ops->to_get_thread_local_address = tdefault_get_thread_local_address;
   ops->to_xfer_partial = tdefault_xfer_partial;
+  ops->to_get_memory_xfer_limit = default_get_memory_xfer_limit;
   ops->to_memory_map = tdefault_memory_map;
   ops->to_flash_erase = tdefault_flash_erase;
   ops->to_flash_done = tdefault_flash_done;
@@ -4610,6 +4634,7 @@ init_debug_target (struct target_ops *ops)
   ops->to_goto_bookmark = debug_goto_bookmark;
   ops->to_get_thread_local_address = debug_get_thread_local_address;
   ops->to_xfer_partial = debug_xfer_partial;
+  ops->to_get_memory_xfer_limit = debug_get_memory_xfer_limit;
   ops->to_memory_map = debug_memory_map;
   ops->to_flash_erase = debug_flash_erase;
   ops->to_flash_done = debug_flash_done;
diff --git a/gdb/target.c b/gdb/target.c
index 6f69ac3..57202b4 100644
--- a/gdb/target.c
+++ b/gdb/target.c
@@ -60,6 +60,8 @@ static int default_region_ok_for_hw_watchpoint (struct target_ops *,
 
 static void default_rcmd (struct target_ops *, const char *, struct ui_file *);
 
+static ULONGEST default_get_memory_xfer_limit (struct target_ops *self);
+
 static ptid_t default_get_ada_task_ptid (struct target_ops *self,
 					 long lwp, long tid);
 
@@ -623,6 +625,17 @@ default_terminal_info (struct target_ops *self, const char *args, int from_tty)
   printf_unfiltered (_("No saved terminal information.\n"));
 }
 
+/* The default implementation for the to_get_memory_xfer_limit method.
+   The hard-coded limit here was determined to be a reasonable default
+   that eliminated exponential slowdown on very large transfers without
+   unduly compromising performance on smaller transfers.  */
+
+static ULONGEST
+default_get_memory_xfer_limit (struct target_ops *self)
+{
+  return 4096;
+}
+
 /* A default implementation for the to_get_ada_task_ptid target method.
 
    This function builds the PTID by using both LWP and TID as part of
@@ -1301,8 +1314,9 @@ memory_xfer_partial (struct target_ops *ops, enum target_object object,
 	 by memory_xfer_partial_1.  We will continually malloc
 	 and free a copy of the entire write request for breakpoint
 	 shadow handling even though we only end up writing a small
-	 subset of it.  Cap writes to 4KB to mitigate this.  */
-      len = min (4096, len);
+	 subset of it.  Cap writes to a limit specified by the target
+	 to mitigate this.  */
+      len = min (ops->to_get_memory_xfer_limit (ops), len);
 
       buf = (gdb_byte *) xmalloc (len);
       old_chain = make_cleanup (xfree, buf);
diff --git a/gdb/target.h b/gdb/target.h
index 6b5b6e0..84f12a9 100644
--- a/gdb/target.h
+++ b/gdb/target.h
@@ -745,6 +745,12 @@ struct target_ops
 						ULONGEST *xfered_len)
       TARGET_DEFAULT_RETURN (TARGET_XFER_E_IO);
 
+    /* Return the limit on the size of any single memory transfer
+       for the target.  */
+
+    ULONGEST (*to_get_memory_xfer_limit) (struct target_ops *)
+      TARGET_DEFAULT_FUNC (default_get_memory_xfer_limit);
+
     /* Returns the memory map for the target.  A return value of NULL
        means that no memory map is available.  If a memory address
        does not fall within any returned regions, it's assumed to be
-- 
2.8.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]