This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [PATCH] fix some bugs in python completion code

From: Sergio Durigan Junior <sergiodj at redhat dot com>
To: wapiflapi <wapiflapi at yahoo dot fr>
Cc: gdb-patches at sourceware dot org
Date: Sun, 09 Nov 2014 02:17:20 -0500
Subject: Re: [PATCH] fix some bugs in python completion code
Authentication-results: sourceware.org; auth=none
References: <545D458E dot 1010405 at yahoo dot fr>
On Friday, November 07 2014, wapiflapi@yahoo.fr wrote:

> The attached patch fixes several issues with auto-completion for gdb
> python plugins. I cc-ed sergiodj because he contributed most of the
> stuff this is touching.

Thanks for the patch, Wannes.  This code is a mess, it takes a bit of
craziness to hack it :-P.

I do have a few comments to make about the code, and few requests too.

The first and perhaps most important one is that you break your patch
into logical patches; this way, it makes the review process much easier!
In this case, you could have submitted 3 separate patches, each one with
its own explanation.

The second one is that it would be very good if you could send updates
to the testcase too.  I am attaching a patch in this e-mail that updates
the gdb.python/py-completion.exp test in order to catch the second and
third issues pointed by you.  Feel free to extend the test if you want.

> First issue: sizeof/strlen confusion.
>
> The first issue is a simple confusion between sizeof and strlen when
> converting the current word to Unicode. This causes the python plugin
> to only see those first bytes that fit into the size of a pointer and
> returns junks if the word happens to be shorter.
>
> This wasn't picked up in the testsuite because we never use the `word`
> argument and always rely on `text`.

This one is obvious enough that it can go in without waiting for your
copyright assignment process to be completed (we have a rule that allows
small contributions to be accepted even if the person doesn't have a
copyright assignment on file).  Thanks for catching this.  I will commit
it for you ASAP.

> Second issue: incorrect value passed to python as word argument.
>
> The second issue this patch addresses is harder to understand. The way
> auto-completion works requires to first determine the break-characters
> that will be used and only then run the actual completion
> code. Because a completer implemented in python can request that an
> existing gdb completer be used instead, we don't know for sure which
> characters should be considered breaking before running the actual
> completion. To handle this the python completer is called in the very
> beginning and its result cached for when the actual completion should
> be returned.
>
> The issue is that the python completer is given the arguments received
> by the function that is supposed to set the breaking characters. At
> that point the `word` parameter isn't supposed to be used and is
> always the same as `text`. This is normal because it can't yet be
> known. Because of this the python completer sees the wrong value for
> word`. Actually it always receives the same value as for `text`
> (basically the whole line).
>
> To fix this we choose a set of default breaking characters that will
> be used id they are not overwritten which happens if another completer
> ends up being used. Since we set those default breaking characters we
> are able to anticipate and compute the word our-self so that we are
> able to pass it on to python.
>
> This wasn't picked up in the testsuite because we never use the `word`
> argument and always rely on `text`.

Heh, this code is rocket science.  I don't even know where to start.

You are right, the "word" argument is not being passed correctly to the
Python completer.  On the other hand, I am not sure we should use
filename_completer every time in this case.  Your suggestion of letting
the user decide is good, but it will increase the complexity too
much...  Oh, well.  It seems your code is the best we can do for now, so
I say we should use it.  I have a few comments below, anyway.

> Third issue: case-specific duplicate code not repeated enough times.
>
> There is some code handling the specific cases of filename and
> location completion that changes the `text` pointer before calling
> those completers. It is repeated twice, in both cases when a direct
> call to a completer is about to be made. The problem is this isn't
> done in the case it is the python completer delegating to one of
> those. The proposed patch moves this code to the filename and location
> completers themselves. This removes the duplicate code and ensures
> this should work no matter how the completers are called.
>
> This can be tested using testsuite/gdb.python/py-completion.py and the
> completefilemethod testcase. Completing the first argument as a file
> works fine but the second argument fails to auto-complete. This is
> because the filename completer received the whole line instead of just
> the last word in this case. With the patch this works fine.

This one was my fault; I forgot to replicate the code in the new code I
made.  Comments below.

> Using python2.7 or python3.4 this patch doesn't break any of the tests
> in the testsuite/gdb.python;

Since you are touching code that is not Python-specific, you should also
run a full regression test.

> ChangeLog:
>
> 2014-11-07  Wannes Rombouts  <wapiflapi@yahoo.fr>
>
> 	* gdb/completer.c (refactor): Move code preparing for

Your lines here are short; you can use longer lines (76 chars is a good
soft limit).  You also need to mention the functions that you touched in
the file, i.e., the "refactor" text is wrong.

> 	completers to the completers themselves to ensure it is
> 	always called.
> 	* gdb/python/py-cmd.c (cmdpy_completer_helper): Fix bug
> 	caused by call to sizeof instead of strlen.

You can write the text like:

  Call strlen instead of sizeof when passing arguments to
  PyUnicode_Decode.

"Fix bug" is too vague, IMO.

> 	(cmdpy_completer_handle_brkchars): Compute current word
> 	and pass it to python as specified in the documentation.

s/python/Python/

> diff --git a/gdb/completer.c b/gdb/completer.c
> index a0f3fa3..95fb6d8 100644
> --- a/gdb/completer.c
> +++ b/gdb/completer.c
> @@ -118,6 +118,23 @@ filename_completer (struct cmd_list_element *ignore,
>    int subsequent_name;
>    VEC (char_ptr) *return_val = NULL;
>  
> +  const char *start;

No need for newline here.  The declaration can go just below the other
variables.

> +
> +  start = text;

You can initialize "start" in the declaration.

> +  /* Many commands which want to complete on
> +     file names accept several file names, as
> +     in "run foo bar >>baz".  So we don't want
> +     to complete the entire text after the
> +     command, just the last word.  To this
> +     end, we need to find the beginning of the
> +     file name by starting at `word' and going
> +     backwards.  */

This comment is using short lines, but it can be expanded.  If you are
using Emacs, just M-q on it.

> +  for (text = word;
> +       text > start
> +	 && strchr (gdb_completer_file_name_break_characters, text[-1]) == NULL;
> +       text--)
> +    ;
> +
>    subsequent_name = 0;
>    while (1)
>      {
> @@ -197,6 +214,17 @@ location_completer (struct cmd_list_element *ignore,
>    const char *orig_text = text;
>    size_t text_len;
>  
> +  const char *start;
> +
> +  start = text;
> +  /* Commands which complete on locations want to
> +     see the entire argument.  */
> +  for (text = word;
> +       text > start
> +	 && text[-1] != ' ' && text[-1] != '\t';
> +       text--)
> +    ;

Likewise (both comments above).

> +
>    /* Do we have an unquoted colon, as in "break foo.c:bar"?  */
>    for (p = text; *p != '\0'; ++p)
>      {
> @@ -659,42 +687,17 @@ complete_line_internal (const char *text,
>  		  rl_completer_word_break_characters =
>  		    gdb_completer_command_word_break_characters;
>  		}
> -	      else
> +	      else if (reason == handle_brkchars)
>  		{
> -		  /* It is a normal command; what comes after it is
> -		     completed by the command's completer function.  */
>  		  if (c->completer == filename_completer)
> -		    {
> -		      /* Many commands which want to complete on
> -			 file names accept several file names, as
> -			 in "run foo bar >>baz".  So we don't want
> -			 to complete the entire text after the
> -			 command, just the last word.  To this
> -			 end, we need to find the beginning of the
> -			 file name by starting at `word' and going
> -			 backwards.  */
> -		      for (p = word;
> -			   p > tmp_command
> -			     && strchr (gdb_completer_file_name_break_characters, p[-1]) == NULL;
> -			   p--)
> -			;
> -		      rl_completer_word_break_characters =
> -			gdb_completer_file_name_break_characters;
> -		    }
> -		  else if (c->completer == location_completer)
> -		    {
> -		      /* Commands which complete on locations want to
> -			 see the entire argument.  */
> -		      for (p = word;
> -			   p > tmp_command
> -			     && p[-1] != ' ' && p[-1] != '\t';
> -			   p--)
> -			;
> -		    }
> -		  if (reason == handle_brkchars
> -		      && c->completer_handle_brkchars != NULL)
> +		    rl_completer_word_break_characters =
> +		      gdb_completer_file_name_break_characters;
> +		  if (c->completer_handle_brkchars != NULL)
>  		    (*c->completer_handle_brkchars) (c, p, word);
> -		  if (reason != handle_brkchars && c->completer != NULL)
> +		}
> +	      else if (reason == handle_completions)
> +		{
> +		  if (c->completer != NULL)
>  		    list = (*c->completer) (c, p, word);
>  		}
>  	    }
> @@ -743,34 +746,18 @@ complete_line_internal (const char *text,
>  	      if (reason != handle_brkchars)
>  		list = complete_on_enum (c->enums, p, word);
>  	    }
> -	  else
> +
> +	  else if (reason == handle_brkchars)
>  	    {
> -	      /* It is a normal command.  */
>  	      if (c->completer == filename_completer)
> -		{
> -		  /* See the commentary above about the specifics
> -		     of file-name completion.  */
> -		  for (p = word;
> -		       p > tmp_command
> -			 && strchr (gdb_completer_file_name_break_characters, 
> -				    p[-1]) == NULL;
> -		       p--)
> -		    ;
> -		  rl_completer_word_break_characters =
> -		    gdb_completer_file_name_break_characters;
> -		}
> -	      else if (c->completer == location_completer)
> -		{
> -		  for (p = word;
> -		       p > tmp_command
> -			 && p[-1] != ' ' && p[-1] != '\t';
> -		       p--)
> -		    ;
> -		}
> -	      if (reason == handle_brkchars
> -		  && c->completer_handle_brkchars != NULL)
> +		rl_completer_word_break_characters =
> +		  gdb_completer_file_name_break_characters;
> +	      if (c->completer_handle_brkchars != NULL)
>  		(*c->completer_handle_brkchars) (c, p, word);
> -	      if (reason != handle_brkchars && c->completer != NULL)
> +	    }
> +	  else if (reason == handle_completions)
> +	    {
> +	      if (c->completer != NULL)
>  		list = (*c->completer) (c, p, word);

I looked at the code, and your modifications seemed like the right thing
to do, indeed.  Also, they make the code better to read, which is a
blessing for this specific portion of GDB...

>  	    }
>  	}
> diff --git a/gdb/python/py-cmd.c b/gdb/python/py-cmd.c
> index 5fc656e..6ebfa5b 100644
> --- a/gdb/python/py-cmd.c
> +++ b/gdb/python/py-cmd.c
> @@ -28,6 +28,9 @@
>  #include "completer.h"
>  #include "language.h"
>  
> +/* Needed for rl_completer_word_break_characters */
> +#include "readline/readline.h"
> +
>  /* Struct representing built-in completion types.  */
>  struct cmdpy_completer
>  {
> @@ -269,7 +272,7 @@ cmdpy_completer_helper (struct cmd_list_element *command,
>        textobj = PyUnicode_Decode (text, strlen (text), host_charset (), NULL);
>        if (textobj == NULL)
>  	error (_("Could not convert argument to Python string."));
> -      wordobj = PyUnicode_Decode (word, sizeof (word), host_charset (), NULL);
> +      wordobj = PyUnicode_Decode (word, strlen (word), host_charset (), NULL);

As explained above, I will prepare a patch and check this in as obvious.

>        if (wordobj == NULL)
>  	{
>  	  Py_DECREF (textobj);
> @@ -306,6 +309,28 @@ cmdpy_completer_handle_brkchars (struct cmd_list_element *command,
>  
>    cleanup = ensure_python_env (get_current_arch (), current_language);
>  
> +  /* The following line sets the default break characters in case the
> +     python complete method chooses to do the completion itself.
> +     This is optional and can be left out to keep the language defaults.
> +
> +     TODO: We should let the user be able to choose and/or know this.
> +
> +     Because if a python completer chooses to do the completion instead
> +     of relying on for example COMPLETE_EXPRESSION, chances are it is
> +     doing something which expects more <shell> like arguments. We use
> +     the same reduced set of breaking characters for this as for files.
> +  */

What about:

  /* We have to decide which word break characters to use in the code
     below, because the user might want to do the completion by
     herself, and we have to adjust "word" accordingly.  Since there is
     no easy way to determine which kind of completion the user might
     do, we decide to use the set of word break characters from
     filename_completer, which provides a good flexibility to set the
     boundaries of "word" here.

     FIXME: We should allow the user to choose this setting.  */

If you decide to tweak your comment: you have to put two spaces after
the period in your sentences.

> +
> +  set_gdb_completion_word_break_characters (filename_completer);

Hm, I don't like this.  Here, you are basically interested in obtaining
the word break chars related to filename_completer, in order to adjust
"word" properly below.  Therefore, I don't think it's a good idea to
mess with rl_completer_word_break_characters just for that; this code is
fragile enough already...

The code below already exists in completer.c, but it is part of the
filename_completer function.  What if you moved this code to a function
there, and exported the function, so that you could call it from here?
It would be cleaner, self-contained, and would avoid code duplication.

Or, as you mentioned in the comment already, we could remove this call
to set_gdb_completion_word_break_characters and stick with what was set
before this function was called.  But I guess I would prefer the
approach above.

> +
> +  /* Because this is the first time we are called we received a dummy
> +     value for word (we aren't supposed to need it). Since we use it
> +     we need to compute its correct value ourselves. */

What about:

  /* We receive a dummy "word" here, which has the same value as "text".
     However, in order to define which kind of word break characters we
     will use, we call the Python "completer" method early.  This means
     that we need to adjust "word" here, because if the user wants to do
     her own completion using the Python "completer" method (i.e.,
     without using GDB's completers), she needs to receive the proper
     "word" now.  */

?

> +  word = text + strlen (text);
> +  while (word > text
> +	 && strchr (rl_completer_word_break_characters, word[-1]) == NULL)
> +    word--;
> +
>    /* Calling our helper to obtain the PyObject of the Python
>       function.  */
>    resultobj = cmdpy_completer_helper (command, text, word, 1);

Below is the patch that tests the issues you pointed.  I would
appreciate if you could give it a try.  You can also include it when you
resubmit your patches.

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

2014-11-09  Sergio Durigan Junior  <sergiodj@redhat.com>

	* gdb.python/py-completion.exp: Adjust matching variables to
	include regex operands.  Add new tests for completions with
	more than one directory in the same line, and for completions made
	by the Python "completer" method itself.
	* gdb.python/py-completion.py (CompleteFileCommandCond2): New
	class, which completes commands by itself without relying on GDB.
	Instantiate the class.

diff --git a/gdb/testsuite/gdb.python/py-completion.exp b/gdb/testsuite/gdb.python/py-completion.exp
index a3bac8b..7e37a66 100644
--- a/gdb/testsuite/gdb.python/py-completion.exp
+++ b/gdb/testsuite/gdb.python/py-completion.exp
@@ -27,14 +27,15 @@ gdb_test_no_output "source ${srcdir}/${subdir}/${testfile}.py"
 
 # Create a temporary directory
 set testdir "[standard_output_file "py-completion-testdir"]/"
-set testdir_regex [string_to_regexp $testdir]
+set testdir_regex [string_to_regexp "$testdir"]
 set testdir_complete [standard_output_file "py-completion-test"]
+set testdir_complete_regex [string_to_regexp "$testdir_complete"]
 file mkdir $testdir
 
 # This one should always pass.
 send_gdb "completefileinit ${testdir_complete}\t"
 gdb_test_multiple "" "completefileinit completion" {
-    -re "^completefileinit ${testdir_regex}$" {
+    -re "^completefileinit ${testdir}$" {
         pass "completefileinit completion"
     }
 }
@@ -60,10 +61,35 @@ gdb_test " " ".*" "discard #2"
 set completion_regex "[string_to_regexp [standard_output_file "py-completion-t"]]"
 send_gdb "completefilecommandcond [standard_output_file "py-completion-t\t"]"
 gdb_test_multiple "" "completefilecommandcond completion" {
-    -re "^completefilecommandcond ${testdir}$" {
+    -re "^completefilecommandcond ${testdir_regex}$" {
 	fail "completefilecommandcond completion (completed filename instead of command)"
     }
     -re "^completefilecommandcond ${completion_regex}\007$" {
 	pass "completefilecommandcond completion"
     }
 }
+
+gdb_test " " ".*" "discard #3"
+
+# Completing two directories consecutively
+send_gdb "completefilemethod ${testdir} ${testdir_complete}\t"
+gdb_test_multiple "" "completefilemethod completion for two dirs" {
+    -re "^completefilemethod ${testdir_regex} ${testdir_complete_regex}\007$" {
+	fail "completefilemethod completion (did not complete two filenames consecutively)"
+    }
+    -re "^completefilemethod ${testdir_regex} ${testdir_regex}$" {
+	pass "completefilemethod completion (two filenames consecutively)"
+    }
+}
+
+gdb_test " " ".*" "discard #4"
+
+send_gdb "completefilecommandcond2 testcompl\t"
+gdb_test_multiple "" "completefilecommandcond2 completion" {
+    -re "^completefilecommandcond2 testcompl\007$" {
+	fail "completefilecommandcond2 complete command based on word"
+    }
+    -re "^completefilecommandcond2 testcompletion $" {
+	pass "completefilecommandcond2 complete command based on word"
+    }
+}
diff --git a/gdb/testsuite/gdb.python/py-completion.py b/gdb/testsuite/gdb.python/py-completion.py
index 23592d0..cdb58ac 100644
--- a/gdb/testsuite/gdb.python/py-completion.py
+++ b/gdb/testsuite/gdb.python/py-completion.py
@@ -53,6 +53,21 @@ class CompleteFileCommandCond(gdb.Command):
 		else:
 			return gdb.COMPLETE_FILENAME
 
+class CompleteFileCommandCond2(gdb.Command):
+	def __init__(self):
+		gdb.Command.__init__(self,'completefilecommandcond2',gdb.COMMAND_USER)
+
+	def invoke(self,argument,from_tty):
+		raise gdb.GdbError('not implemented')
+
+	def complete(self,text,word):
+		if word == "testcompl":
+			return [ "testcompletion" ]
+		else:
+			return false
+
+
 CompleteFileInit()
 CompleteFileMethod()
 CompleteFileCommandCond()
+CompleteFileCommandCond2()
References:
- [PATCH] fix some bugs in python completion code
  - From: wapiflapi
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]