Bug 10705 - python traceback on bad string character decoding
Summary: python traceback on bad string character decoding
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: python (show other bugs)
Version: archer
: P2 normal
Target Milestone: 7.1
Assignee: Phil Muldoon
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-29 20:00 UTC by Mark Wielaard
Modified: 2010-01-15 23:44 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2009-10-19 20:06:39


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2009-09-29 20:00:39 UTC
GNU gdb (GDB) Fedora (6.8.50.20090302-38.fc11)

In the middle of printing a c++ class which contained a (uninitialized/bad)
string value I got the following python backtrace:
          module_name = Traceback (most recent call last):
          File "/usr/lib/python2.6/site-packages/gdb/libstdcxx/v6/printers.py",
line 453, in to_string
            return self.val['_M_dataplus']['_M_p'].string(encoding)
          File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode
            return codecs.utf_8_decode(input, errors, True)
        UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0:
unexpected code byte
It would be much nicer if it would just say <can't decode string> or something
similar.
Comment 1 Tom Tromey 2009-10-19 20:06:39 UTC
We could try playing with the "errors" parameter to Value.string.

Long term I think we want to have Value.string (or a replacement if
this cannot be done compatibly) return a Value that wraps a "lazy string".
This would require a small extension to struct value to let us
pass in the encoding (when known).

Additionally we'd want the pretty-printer string-printing code to
then respect "set print repeat", and generally work like the C string printer
when it comes to undecodable characters.
Comment 2 Tom Tromey 2009-11-24 17:17:00 UTC
Phil looked at changing struct value, but the val_print code makes
it too big.  We should probably get rid of val_print in favor of
value_print everywhere, but ...

Another approach to this problem would be to make a new Value.lazy_string
method.  This would return a Python object holding a pointer, type,
optional length, and optional encoding.  Then, change the code in
py-prettyprint.c to recognize these objects and print them via the
language string-printing function.  This would solve the immediate problems.
A new method may not be needed, if the new object can be made to look
sufficiently string-like to Python.
Comment 3 Sourceware Commits 2010-01-14 08:03:58 UTC
Subject: Bug 10705

CVSROOT:	/cvs/src
Module name:	src
Changes by:	pmuldoon@sourceware.org	2010-01-14 08:03:38

Modified files:
	gdb            : ChangeLog Makefile.in ada-lang.h ada-valprint.c 
	                 c-lang.c c-lang.h c-valprint.c expprint.c 
	                 f-lang.c f-valprint.c language.c language.h 
	                 m2-lang.c m2-valprint.c objc-lang.c p-lang.c 
	                 p-lang.h p-valprint.c scm-lang.c valprint.c 
	                 varobj.c 
	gdb/doc        : ChangeLog gdb.texinfo 
	gdb/python     : py-prettyprint.c py-value.c python-internal.h 
	                 python.c 
	gdb/testsuite  : ChangeLog 
	gdb/testsuite/gdb.python: py-mi.exp py-prettyprint.c 
	                          py-prettyprint.exp py-prettyprint.py 
	                          py-value.c py-value.exp 
Added files:
	gdb/python     : py-lazy-string.c 

Log message:
	2010-01-13  Phil Muldoon  <pmuldoon@redhat.com>
	
	PR python/10705
	
	* python/python-internal.h: Add lazy_string_object_type
	definition.
	(create_lazy_string_object, gdbpy_initialize_lazy_string)
	(gdbpy_is_lazystring, gdbpy_extract_lazy_string): Define.
	* python/py-value.c (valpy_lazy_string): New function.
	(convert_value_from_python): Add lazy string conversion.
	* python/py-prettyprint.c (pretty_print_one_value): Check if
	return is also a lazy string.
	(print_string_repr): Add lazy string printing branch.
	(print_children): Likewise.
	* python/py-lazy-string.c: New file. Implement lazy strings.
	* python/python.c (_initialize_python): Call
	gdbpy_initialize_lazy_string.
	* varobj.c (value_get_print_value): Add lazy string printing
	branch.  Account for encoding.
	* c-lang.c (c_printstr): Account for new encoding argument.  If
	encoding is NULL, find encoding suited for type, otherwise use
	user encoding.
	* language.h (language_defn): Add encoding argument.
	(LA_PRINT_STRING): Likewise.
	* language.c (unk_lang_printstr): Update to reflect new encoding
	argument to language_defn.
	* ada-lang.h (ada_printstr): Likewise.
	* c-lang.h (c_printstr): Likewise.
	* p-lang.h (pascal_printstr);
	* f-lang.c (f_printstr): Likewise.
	* m2-lang.c (m2_printstr): Likewise.
	* objc-lang.c (objc_printstr): Likewise.
	* p-lang.c (pascal_printstr): Likewise.
	* scm-lang.c (scm_printstr): Likewise.
	* c-valprint.c (c_val_print): Update LA_PRINT_STRING call for
	encoding argument.
	* ada-valprint.c (ada_printstr): Likewise.
	* f-valprint.c (f_val_print): Likewise
	* m2-valprint.c (m2_val_print): Likewise.
	* p-valprint.c (pascal_val_print): Likewise.
	* expprint.c (print_subexp_standard): Likewise.
	* valprint.c (val_print_string): Likewise.
	* Makefile.in (SUBDIR_PYTHON_OBS): Add py-lazy-string.
	(SUBDIR_PYTHON_SRCS): Likewise.
	(py-lazy-string.o): New rule.
	
	2010-01-13  Phil Muldoon  <pmuldoon@redhat.com>
	
	* gdb.texinfo (Values From Inferior): Document lazy_string value
	method.
	(Python API): Add Lazy strings menu item.
	(Lazy Strings In Python): New node.
	
	2010-01-13  Phil Muldoon  <pmuldoon@redhat.com>
	
	* gdb.python/py-value.exp (test_lazy_strings): Add lazy string test.
	* gdb.python/py-prettyprint.py (pp_ls): New printer.
	* gdb.python/py-prettyprint.exp (run_lang_tests): Add lazy string
	test.
	* gdb.python/py-prettyprint.c: Define lazystring test structure.
	* gdb.python/py-mi.exp: Add lazy string test.

Patches:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/ChangeLog.diff?cvsroot=src&r1=1.11242&r2=1.11243
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/Makefile.in.diff?cvsroot=src&r1=1.1108&r2=1.1109
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/ada-lang.h.diff?cvsroot=src&r1=1.50&r2=1.51
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/ada-valprint.c.diff?cvsroot=src&r1=1.60&r2=1.61
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/c-lang.c.diff?cvsroot=src&r1=1.79&r2=1.80
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/c-lang.h.diff?cvsroot=src&r1=1.25&r2=1.26
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/c-valprint.c.diff?cvsroot=src&r1=1.64&r2=1.65
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/expprint.c.diff?cvsroot=src&r1=1.40&r2=1.41
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/f-lang.c.diff?cvsroot=src&r1=1.58&r2=1.59
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/f-valprint.c.diff?cvsroot=src&r1=1.54&r2=1.55
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/language.c.diff?cvsroot=src&r1=1.93&r2=1.94
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/language.h.diff?cvsroot=src&r1=1.61&r2=1.62
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/m2-lang.c.diff?cvsroot=src&r1=1.53&r2=1.54
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/m2-valprint.c.diff?cvsroot=src&r1=1.25&r2=1.26
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/objc-lang.c.diff?cvsroot=src&r1=1.85&r2=1.86
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/p-lang.c.diff?cvsroot=src&r1=1.49&r2=1.50
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/p-lang.h.diff?cvsroot=src&r1=1.20&r2=1.21
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/p-valprint.c.diff?cvsroot=src&r1=1.66&r2=1.67
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/scm-lang.c.diff?cvsroot=src&r1=1.58&r2=1.59
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/valprint.c.diff?cvsroot=src&r1=1.89&r2=1.90
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/varobj.c.diff?cvsroot=src&r1=1.152&r2=1.153
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/doc/ChangeLog.diff?cvsroot=src&r1=1.993&r2=1.994
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/doc/gdb.texinfo.diff?cvsroot=src&r1=1.657&r2=1.658
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/python/py-lazy-string.c.diff?cvsroot=src&r1=NONE&r2=1.1
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/python/py-prettyprint.c.diff?cvsroot=src&r1=1.3&r2=1.4
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/python/py-value.c.diff?cvsroot=src&r1=1.6&r2=1.7
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/python/python-internal.h.diff?cvsroot=src&r1=1.18&r2=1.19
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/python/python.c.diff?cvsroot=src&r1=1.22&r2=1.23
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/testsuite/ChangeLog.diff?cvsroot=src&r1=1.2081&r2=1.2082
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/testsuite/gdb.python/py-mi.exp.diff?cvsroot=src&r1=1.4&r2=1.5
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/testsuite/gdb.python/py-prettyprint.c.diff?cvsroot=src&r1=1.3&r2=1.4
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/testsuite/gdb.python/py-prettyprint.exp.diff?cvsroot=src&r1=1.5&r2=1.6
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/testsuite/gdb.python/py-prettyprint.py.diff?cvsroot=src&r1=1.3&r2=1.4
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/testsuite/gdb.python/py-value.c.diff?cvsroot=src&r1=1.3&r2=1.4
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/testsuite/gdb.python/py-value.exp.diff?cvsroot=src&r1=1.4&r2=1.5

Comment 4 Tom Tromey 2010-01-14 19:47:49 UTC
Phil checked in the fix.
Comment 5 Phil Muldoon 2010-01-15 23:43:24 UTC
Adding GCC libstdc++ printer.py changes link.

http://gcc.gnu.org/viewcvs?view=revision&revision=155951
Comment 6 Phil Muldoon 2010-01-15 23:44:28 UTC
Re-closing after libstdc++ link.