[patch][rfc] Implement Python lazy strings (PR 10705)
Tom Tromey
tromey@redhat.com
Mon Dec 21 20:06:00 GMT 2009
>>>>> "Phil" == Phil Muldoon <pmuldoon@redhat.com> writes:
Phil> I wanted to send this patch out before the holidays for initial
Phil> review. This patch implements Python lazy strings, and also
Phil> alters the Python pretty-printing process to handle them
Phil> accordingly.
This looks quite nice.
I think you should submit the next revision of this patch upstream. In
general I think that fixes and modifications to Python code should go
upstream if the underlying functionality is already upstream.
Phil> +If the optional @var{encoding} argument is given, it must be a string
Phil> +naming the encoding of the string in the @code{gdb.LazyString}, such as
Phil> +@code{"ascii"}, @code{"iso-8859-6"} or @code{"utf-8"}. When a lazy
Phil> +string is printed, the @value{GDBN} codec machinery is used to convert
Phil> +the string during printing. If @var{encoding} is not given, or if
Phil> +@var{encoding} is an empty string, then @var{encoding} is set to
Phil> +NULL. When @value{GDBN} attempts to print a @code{gdb.LazyString}
Phil> +where @var{encoding} is set to NULL, @value{GDBN} will automatically
Phil> +select the encoding most suitable for the string type.
I think this paragraph should be reorganized a little. The "When a lazy
string is printed" sentence should probably be in its own paragraph.
Phil> +If the optional @var{length} argument is given, the string will be
Phil> +fetched and encoded to the given length.
This should mention whether the length is in chars or bytes, and what
happens if no length is given.
Phil> +@defivar LazyString length
Phil> +This attribute holds the length of the string. This attribute is not
Phil> +writable.
Likewise.
Phil> +@defivar LazyString encoding
Phil> +This attribute holds the encoding that will be applied to the string
Phil> +when the string is printed by @value{GDBN}. This attribute is not
Phil> +writable.
This should be None if there is no encoding.
Phil> +@defivar LazyString type
Phil> +This attribute holds the type that is associated with the string.
This should mention if it is the string's type or the underlying
character type.
Phil> +typedef struct {
Phil> + PyObject_HEAD
Phil> + CORE_ADDR address;
Phil> + char *encoding;
Phil> + long length;
Phil> + struct type *type;
Phil> +} lazy_string_object;
I try to put comments on each local field (not HEAD) indicating the
meaning of the field. This could mirror the docs, more or less.
Phil> +static PyObject *
Phil> +stpy_get_encoding (PyObject *self, void *closure)
Phil> +{
Phil> + lazy_string_object *self_string = (lazy_string_object *) self;
Phil> +
Phil> + /* An encoding can be set to NULL by the user, so check before
Phil> + attempting a Python FromString call. */
Phil> + if (self_string->encoding)
Phil> + return PyString_FromString (self_string->encoding);
Phil> +
Phil> + return NULL;
I think this should return None if there is no encoding.
Phil> +PyObject *
Phil> +stpy_get_type (PyObject *self, void *closure)
Phil> +{
Phil> + lazy_string_object *obj = (lazy_string_object *) self;
Phil> + PyObject *type;
Phil> +
Phil> + if (obj->type)
Phil> + type = type_to_type_object (obj->type);
Phil> + else
Phil> + return NULL;
I think the lack of a type should be detected and rejected when the
object is constructed.
BTW it isn't generally valid to just return NULL in a Python method.
You must set the error first.
Phil> +PyObject *
Phil> +create_lazy_string_object (CORE_ADDR address, long length,
Phil> + const char *encoding, struct type *type)
I think we usually use some py_ name for exported APIs.
Phil> + str_obj = PyObject_New (lazy_string_object, &lazy_string_object_type);
Phil> + if (!str_obj)
Phil> + {
Phil> + PyErr_SetString (PyExc_MemoryError,
Phil> + "Could not allocate lazy string object.");
Phil> + return NULL;
Phil> + }
I think if PyObject_New returns NULL, then it has probably already set
the error, and you can simply return NULL without setting it again.
Phil> + if (address == 0)
Phil> + {
Phil> + PyErr_SetString (PyExc_MemoryError,
Phil> + "Cannot create a lazy string from a GDB-side string.");
Phil> + return NULL;
Phil> + }
Do this check first, before making a new object.
Otherwise, this leaks the new object.
Phil> + str_obj->address = address;
Phil> + str_obj->length = length;
Phil> + if (encoding == NULL || strcmp (encoding, ""))
You probably meant !strcmp here.
Phil> + type_incref (type);
Upstream doesn't have this stuff FWIW.
Hopefully we'll fix it up next year...
Phil> + val = value_at_lazy (self_string->type, self_string->address);
I guess this means the lazy string holds the string type, not the
character type.
Phil> + xfree (((lazy_string_object *) self)->encoding);
Phil> + type_decref (((lazy_string_object *) self)->type);
Introduce a temporary to avoid duplicating the ugly cast.
Phil> +/* Determine whether the printer object pointed to by OBJ is a
Phil> + Python lazy string. */
Phil> +int is_lazy_string (PyObject *result)
Newline after "int".
Probably should have a py_ name for this and for extract_lazy_string.
Phil> +gdb_byte *
Phil> +extract_lazy_string (PyObject *string, struct type
Phil> + **str_type, long *length,
Don't line break before "**str_type", that looks weird.
Phil> + errcode = read_string (value_address (output), *length, width,
Phil> + *length, byte_order, &buffer,
Phil> + &bytes_read);
This seems fishy. Doesn't this mean we end up reading the full contents
of the string?
I looked a little and it seems this is what gdb already does.
I guess I am confused about something here.
I suppose this does solve the decoding problem nicely, just not the full
laziness problem.
Also, note that read_string will allocate 'buffer' even on error (not on
exception though). So this should free buffer if read_string returns an
error.
And, read_string can throw, so you need a try-catch here.
Phil> + if (is_lazy_string (py_v))
Phil> + {
Phil> + output = extract_lazy_string (py_v, &type, &length, &encoding);
Phil> + xfree (encoding);
Phil> + }
[...]
Phil> + fputs_filtered (output, stream);
It isn't ok to print bytes from the target like this.
They have to be converted to the host encoding.
Phil> +static PyObject *
Phil> +valpy_lazy_string (PyObject *self, PyObject *args, PyObject *kw)
[...]
Phil> + str_obj = create_lazy_string_object (value_address (value), length,
Phil> + user_encoding, value_type (value));
Phil> + if (str_obj)
Phil> + Py_INCREF (str_obj);
I think create_lazy_string_object returns a new reference.
That means this incref is wrong.
Tom
More information about the Archer
mailing list