Bug 26901 - Array subscript fails with flexible array member without size
Summary: Array subscript fails with flexible array member without size
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: 10.1
: P2 normal
Target Milestone: 10.2
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-11-15 02:47 UTC by Simon Marchi
Modified: 2021-11-25 13:58 UTC (History)
12 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Marchi 2020-11-15 02:47:00 UTC
Consider this program:

---
#include <stdlib.h>

struct vectorinox
{
  int size;
  int data[];
};

int main (void)
{
    /* Make a vector of three elements.  */
    struct vectorinox *vector = malloc (sizeof (struct vectorinox) + sizeof(int) * 3);
    vector->size = 3;
    vector->data[0] = 11;
    vector->data[1] = 22;
    vector->data[2] = 33;

    return 0;
}
---

Trying to access an element of the `data` array in Python yields:

>>> vec = gdb.parse_and_eval('vector')
>>> data = vec['data']
>>> print(data[0])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
gdb.MemoryError: Cannot access memory at address 0x5535555592a0

And with GDB 9.2:

>>> vec = gdb.parse_and_eval('vector')
>>> data = vec['data']
>>> print(data[0])
11

This failure was introduced by commit 7c6f27129631 ("gdb: make get_discrete_bounds check for non-constant range bounds").  Unfortunately, this commit doesn't build, but it's trivial to fix if you want to try it: just remove the parenthesis after `kind` that it introduces.

val_subscript passes the array's index type (of type code TYPE_CODE_RANGE) to get_discrete_bounds.  The index type has the low bound set to constant 0 and the high bound unknown.  Before the commit, get_discrete_bounds would return "success" and set the low and high bound to 0.  Although it's a bit by chance that it returned 0 for the high bound, since the bound was "unknown".  It doesn't really matter in that case because the high bound doesn't get used by the caller.

After the commit, the new check in get_discrete_bounds sees that the high bound isn't a constant, so returns "failure".  However, val_subscript doesn't check the return value, and uses the uninitialized values of low and high bounds, and it goes downhill from there.
Comment 1 Simon Marchi 2020-11-15 02:50:10 UTC
Actually, it doesn't to be from Python, as value_subscript is used by everything.  It fails on the command line too:

(gdb) p vector.data [1]
Cannot access memory at address 0x5545555592a4
Comment 2 Simon Marchi 2020-12-03 19:01:16 UTC
Proposed patch here: https://sourceware.org/pipermail/gdb-patches/2020-November/173496.html
Comment 3 Sourceware Commits 2020-12-09 18:53:07 UTC
The master branch has been updated by Simon Marchi <simark@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=5b56203a7cadd545b33713e98e274e582242e090

commit 5b56203a7cadd545b33713e98e274e582242e090
Author: Simon Marchi <simon.marchi@efficios.com>
Date:   Wed Dec 9 13:52:12 2020 -0500

    gdb: fix value_subscript when array upper bound is not known
    
    Since commit 7c6f27129631 ("gdb: make get_discrete_bounds check for
    non-constant range bounds"), subscripting  flexible array member fails:
    
        struct no_size
        {
          int n;
          int items[];
        };
    
        (gdb) p *ns
        $1 = {n = 3, items = 0x5555555592a4}
        (gdb) p ns->items[0]
        Cannot access memory at address 0xfffe555b733a0164
        (gdb) p *((int *) 0x5555555592a4)
        $2 = 101  <--- we would expect that
        (gdb) p &ns->items[0]
        $3 = (int *) 0xfffe5559ee829a24  <--- wrong address
    
    Since the flexible array member (items) has an unspecified size, the array type
    created for it in the DWARF doesn't have dimensions (this is with gcc 9.3.0,
    Ubuntu 20.04):
    
        0x000000a4:   DW_TAG_array_type
                        DW_AT_type [DW_FORM_ref4]       (0x00000038 "int")
                        DW_AT_sibling [DW_FORM_ref4]    (0x000000b3)
    
        0x000000ad:     DW_TAG_subrange_type
                          DW_AT_type [DW_FORM_ref4]     (0x00000031 "long unsigned int")
    
    This causes GDB to create a range type (TYPE_CODE_RANGE) with a defined
    constant low bound (dynamic_prop with kind PROP_CONST) and an undefined
    high bound (dynamic_prop with kind PROP_UNDEFINED).
    
    value_subscript gets both bounds of that range using
    get_discrete_bounds.  Before commit 7c6f27129631, get_discrete_bounds
    didn't check the kind of the dynamic_props and would just blindly read
    them as if they were PROP_CONST.  It would return 0 for the high bound,
    because we zero-initialize the range_bounds structure.  And it didn't
    really matter in this case, because the returned high bound wasn't used
    in the end.
    
    Commit 7c6f27129631 changed get_discrete_bounds to return a failure if
    either the low or high bound is not a constant, to make sure we don't
    read a dynamic prop that isn't a PROP_CONST as a PROP_CONST.  This
    change made get_discrete_bounds start to return a failure for that
    range, and as a result would not set *lowp and *highp.  And since
    value_subscript doesn't check get_discrete_bounds' return value, it just
    carries on an uses an uninitialized value for the low bound.  If
    value_subscript did check the return value of get_discrete_bounds, we
    would get an error message instead of a bogus value.  But it would still
    be a bug, as we wouldn't be able to print the flexible array member's
    elements.
    
    Looking at value_subscript, we see that the low bound is always needed,
    but the high bound is only needed if !c_style.  So, change
    value_subscript to use get_discrete_low_bound and
    get_discrete_high_bound separately.  This fixes the case described
    above, where the low bound is known but the high bound isn't (and is not
    needed).  This restores the original behavior without accessing a
    dynamic_prop in a wrong way.
    
    A test is added.  In addition to the case described above, a case with
    an array member of size 0 is added, which is a GNU C extension that
    existed before flexible array members were introduced.  That case
    currently fails when compiled with gcc <= 8.  gcc <= 8 produces DWARF
    similar to the one shown above, while gcc 9 adds a DW_AT_count of 0 in
    there, which makes the high bound known.  A case where an array member
    of size 0 is the only member of the struct is also added, as that was
    how PR 28675 was originally reported, and it's an interesting corner
    case that I think could trigger other funny bugs.
    
    Question about the implementation: in value_subscript, I made it such
    that if the low or high bound is unknown, we fall back to zero.  That
    effectively makes it the same as it was before 7c6f27129631.  But should
    we instead error() out?
    
    gdb/ChangeLog:
    
            PR 26875, PR 26901
            * gdbtypes.c (get_discrete_low_bound): Make non-static.
            (get_discrete_high_bound): Make non-static.
            * gdbtypes.h (get_discrete_low_bound): New declaration.
            (get_discrete_high_bound): New declaration.
            * valarith.c (value_subscript): Only fetch high bound if
            necessary.
    
    gdb/testsuite/ChangeLog:
    
            PR 26875, PR 26901
            * gdb.base/flexible-array-member.c: New test.
            * gdb.base/flexible-array-member.exp: New test.
    
    Change-Id: I832056f80e6c56f621f398b4780d55a3a1e299d7
Comment 4 Sourceware Commits 2020-12-09 21:34:25 UTC
The gdb-10-branch branch has been updated by Simon Marchi <simark@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=138084b24811c191a85c919116d839a8bd89d57c

commit 138084b24811c191a85c919116d839a8bd89d57c
Author: Simon Marchi <simon.marchi@efficios.com>
Date:   Wed Dec 9 13:52:12 2020 -0500

    gdb: fix value_subscript when array upper bound is not known
    
    Since commit 7c6f27129631 ("gdb: make get_discrete_bounds check for
    non-constant range bounds"), subscripting  flexible array member fails:
    
        struct no_size
        {
          int n;
          int items[];
        };
    
        (gdb) p *ns
        $1 = {n = 3, items = 0x5555555592a4}
        (gdb) p ns->items[0]
        Cannot access memory at address 0xfffe555b733a0164
        (gdb) p *((int *) 0x5555555592a4)
        $2 = 101  <--- we would expect that
        (gdb) p &ns->items[0]
        $3 = (int *) 0xfffe5559ee829a24  <--- wrong address
    
    Since the flexible array member (items) has an unspecified size, the array type
    created for it in the DWARF doesn't have dimensions (this is with gcc 9.3.0,
    Ubuntu 20.04):
    
        0x000000a4:   DW_TAG_array_type
                        DW_AT_type [DW_FORM_ref4]       (0x00000038 "int")
                        DW_AT_sibling [DW_FORM_ref4]    (0x000000b3)
    
        0x000000ad:     DW_TAG_subrange_type
                          DW_AT_type [DW_FORM_ref4]     (0x00000031 "long unsigned int")
    
    This causes GDB to create a range type (TYPE_CODE_RANGE) with a defined
    constant low bound (dynamic_prop with kind PROP_CONST) and an undefined
    high bound (dynamic_prop with kind PROP_UNDEFINED).
    
    value_subscript gets both bounds of that range using
    get_discrete_bounds.  Before commit 7c6f27129631, get_discrete_bounds
    didn't check the kind of the dynamic_props and would just blindly read
    them as if they were PROP_CONST.  It would return 0 for the high bound,
    because we zero-initialize the range_bounds structure.  And it didn't
    really matter in this case, because the returned high bound wasn't used
    in the end.
    
    Commit 7c6f27129631 changed get_discrete_bounds to return a failure if
    either the low or high bound is not a constant, to make sure we don't
    read a dynamic prop that isn't a PROP_CONST as a PROP_CONST.  This
    change made get_discrete_bounds start to return a failure for that
    range, and as a result would not set *lowp and *highp.  And since
    value_subscript doesn't check get_discrete_bounds' return value, it just
    carries on an uses an uninitialized value for the low bound.  If
    value_subscript did check the return value of get_discrete_bounds, we
    would get an error message instead of a bogus value.  But it would still
    be a bug, as we wouldn't be able to print the flexible array member's
    elements.
    
    Looking at value_subscript, we see that the low bound is always needed,
    but the high bound is only needed if !c_style.  So, change
    value_subscript to use get_discrete_low_bound and
    get_discrete_high_bound separately.  This fixes the case described
    above, where the low bound is known but the high bound isn't (and is not
    needed).  This restores the original behavior without accessing a
    dynamic_prop in a wrong way.
    
    A test is added.  In addition to the case described above, a case with
    an array member of size 0 is added, which is a GNU C extension that
    existed before flexible array members were introduced.  That case
    currently fails when compiled with gcc <= 8.  gcc <= 8 produces DWARF
    similar to the one shown above, while gcc 9 adds a DW_AT_count of 0 in
    there, which makes the high bound known.  A case where an array member
    of size 0 is the only member of the struct is also added, as that was
    how PR 28675 was originally reported, and it's an interesting corner
    case that I think could trigger other funny bugs.
    
    Question about the implementation: in value_subscript, I made it such
    that if the low or high bound is unknown, we fall back to zero.  That
    effectively makes it the same as it was before 7c6f27129631.  But should
    we instead error() out?
    
    gdb/ChangeLog:
    
            PR 26875, PR 26901
            * gdbtypes.c (get_discrete_low_bound): Make non-static.
            (get_discrete_high_bound): Make non-static.
            * gdbtypes.h (get_discrete_low_bound): New declaration.
            (get_discrete_high_bound): New declaration.
            * valarith.c (value_subscript): Only fetch high bound if
            necessary.
    
    gdb/testsuite/ChangeLog:
    
            PR 26875, PR 26901
            * gdb.base/flexible-array-member.c: New test.
            * gdb.base/flexible-array-member.exp: New test.
    
    Change-Id: I832056f80e6c56f621f398b4780d55a3a1e299d7
Comment 5 Simon Marchi 2020-12-09 22:11:55 UTC
This is now fixed.
Comment 6 Ahmed Sayeed 2021-06-27 18:00:22 UTC Comment hidden (spam)
Comment 7 Ucel Sani 2021-08-10 12:45:42 UTC Comment hidden (spam)
Comment 8 james rohan 2021-09-02 11:06:52 UTC Comment hidden (spam)
Comment 9 james robin 2021-09-06 09:09:07 UTC Comment hidden (spam)
Comment 10 Mehmet gelisin 2021-09-10 19:39:16 UTC Comment hidden (spam)
Comment 11 diheto 2021-09-22 10:19:09 UTC Comment hidden (spam)
Comment 12 Gulsen Engin 2021-10-09 11:00:39 UTC Comment hidden (spam)
Comment 14 progonsaytu 2021-10-19 07:15:22 UTC Comment hidden (spam)
Comment 15 glassmtech 2021-10-24 10:02:47 UTC Comment hidden (spam)
Comment 16 Adamson 2021-11-25 13:58:25 UTC Comment hidden (spam)