This is the mail archive of the archer@sourceware.org mailing list for the Archer project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Calculating array length

From: Joost van der Sluis <joost at cnoc dot nl>
To: archer at sourceware dot org
Date: Sun, 07 Jun 2009 18:06:12 +0200
Subject: Re: Calculating array length
References: <1244370173.22994.14.camel@wsjoost> <20090607144745.GA21154@host0.dyn.jankratochvil.net>

Op zondag 07-06-2009 om 16:47 uur [tijdzone +0200], schreef Jan
Kratochvil:
> On Sun, 07 Jun 2009 12:22:53 +0200, Joost van der Sluis wrote:

> > First question is the calculation of the length of an array-type in
> > type_length_get(), gdbtypes.c. (I'm using the archer-jankratochvil-sla
> > branch to improve the handling of arrays)
> > 
> > The length is calculated as follows: (count-1)*byte_stride+element_size.
> > 
> > Ok, so why 'count-1'? Count is the actual item-count, why substract it
> > with one? And why is the element_size added? byte_stride (when defined)
> > should replace the element_size?
> > 
> > It doesn't make sense to me, as it should be: 'count*byte_stride'?
> 
> It depends which length do you want to calculate.  According to what you
> describe you want to set the FULL_SPAN parameter of type_length_get to be
> true.  Then it will really return 'count*byte_stride'.

My problem is that val_print_array_elements (in valprint.c) derrives the
amount of elements by dividing the TYPE_LENGTH(type) by
TYPE_LENGTH(eltype). (Both lengths are calculated by a call to
check_typedef which calls type_length_get). This is clearly wrong when
those lengths are calculated as done above.

It's solved very easily by removing that calculation and let the bounds
determine the size (that code is already there in an else-statement).
But I was wondering why the type_length was wrong.

How can I make it to use FULL_SPAN in this case?

> Still the function uses FULL_SPAN as true only internally.  When GDB wants to
> know the type length it uses it for transferring data from the debuggee memory
> into a local GDB copy (for its printing to the user etc.).  For such case we
> want a _minimal_ (but still complete) contiguous memory range.

Yes, I understand that. (Took some days but finally I found this out)

> Fortran example:
> 
> subroutine sub (p)
>   integer :: p (2, 1)
>   print *, p (1, 1)
>   print *, p (2, 1)
> end subroutine sub
> program subarray
>   integer :: a (2, 2)
>   a (1, 1) = 1
>   a (1, 2) = 2
>   a (2, 1) = 3
>   a (2, 2) = 4
>   call sub (a (1:2, 2:2))
> end
> 
> Array `a' in the main program has layout (here the first index is row, second
> one is column):
>   1 2
>   3 4
>   X Y (these are uninitialized / nonexisting / unused memory locations after
>        the end of array)
> Subroutine `sub' will print:
>            2
>            4
> Subroutine `sub' know only about a table with 2 rows and 1 column.  To make it
> working with the original array `a' memory layout without any copy the
> pointers to the array are setup as:
>   array start: row 1 column 2 (element content `2')
>   rows, therefore number of elements of p: 2
>   columns, therefore number of elements of p row: 1
>   element size of p (one row byte length): sizeof (integer) * 1
>   element size of p row (one element byte length): sizeof (integer)
>   byte stride of p (offset to the next row): sizeof (integer) * 2
>   byte stride of p row: sizeof (integer)

This is not true. As how I understood the Dwarf-3 specs, the stride
defines the size which is used to store the entry in the array, when it
is not the same as it's element's length.

ie: it is not the offset to the next row, it is the size of each row. So
also the latest entry should have this size.

> Now if you in `sub' do `print p' GDB has to transfer the `p' memory from
> inferior.  Currently it will transfer contiguous block with content {2,3,4}.
> 
> If we would always use FULL_SPAN true then GDB would transfer in this case
> a contiguous memory block with content {2,3,4,X}.  But X is after the end of
> the array and for very large arrays (thousands of elements or elements of size
> in kilobytes) memory for X may no longer be mapped and GDB would fail
> retrieving the memory of variable being wished to be printed.  (+It would be
> also less effective.)

I don't know anything about Fortran, but as far as I can see it has to
define a new 1-dimensional array with 2 items which is passed to sub.
Then it has to generate new debug-information which contain the
information for that array. 

> GDB has to transfer only the memory it knows that belongs to a variable.

Yes, but now it does not.

Consider a (static, so fixed-size) pascal/fpc-array which elements are
ansistrings. Ansistrings in pascal stored as pointers to an array of
chars. These ansistrings are in Dwarf-3 defined using the dw_at_location
attribute to point to the real data in the array.

So, the length of the Ansistring-type has to return the length of the
actual stored string. This is _not_ the length of the pointer which
points to the actual data. But when you calculate the array-size, you
have to use the size of a pointer. That's why you have to store that
size in the stride...

Example: 

array[0..2] of string;
0x00: pointer1 -> 0x534643: 'string 1'      (size=8)
0x08: pointer2 -> 0x734644: 'str 2'         (size=5)
0x10: pointer3 -> 0x334554: 'long string 3' (size=13)

This reveals a few problems of the current implementation. First of all,
the size of the elements... It's not constant. That's perfectly normal
using Dwarf-3 and dwarf-blocks, but it doesn't work when you calculate
the length of the array as above, because you do not know which element
to use when you calculate the array-size. (In your example it should be
the last element, gdb now uses the first)

But this is a fundamental problem. Now everywhere is assumed that
TYPE_LENGTH(type) is constant, but it is not. So before asking the
actual length of the instance of a type, you have to use
set_object_addres, clear the type and call check_typedef(type) again.
(quite cumbersome, but I got it almost working)
It is also related to another question I had: in copy_type_recursive_1
the length of the elements is calculated by a call to
dwarf_locexpr_baton_eval, which is ok, but not when the object-address
is still set to the object address of the array. It should first be set
to the address of the particular element. 

Second problem is: which data should be copied to the inferior when you
read this array? Well the answer is simple: only the pointers. So the
compiler adds the stride-option to the debug-information, and gdb simply
has to copy count*stride bytes to the inferior.

Thereafter val_print_array_elements has to evaluate each element, by
setting object-addres to the right pointer, and then evaluate the length
of the string, evaluate the length and copy it to the inferior...

I have this all working, only problems I still have is with all the
calls to all sort of properties of the type, while the object-address is
pointing to something different, so that the sizes don't match.

Do you understand my problem? It's hard to explain.

Joost

Follow-Ups:
- Re: Calculating array length
  - From: Jan Kratochvil

References:
- Calculating array length
  - From: Joost van der Sluis
- Re: Calculating array length
  - From: Jan Kratochvil

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]