This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [PATCH] Read corrrect auxiliary entry in AIX.

From: "Sangamesh Mallayya" <sangamesh dot swamy at in dot ibm dot com>
To: "Ulrich Weigand" <uweigand at de dot ibm dot com>
Cc: gdb-patches at sourceware dot org, Ulrich dot Weigand at de dot ibm dot com (Ulrich Weigand)
Date: Tue, 22 Nov 2016 15:39:32 +0530
Subject: Re: [PATCH] Read corrrect auxiliary entry in AIX.
Authentication-results: sourceware.org; auth=none
References: <OF50C7B827.6DE568D2-ON65258059.0048D2F2-65258059.004B36E0@notes.na.collabserv.com> from "Sangamesh Mallayya" at Oct 27, 2016 01:41:32 PM <20161028121034.67B9F10B91A@oc8523832656.ibm.com>
Hi Ulrich,

Thanks for review and sorry for the long delay in responding to your 
comments.

> 
> > Attaching the modified patch which i sent earlier for reading correct
> > auxiliary entry in AIX.
> > Description is almost same, with the output of test case results 
added.
> > 
> > 
> > When we use -qfuncsect xlc or -ffunction-sections gcc option,
> > it places instructions for each function in a separate control section 
or
> > CSECT.
> > 
> > One more extra symbol entry with one or more auxiliary entries will be
> > created as shown below.
> > So in below output, symbol 104 is having two auxiliary entries to 
indicate
> > a separate csect.
> > 
> > We were missing to read SD/PR and information about a each separate
> > functions csect.
> > If no csect entry is present, we would just be going through and 
looking
> > for LD/PR and reading function start address
> > by doing jump to function_entry_point: label.
> > 
> > 
> > [104]   m   0x10000500     .text     2  unamex                    .baz
> > [105]   a1           0         0    92       0       0
> > [106]   a4  0x0000005c       0    0     SD       PR    0    0
> > [107]   m   0x10000500     .text     2  extern                    .baz
> > [108]   a2           0              92    3808     119
> > [109]   a4  0x00000068       0    0     LD       PR    0    0
> > [110]   m   0x00000000     debug     0     fun baz:F-1
> > [111]   m   0x10000518     .text     1     fcn                    .bf
> > [112]   a1           0         4     0       0       0
> > [113]   m   0x00000068     debug     0    psym a:p-1
> > [114]   m   0x000000a4     debug     0   bstat                    .bs
> > [115]   m   0x00000008     debug     0   stsym __func__:V13
> > [116]   m   0x00000000     debug     0   estat                    .es
> > [117]   m   0x10000540     .text     1     fcn                    .ef
> > 
> > And in case of gcc compiled binaries we were having symbol table 
entries
> > as.
> > 
> > [149]   m   0x1000045c     .text     1  unamex                    .baz
> > [150]   a4  0x0000005c       0    0     SD       PR    0    0
> > [151]   m   0x1000045c     .text     2  extern                    .baz
> > [152]   a2           0              92   29522     160
> > [153]   a4  0x00000095       0    0     LD       PR    0    0
> > [154]   m   0x00000000     debug     0     fun baz:F-1
> > [155]   m   0x100003d8     .text     1     fcn                    .bf
> > 
> > So the below changes to read correct auxiliary entry works for both 
gcc &
> > xlc compiled binaries even if we have 1 entry.
> > 
> > As the offical xcoff documentation say about csect auxilirary entry.
> > 
> > And we always need to read last auxiliary entry as symbols having a
> > storage class of C_EXT, C_WEAKEXT or C_HIDEXT
> > and more than one auxiliary entry must have the csect auxiliary entry 
as
> > the last auxiliary entry.
> > 
> > So we need to read the last auxiliary entry which is having the SD/PR 
in
> > case of qfuncsect or LD/PR is case of no csects.
> 
> 
> The current code for handling this is already confusing to me.  Code
> at the function_entry_point label copies the current main_aux to the
> fcn_aux_saved variable, which is later accessed to retrieve the function
> size (from fcn_aux_saved.x_sym.x_misc.x_fsize).
> 
> Now in current code, there are two paths to this label.  The first is
> if we have a C_EXT / C_HIDEXT symbol with a single aux entry.  According
> to the XCOFF spec, this must then be a csext aux entry, which contains
> the SCLAS / SMTYP fields.  If those are LD/PR, then we end up at the
> function_entry_point label, and *this* aux entry is copied into the
> fcn_aux_saved variable.  This seems problematic, since this is at this
> point a *csect* aux entry, but fcn_aux_saved is later assumed to be
> a function aux entry that holds a fsize field -- but for csect aux
> entries this is not true, right?

For C_EXT/C_HIDEEXT/C_WEAKEXT, csect auxiliary entries will be created 
when we specifically use compiler flags (-qfuncsect for xlc and 
ffunction-sections for gcc).
And their can be more than one CSECT auxiliary entries. Otherwise, 
normally we would be just having function auxiliary entries.
Current code assumes that their will be only one auxiliary entry for 
csect.
By convection, csect auxiliary entry in XCOFF32 must be the last auxiliary 
entry for any
external symbol that has more than one auxiliary entry. Looks like same 
holds good for XCOFF64 too.
So, for csect we end up in SD/PR part of code.
And next we go and read function auxiliary entries and end up at 
function_entry_point code when we read LD/PR symbol.
 
For example.

gcc we can see this.

[149]   m   0x100004cc     .text     1  unamex                    .baz
[150]   a4  0x00000060       0    0     SD       PR    0    0 <= This 
would be in SD/PR part
[151]   m   0x100004cc     .text     2  extern                    .baz
[152]   a2           0              96   30612     160 
[153]   a4  0x00000095       0    0     LD       PR    0    0 <= This 
woulb be in LD/PR part


For xlc 

[94]    m   0x10000580     .text     2  unamex                    .baz
[95]    a1           0         0    96       0       0 
[96]    a4  0x00000060       0    0     SD       PR    0    0
[97]    m   0x10000580     .text     2  extern                    .baz
[98]    a2           0              96    3900     110 
[99]    a4  0x0000005e       0    0     LD       PR    0    0

> 
> On the other hand, if I understand the XCOFF standard correctly, actual
> functions will always have at least *two* aux entries, and therefore in
> current code we could never actually run into that bad case?
> 

Yes. For actual functions their will be a two aux entries.
One to have the function size and other to have the smtype/smclass 
information.
So in this case we end up in the ISFCN check in original code as you 
mentioned below
without reading any information about the csect in case of xlc which has 
two aux entries, 
and proceeding to read the function entries, which could be a problem 
later. 
 
So in xlc case we are seeing the messages like this 
 
warning: (Internal error: pc 0x10000380 in read in psymtab, but not in 
symtab.)

As we aren't able to allocate symtab(compunit_symtab) for each csect and 
not properly setting start and end address for a csect.

> Still in the current code base, C_EXT / C_HIDEXT symbols with more than
> one aux entry (which supposedly includes all function symbols) will
> therefore always fall through to the next if statement:
> 
>      /* If explicitly specified as a function, treat is as one.  This 
check
>          evaluates to true for @FIX* bigtoc CSECT symbols, so it must 
occur
>          after the above CSECT check.  */
>       if (ISFCN (cs->c_type) && cs->c_sclass != C_TPDEF)
>         {
>           bfd_coff_swap_aux_in (abfd, raw_auxptr, cs->c_type, 
cs->c_sclass,
>                                 0, cs->c_naux, &main_aux);
>           goto function_entry_point;
>         }
> 
> Now this ISFCN check is somewhat unclear to me.  It checks the c_type
> field in the symbol entry itself.  At least according to the XCOFF
> standard here:
> http://www.ibm.com/support/knowledgecenter/ssw_aix_71/
> com.ibm.aix.files/XCOFF.htm
> this field is only ever supposed to be nonzero on C_EXT / C_HIDEXT /
> C_WEAKEXT symbols, so I don't quite understand the C_TPDEF check.
> Also, it is not quite clear whether ISFCN is set on the SD/PR symbol,
> on the LD/PR symbol, or on both.

type value of main symbol table is set only in case of 
C_EXT/C_HIDEEXT/C_WEAKEXT.
bit 10 of type is set if symbol is a function.

So for example.
 
If we see the symbol table and it's entries below, symbol 94 is SD/PR and 
it's type value isn't being set.
But for symbol 97 we can clearly see that it's main symbol table entry 
with the c_type set to 32(0x20), ie bit number 10 is set.

 [94]    m   0x10000580     .text     2  unamex                    .baz
 [95]    a1           0         0    96       0       0
 [96]    a4  0x00000060       0    0     SD       PR    0    0
 [97]    m   0x10000580     .text     2  extern                    .baz
 [98]    a2           0              96    3900     110
 [99]    a4  0x0000005e       0    0     LD       PR    0    0


$15 = {c_name = 0x11028dafc ".baz", c_symnum = 94, c_naux = 2, c_value = 
268436864, 
  c_sclass = 107 'k', c_secnum = 1, c_type = 0}

$16 = {c_name = 0x11028db32 ".baz", c_symnum = 97, c_naux = 2, c_value = 
268436864, 
  c_sclass = 2 '�', c_secnum = 1, c_type = 32}
 
So ISFCN checks to see if bit 10 is set.
 
I too still not able to confirm what is the real purpose of C_TPDEF here.
In one of the gdb file i see the comment like this "Typedefs should not be 
treated as symbol definitions."
But compiler doesn't support typedefs for function definition and can't 
confirm if this C_TPDEF evaluates to true for functions at all.

> 
> In any case, current code then simply assumes that the symbol must have
> a first aux entry that contains the function size.  (If ISFCN were true
> on SD/PR, this assumption would already be wrong, I guess.)
> 

Symbol table and it's auxiliary entry would be something like this.

                           ***Symbol Table Information***
[Index] m        Value       Scn   Aux  Sclass    Type            Name
[Index] a0                                                        Fname
[Index] a1      Tagndx      Lnno  Size  Lnoptr    Endndx
[Index] a2      Tagndx            Fsiz  Lnoptr    Endndx
[Index] a3      Tagndx      Lnno  Size  Dimensions
[Index] a4   CSlen     PARMhsh SNhash SMtype SMclass Stab SNstab
[Index] a5      SECTlen    #RELent    #LINnums

So for functions atleast two entries are required, one for function size 
and one more for class/type.
So auxiliary entry a1 would come first which has a size and then the a4 
with class/type values. 

Also, we would never come to ISFCN check for SD/PR, as c_type is set only 
on main symbol table entry for functions
not on csects.

> Even more confusing, the code that copies an aux entry from the external
> storage format into the internal representation (_bfd_xcoff_swap_aux_in)
> only even initializes the fsize field if the ISFCN check is true:
> 
>   if (ISFCN (type))
>     {
>       in->x_sym.x_misc.x_fsize = H_GET_32 (abfd, 
ext->x_sym.x_misc.x_fsize);
>     }
> 

This should be fine i think, as type is set only for functions (LD/PR) 
which will have the size information.

> so it is unclear what would happen if the LD/PR check and the ISFCN
> check would ever yield different results ...   Do you understand
> exactly when the c_type is set to indicate a function?
> 
> 

As mentioned above c_type would be set in case of functions.
So this ISFCN check should be true only for LD/PR. 

> Given this, I do have a couple of questions about your proposed patch:
> 
>      /* XCOFF can have more than 1 auxent. */
>      if ((cs->c_naux > 1) &&
>          (ISFCN (cs->c_type) && cs->c_sclass != C_TPDEF))
>        {
>          /* We need to read the size if we have LD/PR, and with more 
than
>             one auxiliary entry.  */
> 
> - The c_sclass check here is redundant since we're already in a
>   if (cs->c_sclass == C_EXT || cs->c_sclass == C_HIDEXT)
>   block.

Yes, thanks. cs->c_sclass check can be removed.

> 
> - B.t.w. do we need to handle C_WEAKEXT?

 I checked the dbx behaviour in case of C_WEAKEXT.
 It seems to be handling C_WEAKEXT symbols too.
 So i think we might need to think of handling C_WEAKEXT too.
 But one question comes to mind is, why this case wasn't consider at the 
first place itself.

> 
> - Given the check above, can it *ever* happen that this check is false
>   and we still reach the LD/PR case?  If so, we'd still have the problem
>   that the record is the wrong one and doesn't actually contain a size,
>   right?  But if not, shouldn't we just have the function_entry_point
>   handling in one place and avoid the goto?
> 
> - If there is just one place, should it be where you added the check
>   above, or rather in the LD/PR switch case?
> 
> - Finally, given the check added above, can it still ever happen that
>   the original ISFCN check after the if C_EXT block is true?  If so,
>   are we sure that the symbol will always have a function size in the
>   first aux entry?  (I understand that if a symbol has just one aux
>   entry, it must be a csect one which doesn't have the function size.)
>   If not, this block should be removed as well.
> 

Thanks for catching this.
 
With current modification looks like we mayn't come to the original ISFCN 
check, as we aren't restricted on only one auxiliary entry
and ISFCN check result to true for the functions with two auxiliary entry.
 
Yes. As mentioned below the order of auxiliary entry is important and for 
functions first auxiliary entry is having the size information.

Let me know your view.
Follow-Ups:
- Re: [PATCH] Read corrrect auxiliary entry in AIX.
  - From: Ulrich Weigand
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]