When a structure member's offset is 1/4 GB or greater from the start of the structure, GDB computes the wrong value. When computing the offset of a structure member, GDB computes first the BITS from the start of the structure, returns that as a 32 bit signed quantity and then divides the result by 8. While this can show up on 32 bit systems, it is especially bad on 64 bit systems. We have systems with over 500 GB of memory and have run into the problem. Here's a short program that illustrates the bug. The actual struct where the problem was observed has over 250 members and is over 1/2 GB in size. #include <stdio.h> struct big_struct { char first[0x10000000 + 16]; long second; } big_struct; int main (int argc, char *argv[]) { printf ("sizeof (big_struct) 0x%lx,\n&big_struct %p,\n&big_struct.second %p\noffset %lx\n", sizeof (big_struct), (void *)&big_struct, (void *)&big_struct.second, __builtin_offsetof(struct big_struct, second)); return (0); } If I run it, the output is: sizeof (big_struct) 0x10000018, &big_struct 0x601040, &big_struct.second 0x10601050 offset 10000010 But, in GDB 7.8: (gdb) p &big_struct $1 = (struct big_struct *) 0x601040 <big_struct> (gdb) p &big_struct.second $2 = (long *) 0xfffffffff0601050
I Fedora carries for it the patchset: http://pkgs.fedoraproject.org/cgit/gdb.git/tree/gdb-rhbz795424-bitpos-20of25.patch http://pkgs.fedoraproject.org/cgit/gdb.git/tree/gdb-rhbz795424-bitpos-21of25.patch http://pkgs.fedoraproject.org/cgit/gdb.git/tree/gdb-rhbz795424-bitpos-22of25.patch http://pkgs.fedoraproject.org/cgit/gdb.git/tree/gdb-rhbz795424-bitpos-23of25.patch http://pkgs.fedoraproject.org/cgit/gdb.git/tree/gdb-rhbz795424-bitpos-25of25-test.patch http://pkgs.fedoraproject.org/cgit/gdb.git/tree/gdb-rhbz795424-bitpos-25of25.patch http://pkgs.fedoraproject.org/cgit/gdb.git/tree/gdb-rhbz795424-bitpos-lazyvalue.patch It was concluded that patching it before GDB is switched to C++ would be fragile as there cannot be done easy sanity checking if the inferior types width has not regressed. With C++ one can wrap the offset types into sanity-checked classes with overriden operators. But the GDB switch to C++ takes too many years.
Fedora has been carrying those patches for quite some time now, and I don't remember seeing regressions because of them. Maybe we could try to revive the patches upstream...
I do not know how to verify they are correct. That is there is no accidental 32bitvar=64bitvar; assignment or similar. While that may be possible with some LLVM static analysis I have been waiting for C++ which would make it easier. But then we are (were) all waiting on that approval of C++ switch.
jan.kratochvil at redhat dot com <sourceware-bugzilla@sourceware.org> wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=17520 > > --- Comment #3 from Jan Kratochvil <jan.kratochvil at redhat dot com> --- > I do not know how to verify they are correct. That is there is no accidental > 32bitvar=64bitvar; assignment or similar. While that may be possible with some > LLVM static analysis I have been waiting for C++ which would make it easier. > But then we are (were) all waiting on that approval of C++ switch. > > -- > You are receiving this mail because: > You reported the bug. I don't care whether it's my patch or Red Hat's or someone else's, so long as the bug is fixed. The bug affects us and we want it fixed. I would rather not have to reapply the patch for each release. If problems are found with the patch, generally they are easy to fix -- find the ooffending variable and change it from int to LONGEST. I configured GDB with --enable-werror. I started by changing everything that I cou8ld find that was going to cause problems and then iterated until everything built and ran with no regressions. There a couple of places where I wasn't certain that it needed to be converted, but I figured better safe than sorry. If people are truly worried about it destabilizing GDB, then the right approach is to put it in early -- to get as much mileage as possible on it before the next release. With regard to GDB moving to C++ -- it might happen, it has certainly been talked about on and off for enough years, but as whether it will ever happen -- who knows. But, I am not going hold my breath.
The master branch has been updated by David Taylor <taylor@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=6b8505468e64c2be8d0eea1f2b8db86fa3897600 commit 6b8505468e64c2be8d0eea1f2b8db86fa3897600 Author: David Taylor <dtaylor@emc.com> Date: Tue Apr 12 15:02:57 2016 -0400 Support structure offsets that are 512K or larger. GDB computes structure byte offsets using a 32 bit integer. And, first it computes the offset in bits and then converts to bytes. The result is that any offset that if 512K bytes or larger overflows. This patch changes GDB to use LONGEST for such calculations. PR gdb/17520 Structure offset wrong when 1/4 GB or greater. * c-lang.h: Change all parameters, variables, and struct or union members used as struct or union fie3ld offsets from int to LONGEST. * c-valprint.c: Likewise. * cp-abi.c: Likewise. * cp-abi.h: Likewise. * cp-valprint.c: Likewise. * d-valprint.c: Likewise. * dwarf2loc.c: Likewise. * eval.c: Likewise. * extension-priv.h: Likewise. * extension.c: Likewise. * extension.h: Likewise. * findvar.c: Likewise. * gdbtypes.h: Likewise. * gnu-v2-abi.c: Likewise. * gnu-v3-abi.c: Likewise. * go-valprint.c: Likewise. * guile/guile-internal.h: Likewise. * guile/scm-pretty-print.c: Likewise. * jv-valprint.c Likewise. * opencl-lang.c: Likewise. * p-lang.h: Likewise. * python/py-prettyprint.c: Likewise. * python/python-internal.h: Likewise. * spu-tdep.c: Likewise. * typeprint.c: Likewise. * valarith.c: Likewise. * valops.c: Likewise. * valprint.c: Likewise. * valprint.h: Likewise. * value.c: Likewise. * value.h: Likewise. * p-valprint.c: Likewise. * c-typeprint.c (c_type_print_base): When printing offset, use plongest, not %d. * gdbtypes.c (recursive_dump_type): Ditto.
Fixed by commit 6b8505468e64c2be8d0eea1f2b8db86fa3897600 on Jun 24th, 2016.