Bug 21324 - gdb hangs when 'thread apply all bt full' is used
Summary: gdb hangs when 'thread apply all bt full' is used
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: threads (show other bugs)
Version: unknown
: P2 critical
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-28 20:35 UTC by brian@ubuntu.com
Modified: 2021-09-15 07:03 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
output from gdb when attached to gdb hang (6.12 KB, text/plain)
2017-06-01 21:31 UTC, brian@ubuntu.com
Details

Note You need to log in before you can comment on or make changes to this bug.
Description brian@ubuntu.com 2017-03-28 20:35:19 UTC
I run a service which does automated retracing of crashes from Ubuntu systems and discovered that gdb was hanging (using 100% CPU and not returning anything) when retracing crashes from vim on Ubuntu 16.10.

I can recreate this by running 'thread apply all bt full' (or 'thread apply 3 bt full' in this case) which then prints only the following:

  Thread 3 (Thread 0x7ff4636a8700 (LWP 7203)):

I tried using strace on the gdb process but only received:

 $ sudo strace -p 16418                                                         
 strace: Process 16418 attached
 strace: [ Process PID=16418 runs in x32 mode. ]

I've seen this behavior with these versions of gdb:

GNU gdb (Ubuntu 7.12.50.20170314-0ubuntu1) 7.12.50.20170314-git
GNU gdb (Ubuntu 7.11.90.20161005-0ubuntu1) 7.11.90.20161005-git
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1

I also tried a 7.9 version of gdb and that produced a backtrace.

GNU gdb (Ubuntu 7.9-1ubuntu1) 7.9

The backtrace:

Thread 2 (LWP 3015):
#0  0x00007fe23ecc9ea3 in select () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x00007fe23f27ae48 in time_sleep () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#2  0x00007fe23f353427 in PyEval_EvalFrameEx () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#3  0x00007fe23f413a74 in _PyEval_EvalCodeWithName.lto_priv.1712 () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#4  0x00007fe23f413b53 in PyEval_EvalCodeEx () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#5  0x00007fe23f295e25 in function_call.lto_priv () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#6  0x00007fe23f383457 in PyObject_Call () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#7  0x00007fe23f34c3c7 in PyEval_EvalFrameEx () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#8  0x00007fe23f35364b in PyEval_EvalFrameEx () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#9  0x00007fe23f35364b in PyEval_EvalFrameEx () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#10 0x00007fe23f413a74 in _PyEval_EvalCodeWithName.lto_priv.1712 () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#11 0x00007fe23f413b53 in PyEval_EvalCodeEx () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#12 0x00007fe23f295d28 in function_call.lto_priv () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#13 0x00007fe23f383457 in PyObject_Call () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#14 0x00007fe23f3cfb0c in method_call.lto_priv () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#15 0x00007fe23f383457 in PyObject_Call () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#16 0x00007fe23f412577 in PyEval_CallObjectWithKeywords () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#17 0x00007fe23f32c0e2 in t_bootstrap () from /tmp/apport_sandbox_kH9R1m/usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#18 0x00007fe23ef9a6ca in start_thread (arg=0x7fe23a74f700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fe23a74f700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140609620080384, -1451514129212077889, 0, 140725579832159, 140609620081088, 140609620080384, 1449978205402990783, 1449968195707373759}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0,
              0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#19 0x00007fe23ecd40af in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
No locals.

While I have many different core files that produce this behavior, I don't want to make them publicly available since it isn't my data. I am happy to any required debugging.

For completeness here is the full gdb command:

/usr/bin/gdb --ex 'set debug-file-directory /mnt/sec-machines/apport-sandbox-dir/Ubuntu 16.10/amd64/report-sandbox/usr/lib/debug' --ex 'set solib-absolute-prefix /mnt/sec-machines/apport-sandbox-dir/Ubuntu 16.10/amd64/report-sandbox' --ex 'add-auto-load-safe-path /mnt/sec-machines/apport-sandbox-dir/Ubuntu 16.10/amd64/report-sandbox' --ex 'set solib-search-path /mnt/sec-machines/apport-sandbox-dir/Ubuntu 16.10/amd64/report-sandbox/lib/x86_64-linux-gnu' --ex 'file "/mnt/sec-machines/apport-sandbox-dir/Ubuntu 16.10/amd64/report-sandbox//usr/bin/vim.gtk"' --ex 'core-file /tmp/apport_core_vx39d9bm'

I used "Critical" for the severity since "Serious" wasn't available.
Comment 1 brian@ubuntu.com 2017-03-29 02:07:02 UTC
I forgot to mention that the first version of gdb with which I received this hang was GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10 from Ubuntu 15.10.
Comment 2 brian@ubuntu.com 2017-05-03 16:55:35 UTC
Ubuntu was recently updated with a new version of gdb:

GNU gdb (Ubuntu 7.99.90.20170502-0ubuntu1) 7.99.90.20170502-git

I repeated the same test with the same crash and I'm still observing the same behavior of gdb hanging and using 100% CPU.
Comment 3 brian@ubuntu.com 2017-05-05 17:17:01 UTC
As I mentioned previously the service I manage retraces crashes from any application on Ubuntu systems. I've ran across another crash from kodi on Ubuntu 16.10 that also manages to hang gdb.  The behavior is different though as some information from the thread is displayed:

(gdb) thread apply 43 bt full

Thread 43 (Thread 0x7fc54c7e8700 (LWP 9950)):
#0  0x00007fc61a3e40bd in poll () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x00007fc621393247 in poll () at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
No locals.
#2  internal_select_ex.isra.0 (writing=1, interval=30) at ../Modules/socketmodule.c:730
        pollfd = {fd = 55, events = 4, revents = 0}
#3  internal_select () at ../Modules/socketmodule.c:760
No locals.
#4  internal_connect (timeoutp=<synthetic pointer>, addrlen=<optimized out>, addr=0x7fc54c7e50d0, s=0x7fc59d494ea0) at ../Modules/socketmodule.c:2113
        res = -1
        timeout = 0
#5  sock_connect (s=<optimized out>, addro=<optimized out>) at ../Modules/socketmodule.c:2156
        _save = 0x7fc5843d16f0
        addrbuf = {in = {sin_family = 2, sin_port = 20480, sin_addr = {s_addr = 2566652611}, sin_zero = "\000\000\000\000\000\000\000"}, un = {sun_family = 2,
            sun_path = "\000P\303\002\374\230\000\000\000\000\000\000\000\000\004", '\000' <repeats 15 times>, "@\211\017\235\305\177\070\060P\347\034\244\305\177\000\000H\022\034\244\305\177\000\000K9G!\306\177\000\000`Q~L\305\177\000\000 Q~L\305\177\000\000\360\026=\204\305\177\000\000\030U\353\365\305\177\000\000\320hg\244\305\177\000\000\270\061\024\244\305\177"}, nl = {nl_family = 2, nl_pad = 20480, nl_pid = 2566652611, nl_groups = 0}, in6 = {sin6_family = 2, sin6_port = 20480,
            sin6_flowinfo = 2566652611, sin6_addr = {__in6_u = {__u6_addr8 = "\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000", __u6_addr16 = {0, 0,
                  0, 0, 4, 0, 0, 0}, __u6_addr32 = {0, 0, 4, 0}}}, sin6_scope_id = 0}, storage = {ss_family = 2,
            __ss_padding = "\000P\303\002\374\230\000\000\000\000\000\000\000\000\004", '\000' <repeats 15 times>, "@\211\017\235\305\177\070\060P\347\034\244\305\177\000\000H\022\034\244\305\177\000\000K9G!\306\177\000\000`Q~L\305\177\000\000 Q~L\305\177\000\000\360\026=\204\305\177\000\000\030U\353\365\305\177\000\000\320hg\244\305\177\000\000\270\061\024\244\305\177\000\000\000\000\000\000\000\000\000", __ss_align = 140486724472931}, bt_l2 = {l2_family = 2, l2_psm = 20480, l2_bdaddr = {
              b = "\303\002\374\230\000"}, l2_cid = 0, l2_bdaddr_type = 0 '\000'}, bt_rc = {rc_family = 2, rc_bdaddr = {
              b = "\000P\303", <incomplete sequence \374\230>}, rc_channel = 0 '\000'}, bt_sco = {sco_family = 2, sco_bdaddr = {
              b = "\000P\303", <incomplete sequence \374\230>}}, bt_hci = {hci_family = 2, hci_dev = 20480, hci_channel = 707}, ll = {sll_family = 2,
            sll_protocol = 20480, sll_ifindex = -1728314685, sll_hatype = 0, sll_pkttype = 0 '\000', sll_halen = 0 '\000',
            sll_addr = "\000\000\000\000\004\000\000"}}
        addrlen = 16
        timeout = <optimized out>
#6  0x00007fc6213c77cf in ext_do_call (nk=<optimized out>, na=0, flags=<optimized out>, pp_stack=0x7fc54c7e5248, func=0x7fc5c4087170) at ../Python/ceval.c:4661
        tstate = <optimized out>
        kwdict = <optimized out>
        nstar = <optimized out>
        callargs = <optimized out>
        stararg = 0x7fc5a415b650
        result = 0x0
#7  PyEval_EvalFrameEx (f=0x7fc59d4acd00, throwflag=<optimized out>) at ../Python/ceval.c:3026
        flags = <optimized out>
        func = 0x7fc5c4087170
        na = 0
        nk = <optimized out>
        n = <optimized out>
        pfunc = 0x7fc59d4ace90
        sp = 0x7fc59d4ace98
        opcode_targets = {0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213c4b37 <PyEval_EvalFrameEx+23399>, 0x7fc6213c4b3c <PyEval_EvalFrameEx+23404>,
          0x7fc6213c4b41 <PyEval_EvalFrameEx+23409>, 0x7fc6213c4b4b <PyEval_EvalFrameEx+23419>, 0x7fc6213c4b46 <PyEval_EvalFrameEx+23414>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>,
          0x7fc6213c4adc <PyEval_EvalFrameEx+23308>, 0x7fc6213c4b6c <PyEval_EvalFrameEx+23452>, 0x7fc6213c4b71 <PyEval_EvalFrameEx+23457>,
          0x7fc6213c4b76 <PyEval_EvalFrameEx+23462>, 0x7fc6213c4b7b <PyEval_EvalFrameEx+23467>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>,
          0x7fc6213c4b80 <PyEval_EvalFrameEx+23472>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213c4b85 <PyEval_EvalFrameEx+23477>, 0x7fc6213c4b8a <PyEval_EvalFrameEx+23482>,
          0x7fc6213c4b8f <PyEval_EvalFrameEx+23487>, 0x7fc6213c4b9e <PyEval_EvalFrameEx+23502>, 0x7fc6213c4ba3 <PyEval_EvalFrameEx+23507>,
          0x7fc6213c4ba8 <PyEval_EvalFrameEx+23512>, 0x7fc6213c4bad <PyEval_EvalFrameEx+23517>, 0x7fc6213c4b99 <PyEval_EvalFrameEx+23497>,
          0x7fc6213c4b94 <PyEval_EvalFrameEx+23492>, 0x7fc6213c5c4b <PyEval_EvalFrameEx+27771>, 0x7fc6213c5c46 <PyEval_EvalFrameEx+27766>,
          0x7fc6213c5c78 <PyEval_EvalFrameEx+27816>, 0x7fc6213c5c88 <PyEval_EvalFrameEx+27832>, 0x7fc6213c5c95 <PyEval_EvalFrameEx+27845>,
          0x7fc6213c5ca2 <PyEval_EvalFrameEx+27858>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213c5cb2 <PyEval_EvalFrameEx+27874>, 0x7fc6213c5cc5 <PyEval_EvalFrameEx+27893>,
          0x7fc6213c48f3 <PyEval_EvalFrameEx+22819>, 0x7fc6213c5cd5 <PyEval_EvalFrameEx+27909>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>,
         0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213c456f <PyEval_EvalFrameEx+21919>,
          0x7fc6213c5ce5 <PyEval_EvalFrameEx+27925>, 0x7fc6213c45bd <PyEval_EvalFrameEx+21997>, 0x7fc6213c5cf5 <PyEval_EvalFrameEx+27941>,
          0x7fc6213c592a <PyEval_EvalFrameEx+26970>, 0x7fc6213c5c55 <PyEval_EvalFrameEx+27781>, 0x7fc6213c5c5a <PyEval_EvalFrameEx+27786>,
          0x7fc6213c5c3c <PyEval_EvalFrameEx+27756>, 0x7fc6213c5c41 <PyEval_EvalFrameEx+27761>, 0x7fc6213c5c50 <PyEval_EvalFrameEx+27776>,
          0x7fc6213c5d05 <PyEval_EvalFrameEx+27957>, 0x7fc6213c5d0a <PyEval_EvalFrameEx+27962>, 0x7fc6213c4bb2 <PyEval_EvalFrameEx+23522>,
          0x7fc6213c4bb7 <PyEval_EvalFrameEx+23527>, 0x7fc6213c4bbc <PyEval_EvalFrameEx+23532>, 0x7fc6213c4bc1 <PyEval_EvalFrameEx+23537>,
          0x7fc6213c5bfa <PyEval_EvalFrameEx+27690>, 0x7fc6213c5c37 <PyEval_EvalFrameEx+27751>, 0x7fc6213c5a47 <PyEval_EvalFrameEx+27255>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213c5d0f <PyEval_EvalFrameEx+27967>, 0x7fc6213bfa98 <PyEval_EvalFrameEx+2760>,
          0x7fc6213c0141 <PyEval_EvalFrameEx+4465>, 0x7fc6213bfa86 <PyEval_EvalFrameEx+2742>, 0x7fc6213c012f <PyEval_EvalFrameEx+4447>,
          0x7fc6213c5c5f <PyEval_EvalFrameEx+27791>, 0x7fc6213c5c64 <PyEval_EvalFrameEx+27796>, 0x7fc6213c5c69 <PyEval_EvalFrameEx+27801>,
          0x7fc6213c5c6e <PyEval_EvalFrameEx+27806>, 0x7fc6213c5c73 <PyEval_EvalFrameEx+27811>, 0x7fc6213c2185 <PyEval_EvalFrameEx+12725>,
          0x7fc6213c5afc <PyEval_EvalFrameEx+27436>, 0x7fc6213c7590 <PyEval_EvalFrameEx+34240>, 0x7fc6213c3082 <PyEval_EvalFrameEx+16562>,
          0x7fc6213c59a3 <PyEval_EvalFrameEx+27091>, 0x7fc6213c7595 <PyEval_EvalFrameEx+34245>, 0x7fc6213c32f3 <PyEval_EvalFrameEx+17187>,
          0x7fc6213c759a <PyEval_EvalFrameEx+34250>, 0x7fc6213c613a <PyEval_EvalFrameEx+29034>, 0x7fc6213c6142 <PyEval_EvalFrameEx+29042>,
          0x7fc6213c614c <PyEval_EvalFrameEx+29052>, 0x7fc6213c6169 <PyEval_EvalFrameEx+29081>, 0x7fc6213c6183 <PyEval_EvalFrameEx+29107>,
          0x7fc6213c5a4c <PyEval_EvalFrameEx+27260>, 0x7fc6213c5bff <PyEval_EvalFrameEx+27695>, 0x7fc6213c619f <PyEval_EvalFrameEx+29135>,
          0x7fc6213c61b8 <PyEval_EvalFrameEx+29160>, 0x7fc6213c61d4 <PyEval_EvalFrameEx+29188>, 0x7fc6213c61f0 <PyEval_EvalFrameEx+29216>,
          0x7fc6213c4b50 <PyEval_EvalFrameEx+23424>, 0x7fc6213c4afe <PyEval_EvalFrameEx+23342>, 0x7fc6213c6209 <PyEval_EvalFrameEx+29241>,
          0x7fc6213c62af <PyEval_EvalFrameEx+29407>, 0x7fc6213c58d6 <PyEval_EvalFrameEx+26886>, 0x7fc6213c58f3 <PyEval_EvalFrameEx+26915>,
          0x7fc6213c590d <PyEval_EvalFrameEx+26941>, 0x7fc6213c5948 <PyEval_EvalFrameEx+27000>, 0x7fc6213c5962 <PyEval_EvalFrameEx+27026>,
          0x7fc6213c5984 <PyEval_EvalFrameEx+27060>, 0x7fc6213c59a8 <PyEval_EvalFrameEx+27096>, 0x7fc6213c59c5 <PyEval_EvalFrameEx+27125>,
          0x7fc6213c5a15 <PyEval_EvalFrameEx+27205>, 0x7fc6213c5a2e <PyEval_EvalFrameEx+27230>, 0x7fc6213c1f65 <PyEval_EvalFrameEx+12181>,
          0x7fc6213c59dd <PyEval_EvalFrameEx+27149>, 0x7fc6213c59f9 <PyEval_EvalFrameEx+27177>, 0x7fc6213c6227 <PyEval_EvalFrameEx+29271>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213c5a69 <PyEval_EvalFrameEx+27289>,
          0x7fc6213c5a85 <PyEval_EvalFrameEx+27317>, 0x7fc6213c5aa4 <PyEval_EvalFrameEx+27348>, 0x7fc6213c5ac3 <PyEval_EvalFrameEx+27379>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213c4ae1 <PyEval_EvalFrameEx+23313>, 0x7fc6213c4b1b <PyEval_EvalFrameEx+23371>,
          0x7fc6213c6241 <PyEval_EvalFrameEx+29297>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213c7572 <PyEval_EvalFrameEx+34210>, 0x7fc6213c2b0f <PyEval_EvalFrameEx+15167>,
          0x7fc6213c5ba6 <PyEval_EvalFrameEx+27606>, 0x7fc6213c5be0 <PyEval_EvalFrameEx+27664>, 0x7fc6213c5bc3 <PyEval_EvalFrameEx+27635>,
          0x7fc6213c625b <PyEval_EvalFrameEx+29323>, 0x7fc6213c6278 <PyEval_EvalFrameEx+29352>, 0x7fc6213c6292 <PyEval_EvalFrameEx+29378>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213c5b04 <PyEval_EvalFrameEx+27444>,
          0x7fc6213c5b3c <PyEval_EvalFrameEx+27500>, 0x7fc6213c5b82 <PyEval_EvalFrameEx+27570>, 0x7fc6213c5ae2 <PyEval_EvalFrameEx+27410>,
          0x7fc6213bfc0d <PyEval_EvalFrameEx+3133>, 0x7fc6213c2d62 <PyEval_EvalFrameEx+15762>, 0x7fc6213c5c1b <PyEval_EvalFrameEx+27723>,
          0x7fc6213c592f <PyEval_EvalFrameEx+26975>, 0x7fc6213bfc0d <PyEval_EvalFrameEx+3133> <repeats 108 times>}
        stack_pointer = <optimized out>
        next_instr = <optimized out>
        opcode = <optimized out>
        oparg = <optimized out>
        why = WHY_NOT
        err = 0
        x = 0x7fc5a46db850
        v = <optimized out>
        w = <optimized out>
        u = <optimized out>
        t = <optimized out>
        stream = 0x0
        fastlocals = 0x7fc59d4ace78
        freevars = <optimized out>
        retval = <optimized out>
        tstate = <optimized out>
        co = <optimized out>
        instr_ub = <optimized out>
        instr_lb = <optimized out>
        instr_prev = <optimized out>
        first_instr = <optimized out>
        names = <optimized out>
        consts = <optimized out>
#8  0x00007fc62151f99c in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=3, kws=0x0,
    kwcount=0, defs=0x0, defcount=0, closure=0x0) at ../Python/ceval.c:3582
        f = 0x7fc59d4acd00
        retval = 0x0
        fastlocals = 0x7fc59d4ace78
---Type <return> to continue, or q <return> to quit---
        freevars = 0x7fc59d4ace90
        tstate = 0x7fc5843d16f0
        x = <optimized out>
        u = <optimized out>
#9  0x00007fc6214cd170 in function_call.lto_priv.355 (func=0x7fc59d5ba5f0, arg=0x7fc5a47c0c80, kw=0x0) at ../Objects/funcobject.c:523
        result = <optimized out>
        argdefs = <optimized out>
        kwtuple = 0x0
        d = 0x0
        k = 0x0
        nk = 0
        nd = 0

And that was the last bit of output before the hang. Again this was with gdb version:

GNU gdb (Ubuntu 7.99.90.20170502-0ubuntu1) 7.99.90.20170502-git
Comment 4 Simon Marchi 2017-05-31 20:02:25 UTC
Hi Brian,

My guess is that GDB is stuck in an infinite loop trying to unwind the stacks/generate the backtraces.  One would need to step into GDB itself to understand what's going wrong.  I'll try to rebuild the Ubuntu package for vim, generate some core and see if I can reproduce the issue.

Simon
Comment 5 Simon Marchi 2017-05-31 21:32:43 UTC
(In reply to Simon Marchi from comment #4)
> Hi Brian,
> 
> My guess is that GDB is stuck in an infinite loop trying to unwind the
> stacks/generate the backtraces.  One would need to step into GDB itself to
> understand what's going wrong.  I'll try to rebuild the Ubuntu package for
> vim, generate some core and see if I can reproduce the issue.
> 
> Simon

I wasn't able to reproduce the problem using GNU gdb (GDB) 8.0.50.20170531-git
.  I suppose it needs a particular situation for it to trigger.  If you find a way to reproduce easily, please share it.

Simon
Comment 6 Pedro Alves 2017-06-01 09:21:30 UTC
>  strace: [ Process PID=16418 runs in x32 mode. ]

Does this mean that the GDB that hangs is built as a x32 process?
Comment 7 Pedro Alves 2017-06-01 09:22:04 UTC
Brian, I assume you're detecting the hangs and killing GDB.  Any chance you can attach another GDB to the hung GDB before killing it, and get a "thread apply all bt" so can we see where is GDB stuck?
Comment 8 brian@ubuntu.com 2017-06-01 21:09:40 UTC
I attached to the hung gdb process which was trying to retrace a crash from kodi. I then ran thread apply all bt and observed the following:

(gdb) thread apply all bt

Thread 1 (Thread 0x7fd6d1f41780 (LWP 8736)):
#0  0x000055e6d9b0503d in ?? ()
#1  0x000055e6d9b0547f in ?? ()
#2  0x000055e6d9afe451 in ?? ()
#3  0x000055e6d9b041da in ?? ()
#4  0x000055e6d9b01221 in ?? ()
#5  0x000055e6d9b01f8a in ?? ()
#6  0x000055e6d9ae62bf in ?? ()
#7  0x000055e6d9aee9d1 in ?? ()
#8  0x000055e6d9ae60d8 in ?? ()
#9  0x000055e6d9ae8265 in ?? ()
#10 0x000055e6d9ae60a8 in ?? ()
#11 0x000055e6d9ae58cc in ?? ()
#12 0x000055e6d9ae467d in ?? ()
#13 0x000055e6d9ad81d2 in ?? ()
#14 0x000055e6d9ae4979 in ?? ()
#15 0x000055e6d9ae42e6 in ?? ()
#16 0x000055e6d9bdbbae in ?? ()
#17 0x000055e6d9bdae0a in ?? ()
#18 0x000055e6d9c2eea1 in ?? ()
#19 0x000055e6d9c2f0d1 in ?? ()
#20 0x000055e6d9a4dfaf in ?? ()
#21 0x000055e6d9ac7bb4 in ?? ()
#22 0x000055e6d9ac7ff6 in ?? ()
#23 0x000055e6d9ac2416 in ?? ()
#24 0x000055e6d9abe95c in ?? ()
#25 0x000055e6d9b30576 in ?? ()
#26 0x000055e6d9b300c0 in ?? ()
#27 0x000055e6d9b31c4b in ?? ()
#28 0x000055e6d9b3233d in ?? ()
#29 0x000055e6d9b324d3 in ?? ()
#30 0x000055e6d9b32b26 in ?? ()
#31 0x000055e6d9ac88a7 in ?? ()
#32 0x000055e6d9ac8a43 in ?? ()
#33 0x000055e6d9acf3ba in ?? ()
#34 0x000055e6d9c1532c in ?? ()
#35 0x000055e6d9c15ef7 in ?? ()
#36 0x000055e6d9c1713c in ?? ()
#37 0x000055e6d9c165a5 in ?? ()
#38 0x000055e6d9c189c2 in ?? ()
#39 0x000055e6d9c18ddb in ?? ()
#40 0x000055e6d99744d4 in ?? ()
#41 0x000055e6d9977861 in ?? ()
#42 0x000055e6d9c5dc50 in ?? ()
#43 0x000055e6d9c592a7 in ?? ()
#44 0x000055e6d99744d4 in ?? ()
#45 0x000055e6d9977861 in ?? ()
#46 0x000055e6d9c5dc50 in ?? ()
#47 0x000055e6d9b1df2a in ?? ()
#48 0x000055e6d9b1e337 in ?? ()
#49 0x000055e6d9b1d80e in ?? ()
#50 0x00007fd6d1b46d73 in rl_callback_read_char () from /lib/x86_64-linux-gnu/libreadline.so.7
#51 0x000055e6d9b1d6fa in ?? ()
#52 0x000055e6d9b1d779 in ?? ()
#53 0x000055e6d9b1ddc1 in ?? ()
#54 0x000055e6d9b1c097 in ?? ()
#55 0x000055e6d9b1c660 in ?? ()
#56 0x000055e6d9b1b3d6 in ?? ()
#57 0x000055e6d9b1b41d in ?? ()
#58 0x000055e6d9b998ac in ?? ()
#59 0x000055e6d9b1f460 in ?? ()
---Type <return> to continue, or q <return> to quit---
#60 0x000055e6d9b9af03 in ?? ()
#61 0x000055e6d9b9af3b in ?? ()
#62 0x000055e6d98a9d39 in ?? ()
#63 0x00007fd6cfaa43f1 in __libc_start_main (main=0x55e6d98a9ce0, argc=13, argv=0x7ffe07075828, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe07075818) at ../csu/libc-start.c:291
#64 0x000055e6d98a9bda in ?? ()
Comment 9 brian@ubuntu.com 2017-06-01 21:20:21 UTC
I tried the same process, attaching gdb to the hung gdb which was this time retracing a core file from vim, and the output was similar to that with the kodi core file.

Here's a snippet:

Thread 1 (Thread 0x7f6ab8dd8780 (LWP 10437)):
#0  0x0000559fa47219ce in ?? ()
...
#59 0x00007f6ab89ddd73 in rl_callback_read_char () from /lib/x86_64-linux-gnu/libreadline.so.7
60 0x0000559fa475a6fa in ?? ()
#61 0x0000559fa475a779 in ?? ()
#62 0x0000559fa475adc1 in ?? ()
#63 0x0000559fa4759097 in ?? ()
#64 0x0000559fa4759660 in ?? ()
#65 0x0000559fa47583d6 in ?? ()
#66 0x0000559fa475841d in ?? ()
#67 0x0000559fa47d68ac in ?? ()
#68 0x0000559fa475c460 in ?? ()
#69 0x0000559fa47d7f03 in ?? ()
#70 0x0000559fa47d7f3b in ?? ()
#71 0x0000559fa44e6d39 in ?? ()
#72 0x00007f6ab693b3f1 in __libc_start_main (main=0x559fa44e6ce0, argc=13, argv=0x7ffe65d3fe68, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=0x7ffe65d3fe58) at ../csu/libc-start.c:291
#73 0x0000559fa44e6bda in ?? ()
Comment 10 brian@ubuntu.com 2017-06-01 21:31:19 UTC
Created attachment 10080 [details]
output from gdb when attached to gdb hang

I installed debug symbols for gdb and received more detailed information when running 'thread apply all bt'. I've added it as an attachment since it is rather lengthy.

Please let me know if there is anything else you need.
Comment 11 Pedro Alves 2017-06-02 09:34:39 UTC
Is that a pristine upstream GDB, or an ubuntu GDB?  The reason I ask is that ubuntu's gdb may have local patches that who knows, might cause this.  It'd be nice to test with current upstream master.  That'd also let us know whether the issue might be fixed already.

Looking at the backtrace, we're in inline_frame_sniffer -> frame_unwind_register -> dwarf2_tailcall_sniffer_first, and then end up in the dwarf reader.

Both the inline and tailcall sniffers rely on accurate debug/unwind info.
I can imagine cycles appearing with e.g., bad unwind info.
I'd suspect something odd along those lines.

Also, seeing the two sniffers in the same backtrace is suspicious.

I think the next step would be to determine where exactly is the cycle.  I.e., once you get a hung gdb, attach to it, and step through the code until you figure out what's the infinite that never breaks.

I'd disable the frame filters and whatever other scripts gdb may be loading, to confirm that the problem isn't being triggered by something they're doing.  If that "fixes" the problem, them we have a better idea where to look next.