In applications linked against musl libc, stepping over calls to the 'setrlimit' library function result with an error from gdb:
Thread 1 "myprog" recieved signal ?, Unknown signal.
__cp_end () at src/thread/x86_64/syscall_cp.s:29
Then, it's not possible to continue the debug session beyond that point.
This issue was seen on Alpine Linux, but may be applicable to apps linked against musl libc on other Linux distros as well. Specifically, it is seen when debugging certain versions of OpenJDK and CoreCLR (links 1, 2) (in which the issue is especially troubling since setrlimit is called during VM startup).
The issue stems from the musl libc implementation of setrlimit (link 3). It updates threads in a synchronized manner by calling __synccall (link 4), which signals the threads with a SIGSYNCCALL signal:
r = -__syscall(SYS_tgkill, pid, tid, SIGSYNCCALL);
SIGSYNCCALL is internal to musl and doesn't seem to be recognized by gdb. When stepping over this code line, gdb intercepts the SIGSYNCCALL signal and reports the "Unknown signal" error.
This signal is defined as follows in musl pthread_impl.h (link 5), along with two other signal types:
#define SIGTIMER 32
#define SIGCANCEL 33
#define SIGSYNCCALL 34
Adding support for these signal types in gdb, at least avoiding the mentioned error, will enable better debugging of OpenJDK and other apps on Alpine Linux.
- Alpine Linux V3.8
- OpenJDK (openjdk7 / openjdk8 package, but any OpenJDK version should be applicable)
- gdb versions 8.0.1-r6, 8.0.1-r3, 7.12.1-r1.
- gdb <path/to/java>
- r -version
I've implemented a quick and dirty patch for musl signals, for reference:
The trouble is, the three mentioned musl signals are internal and not defined in the user facing <signals.h>, so I had to define them locally. There's also no __MUSL__ define, so another difficulty is ifdef'ing these defines to be for musl builds only.
Rather than hard-coding implementation internals (which will change; SIGTIMER is slated to be removed at some point and the others moved to free up a slot), a clean patch should just handle "unknown" signals in some safe way (i.e. not get stuck on them). Do you understand the mechanism of how this problem is even happening? Presumably it's a weird mixup between host and target -- it should be possible to debug local glibc-linked inferiors with a musl-hosted gdb, or vice versa, so I don't understand how ideas of the semantics of implementation-internal signals are coming into this.
Reportedly this fixes the issue. I don't entirely understand the mechanism, but it seems plausible that this is closer to correct.