GNU make losing jobserver tokens
Wed Apr 27 14:13:40 GMT 2022
On Fri, 1 Apr 2022 17:45:51 +0900
Takashi Yano wrote:
> On Mon, 21 Mar 2022 15:28:17 +0100
> Magnus Ihse Bursie wrote:
> > Hi,
> > I'm working for Oracle on the OpenJDK build team. We're using GNU make
> > to build the JDK on all supported platforms. For Windows, we use Cygwin
> > as our build environment, including the Cygwin version of GNU make.
> > We have had a long-standing issue with make losing jobserver tokens.
> > ("long-standing" here means for years, and years, at least since GNU
> > make 4.0, up to and including the current latest version in Cygwin.)
> > Most runs end with something like:
> > make: INTERNAL: Exiting with 11 jobserver tokens available; should be
> > 12!
> > Since the build still succeeds, and it just affects performance (and
> > typically not that much), we have not spend too much time getting to the
> > bottom of this.
> > Now, however, I've come across a machine where this happens repeatedly,
> > and on a much worse scale:
> > make: INTERNAL: Exiting with 1 jobserver tokens available; should be 24!
> > This effectively turns the highly parallelized builds into
> > single-threaded builds, and is absolutely detrimental for performance.
> > On the flip side, this also makes for the perfect testing environment to
> > really get to the bottom of this issue.
> > I started out by sending a question to firstname.lastname@example.org. The folks over
> > there reported that this was not a known problem with GNU make on
> > Windows in general, and that as far as they knew, the mingw port did not
> > suffer from this problem.
> > Instead, they suggested that it was a Cygwin-specific problem, possibly
> > related to issues with emulating Posix pipes and/or signals in Cygwin.
> > So, my first question is: Is this a known problem in Cygwin GNU make?
> > Are there any workarounds/fixes to get around it?
> > Otherwise: Any suggestions on how to go on and debug this? I am willing
> > to build and test an instrumented debug build of make, but I will need
> > assistance to find my way around the source and spot likely candidates
> > for the source of the problem.
> I have tried to reproduce the issue by building OpenJDK
> from source, however, I could not.
> Instead, I encountered another issue.
> Building OpenJDK sometimes (rarely) failed with error such as:
> 0 [sig] make 5484 sig_send: error sending signal 11, pid 5484, pipe handle 0x118, nb 0, packsize 176, Win32 error 0
> 124917 [main] make 5484 sig_send: error sending signal -72, pid 5484, pipe handle 0x118, nb 0, packsize 176, Win32 error 0
> common/modules/GensrcModuleInfo.gmk:77: *** open: /home/yano/jdk/build/windows-x86-server-release/make-support/vardeps/make/common/modules/GensrcModuleInfo.gmk/jdk.accessibility/ALL_MODULES.vardeps: No such file or directory. Stop.
> make: *** [make/Main.gmk:141: jdk.accessibility-gensrc-moduleinfo] Error 2
> make: *** Waiting for unfinished jobs....
> I looked into this new problem and found that wait_sig() thread
> crashes with segfault. It seems that accessing _main_tls causes
> access violation if a signal is sent just after the process is
> static void WINAPI
> wait_sig (VOID *)
> if (!pack.mask)
> tl_entry = cygheap->find_tls (_main_tls);
> dummy_mask = _main_tls->sigmask; // <--- Segfault here
> cygheap->unlock_tls (tl_entry);
> pack.mask = &dummy_mask;
> I also found the following patch resolves the issue.
> diff --git a/winsup/cygwin/sigproc.cc b/winsup/cygwin/sigproc.cc
> index 62df96652..3824af199 100644
> --- a/winsup/cygwin/sigproc.cc
> +++ b/winsup/cygwin/sigproc.cc
> @@ -1325,6 +1325,10 @@ wait_sig (VOID *)
> _sig_tls = &_my_tls;
> bool sig_held = false;
> + /* Wait for _main_tls initialization. */
> + while (!cygwin_finished_initializing)
> + Sleep (10);
> sigproc_printf ("entering ReadFile loop, my_readsig %p, my_sendsig %p",
> my_readsig, my_sendsig);
> I guess _main_tls may not be initialized correctly until
> cygwin_finished_initializing is set.
> Any comments would be appreciated.
Takashi Yano <email@example.com>
More information about the Cygwin