[PATCH] Fix multifarious bad pinfo accesses [was Re: Expect goes crazy... spinning cpu in kill_pgrp]

Dave Korn dave.korn.cygwin@googlemail.com
Sat Jan 9 11:14:00 GMT 2010


[ ref: http://cygwin.com/ml/cygwin-developers/2009-10/threads.html#00225 ]

    Hi all,

  This fixes the bug I was discussing in the referenced thread.  To summarize:

STC(*): Bootstrap gcc, then run make check at -j8 (or so).  At some point,
generally not too long into the run, one of the running expect instances will
go crazy and process creation will start failing everywhere else.

Bug: Expect hits a segfault while trying to abort.  The segfault leads to an
abort, which leads back to the same segfault, which tries again to abort... ad
infinitum.

Cause: The segfault happens when we try and look up the controlling tty of an
exec()ed program stub.  Here's the relevant class definition (comments,
function members and whitespace trimmed for clarity):

> class _pinfo
> {
> public:
>   pid_t pid;
>   DWORD process_state;
>   DWORD exitcode;	/* set when process exits */
> 
> #define PINFO_REDIR_SIZE ((char *) &myself.procinfo->exitcode - (char *) myself.procinfo)
> 
>   DWORD cygstarted;
>   pid_t ppid;
>   DWORD dwProcessId;
>   char progname[NT_MAX_PATH];
>   __uid32_t uid;	/* User ID */
>   __gid32_t gid;	/* Group ID */
>   pid_t pgid;		/* Process group ID */
>   pid_t sid;		/* Session ID */
>   int ctty;		/* Control tty */
>   bool has_pgid_children;/* True if we've forked or spawned children with our GID. */
>   long start_time;
>   struct rusage rusage_self;
>   struct rusage rusage_children;
>   int nice;
>   char stopsig;
 [ ... snip ... ]
>   /* signals */
>   HANDLE sendsig;
>   HANDLE exec_sendsig;
>   DWORD exec_dwProcessId;
> public:
>   HANDLE wr_proc_pipe;
>   DWORD wr_proc_pipe_owner;
>   friend class pinfo;
> };

  The salient point is that PINFO_REDIR_SIZE definition.  This is used in
pinfo::init() like so:

>       DWORD mapsize;
>       if (flag & PID_EXECED)
> 	mapsize = PINFO_REDIR_SIZE;
>       else
> 	mapsize = sizeof (_pinfo);
> 
>       procinfo = (_pinfo *) open_shared (L"cygpid", n, h0, mapsize, shloc,
> 					 sec_attribs, access);

  So, the idea is (iiuc) that if the current pinfo is for an exec()ed program
stub, there's no need for most of the data, so let's save some memory space
and PTEs by only mapping a reduced view of the start of the pinfo object,
containing only the pertinent data members.

  This is where the problem arises, because the segfault happens when trying
to access the 'ctty' member of an exec()ed pinfo.  As the second post in the
thread demonstrates, there are several places that access members of the
_pinfo beyond the end of the short mapping.  And even guarding those accesses
with tests against PID_EXECED state doesn't always help, because there's
always a TOCTOU race there; it would need proper inter-process locking to be
completely safe.

  That's rather top-heavy and undesirable though, but there's a simpler
answer.  Since this code was first written, a lot of the big data structs have
been moved out of _pinfo.  And whatever size of mapping you request with
PINFO_REDIR_SIZE, you've got a 4kB page granularity with a minimum of one
page, anyway.  If the ctty member had been before the big progname[] array
rather than after it, it would have been fallen within that one page.

  So, the attached patch just rearranges the order of data members so that all
the small data members go at the start, and are hence always available in both
sizes of _pinfo view mapping; just for correctness' sake, it also redefines
PINFO_REDIR_SIZE to match the actual end of the small data members.  This
should mean that no more system resources of any kind are used than was
originally the case, but all the required data is available when needed.

winsup/cygwin/ChangeLog:

	* pinfo.h (_pinfo::progname[]): Move to end.
	(PINFO_REDIR_SIZE): Redefine to point up to start of
	progname[] member.

  I've now had a gcc testrun going at -j8 all night with no sign of the old
race condition; it's never made it a fraction this far before without falling
down^W^Wgetting splatted by a falling hippo.  I'm pretty sure that I've got it
this time; OK for head?

    cheers,
      DaveK
-- 
(*) - For sufficiently complex values of simple.  Sorry I don't have a real
testcase.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pinfo-race-fix.diff
Type: text/x-c
Size: 1266 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin-patches/attachments/20100109/320596b5/attachment.bin>


More information about the Cygwin-patches mailing list