This is the mail archive of the cygwin-patches mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Fix multifarious bad pinfo accesses [was Re: Expect goes crazy... spinning cpu in kill_pgrp]


[ ref: http://cygwin.com/ml/cygwin-developers/2009-10/threads.html#00225 ]

    Hi all,

  This fixes the bug I was discussing in the referenced thread.  To summarize:

STC(*): Bootstrap gcc, then run make check at -j8 (or so).  At some point,
generally not too long into the run, one of the running expect instances will
go crazy and process creation will start failing everywhere else.

Bug: Expect hits a segfault while trying to abort.  The segfault leads to an
abort, which leads back to the same segfault, which tries again to abort... ad
infinitum.

Cause: The segfault happens when we try and look up the controlling tty of an
exec()ed program stub.  Here's the relevant class definition (comments,
function members and whitespace trimmed for clarity):

> class _pinfo
> {
> public:
>   pid_t pid;
>   DWORD process_state;
>   DWORD exitcode;	/* set when process exits */
> 
> #define PINFO_REDIR_SIZE ((char *) &myself.procinfo->exitcode - (char *) myself.procinfo)
> 
>   DWORD cygstarted;
>   pid_t ppid;
>   DWORD dwProcessId;
>   char progname[NT_MAX_PATH];
>   __uid32_t uid;	/* User ID */
>   __gid32_t gid;	/* Group ID */
>   pid_t pgid;		/* Process group ID */
>   pid_t sid;		/* Session ID */
>   int ctty;		/* Control tty */
>   bool has_pgid_children;/* True if we've forked or spawned children with our GID. */
>   long start_time;
>   struct rusage rusage_self;
>   struct rusage rusage_children;
>   int nice;
>   char stopsig;
 [ ... snip ... ]
>   /* signals */
>   HANDLE sendsig;
>   HANDLE exec_sendsig;
>   DWORD exec_dwProcessId;
> public:
>   HANDLE wr_proc_pipe;
>   DWORD wr_proc_pipe_owner;
>   friend class pinfo;
> };

  The salient point is that PINFO_REDIR_SIZE definition.  This is used in
pinfo::init() like so:

>       DWORD mapsize;
>       if (flag & PID_EXECED)
> 	mapsize = PINFO_REDIR_SIZE;
>       else
> 	mapsize = sizeof (_pinfo);
> 
>       procinfo = (_pinfo *) open_shared (L"cygpid", n, h0, mapsize, shloc,
> 					 sec_attribs, access);

  So, the idea is (iiuc) that if the current pinfo is for an exec()ed program
stub, there's no need for most of the data, so let's save some memory space
and PTEs by only mapping a reduced view of the start of the pinfo object,
containing only the pertinent data members.

  This is where the problem arises, because the segfault happens when trying
to access the 'ctty' member of an exec()ed pinfo.  As the second post in the
thread demonstrates, there are several places that access members of the
_pinfo beyond the end of the short mapping.  And even guarding those accesses
with tests against PID_EXECED state doesn't always help, because there's
always a TOCTOU race there; it would need proper inter-process locking to be
completely safe.

  That's rather top-heavy and undesirable though, but there's a simpler
answer.  Since this code was first written, a lot of the big data structs have
been moved out of _pinfo.  And whatever size of mapping you request with
PINFO_REDIR_SIZE, you've got a 4kB page granularity with a minimum of one
page, anyway.  If the ctty member had been before the big progname[] array
rather than after it, it would have been fallen within that one page.

  So, the attached patch just rearranges the order of data members so that all
the small data members go at the start, and are hence always available in both
sizes of _pinfo view mapping; just for correctness' sake, it also redefines
PINFO_REDIR_SIZE to match the actual end of the small data members.  This
should mean that no more system resources of any kind are used than was
originally the case, but all the required data is available when needed.

winsup/cygwin/ChangeLog:

	* pinfo.h (_pinfo::progname[]): Move to end.
	(PINFO_REDIR_SIZE): Redefine to point up to start of
	progname[] member.

  I've now had a gcc testrun going at -j8 all night with no sign of the old
race condition; it's never made it a fraction this far before without falling
down^W^Wgetting splatted by a falling hippo.  I'm pretty sure that I've got it
this time; OK for head?

    cheers,
      DaveK
-- 
(*) - For sufficiently complex values of simple.  Sorry I don't have a real
testcase.
Index: winsup/cygwin/pinfo.h
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/pinfo.h,v
retrieving revision 1.109
diff -p -u -r1.109 pinfo.h
--- winsup/cygwin/pinfo.h	6 Oct 2009 21:51:17 -0000	1.109
+++ winsup/cygwin/pinfo.h	9 Jan 2010 05:28:30 -0000
@@ -50,8 +50,6 @@ public:
 
   DWORD exitcode;	/* set when process exits */
 
-#define PINFO_REDIR_SIZE ((char *) &myself.procinfo->exitcode - (char *) myself.procinfo)
-
   /* > 0 if started by a cygwin process */
   DWORD cygstarted;
 
@@ -63,9 +61,6 @@ public:
     signals.  */
   DWORD dwProcessId;
 
-  /* Used to spawn a child for fork(), among other things. */
-  char progname[NT_MAX_PATH];
-
   /* User information.
      The information is derived from the GetUserName system call,
      with the name looked up in /etc/passwd and assigned a default value
@@ -120,6 +115,12 @@ public:
   HANDLE wr_proc_pipe;
   DWORD wr_proc_pipe_owner;
   friend class pinfo;
+
+  /* Used to spawn a child for fork(), among other things. */
+  char progname[NT_MAX_PATH];
+  /* Truncate it after execed process exits. */
+#define PINFO_REDIR_SIZE ((char *) &myself.procinfo->progname[0] - (char *) myself.procinfo)
+
 };
 
 DWORD WINAPI commune_process (void *);

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]