[PATCH] winsup/cygwin: Protect fork() against dll- and exe-updates.

Michael Haubenwallner michael.haubenwallner@ssi-schaefer.com
Fri Jul 31 11:32:00 GMT 2015


Hi Corinna,

Am 2015-07-29 um 15:22 schrieb Corinna Vinschen:
> On Jul 28 18:40, Michael Haubenwallner wrote:
>> On 07/27/2015 09:50 AM, Corinna Vinschen wrote:
>>> On Jul 24 17:43, Michael Haubenwallner wrote:
>>>> When starting to port Gentoo Prefix to Cygwin, the first real problem
>>>> discovered is that fork() does use the original executable's location

>>> Unfortunately there's some red tape to get over with, first.  We need a
>>> copyright assignment from you before we can go much further.

Copyright assignment submitted.

>>> - /proc is already available as virtual filesystem as on Linux.
>>>   [blah]
>>>   Also, using the Windows PID as dir name seems a bit weird, given that
>>>   the virtual /proc obviously uses the Cygwin PID.  This sounds like a
>>>   source for confusion.

For the moment, using Windows PID as directory name is necessary, as the
Cygwin PID may be shared by multiple Windows processes, which feels like
it would require more sophisticated setup/cleanup logic.

>> There's no particular reason for /proc/ actually - just came to my mind
>> first. I've also seen /run/ on recent Linux boxes...
> 
> Yeah, /run might be a good option, albeit there may be installations
> out there already using this path for their own dubious purposes.
> Reusing a path existing in a cygwin installation by default would
> avoid collisions.  /var/run perhaps.

've updated the patch to use /var/run/wproc/<sid>/<ntpid>/ now,
where /var/run/wproc/ needs to be created manually for enabling.

>> This is the functional reason to keep these hardlinks optional:
>> I don't want Cygwin itself to require NTFS, but Gentoo Prefix only - which
>> IMHO is a corner use-case for Cygwin, but requires an updates-protected fork.
> 
> Some people use Cygwin from a USB stick.

Cool idea - does make sense of course. But to not slow down Cygwin,
one should not create /var/run/wproc/ on the USB stick...

>> However, I've been using Interix before - and Cygwin feels faster even
>> with hardlinks enabled.
> 
> FTR: Me too, and I have not the faintest idea why, given that Interix
> can fork natively while Cygwin has to go to great lengths to emulate it.

Won't be surprised actually if Interix does similar - just hidden as "native".

<snip>
>>>> *) dll-redirection for LoadLibrary using "app.exe.local" file does operate on
>>>>    the dll's basename only, breaking perl's Hash::Util and List::Util at least.
>>>>    So creating hardlinks for dynamically loaded dlls is disabled for now.
>>>>    Eventually, manifests and/or app.exe.config could help here, but I'm still
>>>>    failing to really grok them...
>>>
>>> Hmm.  The DLLs are loaded dynamically anyway, so they will be loaded
>>> dynamically in the child as well in dll_list::load_after_fork_impl.  Why
>>> not simply hardlinking them using a unique filename (e.g. using the
>>> inode number), storing the unique number or name in the dll struct and
>>> then calling LoadLibrary on this name?
>>
>> This might be necessary in the initial dlopen() already: I've tried hardlinks
>> for loaded dlls mangling the full path into the hardlink's filename, but
>> encountered different load addresses in the child - most likely due to the
>> now different dll's filename.
> 
> Huh?  That shouldn't happen.  The address is determined by the file's
> PE/COFF header, not by the name.  However, did you reuse the name field
> in the dll structure or did you create another name field for the
> mangled name?  In the first case there may be some checks in dll_init.cc
> not working.  That's why I said to use an extra field for the mangled
> name.

Fixed now. The basename of the loaded dll has to be preserved, so it can
be found in the child as "already loaded" link-dep of another dll that
is loaded afterwards.

>>> - What if a EXE/DLL is replace more than once during the lifetime of
>>>   a process?
>>
>> This wouldn't make any difference: The hardlinks are created upon the first
>> use of some exe/dll in parent (even if that process won't ever use fork),
> 
> So, here's a question.  What if the directory is only created on
> first fork?  Given that only few processes actually call fork, shouldn't
> that speed up typical usage profiles a lot?  Even with `configure' or
> `make', at least half of the involved processes don't fork.

Yeah - but how to create the original file-name (in another directory) so it
does refer to the original inode number, when the original file-name has been
renamed/unlinked during the upgrade? This is why I create the hardlink at
load-time already, where the original file-name is still available.

And WTF is ReFS? Is NTFS the next dead horse I'm gonna ride after Interix?

>> and the forked child gets the parent's first-use versions. Still there is
>> a short timeframe between process startup and hardlink creation, but that
>> is not a real problem (yet).
> 
> This may be even academical, but something to keep in mind.
> 
>>> - What about reducing the overhead by implementing some kind of generic
>>>   exe/dll cache used by all processes?  It would reduce the requirement
>>>   to cleanup, reduce the footprint of the cache, speed up subsequent
>>>   forks.
>>
>> I'm all for it, but I've no idea of currently available cross-process
>> mechanisms in cygwin/windows that could help here ...
> 
> Yeah, scratching my head myself, but we might want to discuss it
> nevertheless.  Maybe sombody has a good idea?

Just found NtCreateFile (CreateOptions=FILE_OPEN_BY_FILE_ID) - but I've not
found a mechanism yet to re-create a fully useable filesystem entry out of
the FILE_ID and/or the HMODULE only.

<snip>
>>> - The heretical question of course:  Is the underlying problem really
>>>   worth the additional overhead?  The patch is pretty intrusive.
>>
>> The underlying problem is:
>> Gentoo Prefix breaks on Cygwin with current fork implementation. OTOH,
>> both - enabling the hardlink creation plus the performance overhead - is
>> acceptable to me (and for now) to allow for Gentoo Prefix on Cygwin.
>> The alternative - to not have some POSIX-like buildsystem on Windows (since
>> Interix is gone) for our otherwise portable application - is... an issue.
> 
> Here's the catch.  What you're doing is a deviation from how Cygwin
> is trying to operate.  If at all possible, Cygwin applications should
> run in any environment.  Cygwin is just some "operating system", and
> despite striving for POSIX compatibility, we can't manage it under all
> circumstances.
> 
> This in turn usually requires porting.  Any application running under
> multiple OSes has code to make sure differences in the various OSes
> (and there are lots of them, even between the supposedly POSIX compatible
> ones) are handled gracefully.
> 
> So you'd usually port gentoo prefix to Cygwin, not vice versa.  And
> to close the loop, your change to Cygwin requires to change the users'
> environment, plus a noticable slowdown of the entire installation, just
> to be able to run your application.
> 
> I'd expect that gentoo prefix, if there *is* an interest to port it to
> Cygwin, would try to run under Cygwin as is.  And it should preferredly
> run under Cygwin in any environment, not only in the environment adding
> the exe/dll hardlinks.
> 
> Do you understand what bugs me?

I think I do understand - and I do agree for your point of view!

But still, before trying to work around any issues with the underlying OS,
I prefer to fix them if possible.

OTOH: For the moment, I'm in an evaluation phase whether Gentoo Prefix can
be ported to Windows - be it restricted to NTFS only, as FAT is ancient,
even if still used on USB sticks.

Now for Windows as the underlying OS, an existing workaround for the missing
fork() is available in newlib-cygwin package already, even if that one needs
some patching to allow for Gentoo Prefix. Although patching is business as
usual in Gentoo Prefix too - though upstream acceptable patches are preferred.

Still, for Gentoo Prefix I do prefer to run on "Cygwin" rather than on
"Windows", even if that feels like nitpicking. But probably I have to
mirror my own cygwin distro anyway for complete long term support...:
http://video.fosdem.org/2015/devroom-distributions/providing_an_lts_distro_with_gentoo_prefix.mp4

>>>   Is there a simpler way to achieve the same or, at least, a similar
>>>   result?
>>
>> Hmm - most likely there is a faster way than the current patch,
>> but I doubt there is a simpler way...
> 
> Your patch is rather intrusive.  It's not "simple" as I understand it.

Was it really myself that called this patch "simple"? ;)

Thanks!
/haubi/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Protect-fork-against-dll-and-exe-updates.patch
Type: text/x-patch
Size: 31396 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin-developers/attachments/20150731/ea912c44/attachment.bin>


More information about the Cygwin-developers mailing list