Why text=binary mounts
Thu Jan 8 08:31:00 GMT 1998
Jeff Fried writes:
> Porting code from Unix to the PC should NOT require the same line
> termination mode since most Unix code which reads text uses fread/getc
> which automatically handle the end-of-line. And from the replies of most
> people i would argue that most of us would prefer to work in the native
> mode of the operating system in which we are running rather than having to
> constantly convert files between the two models simply because we use tools
> from both operating systems under NT/95. For examples of this
> compatibility look at many of the GNU tools which handle text, the file
> handling will work under both operating systems without any change because
> they use text mode I/O which is platform independent once all files have
> been converted to the form of the native OS.
This is true as long as you are considering text files only. The problem
comes in when you also want to deal with binary files. On Unix systems,
of course, there is no difference in operations on either, so most Unix
programs open all files using the same open() or fopen() calls. On systems
that differentiate between these files, it is important to add O_BIARY or
O_TEXT to the second argument of open(), and "b" for binary files to the
second argument of fopen(). This tells the underlying routines whether to
apply any translation to the file. If nothing is specified, the OS must
choose whether or not to make translations, and that is where the text=/!=
binary mounting comes in, as this specifies the default mode.
Now, there are some difficulties in this implementation. First, since there
is no "t" that can be passed to fopen(), it is impossible to tell if a call
to fopen() wants a text mode open, or the default (blame POSIX/ANSI for that,
I guess). If you know that all programs have conciously made a choice about
things, there would not be any need for a default, so we could assume that
the fopen() without a "b" wants a text mode open and mount things as
text!=binary. However, if there exist Unix programs that call fopen() without
the "b" for binary files (since it isn't needed on Unix and was added to the
standard much later than the program may have been written), then these
programs won't run correctly without some additional porting effort. The
same goes for programs that call open() without the O_BINARY bit set in the
second argument when opening binary files.
To compound this, there are times when it is extremely difficult to impossible
to tell if a file should be opened as text or binary. For instance, should
TAR open the files that it is writing to an archive as binary or text files?
How can it determine which to use?
So, to avoid these issues, many people on this list try to avoid using anything
from the Microsoft world (except for NT/95 itself) and use only cygwin32
programs with text=binary so that any file is just like any other file just
like in Unix systems. Since their text files are marginally exchangable
with other NT/95 users (or other NT/95 applications). So, it seems to me
that this gives a slow, incomplete, and buggy (well, it is a Beta release!)
emulation of Unix with no advantages over Linux except that their boss has
declared that they must run NT (in true pointy-haired boss fashon).
Sure, it's fun to play with cygwin32, but to me it doesn't seem reasonable to
try to develop it as a Linux replacement. I think that if it is to be truely
useful, cygwin32 must encourage interoperating with the native world that it
exists in. Part of that is running well in a text!=binary mounted world.
Sure, that means that porting programs to Cygwin32 means that you have to
install an awareness of binary v.s. text files, and that does mean more work
to port the programs, but it also produces more useful programs as well.
This discussion keeps coming up, which I believe supports my feeling that it
is a major issue with cygwin32. I know that the previous iteration I ended
with just agreeing to disagree and I said that I wouldn't say any more in it,
but I just wanted to give some support to this side in this iteration and
that'll be it (this time around, at least).
Unfortunately, there is no "t" that
can be supplied to fopen() to fully disambiguate the three cases that may
occur, so we have the following situation:
For help on using this list (especially unsubscribing), send a message to
"firstname.lastname@example.org" with one line of text: "help".
More information about the Cygwin