This is the mail archive of the cygwin@sourceware.cygnus.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: ASCII and BINARY files. Why?


Jim Balter wrote:
> 
> Fergus Henderson wrote:
> 
> >           (^Z at the console should be EOF iff `stty eof ^Z'.)
> 
> What sort of implementation do you have in mind?

I didn't have any particular implementation in mind.
I'm sure it is doable, though.

> > For example of the difference, in the C preprocessor,
> > 
> >         #define foo \<carriage return><newline>
> >         bar
> > 
> > is different from
> > 
> >         #define foo \<newline>
> >         bar
> > 
> > Now, in this particular case, it is implementation-defined what
> > constitutes the end of a line, and so the GNU C preprocessor could
> > define the end of a line as either "\r\n" or "\n".  However, the ANSI
> > standard requires that the implementation document this choice, and so
> > if this change were made, the documentation would need to be changed.
> 
> I challenge you to show me where the standard requires this.

Since I happen to have the C++ draft on hand but not the C standard,
I'll show you where the C++ draft requires it.  I believe the wording
in the C standard is pretty similar.

 |   2.1  Phases of translation                                [lex.phases]
 | 
 | 1 The  precedence  among the syntax rules of translation is specified by
 |   the following phases.1)
 | 
 |     1 Physical  source file characters are mapped, in an implementation-
 |       defined manner, to the source character set (introducing  new-line
 |       characters  for  end-of-line  indicators)  if necessary.

 |   1.4  Definitions                                          [intro.defs]
 | 
 | 1 For the purposes of this International Standard, the definitions given
 |   in ISO/IEC 2382 and the following definitions apply.
 | 
 |   --implementation-defined behavior: Behavior, for a well-formed program
 |     construct  and  correct data, that depends on the implementation and
 |     that each implementation shall document.

The above quotes mean that the implementation is required to document
the way "physical source file characters" are mapped to the source
character set.  If the implementation maps the physical source file
characters "\r\n" to the single source character "\n", then that must be
documented.

> In any case, since \<cr> at the end of a line isn't syntactically
> legal, it's a rather minor matter how the compiler treats it.

No, `\<cr>' at the end of a *logical* source character line is
legal in some contexts, e.g. inside comments.  But that is indeed
a minor matter.

However, the more important point is that it is implementation-defined
whether `\<cr>' at the end of a *physical* source character line is legal.
If you're using Windows editors that save files in <cr><lf> format,
then it is very important that the implementation treat it the right way.

> > I think that cases like this are very common.
> 
> You've managed to name one

OK, let me name some others.

	make
	lex
	yacc
	flex
	bison
	as
	awk
	sed

That's just starting with the usual Unix programmer's development tools.
Is that enough for you?  It might well be easier for me to name the
programs that *don't* care whether there are <cr>s at the end of lines.

> > So, I think the problem with your suggestion is that even though these
> > changes might well be worthy enhancements, the sheer number of changes
> > required would be overwhelming.
> 
> I offered a *path* to a goal.

Yes, the question is which path to the goal is easier.

Patching the above applications to use "t" in fopen() or O_TEXT in open()
is going to be a lot easier, IMHO, than patching them to treat <cr><nl>
the same way they treat <nl>.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
-
For help on using this list, send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]