This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: CR-LF handling behavior of SED changed recently - this breaks a lot of MinGW cross build scripts


On 6/9/2017 4:06 AM, Soegtrop, Michael wrote:
> Dear Blake,
> 
>> External products were being lazy by relying on cygwin to strip CR when they
>> should have stripped it themselves.  But 'sed -b' does NOT strip CR (it is the
>> exact opposite, of keeping CR unstripped).
> 
> I think that there are more people who use sed for text processing in a MinGW cygwin cross environment than there are people using sed for binary data - but looking at the mailing list this might be a subjective view. The maintainers of the Linux centric SW could insert dos2unix in all stuff piped into sed, but this would only be required in this very specific build configuration, and Linux SW maintainers frequently argue why they should care about Windows at all. It can take all sorts of philosophical discussions to get such a change in. And I cannot blame them for a "we are free SW people, if you Windowers can make use of our SW without bothering us - go ahead, but please don't drag us into your mud" attitude. In the end one usually gets it done, but the effort is not negligible.
> 

They have every right and I would most likely do the same if I were in
their positions.  As already explained by Erik the CR needs to be
preserved during the pipe and redirect of data.  Otherwise you corrupt
the data being used.

> I think that sed, grep and awk are in the end text processing tools, and that they should at least have an environment option to behave like text processing tools in a mixed cygwin MinGW environment. With sed I have several 100 issues building a single application, with grep it was just one in my scripts which I fixed. awk no issues so far.
> 

They may have been derived by the need to process text but they were
also derived as *nix software where the CR wasn't an issue.  Dealing
with the data is the end users responsibility and if the data contains a
character that isn't required then the end user needs to remove it with
other tools. This isn't anything new, dealing with Windows produced
files in any *nix environment requires the conversion from and to
Windows formats.

> Btw.: I don't think that it will be easy or even possible to build detection for MinGW generated output into the cygwin dll as you suggested in a previous post. How should the receiving part of a pipe know what kind of DLLs the sending part has loaded? One could have obscure heuristics to detect "text with cr-lf line endings", but this sounds more like a night mare than a solution. The only entity who could detect this is the shell, but then you have more than one shell (and more philosophers). As I said, I think on cygwin sed, grep and awk should have an environment option to be MinGW friendly text processors (as they used to be). Other less text centric SW should be unaffected.
> 

No, there should be no such option.

> Honestly my solution to the problem is to build sed from sources with CR stripping. I thought about it a day and came to the conclusion that everything else is a waste of time.
> 

That is your choice, you could even build sed as a Windows binary
instead of a Cygwin binary; but it would be most beneficial if you
caused the stdio of your Windows applications to be in binary format
instead of text format.  Then the CR wouldn't be an issue during the
pipe process.  Why does your applications stdio need to be in text
format instead of binary format?

-- 
cyg Simple

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]