This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: sed converts 8-bit input text to 16-bit (Unicode-16?) characters - how to suppress that?

Corinna Vinschen wrote:
> On Mar 30 13:48, Michael Moser wrote:
>> I need to mangle a file containing "8-bit ASCII" characters (i.e. the
>> file contains also characters in the upper 8-bit range, namely a few
>> umlauts as well as some french accented characters). 
>> Strange enough, the SED version that came as part of cygwin emits the
>> result of the mangling using 16-bit characters (I believe those are
>> Unicode-16 characters, but not sure. The Hexeditor shows each second
>> byte as always 00, execpt for the first two bytes which read FF FE).
> This is very likely not Cygwin's sed.  Do you have another sed in $PATH
> by any chance?  I tried with input files containing german umlauts and
> sed does not convert to wide char and it does not produce a BOM marker
> at the start of the file.

  Another possibility is that wordpad or notepad has tried to be clever and
gone and unexpectedly saved the original source file in UTF16.  Did you verify
the original source file in a hexeditor too, Michael?


Unsubscribe info:
Problem reports:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]