This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: grep treating my text files as binary!

Am 25.12.2014 um 00:16 schrieb zzapper:
Eric Blake <> wrote in">

You upgraded grep.  This is an intentional change in behavior in the
newest grep.  Work around it by using 'grep -a' or 'LC_ALL=C grep'.
Eric had further written:
Basically, the POSIX definition of a binary file includes any file that
is encoded incorrectly for the current locale, and since your current
locale is (probably) UTF-8 encoding, any file (such as note.html) that
assumes some other encoding (probably Latin-1 8-bit encoding) will be
treated as binary unless you request -a or change locales.
Thanks Eric, just surprised not to see more people bleating about this
- it resisted my Googling skills!
I actually had complained about this nonsense in the grep bug channel (a
mailing list),
and Eric had responded there, my further reply being pending... so let
me put it here for now;
I've read the POSIX definition of "binary file" that was quoted in the
grep bug already,
and if I remember correctly (or how this is abbreviated here...) it does
not mention character encoding or locale.
In any case the argument is quite artificial since the new behaviour
hits many files that are in fact text files.
Thus it is very undesirable from any reasonable users' point of view,
which should be the guideline for software design rather than dogmatic
locale theories. Therefore I hold the claim that this is a serious flaw
in grep and I hope it will be reverted.

Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.

Problem reports:
Unsubscribe info:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]