This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Grep and matching end of line (anchoring)


At Friday, November 19, 2004 1:30 PM, Dave Korn wrote:
>> -----Original Message-----
>> From: cygwin-owner On Behalf Of Buchbinder, Barry (NIH/NIAID)
>> Sent: 19 November 2004 15:17
> 
>> This should work whether or not one is on a text mount or for
>> the file has DOS or Unix line endings:
>> 
>> 	cat files.txt | grep -E '\.h^M?$'
>
>   Always test before posting.  Even a one liner.  That doesn't work,
> or at least NFM:
> 
> dk@mace /test/grep-test> od -c test.dos.txt
> 0000000   H   e   l   l   o       w   o   r   l   d  \r  \n
> 0000015
> dk@mace /test/grep-test> od -c test.unix.txt
> 0000000   H   e   l   l   o       w   o   r   l   d  \n
> 0000014
> dk@mace /test/grep-test> grep -E 'ld^M?$' *
> dk@mace /test/grep-test> grep -E 'd^M?$' *
> dk@mace /test/grep-test> grep -E '.^M?$' *
> dk@mace /test/grep-test>
> 
>   Grep knows there's a char there, but it won't match it with ^M.
> 
> dk@mace /test/grep-test> grep -E '.$' *
> test.dos.txt:Hello world
> test.unix.txt:Hello world
> dk@mace /test/grep-test> grep -E 'd.$' *
> test.dos.txt:Hello world
> dk@mace /test/grep-test> grep -E 'd^M$' *
> dk@mace /test/grep-test> grep -E 'd^m$' *
> dk@mace /test/grep-test>
> 
> 
>   What makes you think grep understands ^ notation to indicate control
> chars?  It doesn't say so in the info page.  (It doesn't recognize
> [\r] either.)
> 
>   Actually, it seems that grep
> 
>     cheers,
>       DaveK

I tested by cat-ing a batch file and it worked for me.  I did not put the
two character "^" and "M" in.  In bash I put a control-M by hitting
control-V and then <enter>.  The console showed the two character ^M and I
just copied the console screen to the email.  (Display of \r as ^M might be
due to $CYGWIN containing tty -- I don't know.)

During my testing I also discovered that grep does not understand \r.

I used the word "should" because I did not test in all combinations of text
and binary mounts and line endings.  I'm sorry if that choice of word was
too ambiguous or subtle.

I did not think that grep understood ^M -- I assumed that the readers in
this list would understand it.  Personally I've never seen the two character
^M used for inputting a \r.  It has, in my experience, always been used to
indicate a \r in output or when viewing a file in a hex editor so I thought
that it would be understood.  I apologize for not being explicit.
 

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]