This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Cygwin bash regexp matching doesn't treat "\b" properly

Dave Korn <dave.korn.cygwin <at>> writes:

> $ [[ "foo" =~ [[:\<:]]foo[[:\>:]] ]]; echo $?
> 0
>   (Note that I had to backslash-escape the < and > there.  In other contexts
> that might not be needed.)

But here's something weird with how bash manages quoting inside [[ ]].  If you 
add a subexpression, you no longer need to quote < or >:

$ [[ foo =~ ([[:<:]]foo[[:>:]]) ]]; echo $?

With further experimentation, it turns out that cygwin's regex(3) does not 
understand [[:<:][:>:]] as a character class that accepts either direction of 
word boundary (for shame).  So, modulo the difference in the number of 
subexpressions, the closest representation of \b becomes:


and an expression to match words that either end in a or begin in b would be:

$ [[ ' b ' =~ ([a ]([[:<:]]|[[:>:]])[b ]) ]]; echo $?
$ [[ ' ab '  =~ ([a ]([[:<:]]|[[:>:]])[b ]) ]]; echo $?

which looks so much shorter as ([a ]\b[b ])

Eric Blake

Problem reports:
Unsubscribe info:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]