This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: length in gawk returns wrong value
- From: Corinna Vinschen <corinna-cygwin at cygwin dot com>
- To: cygwin at cygwin dot com
- Date: Thu, 19 Jul 2012 11:20:24 +0200
- Subject: Re: length in gawk returns wrong value
- References: <loom.20120719T103849-659@post.gmane.org>
- Reply-to: cygwin at cygwin dot com
On Jul 19 08:50, Ralf wrote:
> The following lines create a file named ttt.txt. The file ttt.txt contains
> exactly what I want (oct 374 for the umlaut u). But if you look at the output of
> these lines you can see that the function length() of gawk can not handle this
> character:
>
> uname -a
> echo "RÃcken" > ttt.txt
> od -c ttt.txt
> gawk '{print "Length: " length($0)}' ttt.txt
>
> Output:
> CYGWIN_NT-6.0-WOW64 WIESWEG 1.7.9(0.237/5/3) 2011-03-29 10:10 i686 Cygwin
Uh oh. 1.7.9 is old. Please update.
> 0000000 R 374 c k e n \r \n
> 0000010
> Length: 1
>
> What can I do to get the correct length in gawk without changing the contents of
> ttt.txt?
Dunno. This is not what I see. What did you have $LANG and $LC_CTYPE
set to? Here's what I see:
$ uname -a
CYGWIN_NT-6.1 vmbert7 1.7.16(0.261/5/3) 2012-07-09 14:51 i686 Cygwin
$ echo $LANG
C.UTF-8
$ echo "RÃcken" > ttt.txt
$ od -c ttt.txt
0000000 R 303 274 c k e n \n
0000010
$ gawk '{print "Length: " length($0)}' ttt.txt
Length: 6
$ gawk --version | head -1
GNU Awk 4.0.1
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple