This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: file conversion utility sought: from isolatin (8859-1) to utf8


Ralf Hauser wrote:
Hi,

Are there any tools like d2u or u2d for UTF-8 for cygwin?
> ...

A starting point might be http://userpage.fu-berlin.de/~ram/pub/pub_kfd8tk88g/perl_unicode_en ?

Not particularly cygwin related, but anyway... This is a better start: http://www.perldoc.com/perl5.8.0/lib/Encode.html

  #!/usr/bin/perl
  # iso2utf8.pl
  use Encode;

  while(<STDIN>){
    print encode("utf8", decode("iso-8859-1", $_));
  }

Then

  #!/bin/sh
  mkdir -p utf8
  for FILE in $* ; do iso2utf8.pl < $FILE > utf8/$FILE ; done

If you're sure you want in-place, finish off with

mv utf8/* .

If you need to handle a hierarchy of files, you need to fiddle with find -print0 | xargs -0, or keep it all in perl. I'm not a perl wiz,


Cheers, Rob


-- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]