This is the mail archive of the guile@cygnus.com mailing list for the guile project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: i18n; wide characters; Guile

To: drepper@ipd.info.uni-karlsruhe.de (Ulrich Drepper)
Subject: Re: i18n; wide characters; Guile
From: Jim Blandy <jimb@red-bean.com>
Date: Sat, 18 Oct 1997 22:27:54 -0400
Cc: Erik Naggum <nobody@naggum.no>, Guile Discussion <guile@cygnus.com>
References: <199710190030.UAA00834@totoro.red-bean.com><x7ra9i375f.fsf@myware.rz.uni-karlsruhe.de>


>The answer can only be UCS4.  It's no surprise that all reasonable
>i18n developers (this excludes those at IBM) use a 32bit type for
>wchar_t.

*ugh*

Can you be more specific about why 32 bits are needed?  Which
character sets does Unicode not accomodate?  Or is that the wrong
question for me to ask?

>This may sound like a big waste of space but if used correctly it
>isn't.  Normally string are not meant to contain whole text books but
>instead are rather short.  This means there is not that much
>redundancy.  If you need to store large texts you can still fall back
>on a multibyte encoding, perhaps offer several of them so that the
>most effective can be chosen.

This argument is not entirely reassuring to me.  If one thinks mostly
about processing text streams, sure, this is fine.  However, I am more
interested in interactive applications like Emacs, and related things
with wider audiences.  In such applications there are no clear
boundaries at which it is convenient to convert between a dense form,
like UTF-8, and a sparse but consistent form, like UCS2.  An Emacs
buffer must hold large amounts of text, and must also serve as the
operand to editing and searching commands.  It is terribly clumsy to
use a variable-length encoding in buffers.  Since the buffer
representation must be the foundation of all other i18n support, it's
important to get it right.  Doubling the text storage required isn't
so unreasonable; quadrupling it is.

Follow-Ups:
- Re: i18n; wide characters; Guile
  - From: Ulrich Drepper <drepper@ipd.info.uni-karlsruhe.de>
- Re: i18n; wide characters; Guile
  - From: Per Bothner <bothner@cygnus.com>

References:
- i18n; wide characters; Guile
  - From: Jim Blandy <jimb@red-bean.com>
- Re: i18n; wide characters; Guile
  - From: Ulrich Drepper <drepper@ipd.info.uni-karlsruhe.de>
- Re: i18n; wide characters; Guile
  - From: Ulrich Drepper <drepper@ipd.info.uni-karlsruhe.de>

Prev by Date: Japanese and Unicode
Next by Date: Re: i18n; wide characters; Guile
Prev by thread: Re: i18n; wide characters; Guile
Next by thread: Re: i18n; wide characters; Guile
Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]