problem with ISO-2022-KR encoder
Ulrich Drepper
drepper@cygnus.com
Mon Dec 20 22:24:00 GMT 1999
Bruno Haible <haible@ilog.fr> writes:
> The glibc-2.1.1 iconv ISO-2022-KR encoder puts an "Esc $ ) C" sequence
> only once, at the beginning of its output, not in every line.
>
>[...] but RFC 1557 says it must appear once in every line
> containing SO characters (rationale: so that if some lines of the text get
> lost, the remaining are still recognizable as Korean).
Where do you read this? The formal description in RFC 1557 is:
body = *e-line *1( designator *( e-line / h-line ))
designator = ESC "$" ")" "C"
e-line = *text CRLF
h-line = *text 1*( segment *text ) CRLF
segment = SO 1*(one-of-94 one-of-94) SI
(Closing parenthesis in `segment' added by me).
Of interest is the `body' line. This is not a recursive rule. And
since *1 in front of
( designator *( e-line / h-line ))
means "zero or one time" there must be exactly one occurrence of the
designator and this must happen before the first use of SO. Therefore
I think the glibc implementation is just fine and RFC 1557 does *not*
contradict Ken Lunde.
--
---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Cygnus Solutions `--' drepper at cygnus.com `------------------------
More information about the Libc-alpha
mailing list