Bug 20756

Summary:	[PATCH] Use Unicode wise thousands separator
Product:	glibc	Reporter:	Stanislav Brabec <sbrabec>
Component:	localedata	Assignee:	Mike FABIAN <maiku.fabian>
Status:	RESOLVED FIXED
Severity:	minor	CC:	carlos, libc-locales, maiku.fabian
Priority:	P2	Flags:	fweimer: security-
Version:	unspecified
Target Milestone:	2.27
Host:		Target:
Build:		Last reconfirmed:	2016-11-01 00:00:00
Attachments:	Proposed changes

Description Stanislav Brabec 2016-11-01 19:52:26 UTC

Created attachment 9605 [details]
Proposed changes

Many languages use small gap as thousands separator.

Thousands separator should not be a plain space, but a narrow space. And additionally, it is not allowed to wrap number in the middle when wrapping line.

Locale data were created in a deep age of 8-bit encodings, so most of them use space (incorrect: it allows word wrapping in the middle of the number), or NBSP (better, but typographically incorrect: space between group is too wide).

Now unicode is widely supported, so we should leave legacy characters in favor of correct UNICODE character.

UNICODE has a dedicated character for this purpose:

NNBSP
U+202F NARROW NO-BREAK SPACE: a narrow form of a no-break space, typically the width of a thin space or a mid space

Comment 1 Carlos O'Donell 2016-11-01 22:17:49 UTC

(In reply to Stanislav Brabec from comment #0)
> Created attachment 9605 [details]
> Proposed changes
> 
> Many languages use small gap as thousands separator.
> 
> Thousands separator should not be a plain space, but a narrow space. And
> additionally, it is not allowed to wrap number in the middle when wrapping
> line.

Agreed.
 
> Locale data were created in a deep age of 8-bit encodings, so most of them
> use space (incorrect: it allows word wrapping in the middle of the number),
> or NBSP (better, but typographically incorrect: space between group is too
> wide).
> 
> Now unicode is widely supported, so we should leave legacy characters in
> favor of correct UNICODE character.
> 
> UNICODE has a dedicated character for this purpose:
> 
> NNBSP
> U+202F NARROW NO-BREAK SPACE: a narrow form of a no-break space, typically
> the width of a thin space or a mid space

I would support this change.

The NNBSP has been around since Unicode 3.0 so we support it across the board.

Can you please post this to libc-alpha following:
https://sourceware.org/glibc/wiki/Contribution%20checklist

Note that you don't need a copyright assignment for locale data changes as your patch proposes.

Comment 2 Stanislav Brabec 2016-11-02 15:54:34 UTC

Sent to libc-alpha: https://sourceware.org/ml/libc-alpha/2016-11/msg00062.html

Comment 3 Joseph Myers 2017-08-28 16:29:43 UTC

Restoring changes lost in system crash and restore from backup.

https://sourceware.org/ml/glibc-bugs/2017-08/msg00358.html
https://sourceware.org/ml/glibc-bugs/2017-08/msg00359.html