20756 – [PATCH] Use Unicode wise thousands separator

Bug 20756 - [PATCH] Use Unicode wise thousands separator

Summary: [PATCH] Use Unicode wise thousands separator

Status:	RESOLVED FIXED

Alias:	None

Product:	glibc
Classification:	Unclassified
Component:	localedata (show other bugs)
Version:	unspecified

Importance:	P2 minor
Target Milestone:	2.27
Assignee:	Mike FABIAN

URL:
Keywords:

Depends on:
Blocks:

Reported:	2016-11-01 19:52 UTC by Stanislav Brabec
Modified:	2017-08-28 16:29 UTC (History)
CC List:	3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:	2016-11-01 00:00:00

Flags:	fweimer: security-

Attachments
Proposed changes (3.66 KB, patch) 2016-11-01 19:52 UTC, Stanislav Brabec	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Stanislav Brabec 2016-11-01 19:52:26 UTC

Created attachment 9605 [details]
Proposed changes

Many languages use small gap as thousands separator.

Thousands separator should not be a plain space, but a narrow space. And additionally, it is not allowed to wrap number in the middle when wrapping line.

Locale data were created in a deep age of 8-bit encodings, so most of them use space (incorrect: it allows word wrapping in the middle of the number), or NBSP (better, but typographically incorrect: space between group is too wide).

Now unicode is widely supported, so we should leave legacy characters in favor of correct UNICODE character.

UNICODE has a dedicated character for this purpose:

NNBSP
U+202F NARROW NO-BREAK SPACE: a narrow form of a no-break space, typically the width of a thin space or a mid space

Comment 1 Carlos O'Donell 2016-11-01 22:17:49 UTC

(In reply to Stanislav Brabec from comment #0)
> Created attachment 9605 [details]
> Proposed changes
> 
> Many languages use small gap as thousands separator.
> 
> Thousands separator should not be a plain space, but a narrow space. And
> additionally, it is not allowed to wrap number in the middle when wrapping
> line.

Agreed.
 
> Locale data were created in a deep age of 8-bit encodings, so most of them
> use space (incorrect: it allows word wrapping in the middle of the number),
> or NBSP (better, but typographically incorrect: space between group is too
> wide).
> 
> Now unicode is widely supported, so we should leave legacy characters in
> favor of correct UNICODE character.
> 
> UNICODE has a dedicated character for this purpose:
> 
> NNBSP
> U+202F NARROW NO-BREAK SPACE: a narrow form of a no-break space, typically
> the width of a thin space or a mid space

I would support this change.

The NNBSP has been around since Unicode 3.0 so we support it across the board.

Can you please post this to libc-alpha following:
https://sourceware.org/glibc/wiki/Contribution%20checklist

Note that you don't need a copyright assignment for locale data changes as your patch proposes.

Comment 2 Stanislav Brabec 2016-11-02 15:54:34 UTC

Sent to libc-alpha: https://sourceware.org/ml/libc-alpha/2016-11/msg00062.html

Comment 3 Joseph Myers 2017-08-28 16:29:43 UTC

Restoring changes lost in system crash and restore from backup.

https://sourceware.org/ml/glibc-bugs/2017-08/msg00358.html
https://sourceware.org/ml/glibc-bugs/2017-08/msg00359.html