This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: bz1311954 - multilib variations in LC_COLLATE files, with fixes

From: Rafal Luzynski <digitalfreak at lingonborough dot com>
To: DJ Delorie <dj at redhat dot com>, libc-alpha at sourceware dot org
Date: Wed, 20 Mar 2019 21:36:00 +0100 (CET)
Subject: Re: bz1311954 - multilib variations in LC_COLLATE files, with fixes
References: <xnlg19v1wb.fsf@greed.delorie.com>

20.03.2019 19:25 DJ Delorie <dj@redhat.com> wrote:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1311954
> 
> Fedora BZ 1311954 saw that the locale archives were not the same
> across all builds; partly that was due to big vs little endian, but
> there were differences between 32 and 64-bit on both s390 and x86.

When this bug is fixed, will the locale archives be the same across
all builds?

> In locale/programs/ld-collate.c we see this:
> 
>   /* Add 40% and find the next prime number.  */
>   elem_size = next_prime (elem_size * 1.4);
> 
> After debugging this for a week, it turned out that when elem_size is
> 120, "elem_size * 1.4" is 168 - but not exactly 168.  Since 1.4 isn't
> exactly representable in IEEE, the result was either 168.00000001 or
> 167.9999999 - and it turns out that 167 *is prime* so the next_prime()
> call returned completely different results depending on the FPU and
> rounding.

Thanks for finding this, I appreciate.

Well, the decimal fractions are known to be often infinite and never
accurate in binary systems.

> The solution is to avoid floating point math.
> 
> Since elem_size is limited by locale limits, overflow isn't much of a
> problem.  There are a couple of simple changes, but which to choose?
> 
> /* Same value as before, but "/10" isn't exact */
>   elem_size = next_prime (elem_size * 14/10);

What about:

  elem_size = next_prime (elem_size * 7/5);

or

  elem_size = next_prime (elem_size + elem_size * 2/5);

> /* Slightly different value (37.5%) but now it's exact */
>   elem_size = next_prime (elem_size * 11/8);
> 
> /* Same as above, but without extra chance of overflow */
>   elem_size = next_prime (elem_size + (elem_size>>2) + (elem_size>>3));

If it does not have to be exactly 1.4 then what about:

  elem_size = next_prime (elem_size + elem_size / 2);

which is 1.5?

> Note this math also happens in ./iconv/iconvconfig.c:
>   hash_size = next_prime (nnames * 1.4); 

I guess this will need the same update.

Regards,

Rafal

Follow-Ups:
- Re: bz1311954 - multilib variations in LC_COLLATE files, with fixes
  - From: Carlos O'Donell
- Re: bz1311954 - multilib variations in LC_COLLATE files, with fixes
  - From: DJ Delorie

References:
- bz1311954 - multilib variations in LC_COLLATE files, with fixes
  - From: DJ Delorie

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]