This is sources Bugzilla
Bugzilla Version 2.17.5
Bugzilla Bug 4403
  strfry() gives skewed distributions Last modified: 2009-05-10 08:03:52
     Query page      Enter new bug
Bug#: 4403   Hardware:   Reporter: Aurelien Jarno <aurelien@aurel32.net>
Host: Target: Build:
Product:     Add CC:
Component:   Version:   CC:
Remove selected CCs
Status: RESOLVED   Priority:  
Resolution: FIXED   Severity:  
Assigned To: Ulrich Drepper <drepper@redhat.com>   Target Milestone:  
Flags: Requestee:
  backport ()
  examined ()
  testsuite ()
Summary:
Keywords:

Attachment Description Type Created Actions
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 4403 depends on: Show dependency tree
Show dependency graph
Bug 4403 blocks:

Additional Comments:


Leave as RESOLVED FIXED
Reopen bug
Mark bug as VERIFIED

View Bug Activity   |   Format For Printing


Description:   Last confirmed: 0000-00-00 00:00 Opened: 2007-04-21 22:36
strfry() tries to shuffle its string using random swaps, but it uses the
wrong strategy, and thus not all permutations are equally likely. The
code doing the shuffling itself looks like this:

  len = strlen (string);
  for (i = 0; i < len; ++i)
    {
      int32_t j;
      char c;

      __random_r (&rdata, &j);
      j %= len;

      c = string[i];
      string[i] = string[j];
      string[j] = c;
    }

In other words, for the string 'abc' j will always be between 0 and 2
(inclusive), giving the following possibilities (all equally likely):

  j0 j1 j2  result
   0  0  0  cab
   0  0  1  bca
   0  0  2  bac
   0  1  0  cba
   0  1  1  acb
   0  1  2  abc
   0  2  0  bca
   0  2  1  abc
   0  2  2  acb
   1  0  0  cba
   1  0  1  acb
   1  0  2  abc
   1  1  0  cab
   1  1  1  bca
   1  1  2  bac
   1  2  0  acb
   1  2  1  bac
   1  2  2  bca
   2  0  0  acb
   2  0  1  bac
   2  0  2  bca
   2  1  0  abc
   2  1  1  cab
   2  1  2  cba
   2  2  0  bac
   2  2  1  cba
   2  2  2  cab

Sorting and counting gives us the following distribution:

   4  abc
   5  acb
   5  bac
   5  bca
   4  cab
   4  cba

In other words, this is clearly skewed; some strings will appear 25% more often
than others.

Steinar H. Gunderson <sgunderson@bigfoot.com> proposed the following patch:

--- string/strfry.c     2007-04-21 23:12:47.000000000 +0200
+++ string/strfry.c     2007-04-21 23:22:46.000000000 +0200
@@ -38,17 +38,17 @@
     }

   len = strlen (string);
-  for (i = 0; i < len; ++i)
+  for (i = 0; i < len - 1; ++i)
     {
       int32_t j;
       char c;

       __random_r (&rdata, &j);
-      j %= len;
+      j %= (len - i);

       c = string[i];
-      string[i] = string[j];
-      string[j] = c;
+      string[i] = string[j + i];
+      string[j + i] = c;
     }

   return string;

It turns strfry() into a proper Fisher-Yates shuffle. This gives exactly
n! paths for a string with N characters, and since there are n! possible
permutations, this means a one-to-one mapping, and all possibilities are
equally likely.

------- Additional Comment #1 From Ulrich Drepper 2007-05-08 04:28 -------
This function is a joke.  Don't you have better things to do?  It's not worth
arguing about and so to safe me time I added a modifeed versson of the patch.

------- Additional Comment #2 From Aurelien Jarno 2007-05-20 21:48 -------
Your version is still not right, as now some strings could not appear anymore.

For example for the string "abc" the strings "abc" and "acb" could never 
appear.

The version already attached to this bug report returns all cases with a 
correct distribution.

------- Additional Comment #3 From Ulrich Drepper 2007-08-22 07:29 -------
Stop wasting people's time!  Nobody cares about this crap.  I made a last change
but it's just too embarasing to even admit that.

------- Additional Comment #4 From Evan Carroll 2009-05-09 17:00 -------
(In reply to comment #3)
> Stop wasting people's time!  Nobody cares about this crap.  I made a last change
> but it's just too embarasing to even admit that.

Refusing someone else's better free code without reason? You're a dick. You--

------- Additional Comment #5 From Petr Baudis 2009-05-10 08:03 -------
Please do not reopen bugs spuriously, thank you.

     Query page      Enter new bug
Actions: New | Query | bug # | Reports | Requests   New Account | Log In