Bug 28904 - libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
Summary: libbzip2: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-unini...
Status: RESOLVED FIXED
Alias: None
Product: bzip2
Classification: Unclassified
Component: bzip2 (show other bugs)
Version: unspecified
: P2 enhancement
Target Milestone: ---
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-02-18 03:18 UTC by demerphq
Modified: 2022-05-31 19:06 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2022-04-19 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description demerphq 2022-02-18 03:18:52 UTC
Hi,

I am a contributor to the Perl project. https://github.com/Perl/perl5

We bundle your library libbzip2 via the perl Module Compress-Raw-Bzip2, which is hosted at:

https://github.com/pmqs/Compress-Raw-Bzip2

We are trying to ensure our builds are warning clean, however when building this package your library produces "may be used uninitialized" warnings on some compilers, you can see these warnings below. 

While this is almost certainly a false positive warning, it is also trivial to silence the warnings with essentially no performance hit by initializing the cost structure with 0's at the start.

We have an open bug report for this for perl here:
https://github.com/Perl/perl5/issues/19432

And an open bug report for this for the module here:
https://github.com/pmqs/Compress-Raw-Bzip2/issues/4

And a patch for it here:
https://github.com/pmqs/Compress-Raw-Bzip2/pull/5/commits/641a440ec6229c1d368b9ead48f4968b955c0115

The maintainer of Compress-Raw-Bzip2 (on CC) is known to prefer not to apply changes to this library code that are not upstream. (A reasonable policy.)

I was wondering if you could apply this patch, or something equivalent that would fix this so that the perl project could have warning free builds. Your library is the last module we ship that produces warnings during the build process. We target a wide range of compilers and try to keep our code as clean as possible, and while again I accept that your code likely is not buggy as you initialize it in a loop at line 357, 

357:         for (t = 0; t < nGroups; t++) cost[t] = 0;

as far as I can tell nGroups is not a constant, and the compiler reasonably cannot be certain that all BZ_N_GROUPS entries in the array are initialized before use.  

We in the perl development community would appreciate it if you would release a patch that would silence these warnings for us. Then Paul could roll a new release of his module, and we in the Perl could then use that and be happy with a completely warning free build.

gcc -c  -I. -D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -Wall -Werror=pointer-arith -Werror=vla -Wextra -Wno-long-long -Wno-declaration-after-statement -Wc++-compat -Wwrite-strings -O3   -DVERSION=\"2.101\" -DXS_VERSION=\"2.101\" -fPIC "-I../.."  -Wall -Wno-comment  -DBZ_NO_STDIO  compress.c
compress.c: In function ‘BZ2_compressBlock’:
compress.c:392:54: warning: ‘cost[5]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
                for (t = 0; t < nGroups; t++) cost[t] += s->len[t][icv];
                                                      ^~
compress.c:256:11: note: ‘cost[5]’ was declared here
    UInt16 cost[BZ_N_GROUPS];
           ^~~~
compress.c:402:21: warning: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
             if (cost[t] < bc) { bc = cost[t]; bt = t; };
                 ~~~~^~~
compress.c:256:11: note: ‘cost[3]’ was declared here
    UInt16 cost[BZ_N_GROUPS];
           ^~~~
compress.c:402:21: warning: ‘cost[2]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
             if (cost[t] < bc) { bc = cost[t]; bt = t; };
                 ~~~~^~~
compress.c:256:11: note: ‘cost[2]’ was declared here
    UInt16 cost[BZ_N_GROUPS];
           ^~~~
Your consideration on this matter would be much appreciated. 

Thank you very much for your time in producing this library, I personally have made use of it many times in my career.

cheers,
Yves
Comment 1 Mark Wielaard 2022-04-19 22:04:52 UTC
Sorry for the late reply.

I think you are right that initializing these two small (6 ints) arrays at the start of the function is fine if it helps the compiler.

I haven't been able to reproduce the compiler warning myself though.
My proposed patch would be:

diff --git a/compress.c b/compress.c
index 5dfa002..c825c78 100644
--- a/compress.c
+++ b/compress.c
@@ -253,8 +253,8 @@ void sendMTFValues ( EState* s )
    --*/
 
 
-   UInt16 cost[BZ_N_GROUPS];
-   Int32  fave[BZ_N_GROUPS];
+   UInt16 cost[BZ_N_GROUPS] = { 0 };
+   Int32  fave[BZ_N_GROUPS] = { 0 };
 
    UInt16* mtfv = s->mtfv;
 

{ 0 } initializes the whole array with zero, so we don't have to know the exact number of elements.

Does the above work for you?
Comment 2 Mark Wielaard 2022-04-19 22:23:49 UTC
A different solution might be to use the constant BZ_N_GROUPS instead of the variable nGroups when resetting the arrays:

diff --git a/compress.c b/compress.c
index 5dfa002..2dc5dc1 100644
--- a/compress.c
+++ b/compress.c
@@ -321,7 +321,7 @@ void sendMTFValues ( EState* s )
    ---*/
    for (iter = 0; iter < BZ_N_ITERS; iter++) {
 
-      for (t = 0; t < nGroups; t++) fave[t] = 0;
+      for (t = 0; t < BZ_N_GROUPS; t++) fave[t] = 0;
 
       for (t = 0; t < nGroups; t++)
          for (v = 0; v < alphaSize; v++)
@@ -353,7 +353,7 @@ void sendMTFValues ( EState* s )
             Calculate the cost of this group as coded
             by each of the coding tables.
          --*/
-         for (t = 0; t < nGroups; t++) cost[t] = 0;
+         for (t = 0; t < BZ_N_GROUPS; t++) cost[t] = 0;
 
          if (nGroups == 6 && 50 == ge-gs+1) {
             /*--- fast track the common case ---*/

Which might also help the compiler to simply zero the whole (6 element) arrays in one go (the common case), instead of having to check whether nGroups is smaller in this particular case.
Comment 3 demerphq 2022-04-20 01:57:12 UTC
On Wed, 20 Apr 2022 at 00:23, mark at klomp dot org
<sourceware-bugzilla@sourceware.org> wrote:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=28904
>
> --- Comment #2 from Mark Wielaard <mark at klomp dot org> ---
> A different solution might be to use the constant BZ_N_GROUPS instead of the
> variable nGroups when resetting the arrays:
>
> diff --git a/compress.c b/compress.c
> index 5dfa002..2dc5dc1 100644
> --- a/compress.c
> +++ b/compress.c
> @@ -321,7 +321,7 @@ void sendMTFValues ( EState* s )
>     ---*/
>     for (iter = 0; iter < BZ_N_ITERS; iter++) {
>
> -      for (t = 0; t < nGroups; t++) fave[t] = 0;
> +      for (t = 0; t < BZ_N_GROUPS; t++) fave[t] = 0;
>
>        for (t = 0; t < nGroups; t++)
>           for (v = 0; v < alphaSize; v++)
> @@ -353,7 +353,7 @@ void sendMTFValues ( EState* s )
>              Calculate the cost of this group as coded
>              by each of the coding tables.
>           --*/
> -         for (t = 0; t < nGroups; t++) cost[t] = 0;
> +         for (t = 0; t < BZ_N_GROUPS; t++) cost[t] = 0;
>
>           if (nGroups == 6 && 50 == ge-gs+1) {
>              /*--- fast track the common case ---*/
>
> Which might also help the compiler to simply zero the whole (6 element) arrays
> in one go (the common case), instead of having to check whether nGroups is
> smaller in this particular case.

I am fine with both of your proposals. I was wondering why you don't
use memset() instead of the explicit loop? From what I understand it
is faster.

Yves
Comment 4 Mark Wielaard 2022-05-26 20:53:44 UTC
(In reply to demerphq from comment #3)
> I am fine with both of your proposals. I was wondering why you don't
> use memset() instead of the explicit loop? From what I understand it
> is faster.

I went with the second option. I am not using memset but let the compiler pick the most efficient way of initializing. With just 6 elements calling memcpy might just be extra overhead. Thanks for the report.

commit 9de658d248f9fd304afa3321dd7a9de1280356ec
Author: Mark Wielaard <mark@klomp.org>
Date:   Thu May 26 22:38:01 2022 +0200

    Initialize the fave and cost arrays fully
    
    We try to be smart in sendMTFValues by initializing just nGroups
    number of elements instead of all BZ_N_GROUPS elements. But this means
    the compiler doesn't know all elements are correctly initialized and
    might warn. The arrays are really small, BZ_N_GROUPS, 6 elements. And
    nGroups == BZ_N_GROUPS is the common case. So just initialize them all
    always. Using a constant loop might also help the compiler to optimize
    the initialization.
    
    https://sourceware.org/bugzilla/show_bug.cgi?id=28904
Comment 5 email 2022-05-31 19:06:56 UTC
   Need to check the code
   Wait...
   Thanks
   Em 19 de abr. de 2022 19:04, mark at klomp dot org via Bzip2-devel
   <bzip2-devel@sourceware.org> escreveu:

     https://sourceware.org/bugzilla/show_bug.cgi?id=28904

     Mark Wielaard <mark at klomp dot org> changed:

     What |Removed |Added
     ----------------------------------------------------------------------------
     Status|UNCONFIRMED |NEW
     Last reconfirmed| |2022-04-19
     Ever confirmed|0 |1
     CC| |mark at klomp dot org

     --- Comment #1 from Mark Wielaard <mark at klomp dot org> ---
     Sorry for the late reply.

     I think you are right that initializing these two small (6 ints)
     arrays at the
     start of the function is fine if it helps the compiler.

     I haven't been able to reproduce the compiler warning myself
     though.
     My proposed patch would be:

     diff --git a/compress.c b/compress.c
     index 5dfa002..c825c78 100644
     --- a/compress.c
     +++ b/compress.c
     @@ -253,8 +253,8 @@ void sendMTFValues ( EState* s )
     --*/


     - UInt16 cost[BZ_N_GROUPS];
     - Int32 fave[BZ_N_GROUPS];
     + UInt16 cost[BZ_N_GROUPS] = { 0 };
     + Int32 fave[BZ_N_GROUPS] = { 0 };

     UInt16* mtfv = s->mtfv;


     { 0 } initializes the whole array with zero, so we don't have to
     know the exact
     number of elements.

     Does the above work for you?

     --
     You are receiving this mail because:
     You are on the CC list for the bug.