Hi, I am a contributor to the Perl project. https://github.com/Perl/perl5 We bundle your library libbzip2 via the perl Module Compress-Raw-Bzip2, which is hosted at: https://github.com/pmqs/Compress-Raw-Bzip2 We are trying to ensure our builds are warning clean, however when building this package your library produces "may be used uninitialized" warnings on some compilers, you can see these warnings below. While this is almost certainly a false positive warning, it is also trivial to silence the warnings with essentially no performance hit by initializing the cost structure with 0's at the start. We have an open bug report for this for perl here: https://github.com/Perl/perl5/issues/19432 And an open bug report for this for the module here: https://github.com/pmqs/Compress-Raw-Bzip2/issues/4 And a patch for it here: https://github.com/pmqs/Compress-Raw-Bzip2/pull/5/commits/641a440ec6229c1d368b9ead48f4968b955c0115 The maintainer of Compress-Raw-Bzip2 (on CC) is known to prefer not to apply changes to this library code that are not upstream. (A reasonable policy.) I was wondering if you could apply this patch, or something equivalent that would fix this so that the perl project could have warning free builds. Your library is the last module we ship that produces warnings during the build process. We target a wide range of compilers and try to keep our code as clean as possible, and while again I accept that your code likely is not buggy as you initialize it in a loop at line 357, 357: for (t = 0; t < nGroups; t++) cost[t] = 0; as far as I can tell nGroups is not a constant, and the compiler reasonably cannot be certain that all BZ_N_GROUPS entries in the array are initialized before use. We in the perl development community would appreciate it if you would release a patch that would silence these warnings for us. Then Paul could roll a new release of his module, and we in the Perl could then use that and be happy with a completely warning free build. gcc -c -I. -D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -Wall -Werror=pointer-arith -Werror=vla -Wextra -Wno-long-long -Wno-declaration-after-statement -Wc++-compat -Wwrite-strings -O3 -DVERSION=\"2.101\" -DXS_VERSION=\"2.101\" -fPIC "-I../.." -Wall -Wno-comment -DBZ_NO_STDIO compress.c compress.c: In function ‘BZ2_compressBlock’: compress.c:392:54: warning: ‘cost[5]’ may be used uninitialized in this function [-Wmaybe-uninitialized] for (t = 0; t < nGroups; t++) cost[t] += s->len[t][icv]; ^~ compress.c:256:11: note: ‘cost[5]’ was declared here UInt16 cost[BZ_N_GROUPS]; ^~~~ compress.c:402:21: warning: ‘cost[3]’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (cost[t] < bc) { bc = cost[t]; bt = t; }; ~~~~^~~ compress.c:256:11: note: ‘cost[3]’ was declared here UInt16 cost[BZ_N_GROUPS]; ^~~~ compress.c:402:21: warning: ‘cost[2]’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (cost[t] < bc) { bc = cost[t]; bt = t; }; ~~~~^~~ compress.c:256:11: note: ‘cost[2]’ was declared here UInt16 cost[BZ_N_GROUPS]; ^~~~ Your consideration on this matter would be much appreciated. Thank you very much for your time in producing this library, I personally have made use of it many times in my career. cheers, Yves
Sorry for the late reply. I think you are right that initializing these two small (6 ints) arrays at the start of the function is fine if it helps the compiler. I haven't been able to reproduce the compiler warning myself though. My proposed patch would be: diff --git a/compress.c b/compress.c index 5dfa002..c825c78 100644 --- a/compress.c +++ b/compress.c @@ -253,8 +253,8 @@ void sendMTFValues ( EState* s ) --*/ - UInt16 cost[BZ_N_GROUPS]; - Int32 fave[BZ_N_GROUPS]; + UInt16 cost[BZ_N_GROUPS] = { 0 }; + Int32 fave[BZ_N_GROUPS] = { 0 }; UInt16* mtfv = s->mtfv; { 0 } initializes the whole array with zero, so we don't have to know the exact number of elements. Does the above work for you?
A different solution might be to use the constant BZ_N_GROUPS instead of the variable nGroups when resetting the arrays: diff --git a/compress.c b/compress.c index 5dfa002..2dc5dc1 100644 --- a/compress.c +++ b/compress.c @@ -321,7 +321,7 @@ void sendMTFValues ( EState* s ) ---*/ for (iter = 0; iter < BZ_N_ITERS; iter++) { - for (t = 0; t < nGroups; t++) fave[t] = 0; + for (t = 0; t < BZ_N_GROUPS; t++) fave[t] = 0; for (t = 0; t < nGroups; t++) for (v = 0; v < alphaSize; v++) @@ -353,7 +353,7 @@ void sendMTFValues ( EState* s ) Calculate the cost of this group as coded by each of the coding tables. --*/ - for (t = 0; t < nGroups; t++) cost[t] = 0; + for (t = 0; t < BZ_N_GROUPS; t++) cost[t] = 0; if (nGroups == 6 && 50 == ge-gs+1) { /*--- fast track the common case ---*/ Which might also help the compiler to simply zero the whole (6 element) arrays in one go (the common case), instead of having to check whether nGroups is smaller in this particular case.
On Wed, 20 Apr 2022 at 00:23, mark at klomp dot org <sourceware-bugzilla@sourceware.org> wrote: > > https://sourceware.org/bugzilla/show_bug.cgi?id=28904 > > --- Comment #2 from Mark Wielaard <mark at klomp dot org> --- > A different solution might be to use the constant BZ_N_GROUPS instead of the > variable nGroups when resetting the arrays: > > diff --git a/compress.c b/compress.c > index 5dfa002..2dc5dc1 100644 > --- a/compress.c > +++ b/compress.c > @@ -321,7 +321,7 @@ void sendMTFValues ( EState* s ) > ---*/ > for (iter = 0; iter < BZ_N_ITERS; iter++) { > > - for (t = 0; t < nGroups; t++) fave[t] = 0; > + for (t = 0; t < BZ_N_GROUPS; t++) fave[t] = 0; > > for (t = 0; t < nGroups; t++) > for (v = 0; v < alphaSize; v++) > @@ -353,7 +353,7 @@ void sendMTFValues ( EState* s ) > Calculate the cost of this group as coded > by each of the coding tables. > --*/ > - for (t = 0; t < nGroups; t++) cost[t] = 0; > + for (t = 0; t < BZ_N_GROUPS; t++) cost[t] = 0; > > if (nGroups == 6 && 50 == ge-gs+1) { > /*--- fast track the common case ---*/ > > Which might also help the compiler to simply zero the whole (6 element) arrays > in one go (the common case), instead of having to check whether nGroups is > smaller in this particular case. I am fine with both of your proposals. I was wondering why you don't use memset() instead of the explicit loop? From what I understand it is faster. Yves
(In reply to demerphq from comment #3) > I am fine with both of your proposals. I was wondering why you don't > use memset() instead of the explicit loop? From what I understand it > is faster. I went with the second option. I am not using memset but let the compiler pick the most efficient way of initializing. With just 6 elements calling memcpy might just be extra overhead. Thanks for the report. commit 9de658d248f9fd304afa3321dd7a9de1280356ec Author: Mark Wielaard <mark@klomp.org> Date: Thu May 26 22:38:01 2022 +0200 Initialize the fave and cost arrays fully We try to be smart in sendMTFValues by initializing just nGroups number of elements instead of all BZ_N_GROUPS elements. But this means the compiler doesn't know all elements are correctly initialized and might warn. The arrays are really small, BZ_N_GROUPS, 6 elements. And nGroups == BZ_N_GROUPS is the common case. So just initialize them all always. Using a constant loop might also help the compiler to optimize the initialization. https://sourceware.org/bugzilla/show_bug.cgi?id=28904
Need to check the code Wait... Thanks Em 19 de abr. de 2022 19:04, mark at klomp dot org via Bzip2-devel <bzip2-devel@sourceware.org> escreveu: https://sourceware.org/bugzilla/show_bug.cgi?id=28904 Mark Wielaard <mark at klomp dot org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2022-04-19 Ever confirmed|0 |1 CC| |mark at klomp dot org --- Comment #1 from Mark Wielaard <mark at klomp dot org> --- Sorry for the late reply. I think you are right that initializing these two small (6 ints) arrays at the start of the function is fine if it helps the compiler. I haven't been able to reproduce the compiler warning myself though. My proposed patch would be: diff --git a/compress.c b/compress.c index 5dfa002..c825c78 100644 --- a/compress.c +++ b/compress.c @@ -253,8 +253,8 @@ void sendMTFValues ( EState* s ) --*/ - UInt16 cost[BZ_N_GROUPS]; - Int32 fave[BZ_N_GROUPS]; + UInt16 cost[BZ_N_GROUPS] = { 0 }; + Int32 fave[BZ_N_GROUPS] = { 0 }; UInt16* mtfv = s->mtfv; { 0 } initializes the whole array with zero, so we don't have to know the exact number of elements. Does the above work for you? -- You are receiving this mail because: You are on the CC list for the bug.