In the past, it was common in Linux kernel code to have a "fake flex array" at the end of a structure. Like this: struct foo { int x; int y; int end[1]; }; In recent years, with improved compiler support, real flex arrays have been preferred. It's common to see patches like this: https://github.com/torvalds/linux/commit/c6f2e6b6eaaf883df482cb94f302acad9b80a2a4 Basically, this takes the struct above, and changes it to: struct foo { int x; int y; int end[]; }; abidiff flags this change with: [C] 'struct foo' changed: type size changed from 96 to 64 (in bits) 1 data member change: type of 'int end[1]' changed: type name changed from 'int[1]' to 'int[]' array type size changed from 32 to 'unknown' array type subrange 1 changed length from 1 to 'unknown' It would be good to have a suppression to filter out this kind of change. For example: [suppress_type] type_kind = struct has_size_change = true has_strict_flexible_array_data_member_conversion = true
I have started a patch in the branch https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/users/dodji/PR31017 to support this. It's still dry at the moment, but I believe it should support the example provided in this feature request. I'll be polishing it and prepare it for submission. Please feel free to try it and tell me what rough edges you think I should polish more.
Thanks for the implementation! I think this looks great. I tested it and it seems to be working properly for our use case in the kernel! One question about this block: + // Support for the + // "has_strict_flexible_array_data_member_conversion = true" + // clause. + if (has_strict_fam_conversion()) + { + // Let's detect if the first class of the diff has a fake + // flexible array data member that got turned into a real + // flexible array data member. + if (!( + (has_fake_flexible_array_data_member(first_class) + && has_flexible_array_data_member(second_class)) + // A fake flexible array member has been changed into + // a real flexible array ... + && + ((first_class->get_size_in_bits() + == second_class->get_size_in_bits()) + || get_has_size_change()) + // There was no size change or the suppression has a + // "has_size_change = true" clause. + )) + return false; + } Is it possible for a structure to meet the first condition (fake flex -> flex) *without* a size change? I'd think not, but may be missing something. Basically, I think you can get rid of the first_class->get_size_in_bits() == second_class->get_size_in_bits() check. Also, if we know a size change is a tautology, you could move the get_has_size_change() check to be first and save a few CPU cycles. Other than that, LGTM! I went ahead and made that change and added (at least a start) on the documentation/tests. I don't have access to make a branch, so I just sent a patch separately. We can continue discussion there as needed.
"quic_johmoo at quicinc dot com" <sourceware-bugzilla@sourceware.org> a écrit: > https://sourceware.org/bugzilla/show_bug.cgi?id=31017 > > --- Comment #2 from John Moon <quic_johmoo at quicinc dot com> --- > > Thanks for the implementation! I think this looks great. I tested it and it > seems to be working properly for our use case in the kernel! > > One question about this block: > > + // Support for the > + // "has_strict_flexible_array_data_member_conversion = true" > + // clause. > + if (has_strict_fam_conversion()) > + { > + // Let's detect if the first class of the diff has a fake > + // flexible array data member that got turned into a real > + // flexible array data member. > + if (!( > + (has_fake_flexible_array_data_member(first_class) > + && has_flexible_array_data_member(second_class)) > + // A fake flexible array member has been changed into > + // a real flexible array ... > + && > + ((first_class->get_size_in_bits() > + == second_class->get_size_in_bits()) > + || get_has_size_change()) > + // There was no size change or the suppression has a > + // "has_size_change = true" clause. > + )) > + return false; > + } > > Is it possible for a structure to meet the first condition (fake flex -> flex) > *without* a size change? As a general rule, suppression specifications (aka supprspecs) don't apply to a type which size has changed, unless the user /really/ wants the supprspec to apply. If she really wants it, then she has to explicitly say "has_size_change = yes". This is prevents "too eager" supprspecs to be applied without the user noticing the supprspecs is too eager. Basically, if a type's size changed, more often than not, we don't want to suppress its change report, for obvious reasons. That is why, throughout type_suppression::suppresses_diff you see the careful attention to the size change condition, when evaluating supprspecs. In this particular case, I think that we can have "fake flex -> flex" changes without a size change because there can other /additional/ changes that counter the size change we would have expected. That change could have been introduced, on purpose, to keep the ABI stable. For instance: $ diff -u test-PR31017-2-v0.c test-PR31017-2-v1.c --- test-PR31017-2-v0.c 2023-11-05 10:04:36.433000539 +0100 +++ test-PR31017-2-v1.c 2023-11-05 10:07:00.579810832 +0100 @@ -2,7 +2,8 @@ { int x; int y; - int end[1]; + int padding; + int end[]; }; $ /home/dodji/git/libabigail/PR31017/build/tools/abidiff test-PR31017-2-v0.o test-PR31017-2-v1.o Functions changes summary: 0 Removed, 1 Changed, 0 Added function Variables changes summary: 0 Removed, 0 Changed, 0 Added variable 1 function with some indirect sub-type change: [C] 'function void fun(foo*)' at test-PR31017-2-v0.c:9:1 has some indirect sub-type changes: parameter 1 of type 'foo*' has sub-type changes: in pointed to type 'struct foo' at test-PR31017-2-v1.c:1:1: type size hasn't changed 1 data member insertion: 'int padding', at offset 64 (in bits) at test-PR31017-2-v1.c:5:1 1 data member change: type of 'int end[1]' changed: type name changed from 'int[1]' to 'int[]' array type size changed from 32 to 'unknown' array type subrange 1 changed length from 1 to 'unknown' and offset changed from 64 to 96 (in bits) (by +32 bits) $ One reason why I think it's important to keep this "rigour" with the type size change thing is that abidiff actually returns a code that is a bit field that tells callers about the categories of the changes it encountered. From https://sourceware.org/libabigail/manual/abidiff.html#return-values, we see that if the ABIDIFF_ABI_CHANGE bit is set, it means there were some ABI changes. But then if the ABIDIFF_ABI_INCOMPATIBLE_CHANGE bit is set, it means abidiff is 100% sure that at least of the changes causes an ABI incompatibility. In a continuous integration context, for instance, if the ABIDIFF_ABI_INCOMPATIBLE_CHANGE bit is set, it means we are sure the change is incompatible, whereas if only the ABIDIFF_ABI_CHANGE bit is set, it means the change might or might not incompatible and thus needs a human review to decide. To wraps this all up, I'd say that only changes that would NOT set the ABIDIFF_ABI_INCOMPATIBLE_CHANGE bit should be able to be suppressed, unless the user really knows what she is doing. Thinking about this, maybe check-uapi.sh could use the return code of abidiff rather than grepping its output. check-uapi.sh would then only reject changes categorically only if the ABIDIFF_ABI_INCOMPATIBLE_CHANGE bit is set. If only ABIDIFF_ABI_CHANGE bit is set, check-uapi.sh would just propose the user to review the changes detected and possibly waive them. One result of the waiving process would thus be a new supprspec written and added to the stock of supprspecs to make sure the same kind of reviews is not requested in the future. With time, if a supprspec is recognized to be needed for this particular project, libabigail can even integrate it and install it by default for that project to use. We do this for various projects and their default supprspecs are included in the default.abignore file that is installed libabigail. You can browse it at https://sourceware.org/git/?p=libabigail.git;a=blob;f=default.abignore and learn about what project requested default supprspecs. > I'd think not, but may be missing something. I hope my explanation above helps shed some light in this apparently weird way of doing things. > Basically, I think you can get rid of the first_class->get_size_in_bits() == > second_class->get_size_in_bits() check. I would rather keep it, at least for the sake of consistency in the behaviour supprspecs evaluation in general, especially with the unwritten rule: "only changes that would NOT set the ABIDIFF_ABI_INCOMPATIBLE_CHANGE bit should be able to be suppressed, unless the user really knows what she is doing." I guess we need (better) documentation about all this :-( > Also, if we know a size change is a tautology, you could move the > get_has_size_change() check to be first and save a few CPU cycles. > > Other than that, LGTM! > > I went ahead and made that change and added (at least a start) on the > documentation/tests. I don't have access to make a branch, so I just sent a > patch separately. We can continue discussion there as needed. Thanks a lot for moving forward on this! I'll wait for your feedback on these comments and we can proceed with merging your patch accordingly. In the mean time, if I have comments, I'll follow-up on the patch thread, indeed.
On 11/5/2023 1:43 AM, Dodji Seketeli wrote: > "quic_johmoo at quicinc dot com" <sourceware-bugzilla@sourceware.org> a > écrit: > >> https://sourceware.org/bugzilla/show_bug.cgi?id=31017 >> >> --- Comment #2 from John Moon <quic_johmoo at quicinc dot com> --- >> >> Thanks for the implementation! I think this looks great. I tested it and it >> seems to be working properly for our use case in the kernel! >> >> One question about this block: >> >> + // Support for the >> + // "has_strict_flexible_array_data_member_conversion = true" >> + // clause. >> + if (has_strict_fam_conversion()) >> + { >> + // Let's detect if the first class of the diff has a fake >> + // flexible array data member that got turned into a real >> + // flexible array data member. >> + if (!( >> + (has_fake_flexible_array_data_member(first_class) >> + && has_flexible_array_data_member(second_class)) >> + // A fake flexible array member has been changed into >> + // a real flexible array ... >> + && >> + ((first_class->get_size_in_bits() >> + == second_class->get_size_in_bits()) >> + || get_has_size_change()) >> + // There was no size change or the suppression has a >> + // "has_size_change = true" clause. >> + )) >> + return false; >> + } >> >> Is it possible for a structure to meet the first condition (fake flex -> flex) >> *without* a size change? > > As a general rule, suppression specifications (aka supprspecs) don't apply to a type > which size has changed, unless the user /really/ wants the supprspec to > apply. If she really wants it, then she has to explicitly say > "has_size_change = yes". This is prevents "too eager" supprspecs to be > applied without the user noticing the supprspecs is too eager. > > Basically, if a type's size changed, more often than not, we don't want > to suppress its change report, for obvious reasons. > > That is why, throughout type_suppression::suppresses_diff you see the > careful attention to the size change condition, when evaluating > supprspecs. Right, I wasn't suggesting to apply the suppression even without the "has_size_change = true" clause. I just thought we could avoid the check as it was always true. > > In this particular case, I think that we can have "fake flex -> flex" > changes without a size change because there can other /additional/ > changes that counter the size change we would have expected. That > change could have been introduced, on purpose, to keep the ABI stable. > For instance: > > $ diff -u test-PR31017-2-v0.c test-PR31017-2-v1.c > --- test-PR31017-2-v0.c 2023-11-05 10:04:36.433000539 +0100 > +++ test-PR31017-2-v1.c 2023-11-05 10:07:00.579810832 +0100 > @@ -2,7 +2,8 @@ > { > int x; > int y; > - int end[1]; > + int padding; > + int end[]; > }; > > $ /home/dodji/git/libabigail/PR31017/build/tools/abidiff test-PR31017-2-v0.o test-PR31017-2-v1.o > Functions changes summary: 0 Removed, 1 Changed, 0 Added function > Variables changes summary: 0 Removed, 0 Changed, 0 Added variable > > 1 function with some indirect sub-type change: > > [C] 'function void fun(foo*)' at test-PR31017-2-v0.c:9:1 has some indirect sub-type changes: > parameter 1 of type 'foo*' has sub-type changes: > in pointed to type 'struct foo' at test-PR31017-2-v1.c:1:1: > type size hasn't changed > 1 data member insertion: > 'int padding', at offset 64 (in bits) at test-PR31017-2-v1.c:5:1 > 1 data member change: > type of 'int end[1]' changed: > type name changed from 'int[1]' to 'int[]' > array type size changed from 32 to 'unknown' > array type subrange 1 changed length from 1 to 'unknown' > and offset changed from 64 to 96 (in bits) (by +32 bits) > > $ > And this proves I was wrong! :) I thought libabigail would consider the whole structure size as "unknown" if there was a flex array at the end, but this is clearly not the case. It just takes the size of the known structs (as the compiler does). > One reason why I think it's important to keep this "rigour" with the > type size change thing is that abidiff actually returns a code that is a > bit field that tells callers about the categories of the changes it > encountered. From > https://sourceware.org/libabigail/manual/abidiff.html#return-values, we > see that if the ABIDIFF_ABI_CHANGE bit is set, it means there were some > ABI changes. But then if the ABIDIFF_ABI_INCOMPATIBLE_CHANGE bit is > set, it means abidiff is 100% sure that at least of the changes causes > an ABI incompatibility. In a continuous integration context, for > instance, if the ABIDIFF_ABI_INCOMPATIBLE_CHANGE bit is set, it means we > are sure the change is incompatible, whereas if only the > ABIDIFF_ABI_CHANGE bit is set, it means the change might or might not > incompatible and thus needs a human review to decide. > > To wraps this all up, I'd say that only changes that would NOT set the > ABIDIFF_ABI_INCOMPATIBLE_CHANGE bit should be able to be suppressed, > unless the user really knows what she is doing. > > Thinking about this, maybe check-uapi.sh could use the return code of > abidiff rather than grepping its output. check-uapi.sh would then only > reject changes categorically only if the ABIDIFF_ABI_INCOMPATIBLE_CHANGE > bit is set. If only ABIDIFF_ABI_CHANGE bit is set, check-uapi.sh would > just propose the user to review the changes detected and possibly waive > them. One result of the waiving process would thus be a new supprspec > written and added to the stock of supprspecs to make sure the same kind > of reviews is not requested in the future. Agreed, and in v6 of the script, we do this! If you pass the flag "-i" to the script, it will ignore abidiff results when return code is 4 (ABIDIFF_ABI_CHANGE, but not ABIDIFF_ABI_INCOMPATIBLE_CHANGE). > > With time, if a supprspec is recognized to be needed for this particular > project, libabigail can even integrate it and install it by default for > that project to use. We do this for various projects and their default > supprspecs are included in the default.abignore file that is installed > libabigail. You can browse it at > https://sourceware.org/git/?p=libabigail.git;a=blob;f=default.abignore > and learn about what project requested default supprspecs. > >> I'd think not, but may be missing something. > > I hope my explanation above helps shed some light in this apparently > weird way of doing things. It certainly did, thank you! > >> Basically, I think you can get rid of the first_class->get_size_in_bits() == >> second_class->get_size_in_bits() check. > > I would rather keep it, at least for the sake of consistency in the > behaviour supprspecs evaluation in general, especially with the > unwritten rule: > > "only changes that would NOT set the ABIDIFF_ABI_INCOMPATIBLE_CHANGE > bit should be able to be suppressed, unless the user really knows > what she is doing." > > I guess we need (better) documentation about all this :-( I think the documentation is clear, I just made an incorrect assumption. > >> Also, if we know a size change is a tautology, you could move the >> get_has_size_change() check to be first and save a few CPU cycles. >> >> Other than that, LGTM! >> >> I went ahead and made that change and added (at least a start) on the >> documentation/tests. I don't have access to make a branch, so I just sent a >> patch separately. We can continue discussion there as needed. > > Thanks a lot for moving forward on this! > > I'll wait for your feedback on these comments and we can proceed with > merging your patch accordingly. In the mean time, if I have comments, > I'll follow-up on the patch thread, indeed. > Sounds good, thank you! - John
I have reviewed and amended the patch at https://inbox.sourceware.org/libabigail/8734xh3j1o.fsf@seketeli.org/. It works for me, but I am waiting for your feedback to apply it to the master branch of the Git repository. Many thanks!
Thanks for pushing it across the finish line! I just tested it and seems to be working correctly for me. 👍
The patch was applied to the master branch of the git repository at https://sourceware.org/git/?p=libabigail.git;a=commit;h=89ab39de785db5a06e050a3e62bd88e980ab3346 and should be available in libabigail 2.5. Thank you for filling this enhancement request!