Summary: | Support conservative use of a-z or A-Z. | ||
---|---|---|---|
Product: | glibc | Reporter: | Zorro Lang <zlang> |
Component: | glob | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED DUPLICATE | ||
Severity: | normal | CC: | adhemerval.zanella, carlos, fweimer, rjones |
Priority: | P2 | Flags: | fweimer:
security-
|
Version: | 2.27 | ||
Target Milestone: | --- | ||
See Also: |
https://bugzilla.redhat.com/show_bug.cgi?id=1601681 https://sourceware.org/bugzilla/show_bug.cgi?id=23393 |
||
Host: | Target: | ||
Build: | Last reconfirmed: |
Description
Zorro Lang
2018-07-17 07:20:37 UTC
I can't reproduce it with GNU Make 4.2.1 and master (ba2ea23), neither with 2.27 branch release (8623cfe). The only difference on the mentioned steps to reproduce is I used a prefix path on configuration (--prefix=/tmp/xfstests-dev/install). We recently fixed some glob issues (BZ #1062, BZ #19971, BZ #866, BZ #1062, BZ #22320, BZ #22332), which might be related and we also provide a compat symbol with old semantic related to BZ #866. So I tried to remove the compat version on master to check if it is something to glob implementation, but even forcing GNU make to use the newer implementation I couldn't reproduce the issue. I am not sure which commit id Fedora 28 has used on its deployment or any other out of the, if any, it has used. However if you could which version Fedora 28 is using or if you could check if master branch does help you it would be helpful. (In reply to Adhemerval Zanella from comment #1) > I can't reproduce it with GNU Make 4.2.1 and master (ba2ea23), neither with > 2.27 branch release (8623cfe). The only difference on the mentioned steps to > reproduce is I used a prefix path on configuration > (--prefix=/tmp/xfstests-dev/install). > > We recently fixed some glob issues (BZ #1062, BZ #19971, BZ #866, BZ #1062, > BZ #22320, BZ #22332), which might be related and we also provide a compat > symbol with old semantic related to BZ #866. So I tried to remove the compat > version on master to check if it is something to glob implementation, but > even forcing GNU make to use the newer implementation I couldn't reproduce > the issue. > > I am not sure which commit id Fedora 28 has used on its deployment or any > other out of the, if any, it has used. However if you could which version > Fedora 28 is using or if you could check if master branch does help you it > would be helpful. We sync from 2.27 here: - Auto-sync with upstream branch release/2.27/master, commit 56170e064e2b21ce204f0817733e92f1730541ea. ... with a few patches applied on top, but not much, since we've started to backport and track release branches as actively as we can. I haven't verified this issue yet either, but it seems oddly suspicious of an interaction between the toolchain and make or some other component that is part of the build. So far nobody has actually debugged the root cause. I'm marking this RESOLVED/INVALID for upstream glibc, given your quick testing. Zorro, if you want to track this for Fedora 28, please open a Fedora bug against glibc for the Fedora release you're having problems with. Thanks for you comfirming. But my colleague metioned that they've reported this bug before: https://sourceware.org/bugzilla/show_bug.cgi?id=23393 And the change looks like radical. Before, we can do this: # echo abcd > testfile # echo ABCD >> testfile # egrep [a-z] testfile abcd But now, it becomes this: # echo abcd > testfile # echo ABCD >> testfile # egrep [a-z] testfile abcd ABCD I'm afraid that it will break many customer's stripts and tools... (In reply to Zorro Lang from comment #3) > Thanks for you comfirming. But my colleague metioned that they've reported > this bug before: > https://sourceware.org/bugzilla/show_bug.cgi?id=23393 > > And the change looks like radical. > Before, we can do this: > > # echo abcd > testfile > # echo ABCD >> testfile > # egrep [a-z] testfile > abcd > > But now, it becomes this: > # echo abcd > testfile > # echo ABCD >> testfile > # egrep [a-z] testfile > abcd > ABCD > > I'm afraid that it will break many customer's stripts and tools... It appears you are asking for the following: * Support using a-z and A-Z to mean [:lower:] and [:upper:] * For all locales, even non-English ones. This defeats the purpose of the regular expression statement, and is the reason why [:lower:] and [:upper:] were specified. We could make changes in the English-speaking locales perhaps, argued rationally for conservatism, reverting some ISO 14651 changes, and causing a-z and A-Z not to interleave, which would likely fix your use cases (until you had files with more esoteric names). Do we think that kind of conservative fix might work? (In reply to Carlos O'Donell from comment #4) > It appears you are asking for the following: > > * Support using a-z and A-Z to mean [:lower:] and [:upper:] > * For all locales, even non-English ones. > > This defeats the purpose of the regular expression statement, and is the > reason why [:lower:] and [:upper:] were specified. > > We could make changes in the English-speaking locales perhaps, argued > rationally for conservatism, reverting some ISO 14651 changes, and causing > a-z and A-Z not to interleave, which would likely fix your use cases (until > you had files with more esoteric names). > > Do we think that kind of conservative fix might work? I guess if we make this change it would be better if we uniformly deviate in all languages from ISO 14651 to solve the issue. However, really this is a bug in all the applications using '[a-z]' etc. because it will cause problems in all sorts of locales that collate other characters in that range, but I understand the problem. (In reply to Carlos O'Donell from comment #5) > (In reply to Carlos O'Donell from comment #4) > > It appears you are asking for the following: > > > > * Support using a-z and A-Z to mean [:lower:] and [:upper:] > > * For all locales, even non-English ones. > > > > This defeats the purpose of the regular expression statement, and is the > > reason why [:lower:] and [:upper:] were specified. > > > > We could make changes in the English-speaking locales perhaps, argued > > rationally for conservatism, reverting some ISO 14651 changes, and causing > > a-z and A-Z not to interleave, which would likely fix your use cases (until > > you had files with more esoteric names). > > > > Do we think that kind of conservative fix might work? > > I guess if we make this change it would be better if we uniformly deviate in > all languages from ISO 14651 to solve the issue. However, really this is a > bug in all the applications using '[a-z]' etc. because it will cause > problems in all sorts of locales that collate other characters in that > range, but I understand the problem. ... however in making this deviation, we take a step away from our stated goals of harmonization with Unicode and CLDR data, and again add back subtle sorting differences against other systems using Unicode collation or libICU. My opinion is that the scripts in question should just get fixed, and that the differences in collation among systems introduce subtle differences that are more insidious than just the fixes we need for [a-z] and [A-Z]. Rich Felker proposed to use codepoint order only for ranges: https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c13 I think this would be a workable solution. It is POSIX-compliant. (And this bug is really the same as bug 23393.) (In reply to Florian Weimer from comment #7) > Rich Felker proposed to use codepoint order only for ranges: > > https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c13 > > I think this would be a workable solution. It is POSIX-compliant. > > (And this bug is really the same as bug 23393.) Moving to 23393, closing this as duplicate. *** This bug has been marked as a duplicate of bug 23393 *** (In reply to Carlos O'Donell from comment #8) > (In reply to Florian Weimer from comment #7) > > Rich Felker proposed to use codepoint order only for ranges: > > > > https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c13 > > > > I think this would be a workable solution. It is POSIX-compliant. > > > > (And this bug is really the same as bug 23393.) > > Moving to 23393, closing this as duplicate. > > *** This bug has been marked as a duplicate of bug 23393 *** Just to certify to actually trigger the issue described on xfstests-dev install one need to set a specific system locale, right? I am trying to understand why I could reproduce it on my environment. On 07/19/2018 01:42 PM, adhemerval.zanella at linaro dot org wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=23420
>
> --- Comment #9 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
> (In reply to Carlos O'Donell from comment #8)
>> (In reply to Florian Weimer from comment #7)
>>> Rich Felker proposed to use codepoint order only for ranges:
>>>
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c13
>>>
>>> I think this would be a workable solution. It is POSIX-compliant.
>>>
>>> (And this bug is really the same as bug 23393.)
>>
>> Moving to 23393, closing this as duplicate.
>>
>> *** This bug has been marked as a duplicate of bug 23393 ***
>
> Just to certify to actually trigger the issue described on xfstests-dev install
> one need to set a specific system locale, right? I am trying to understand why
> I could reproduce it on my environment.
Any locale but C/POSIX.
Any locale which uses iso14651_t1_common went from [a-z] == a-z, to [a-z] == aA-zZ.
(In reply to Carlos O'Donell from comment #10) > On 07/19/2018 01:42 PM, adhemerval.zanella at linaro dot org wrote: > > https://sourceware.org/bugzilla/show_bug.cgi?id=23420 > > > > --- Comment #9 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> --- > > (In reply to Carlos O'Donell from comment #8) > >> (In reply to Florian Weimer from comment #7) > >>> Rich Felker proposed to use codepoint order only for ranges: > >>> > >>> https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c13 > >>> > >>> I think this would be a workable solution. It is POSIX-compliant. > >>> > >>> (And this bug is really the same as bug 23393.) > >> > >> Moving to 23393, closing this as duplicate. > >> > >> *** This bug has been marked as a duplicate of bug 23393 *** > > > > Just to certify to actually trigger the issue described on xfstests-dev install > > one need to set a specific system locale, right? I am trying to understand why > > I could reproduce it on my environment. > > Any locale but C/POSIX. > > Any locale which uses iso14651_t1_common went from [a-z] == a-z, to [a-z] == > aA-zZ. Right, I tested with LANG=en_US.UTF-8 and LANGUAGE=en_US but couldn't reproduce it. (In reply to Adhemerval Zanella from comment #11) > Right, I tested with LANG=en_US.UTF-8 and LANGUAGE=en_US but couldn't > reproduce it. Did you test with glibc from the master branch? We backported the collation data update into Fedora's glibc 2.27, but it's not in upstream glibc 2.27. (In reply to Florian Weimer from comment #12) > (In reply to Adhemerval Zanella from comment #11) > > > Right, I tested with LANG=en_US.UTF-8 and LANGUAGE=en_US but couldn't > > reproduce it. > > Did you test with glibc from the master branch? We backported the collation > data update into Fedora's glibc 2.27, but it's not in upstream glibc 2.27. I tested with master with my system make (executed through testrun.sh). I am not sure if using DESTDIR along with install rule is tampering with testing. |