This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Fix i686 memchr overflow calculation (BZ#21182)
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Fri, 19 May 2017 14:24:09 -0300
- Subject: Re: [PATCH] Fix i686 memchr overflow calculation (BZ#21182)
- Authentication-results: sourceware.org; auth=none
- References: <1489512487-24860-1-git-send-email-adhemerval.zanella@linaro.org> <40ea40ff-763f-e5ff-4bf1-1d24db403df3@linaro.org> <CAMe9rOoyTNL6jmmpSQPSG-8cQGYo4eL78=z+kQkvgPw8ZjFC9A@mail.gmail.com> <CAMe9rOrxU79HmdvLDv=hq9Pkb7+Dg8iB2ozWr4QVs_ZduT4yYA@mail.gmail.com> <5d284d70-b6e5-b92e-185a-bfede9777ad8@linaro.org> <5e09e1e1-a4cf-772a-68d7-a532eca8d5be@linaro.org> <CAMe9rOr0UFKouWki2VwybG=k9dxe8ZoEJu=TzqrRP9TpLDfybA@mail.gmail.com>
On 19/05/2017 13:14, H.J. Lu wrote:
> On Fri, May 19, 2017 at 8:10 AM, Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>>
>>
>> On 19/05/2017 11:45, Adhemerval Zanella wrote:
>>>
>>>
>>> On 19/05/2017 10:44, H.J. Lu wrote:
>>>> On Tue, Mar 28, 2017 at 2:47 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Tue, Mar 28, 2017 at 12:22 PM, Adhemerval Zanella
>>>>> <adhemerval.zanella@linaro.org> wrote:
>>>>>> Ping (with the page_size/2 fix included).
>>>>>>
>>>>>> On 14/03/2017 14:28, Adhemerval Zanella wrote:
>>>>>>> This patch fixes the regression added by 23d2770 for final address
>>>>>>> overflow calculation. The subtraction of the considered size (16)
>>>>>>> at line 120 is at wrong place, for sizes less than 16 subsequent
>>>>>>> overflow check will not take in consideration an invalid size (since
>>>>>>> the subtraction will be negative). Also, the lea instruction also
>>>>>>> does not raise the carry flag (CF) that is used in subsequent jbe
>>>>>>> to check for overflow.
>>>>>>>
>>>>>>> The fix is to follow x86_64 logic from 3daef2c where the overflow
>>>>>>> is first check and a sub instruction is issued. In case of resulting
>>>>>>> negative size, CF will be set by the sub instruction and a NULL
>>>>>>> result will be returned. The patch also add similar tests reported
>>>>>>> in bug report.
>>>>>>>
>>>>>>> Checked on i686-linux-gnu and x86_64-linux-gnu.
>>>>>>>
>>>>>>> [BZ# 21182]
>>>>>>> * string/test-memchr.c (do_test): Add BZ#21182 checks for address
>>>>>>> near end of a page.
>>>>>>> * sysdeps/i386/i686/multiarch/memchr-sse2.S (__memchr): Fix
>>>>>>> overflow calculation.
>>>>>>> ---
>>>>>>> ChangeLog | 7 +++++++
>>>>>>> string/test-memchr.c | 6 ++++++
>>>>>>> sysdeps/i386/i686/multiarch/memchr-sse2.S | 2 +-
>>>>>>> 3 files changed, 14 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/string/test-memchr.c b/string/test-memchr.c
>>>>>>> index d64d10c..87a077b 100644
>>>>>>> --- a/string/test-memchr.c
>>>>>>> +++ b/string/test-memchr.c
>>>>>>> @@ -210,6 +210,12 @@ test_main (void)
>>>>>>> do_test (0, i, i + 1, i + 1, 0);
>>>>>>> }
>>>>>>>
>>>>>>> + /* BZ#21182 - wrong overflow calculation for i686 implementation
>>>>>>> + with address near end of the page. */
>>>>>>> + for (i = 2; i < 16; ++i)
>>>>>>> + /* page_size is in fact getpagesize() * 2. */
>>>>>>> + do_test (page_size/2 - i, i, i, 1, 0x9B);
>>>>>>> +
>>>>>>> do_random_tests ();
>>>>>>> return ret;
>>>>>>> }
>>>>>>> diff --git a/sysdeps/i386/i686/multiarch/memchr-sse2.S b/sysdeps/i386/i686/multiarch/memchr-sse2.S
>>>>>>> index 910679c..e41f324 100644
>>>>>>> --- a/sysdeps/i386/i686/multiarch/memchr-sse2.S
>>>>>>> +++ b/sysdeps/i386/i686/multiarch/memchr-sse2.S
>>>>>>> @@ -117,7 +117,6 @@ L(crosscache):
>>>>>>>
>>>>>>> # ifndef USE_AS_RAWMEMCHR
>>>>>>> jnz L(match_case2_prolog1)
>>>>>>> - lea -16(%edx), %edx
>>>>>>> /* Calculate the last acceptable address and check for possible
>>>>>>> addition overflow by using satured math:
>>>>>>> edx = ecx + edx
>>>>>>> @@ -125,6 +124,7 @@ L(crosscache):
>>>>>>> add %ecx, %edx
>>>>>>> sbb %eax, %eax
>>>>>>> or %eax, %edx
>>>>>>> + sub $16, %edx
>>>>>>> jbe L(return_null)
>>>>>>> lea 16(%edi), %edi
>>>>>>> # else
>>>>>>>
>>>>>
>>>>> Looks good.
>>>>>
>>>>>
>>>>
>>>> Just a thought. Is this approach
>>>>
>>>> /* Calculate "edx + ecx - 16" by using "edx - (16 - ecx)"
>>>> instead of "(edx + ecx) - 16" to void possible addition
>>>> overflow. */
>>>> neg %ecx
>>>> add $16, %ecx
>>>> sub %ecx, %edx
>>>> jbe L(return_null)
>>>>
>>>> a little bit better?
>>>
>>> I do not have a strong preference here, although imho it is simpler to
>>> understand why the saturated math would work here and it is the same
>>> strategy used on other arch implementations. May you might outline in
>>> comment that ecx is always in the range of [0,64) so '-ecx + 16' can
>>> possible underflow. Is is this change just for micro-optimization?
>>>
>>
>> Sorry, my bad here. I made some confusion, in fact what I meant was ecx
>> was indeed in range [0,16) (is the and instruction for alignment that
>> uses 64) and your suggestion *can't* overflow. I would recommend
>> you to just outline the possible value of 'edx' in comments.
>
> Here is a patch for SSE2 memchr. OK for master?
>
LGTM, thanks.