This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix i686 memchr overflow calculation (BZ#21182)



On 19/05/2017 11:45, Adhemerval Zanella wrote:
> 
> 
> On 19/05/2017 10:44, H.J. Lu wrote:
>> On Tue, Mar 28, 2017 at 2:47 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Tue, Mar 28, 2017 at 12:22 PM, Adhemerval Zanella
>>> <adhemerval.zanella@linaro.org> wrote:
>>>> Ping (with the page_size/2 fix included).
>>>>
>>>> On 14/03/2017 14:28, Adhemerval Zanella wrote:
>>>>> This patch fixes the regression added by 23d2770 for final address
>>>>> overflow calculation.  The subtraction of the considered size (16)
>>>>> at line 120 is at wrong place, for sizes less than 16 subsequent
>>>>> overflow check will not take in consideration an invalid size (since
>>>>> the subtraction will be negative).  Also, the lea instruction also
>>>>> does not raise the carry flag (CF) that is used in subsequent jbe
>>>>> to check for overflow.
>>>>>
>>>>> The fix is to follow x86_64 logic from 3daef2c where the overflow
>>>>> is first check and a sub instruction is issued.  In case of resulting
>>>>> negative size, CF will be set by the sub instruction and a NULL
>>>>> result will be returned.  The patch also add similar tests reported
>>>>> in bug report.
>>>>>
>>>>> Checked on i686-linux-gnu and x86_64-linux-gnu.
>>>>>
>>>>>       [BZ# 21182]
>>>>>       * string/test-memchr.c (do_test): Add BZ#21182 checks for address
>>>>>       near end of a page.
>>>>>       * sysdeps/i386/i686/multiarch/memchr-sse2.S (__memchr): Fix
>>>>>       overflow calculation.
>>>>> ---
>>>>>  ChangeLog                                 | 7 +++++++
>>>>>  string/test-memchr.c                      | 6 ++++++
>>>>>  sysdeps/i386/i686/multiarch/memchr-sse2.S | 2 +-
>>>>>  3 files changed, 14 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/string/test-memchr.c b/string/test-memchr.c
>>>>> index d64d10c..87a077b 100644
>>>>> --- a/string/test-memchr.c
>>>>> +++ b/string/test-memchr.c
>>>>> @@ -210,6 +210,12 @@ test_main (void)
>>>>>        do_test (0, i, i + 1, i + 1, 0);
>>>>>      }
>>>>>
>>>>> +  /* BZ#21182 - wrong overflow calculation for i686 implementation
>>>>> +     with address near end of the page.  */
>>>>> +  for (i = 2; i < 16; ++i)
>>>>> +    /* page_size is in fact getpagesize() * 2.  */
>>>>> +    do_test (page_size/2 - i, i, i, 1, 0x9B);
>>>>> +
>>>>>    do_random_tests ();
>>>>>    return ret;
>>>>>  }
>>>>> diff --git a/sysdeps/i386/i686/multiarch/memchr-sse2.S b/sysdeps/i386/i686/multiarch/memchr-sse2.S
>>>>> index 910679c..e41f324 100644
>>>>> --- a/sysdeps/i386/i686/multiarch/memchr-sse2.S
>>>>> +++ b/sysdeps/i386/i686/multiarch/memchr-sse2.S
>>>>> @@ -117,7 +117,6 @@ L(crosscache):
>>>>>
>>>>>  # ifndef USE_AS_RAWMEMCHR
>>>>>       jnz     L(match_case2_prolog1)
>>>>> -     lea     -16(%edx), %edx
>>>>>          /* Calculate the last acceptable address and check for possible
>>>>>             addition overflow by using satured math:
>>>>>             edx = ecx + edx
>>>>> @@ -125,6 +124,7 @@ L(crosscache):
>>>>>       add     %ecx, %edx
>>>>>       sbb     %eax, %eax
>>>>>       or      %eax, %edx
>>>>> +     sub     $16, %edx
>>>>>       jbe     L(return_null)
>>>>>       lea     16(%edi), %edi
>>>>>  # else
>>>>>
>>>
>>> Looks good.
>>>
>>>
>>
>> Just a thought.  Is this approach
>>
>>         /* Calculate "edx + ecx - 16" by using "edx - (16 - ecx)"
>>            instead of "(edx + ecx) - 16" to void possible addition
>>            overflow.  */
>>         neg     %ecx
>>         add     $16, %ecx
>>         sub     %ecx, %edx
>>         jbe     L(return_null)
>>
>> a little bit better?
> 
> I do not have a strong preference here, although imho it is simpler to
> understand why the saturated math would work here and it is the same
> strategy used on other arch implementations.  May you might outline in
> comment that ecx is always in the range of [0,64) so '-ecx + 16' can
> possible underflow.  Is is this change just for micro-optimization?
> 

Sorry, my bad here.  I made some confusion, in fact what I meant was ecx
was indeed in range [0,16) (is the and instruction for alignment that 
uses 64) and your suggestion *can't* overflow.  I would recommend
you to just outline the possible value of 'edx' in comments. 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]