This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 06/11] Improve generic memrchr



On 20/12/2016 15:19, Richard Henderson wrote:
> On 12/20/2016 04:30 AM, Adhemerval Zanella wrote:
>> This implementation has a lot of issues, checking on both x86_64 and i386
>> (by manually removing the assembly implementation to use the new default
>> one) I am seeing a lot of issues with string/testers and others. 
>> Unfortunately the test-memrchr itself does not trigger most of the issues.
> 
> I did notice problems by eye when going through the patch set for the second
> time.  It is unfortunate that the tester doesn't trigger them.
> 
>>> +  align = (uintptr_t)char_ptr % sizeof(longword);
>>> +  for (i = 0; i < align; ++i)
>>>      if (*--char_ptr == c)
>>> -      return (__ptr_t) char_ptr;
>>> +      return (void *) char_ptr;
>>
>> We need to take care of 'n' while interacting over the string to align
>> it.  It might the case where 's' is unaligned and 'n' is less than
>> the aligned value.
> 
> As for memchr, I had intended
> 
>   if (align > n)
>     align = n;
> 
>>> + found:
>>> +      i = whichzero (longword);
>>> +      if (sizeof (longword) - 1 - i < n)
>>> +	return (char *) longword_ptr + i;
>>
>> We need to check longword in reverse word since we are checking for last
>> occurrence. 
> 
> Well, yes and no.  We check from reverse but want the last match.  So it's
> still a forward search.  What's needed is to mask out any match that is
> excluded by N.
> 
> I currently have
> 
>   /* Handle any remaining portion of a word.  */
>   if (n > 0)
>     {
>       word = *--word_ptr ^ repeated_c;
> 
>       /* Mask out the garbage bytes.  */
>       op_t m = -1;
>       if (__BYTE_ORDER == __LITTLE_ENDIAN)
>         m <<= n * CHAR_BIT;
>       else
>         m >>= n * CHAR_BIT;
>       word |= ~m;
> 
>       if (haszero (word))
>         {
>         found:
>           return (char *) word_ptr + findzero (word);
>         }
>     }

I do not have a strong preference here, neither I am sure which strategy
would yield better performance for mostly architectures.  I would be usually
one instruction if the architecture supports direct swap instruction (like
x86 and aarch64), but the ones that do not it might generate some long
sequences.  Anyway, In such cases I would go either for code simplicity.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]