This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC PATCH 4/3] Fix previous patch and add header.
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Thu, 28 May 2015 19:25:13 +0200
- Subject: Re: [RFC PATCH 4/3] Fix previous patch and add header.
- Authentication-results: sourceware.org; auth=none
- References: <20150526173150 dot GA26817 at domone> <20150526181035 dot GA27596 at domone> <20150526185534 dot GA28344 at domone> <alpine dot DEB dot 2 dot 10 dot 1505281706050 dot 16930 at digraph dot polyomino dot org dot uk>
On Thu, May 28, 2015 at 05:17:22PM +0000, Joseph Myers wrote:
> On Tue, 26 May 2015, OndÅej BÃlka wrote:
>
> > diff --git a/string/common.h b/string/common.h
> > new file mode 100644
> > index 0000000..84f9767
> > --- /dev/null
> > +++ b/string/common.h
> > @@ -0,0 +1,35 @@
> > +#include <stdint.h>
>
> All files should start with a comment that first describes the file, then
> has the copyright and license notice.
>
> > +/* Use vector arithmetic tricks. Idea is to take expression works on
> > + unsigned byte and evaluates 0 for nozero byte and nonzero on zero byte.
> > + Our expression is ((s + 127) ^ (~s)) & 128
> > + Now we evaluate this expression on each byte in parallel and on first
> > + nonzero byte our expression will have nonzero value. */
>
> I think a more detailed, multi-paragraph comment carefully explaining the
> algorithm used (and why and how it works) would be better. Like the one
> discussed in bug 5806, but of course without the mistake discussed there.
>
> The principle of a common header with this logic is a good one - hopefully
> if it gets used everywhere this can resolve bug 5806 by eliminating all
> copies of the old comment.
>
> (The patch submission should carefully discuss how the algorithm relates
> to the ones discussed in bug 5806 / 5807 or used in existing code. But
> that's part of justifying the patches rather than for the comments in the
> new code.)
>
> > +static __always_inline
> > +unsigned long int
> > +contains_zero (unsigned long int s)
> > +{
> > + return ((s + add) ^ ~s) & high_bits;
>
> Instead of using unsigned long int, architectures should be able to
> configure the type used. I expect that for ilp32 configurations on 64-bit
> architectures, where registers are 64-bit but unsigned long int is 32-bit,
> such as x32 and MIPS n32, it will be better to use unsigned long long int.
>
in next iteration I would add typedef and archs would use it from
header. Perhaps something like.
#ifndef VECTOR_INT
# define VECTOR_INT unsigned long int
#endif
typedef VECTOR_INT vector_int;
> > +#define CROSS_PAGE(x, n) (((uintptr_t)x) % 4096 >= 4096 - n)
>
> It might also make sense for the use of 4096 here to be configurable by
> architectures as well.
>
Could but it won't likely make noticable difference as you cross page
with probability 0.8% for uniformly random address.