This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
[Bug runtime/19802] bad hash value distribution and horrible performance for large arrays with multiple small-integer indices
- From: "raeburn at permabit dot com" <sourceware-bugzilla at sourceware dot org>
- To: systemtap at sourceware dot org
- Date: Thu, 10 Mar 2016 22:50:30 +0000
- Subject: [Bug runtime/19802] bad hash value distribution and horrible performance for large arrays with multiple small-integer indices
- Auto-submitted: auto-generated
- References: <bug-19802-6586 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=19802
--- Comment #2 from Ken Raeburn <raeburn at permabit dot com> ---
> I took a bit of a look at this today. We're using the kernel's hash_{64,32}
> function. (We do have a copy of it down src/runtime/dyninst/linux_hash.c, but
> that just for the use of "stap --dyninst".)
>
> The advantage the kernel has when using hash_{64,32} on multiple values is that
> it knows what data is it is mixing, so it can mix the data semi-intelligently.
Yes, thatâs how Iâm working around the problem for now. But you really have to
understand how hash_64 works for your use case.
> In systemtap's case, we don't know if we're going to be mixing small integers
> or pointer values, which makes things tricky when it comes to mixing the data
Right. But having either stage work well would probably save us; itâs the two
working poorly in conjunction that gives a bad outcome in cases like mine.
Really, if the second stage (combining hashes of individual keys) worked really
well, we could probably forego hash_64 and just use the supplied integer value
directly, maybe after XOR with the seedâ.
--
You are receiving this mail because:
You are the assignee for the bug.