This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug runtime/19802] bad hash value distribution and horrible performance for large arrays with multiple small-integer indices


https://sourceware.org/bugzilla/show_bug.cgi?id=19802

--- Comment #2 from Ken Raeburn <raeburn at permabit dot com> ---
> I took a bit of a look at this today. We're using the kernel's hash_{64,32}
> function. (We do have a copy of it down src/runtime/dyninst/linux_hash.c, but
> that just for the use of "stap --dyninst".)
> 
> The advantage the kernel has when using hash_{64,32} on multiple values is that
> it knows what data is it is mixing, so it can mix the data semi-intelligently.

Yes, thatâs how Iâm working around the problem for now. But you really have to
understand how hash_64 works for your use case.

> In systemtap's case, we don't know if we're going to be mixing small integers
> or pointer values, which makes things tricky when it comes to mixing the data

Right. But having either stage work well would probably save us; itâs the two
working poorly in conjunction that gives a bad outcome in cases like mine.
Really, if the second stage (combining hashes of individual keys) worked really
well, we could probably forego hash_64 and just use the supplied integer value
directly, maybe after XOR with the seedâ.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]