This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug runtime/19802] bad hash value distribution and horrible performance for large arrays with multiple small-integer indices

From: "raeburn at permabit dot com" <sourceware-bugzilla at sourceware dot org>
To: systemtap at sourceware dot org
Date: Thu, 10 Mar 2016 22:50:30 +0000
Subject: [Bug runtime/19802] bad hash value distribution and horrible performance for large arrays with multiple small-integer indices
Auto-submitted: auto-generated
References: <bug-19802-6586 at http dot sourceware dot org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=19802

--- Comment #2 from Ken Raeburn <raeburn at permabit dot com> ---
> I took a bit of a look at this today. We're using the kernel's hash_{64,32}
> function. (We do have a copy of it down src/runtime/dyninst/linux_hash.c, but
> that just for the use of "stap --dyninst".)
> 
> The advantage the kernel has when using hash_{64,32} on multiple values is that
> it knows what data is it is mixing, so it can mix the data semi-intelligently.

Yes, thatâs how Iâm working around the problem for now. But you really have to
understand how hash_64 works for your use case.

> In systemtap's case, we don't know if we're going to be mixing small integers
> or pointer values, which makes things tricky when it comes to mixing the data

Right. But having either stage work well would probably save us; itâs the two
working poorly in conjunction that gives a bad outcome in cases like mine.
Really, if the second stage (combining hashes of individual keys) worked really
well, we could probably forego hash_64 and just use the supplied integer value
directly, maybe after XOR with the seedâ.

-- 
You are receiving this mail because:
You are the assignee for the bug.

References:
- [Bug runtime/19802] New: bad hash value distribution and horrible performance for large arrays with multiple small-integer indices
  - From: raeburn at permabit dot com

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]