This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Scheme style auto-resizing hashtable


>    5 % slower if I use my own hashing function (I've found the current
>    Guile hasher to be inadequate for my needs).
> 
> Would anyone knowledgeable about bit twiddling be willing to write a
> Guile hasher that doesn't mod out by a given number?

This is my C++ code to do hashing:

typedef unsigned long ULONG;

ULONG hashing_bit_matrix[] = 
{
	0xC28568FD, 0x1F2030B6, 0xF03875BA, 0x7621987E,
// ...
// a total of 256 random 32 bit values (not listed here)
// ...
	0x06ED9210, 0x2F98CD21, 0xF7180B5E, 0x9DE9E56E
};

ULONG StringHash( const char *s )
{
	ULONG a = 0;
	while( *s )
	{
		a ^= hashing_bit_matrix[ *s ];
		a = ( a << 1 ) | ( a >> 31 );
		s++;
	}
	return( a );
}

ULONG GenericHash( void *HO, ULONG size = 1, ULONG seed = 0 )
{
	unsigned char *p = ( unsigned char *)HO;
	while( size-- )
	{
		seed ^= hashing_bit_matrix[ *p++ ];
		seed = ( seed << 1 ) | ( seed >> 31 );
	}
	return( seed );
}

Some systems have sneaky ways to speed up the bitwise rotate,
you may want to look into that... If you want to improve the
randomness of the results then rather than lookup each byte
into hashing_bit_matrix[] you can do the lookup each of the
bit-shifted versions (i.e. from 0 to 7) so that you are doing
the lookup on overlapping groups of 8 bits. I found that to be
overkill.

This method uses only shifts, rotates, xor and lookups which are usually
fast operations. It uses one iteration for every byte in the key if you
use the easier method or 8 iterations for every byte in the key if you
use the more thorough method.

	- Tel