This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: DT_GNU_HASH: ~ 50% dynamic linking improvement

From: "djamel anonymous" <djam8193ah at hotmail dot com>
To: djam8193ah at hotmail dot com, drepper at redhat dot com, michael dot meeks at novell dot com
Cc: jakub at redhat dot com, binutils at sources dot redhat dot com, libc-alpha at sources dot redhat dot com
Date: Mon, 03 Jul 2006 06:38:09 +0000
Subject: Re: DT_GNU_HASH: ~ 50% dynamic linking improvement
Bcc:

Hello. after the last post on the mailing list i found that my suggestion was too complicated and suboptimal.i think that i have found a simple and efficient method to implement the additonal sparse hash table.the proposed hash table is of size approximately m*8 bits were m is the nearest power of two to nysmbols. for example for nsymbols=3071 m will be 2048, for nsymbols=3072 m will be 4096.the index into the hash table will be simply hash%(m*8) which will be computed quickly because it is a power of two. a bit of index x into the hash table will be zero if and only if no symbol has hash%(m*8)==x. the size of the hash table will be between 5.3333*nsymbols and 10.6666*nsymobls which is a very small space overhead. to look for a symbol with hash value x we have to compute i=x%(m*8) then check if the bit i is set or not. if it is cleared then the symbol is not present otherwise the search will continue in the hash table that is currently implemented.I think this suggestion which is in the litterature called a bloom filter with parameter k=1 has several advantage: 1-it has a very small space overhead: between 5.3333*nsymbols bits and 10.6666*nsymbols bits . 2-it is very easy to implement. 3-it will incure a very small cache misses if the search is unsuccesfull there will be only one memory access approximately in 7 cases from 8. 4-the index into the hash table is very cheap to cumpute as it is done with a binary and.compared to a real modulo which is very expensive; on an athlon or pentium it takes between 20 and 40 cycles.

although the index into the hash function is simple to compute, it may however be quite efficient because the lower bit of the chosen hash function are affected by every character in the symbol. static uint_fast32_t dl_new_hash (const char *s) { uint_fast32_t h = 5381; for (unsigned char c = *s; c != '\0'; c = *++s) h = h * 33 + c; return h & 0xffffffff; } in conclusion i think that this method will improve the speed for unsuccesfull searches because it needs to compute a power of two modulo instead of expensive normal modulo, and that it will do one memory access most of the time.the space overhead will be of around nsymbols bytes (between nsymobls*0.6666 and nsumbols*1.3333). best regards.

_________________________________________________________________
Testez Windows Live Mail Beta ! http://www.ideas.live.com/

Follow-Ups:
- Re: DT_GNU_HASH: ~ 50% dynamic linking improvement
  - From: djamel anonymous

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]