In bfd/elflink.c, in compute_bucket_count(), there is a $\Theta(n^2)$ loop (with n=nsyms) to find a good hash table size. This is okay for n=50, but we have links taking 15 minutes for a library with n=162099 symbols. The code in compute_bucket_count() is a bit pointless for large values of n. Perhaps a different strategy for n > 1024 is needed, e.g., just return size nsyms/2, or choose the best of (say) sizes n0...n0+k where n0 is the target size (with k=4 for example, just to eliminate the possibility that the first choice is astronomically unlucky).
Can you provide a testcase?
Here is a C++ program that will create a bunch of .c files, compile and then link them, hitting the bug. It links very quickly without the -O1 option, and with the -O1 option it takes a very long time. ---==== cut-here #include <stdlib.h> #include <stdio.h> #include <string> #include <map> int main() { const int num_symbols = 199933; const int ref_symbol_delta = 37199; const int num_files = 25; char tbuf[128]; srand(1); std::string objfiles; int ref_symbol = 0; int i = 0; for (int j=0; j < num_files; ++j) { int symbols_ceil = (num_symbols*(j+1))/num_files; sprintf(tbuf, "many_symbols%03d.c", j+1); FILE* dst = fopen(tbuf, "w"); std::map<int,int> sites; for (; i < symbols_ceil; ++i) { ref_symbol = (ref_symbol + ref_symbol_delta) % num_symbols; sites[i] = ref_symbol; fprintf(dst, "void foo%d();\n", ref_symbol); } for (std::map<int,int>::iterator iter = sites.begin(); iter != sites.end(); ++iter) { fprintf(dst, "void foo%d() { foo%d(); }\n", iter->first, iter->second); } fclose(dst); sprintf(tbuf, "gcc -c -fPIC many_symbols%03d.c -o many_symbols%03d.o", j+1, j+1); printf("%s\n", tbuf); system(tbuf); sprintf(tbuf, "many_symbols%03d.o", j+1); objfiles += ((j > 0) ? " " : "") + std::string(tbuf); } system("time ld -O1 many_symbols???.o -shared -o libxxx.so"); } ---==== cut-here
Here some tracing for compute_bucket_count() that illustrates the futility of searching large numbers of hash table sizes. Index: bfd/elflink.c =================================================================== RCS file: /cvs/src/src/bfd/elflink.c,v retrieving revision 1.372 diff -r1.372 elflink.c 31a32 > #include <sys/time.h> 5345a5347,5354 > > static double wallTime(void) > { > struct timeval t; > gettimeofday(&t, 0); > return t.tv_sec + t.tv_usec/1000000.0; > } > 5361a5371,5373 > double startTime; > > startTime = wallTime(); 5368a5381 > long insert_count; 5377a5391,5392 > insert_count=0; > 5419a5435,5436 > insert_count += nsyms; > 5458a5476,5481 > printf("elapsed=%7.4lfs size=%ld max=%lf (%lf%%) after %ld inserts\n", > wallTime() - startTime, > (long)i, > (double)max, ((double)max-best_chlen)/best_chlen*100.0, > insert_count); > insert_count=0; 5461a5485,5491 > else if (insert_count > 1000000000L) > { > printf("elapsed=%7.4lfs No improvement after %ld inserts\n", > wallTime() - startTime, > insert_count); > insert_count=0; > } Its output for the test case, on my machine, shows that after trying the first three hash table sizes, no improvements of more than 1% occur, and after the first 200 hash table sizes, no improvements whatsoever occur. -bash-3.2$ ld -O1 many_symbols*.o -shared -o libxxx.so elapsed= 0.0045s size=99984 max=43186095512.000000 (-100.000000%) after 399936 inserts elapsed= 0.0087s size=99985 max=34095909512.000000 (-21.048872%) after 399936 inserts elapsed= 0.0129s size=99986 max=33926687032.000000 (-0.496313%) after 399936 inserts elapsed= 0.0170s size=99987 max=33836351808.000000 (-0.266266%) after 399936 inserts elapsed= 0.0586s size=99997 max=33830992776.000000 (-0.015838%) after 3999360 inserts elapsed= 0.0628s size=99998 max=33828995144.000000 (-0.005905%) after 399936 inserts elapsed= 0.0836s size=100003 max=33823059872.000000 (-0.017545%) after 1999680 inserts elapsed= 0.1293s size=100014 max=33802046320.000000 (-0.062128%) after 4399296 inserts elapsed= 0.1625s size=100022 max=33791424296.000000 (-0.031424%) after 3199488 inserts elapsed= 0.1833s size=100027 max=33740062104.000000 (-0.151998%) after 1999680 inserts elapsed= 0.2167s size=100035 max=33725848184.000000 (-0.042128%) after 3199488 inserts elapsed= 0.4254s size=100085 max=33716109728.000000 (-0.028875%) after 19996800 inserts elapsed= 0.5258s size=100109 max=33682995136.000000 (-0.098216%) after 9598464 inserts elapsed= 0.6515s size=100139 max=33669972112.000000 (-0.038663%) after 11998080 inserts elapsed= 0.7270s size=100157 max=33644252600.000000 (-0.076387%) after 7198848 inserts elapsed=11.1804s No improvement after 1000239936 inserts elapsed=21.6568s No improvement after 1000239936 inserts elapsed=32.1744s No improvement after 1000239936 inserts elapsed=42.7346s No improvement after 1000239936 inserts elapsed=53.3344s No improvement after 1000239936 inserts elapsed=63.9556s No improvement after 1000239936 inserts elapsed=74.6303s No improvement after 1000239936 inserts elapsed=85.3426s No improvement after 1000239936 inserts elapsed=96.0896s No improvement after 1000239936 inserts elapsed=106.8647s No improvement after 1000239936 inserts elapsed=117.6750s No improvement after 1000239936 inserts elapsed=128.5248s No improvement after 1000239936 inserts elapsed=139.4285s No improvement after 1000239936 inserts elapsed=150.3337s No improvement after 1000239936 inserts elapsed=161.2662s No improvement after 1000239936 inserts elapsed=172.2233s No improvement after 1000239936 inserts elapsed=183.2224s No improvement after 1000239936 inserts elapsed=194.2606s No improvement after 1000239936 inserts elapsed=205.3407s No improvement after 1000239936 inserts elapsed=216.4314s No improvement after 1000239936 inserts elapsed=227.5811s No improvement after 1000239936 inserts elapsed=238.7598s No improvement after 1000239936 inserts elapsed=249.9598s No improvement after 1000239936 inserts etc.
Here is a conservative patch. Index: bfd/elflink.c =================================================================== RCS file: /cvs/src/src/bfd/elflink.c,v retrieving revision 1.372 diff -u -r1.372 elflink.c --- bfd/elflink.c 14 Apr 2010 08:29:33 -0000 1.372 +++ bfd/elflink.c 28 Jul 2010 22:21:53 -0000 @@ -5374,6 +5374,7 @@ const struct elf_backend_data *bed = get_elf_backend_data (dynobj); unsigned long int *counts; bfd_size_type amt; + unsigned int no_improvement_count = 0; /* Possible optimization parameters: if we have NSYMS symbols we say that the hashing table must at least have NSYMS/4 and at most @@ -5458,7 +5459,10 @@ { best_chlen = max; best_size = i; + no_improvement_count = 0; } + else if (++no_improvement_count == 100) + break; } free (counts);
(In reply to comment #4) > Here is a conservative patch. > > Index: bfd/elflink.c > =================================================================== > RCS file: /cvs/src/src/bfd/elflink.c,v > retrieving revision 1.372 > diff -u -r1.372 elflink.c > --- bfd/elflink.c 14 Apr 2010 08:29:33 -0000 1.372 > +++ bfd/elflink.c 28 Jul 2010 22:21:53 -0000 > @@ -5374,6 +5374,7 @@ > const struct elf_backend_data *bed = get_elf_backend_data (dynobj); > unsigned long int *counts; > bfd_size_type amt; > + unsigned int no_improvement_count = 0; > > /* Possible optimization parameters: if we have NSYMS symbols we say > that the hashing table must at least have NSYMS/4 and at most > @@ -5458,7 +5459,10 @@ > { > best_chlen = max; > best_size = i; > + no_improvement_count = 0; > } > + else if (++no_improvement_count == 100) > + break; > } > > free (counts); > It looks good to me. Please post it to binutils mailing list. Thanks.
(In reply to comment #5) > (In reply to comment #4) > > Here is a conservative patch. > > > > Index: bfd/elflink.c > > =================================================================== > > RCS file: /cvs/src/src/bfd/elflink.c,v > > retrieving revision 1.372 > > diff -u -r1.372 elflink.c > > --- bfd/elflink.c 14 Apr 2010 08:29:33 -0000 1.372 > > +++ bfd/elflink.c 28 Jul 2010 22:21:53 -0000 > > @@ -5374,6 +5374,7 @@ > > const struct elf_backend_data *bed = get_elf_backend_data (dynobj); > > unsigned long int *counts; > > bfd_size_type amt; > > + unsigned int no_improvement_count = 0; > > > > /* Possible optimization parameters: if we have NSYMS symbols we say > > that the hashing table must at least have NSYMS/4 and at most > > @@ -5458,7 +5459,10 @@ > > { > > best_chlen = max; > > best_size = i; > > + no_improvement_count = 0; > > } > > + else if (++no_improvement_count == 100) > > + break; > > } > > > > free (counts); > > > > It looks good to me. Please post it to binutils mailing list. Thanks. My second though. This patch gives up way too early.
(In reply to comment #6) > > > > It looks good to me. Please post it to binutils mailing list. Thanks. > > My second though. This patch gives up way too early. 100 is good enough. We may spend more time with little gain.
Subject: Bug 11843 CVSROOT: /cvs/src Module name: src Changes by: nickc@sourceware.org 2010-08-12 16:23:33 Modified files: bfd : ChangeLog elflink.c Log message: PR ld/11843 * elflink.c (compute_bucket_count): Avoid futile long searches for the best bucket size. Patches: http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/ChangeLog.diff?cvsroot=src&r1=1.5101&r2=1.5102 http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/elflink.c.diff?cvsroot=src&r1=1.373&r2=1.374
Hi Todd, I have checked in your patch - thanks very much for creating it. Cheers Nick bfd/ChangeLog 2010-08-12 Todd Veldhuizen <todd.veldhuizen@logicblox.com> PR ld/11843 * elflink.c (compute_bucket_count): Avoid futile long searches for the best bucket size.