[GOLD] add new method for computing a build ID (take 2)

Cary Coutant ccoutant@google.com
Fri Mar 22 22:06:00 GMT 2013

+  DEFINE_uint64(build_id_chunk_size_for_treehash,
+                options::TWO_DASHES, '\0', 2 << 20,
+                N_("Chunk size for '--build-id=tree'"), N_("SIZE"));
+  DEFINE_uint64(build_id_min_file_size_for_treehash, options::TWO_DASHES,
+                '\0', 200 << 20,
+                N_("Minimum output file size for '--build-id=tree' to work"
+                   " differently than '--build-id=sha1'"), N_("SIZE"));

These are easily the longest options gold would support, and they
have a lot of dashes in them. Given the help text, do you think
"for-treehash" needs to be in the option name? Maybe
--build-id-chunk-size and --build-id-tree-threshold?  Just a
suggestion -- if you and Ian are happy with the long option
names, that's fine with me.

+Layout::start_asynchronous_build_id_if_needed(Workqueue* workqueue,
+                                              const General_options* options,
+                                              Output_file* of) const

Need a comment for this function.

+      workqueue->queue(new Task_function(new Close_task_runner(options, this,
+                                                               of),
+                                         blocker,
+                                         "Task_function Close_task_runner"));
+      return true;

The idea that the Close_task_runner schedules a new copy of
itself, and bases what it does on whether or not
array_of_hashes_ == NULL, strikes me as somewhat kludgy.

It seems to me that you could call this function from near the
end of queue_final_tasks, just before where it schedules
Close_task_runner, with final_blocker as a parameter. If you need
to run the hash tasks, they would all depend on final_blocker,
and you would return the new blocker. If you don't need to run
the hash tasks, you would simply return final_blocker. Then
queue_final_tasks would schedule Close_task_runner using the
returned blocker.

(Also, in queue_final_tasks, your Layout* isn't const, so you
wouldn't need to make your new Layout fields mutable.)

Do you think --thread-count-final is good enough for how many
threads to use here, or do you think another parameter would
be useful? I think it's probably good for this purpose, but
I just wanted to make sure you considered it.

I would think that you'd want to (a) constrain the number of
tasks to the thread count and compute chunk size based on that,
or (b) set the thread count to match the number of chunks.

+      // Parallel computation of build ID is mostly done: all substrings
+      // of the output file have been hashed.  Compute SHA-1 hash of
the hashes.
+      sha1_buffer(reinterpret_cast<const char*>(this->array_of_hashes_),
+                  this->size_of_array_of_hashes_, ov);
+      delete[] this->array_of_hashes_;

Oh, I guess array_of_hashes_ will need to be mutable in order to
keep Layout::write_build_id const. This is at the end of the
link, so deleting this array here isn't really going to matter.
You could put the delete instead in Layout::~Layout.

+Layout::must_use_chunked_build_id_computation() const
+  uint64_t filesize = (this->output_file_size_ <= 0 ? 0
+                       : static_cast<uint64_t>(this->output_file_size_));
+  return this->build_id_note_ != NULL
+      && (strcmp(parameters->options().build_id(), "tree") == 0)
+      && (parameters->options().build_id_chunk_size_for_treehash() > 0)
+      && (filesize >=
+          parameters->options().build_id_min_file_size_for_treehash());

There's also no point in using the tree hash if the linker isn't running
multi-threaded, right?

+bootstrap-test-treehash: ld1 ld3
+ rm -f $@
+ echo "#!/bin/sh" > $@
+ echo "cmp ld1 ld3" > $@

You should use ">>" for the second echo (likewise immediately below).


More information about the Binutils mailing list