Hi! I've been seeing an issue when re-building the libvirt package from Ubuntu 20.04 on a 48-core system where it frequently crashes in dtrace. I created a simple reproducer for it: ------------------------------ user@host:~$ cat foo/Makefile all: ../foo/foo.out foo.out %.out: dtrace -o foo.out -G -s /dev/null clean: rm -f foo.out user@host:~$ cd foo user@host:~/foo$ make dtrace -o foo.out -G -s /dev/null user@host:~/foo$ make clean rm -f foo.out user@host:~/foo$ make -j2 dtrace -o foo.out -G -s /dev/null dtrace -o foo.out -G -s /dev/null Traceback (most recent call last): File "/usr/bin/dtrace", line 455, in <module> sys.exit(main()) File "/usr/bin/dtrace", line 440, in main os.remove(fname) FileNotFoundError: [Errno 2] No such file or directory: 'foo.out.dtrace-temp.c' make: *** [Makefile:4: ../foo/foo.out] Error 1 ------------------------------ Here we have 2 targets that actually resolve to the same file. It seems like something about the way pattern matching (%) works prevents make from realizing that. So, a parallel make will execute 2 dtrace processes to build the same file. Because dtrace uses the same temporary filename for both - foo.out.dtrace-temp.c - they race, and one dtrace delete the file while the other is still processing it. Calling dtrace w/ -k works around the problem for me because it prevents the first dtrace process from trying to delete the file. But that likely just leaves a harder-to-hit race where one dtrace process tries to re-write the process while the other is processing it. The reason dtrace uses the same temp filename is for reproducible builds: https://sourceware.org/git/?p=systemtap.git;a=commitdiff;h=c245153ca193c471a8c8a2650834dc2f0b801bc1 The reproducible builds project suggests an alternative method of overriding the .file directive: https://reproducible-builds.org/docs/randomness/ Unfortunately I haven't found a good way to override .file outside of an assembly file. -dann
What do you think about using fcntl.flock() on the temp file, so as to lock out other concurrent scripts? (Additionally, it would be possible to salt the temp file name with a function of the input .d content, so that it's reproducible but not just a single common name.
Thanks for the response Frank! I suspect the flock() approach could have unintended side-effects depending on the underlying filesystem. I don't have a specific example. Much like a toddler's birthday party, it's just something I've learned is better to avoid over the years. I do like the idea of providing a salt to the mkstemp function. Although, probably not with the content of the input file alone because, in the libvirt case, that input is actually the same. The passed-in path is unique though, so that would seem to work. Just to make sure we're not reinventing the wheel, I reached out to the reproducible builds folks to see if they know of a way to override .file when generating C files: https://lists.reproducible-builds.org/pipermail/rb-general/2022-February/002489.html I suggest we see if they have a better idea. Meanwhile, I'll try to PoC a filename-seeded-mkstemp patch.
Created attachment 13997 [details] Hash inputs to create predictable temporary name for .c file I didn't find a good way to alter the random seed used by mkstemp(), so instead I experimented with a sha hash using the user-supplied input/output file paths. See attached. This works for my use case, though I'd prefer a solution that overrides the .file directive if we can figure out how to do that and continue to use mkstemp().
Can I interest you in a hash not of the file name, but of the file *contents*? (If I had a hash to seed a single-use rng from, I wouldn't bother use the rng at all.)
A hash of the contents - at least alone - would have the original problem I think. In the libvirt case (and in my simple reproducer), the problem is that we have 2 racing processes that are building the same target from the same source in the same directory. So a hash of the contents of each would be the same and still collide. We could add the contents into the hash *in addition to* the filenames, but I don't think that would improve uniqueness.
I didn't get any better suggestion from the reproducible builds folks: https://lists.reproducible-builds.org/pipermail/rb-general/2022-February/002490.html I'll therefore go ahead and submit the attached patch to the mailing list, as that appears to be the correct process, and we can continue the conversation there.
merged, thanks (I think I'll still follow up with an explicit flock.)