As a developer, I need the ability to activate/use gmon (gprof monitor) for my DSO (shared library). Yes, of course, I know that gmon is not intended for this. However, such restriction can be easily circumvented approximately as shown below. As a result, gmon.out will be created, which unfortunately cannot be processed correctly by gprof due to a problem similar to PIE in 2017 (Bug 21189, Bug 22284). The reason is that the fix made is only intended to work in the case of a PIE, but not a DSO. In particular due that the `load_address` is gathered strictly for executable only (https://sourceware.org/git/?p=glibc.git;a=blob;f=gmon/gmon.c;h=dee64803ada583d79ca9adc281beffce4300255f;hb=d165ca64980f90ccace088670652cc203d1b5411#l55). So I think this can be easily fixed if we take the `load_address` not from executable but of the first DSO, the text segment of which intersects with the range of addresses passed to `monstartup()`. Will such a patch be accepted if I prepare it, or will there be other ideas/suggestions? Leo, Regards. --- /* It is worth noting that the code below works for DSO * when `-pg` is not enabled in the main executable * but does not create problems if `-pg` is enabled. */ extern void _mcleanup (void); extern void monstartup(unsigned long, unsigned long); extern void __gmon_start__(void) __attribute__((__weak__)); extern void _init (void); extern void _fini (void); __attribute__((__constructor__, __no_instrument_function__, __no_profile_instrument_function__)) static void mydso_global_constructor(void) { #ifdef ENABLE_GPROF_FOR_THIS_DSO if (!&__gmon_start__) monstartup((u_long)&_init, (u_long)&_fini); #endif } __attribute__((__destructor__, __no_instrument_function__, __no_profile_instrument_function__)) void static mydso_global_destructor(void) { #ifdef ENABLE_GPROF_FOR_THIS_DSO if (!&__gmon_start__) _mcleanup(); #endif }
I have a working draft of the patch ready, along with the tests. However, during testing, a nasty bug was found in calculating the size of the allocated buffer inside the `__monstartup()`, which is why `_mcount()` may corrupt memory, and `write_call_graph()` does not output part of the collected data (which falls outside the border). So I'll fill one more bug with a patch, and then come back here.
Created attachment 14254 [details] The patch with implementation including test Added the `gmon/tst-gmon-dso` test require Bug 29444 to be fixed (i.e. corresponding patch should be applied). I also didn't figure out how the tests check the generated data, in particular the gprof output. Nonetheless, I manually checked gprof output both for `make test t=gmon/tst-gmon-pie` and `make test t=gmon/tst-gmon-dso`. Surprising, during this check, the noted Bug 29444 was revealed.
i dont know if the patch is acceptable just want to note that there is an LD_PROFILE feature for something similar (i haven't used it though). is that not suitable for your use case?
(In reply to Szabolcs Nagy from comment #3) > LD_PROFILE feature for something similar (i haven't used it though). > is that not suitable for your use case? I agree that I should have mentioned LD_PROFILE in any way. I was focused on my case, so forgot to mention LD_PROFILE, but at same time I didn't describe it so as not to pile up more information. Briefly: In my case, LD_PROFILE is not suitable, since profiling was happening on a customer system where, due to a specific requirements, it was not possible to change the environment settings. I solved all the problems there, but nevertheless decided to return the improvements that might be useful (i.e. not to throw ones away). -- In addition to all of the above, the found issue Bug 29444 is a real bug that needs to be fixed.
Useless.
The master branch has been updated by DJ Delorie <dj@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=801af9fafd4689337ebf27260aa115335a0cb2bc commit 801af9fafd4689337ebf27260aa115335a0cb2bc Author: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru> Date: Sat Feb 4 14:41:38 2023 +0300 gmon: Fix allocated buffer overflow (bug 29444) The `__monstartup()` allocates a buffer used to store all the data accumulated by the monitor. The size of this buffer depends on the size of the internal structures used and the address range for which the monitor is activated, as well as on the maximum density of call instructions and/or callable functions that could be potentially on a segment of executable code. In particular a hash table of arcs is placed at the end of this buffer. The size of this hash table is calculated in bytes as p->fromssize = p->textsize / HASHFRACTION; but actually should be p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms)); This results in writing beyond the end of the allocated buffer when an added arc corresponds to a call near from the end of the monitored address range, since `_mcount()` check the incoming caller address for monitored range but not the intermediate result hash-like index that uses to write into the table. It should be noted that when the results are output to `gmon.out`, the table is read to the last element calculated from the allocated size in bytes, so the arcs stored outside the buffer boundary did not fall into `gprof` for analysis. Thus this "feature" help me to found this bug during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438 Just in case, I will explicitly note that the problem breaks the `make test t=gmon/tst-gmon-dso` added for Bug 29438. There, the arc of the `f3()` call disappears from the output, since in the DSO case, the call to `f3` is located close to the end of the monitored range. Signed-off-by: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru> Another minor error seems a related typo in the calculation of `kcountsize`, but since kcounts are smaller than froms, this is actually to align the p->froms data. Co-authored-by: DJ Delorie <dj@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The release/2.37/master branch has been updated by Florian Weimer <fw@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5d750495db62eeb3c5a62b80a1747b552db664fb commit 5d750495db62eeb3c5a62b80a1747b552db664fb Author: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru> Date: Sat Feb 4 14:41:38 2023 +0300 gmon: Fix allocated buffer overflow (bug 29444) The `__monstartup()` allocates a buffer used to store all the data accumulated by the monitor. The size of this buffer depends on the size of the internal structures used and the address range for which the monitor is activated, as well as on the maximum density of call instructions and/or callable functions that could be potentially on a segment of executable code. In particular a hash table of arcs is placed at the end of this buffer. The size of this hash table is calculated in bytes as p->fromssize = p->textsize / HASHFRACTION; but actually should be p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms)); This results in writing beyond the end of the allocated buffer when an added arc corresponds to a call near from the end of the monitored address range, since `_mcount()` check the incoming caller address for monitored range but not the intermediate result hash-like index that uses to write into the table. It should be noted that when the results are output to `gmon.out`, the table is read to the last element calculated from the allocated size in bytes, so the arcs stored outside the buffer boundary did not fall into `gprof` for analysis. Thus this "feature" help me to found this bug during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438 Just in case, I will explicitly note that the problem breaks the `make test t=gmon/tst-gmon-dso` added for Bug 29438. There, the arc of the `f3()` call disappears from the output, since in the DSO case, the call to `f3` is located close to the end of the monitored range. Signed-off-by: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru> Another minor error seems a related typo in the calculation of `kcountsize`, but since kcounts are smaller than froms, this is actually to align the p->froms data. Co-authored-by: DJ Delorie <dj@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 801af9fafd4689337ebf27260aa115335a0cb2bc)
The release/2.36/master branch has been updated by Florian Weimer <fw@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=33909e5abc8ad24c52d48bd005e102b6b56537c4 commit 33909e5abc8ad24c52d48bd005e102b6b56537c4 Author: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru> Date: Sat Feb 4 14:41:38 2023 +0300 gmon: Fix allocated buffer overflow (bug 29444) The `__monstartup()` allocates a buffer used to store all the data accumulated by the monitor. The size of this buffer depends on the size of the internal structures used and the address range for which the monitor is activated, as well as on the maximum density of call instructions and/or callable functions that could be potentially on a segment of executable code. In particular a hash table of arcs is placed at the end of this buffer. The size of this hash table is calculated in bytes as p->fromssize = p->textsize / HASHFRACTION; but actually should be p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms)); This results in writing beyond the end of the allocated buffer when an added arc corresponds to a call near from the end of the monitored address range, since `_mcount()` check the incoming caller address for monitored range but not the intermediate result hash-like index that uses to write into the table. It should be noted that when the results are output to `gmon.out`, the table is read to the last element calculated from the allocated size in bytes, so the arcs stored outside the buffer boundary did not fall into `gprof` for analysis. Thus this "feature" help me to found this bug during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438 Just in case, I will explicitly note that the problem breaks the `make test t=gmon/tst-gmon-dso` added for Bug 29438. There, the arc of the `f3()` call disappears from the output, since in the DSO case, the call to `f3` is located close to the end of the monitored range. Signed-off-by: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru> Another minor error seems a related typo in the calculation of `kcountsize`, but since kcounts are smaller than froms, this is actually to align the p->froms data. Co-authored-by: DJ Delorie <dj@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 801af9fafd4689337ebf27260aa115335a0cb2bc)
The release/2.35/master branch has been updated by Florian Weimer <fw@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f2820e478c68a73a38f81512cc38beeee220212a commit f2820e478c68a73a38f81512cc38beeee220212a Author: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru> Date: Sat Feb 4 14:41:38 2023 +0300 gmon: Fix allocated buffer overflow (bug 29444) The `__monstartup()` allocates a buffer used to store all the data accumulated by the monitor. The size of this buffer depends on the size of the internal structures used and the address range for which the monitor is activated, as well as on the maximum density of call instructions and/or callable functions that could be potentially on a segment of executable code. In particular a hash table of arcs is placed at the end of this buffer. The size of this hash table is calculated in bytes as p->fromssize = p->textsize / HASHFRACTION; but actually should be p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms)); This results in writing beyond the end of the allocated buffer when an added arc corresponds to a call near from the end of the monitored address range, since `_mcount()` check the incoming caller address for monitored range but not the intermediate result hash-like index that uses to write into the table. It should be noted that when the results are output to `gmon.out`, the table is read to the last element calculated from the allocated size in bytes, so the arcs stored outside the buffer boundary did not fall into `gprof` for analysis. Thus this "feature" help me to found this bug during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438 Just in case, I will explicitly note that the problem breaks the `make test t=gmon/tst-gmon-dso` added for Bug 29438. There, the arc of the `f3()` call disappears from the output, since in the DSO case, the call to `f3` is located close to the end of the monitored range. Signed-off-by: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru> Another minor error seems a related typo in the calculation of `kcountsize`, but since kcounts are smaller than froms, this is actually to align the p->froms data. Co-authored-by: DJ Delorie <dj@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 801af9fafd4689337ebf27260aa115335a0cb2bc)
The release/2.34/master branch has been updated by Florian Weimer <fw@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8e1a8e04b153739a77289e6fc07cbfc252d87e02 commit 8e1a8e04b153739a77289e6fc07cbfc252d87e02 Author: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru> Date: Sat Feb 4 14:41:38 2023 +0300 gmon: Fix allocated buffer overflow (bug 29444) The `__monstartup()` allocates a buffer used to store all the data accumulated by the monitor. The size of this buffer depends on the size of the internal structures used and the address range for which the monitor is activated, as well as on the maximum density of call instructions and/or callable functions that could be potentially on a segment of executable code. In particular a hash table of arcs is placed at the end of this buffer. The size of this hash table is calculated in bytes as p->fromssize = p->textsize / HASHFRACTION; but actually should be p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms)); This results in writing beyond the end of the allocated buffer when an added arc corresponds to a call near from the end of the monitored address range, since `_mcount()` check the incoming caller address for monitored range but not the intermediate result hash-like index that uses to write into the table. It should be noted that when the results are output to `gmon.out`, the table is read to the last element calculated from the allocated size in bytes, so the arcs stored outside the buffer boundary did not fall into `gprof` for analysis. Thus this "feature" help me to found this bug during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438 Just in case, I will explicitly note that the problem breaks the `make test t=gmon/tst-gmon-dso` added for Bug 29438. There, the arc of the `f3()` call disappears from the output, since in the DSO case, the call to `f3` is located close to the end of the monitored range. Signed-off-by: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru> Another minor error seems a related typo in the calculation of `kcountsize`, but since kcounts are smaller than froms, this is actually to align the p->froms data. Co-authored-by: DJ Delorie <dj@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit 801af9fafd4689337ebf27260aa115335a0cb2bc)