Bug 29438 - enabling gmon for arbitrary DSO
Summary: enabling gmon for arbitrary DSO
Status: RESOLVED WONTFIX
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: 2.38
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-08-02 20:17 UTC by account disabled by myself since useless
Modified: 2023-04-28 17:23 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
The patch with implementation including test (2.82 KB, patch)
2022-08-03 19:33 UTC, account disabled by myself since useless
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description account disabled by myself since useless 2022-08-02 20:17:07 UTC
As a developer, I need the ability to activate/use gmon (gprof monitor) for my DSO (shared library).

Yes, of course, I know that gmon is not intended for this. However, such restriction can be easily circumvented approximately as shown below.

As a result, gmon.out will be created, which unfortunately cannot be processed correctly by gprof due to a problem similar to PIE in 2017 (Bug 21189, Bug 22284).
The reason is that the fix made is only intended to work in the case of a PIE, but not a DSO. In particular due that the `load_address` is gathered strictly for executable only (https://sourceware.org/git/?p=glibc.git;a=blob;f=gmon/gmon.c;h=dee64803ada583d79ca9adc281beffce4300255f;hb=d165ca64980f90ccace088670652cc203d1b5411#l55).

So I think this can be easily fixed if we take the `load_address` not from executable but of the first DSO, the text segment of which intersects with the range of addresses passed to `monstartup()`.

Will such a patch be accepted if I prepare it, or will there be other ideas/suggestions?

Leo,
Regards.

---

/* It is worth noting that the code below works for DSO
 * when `-pg` is not enabled in the main executable 
 * but does not create problems if `-pg` is enabled. 
 */
extern void _mcleanup (void);
extern void monstartup(unsigned long, unsigned long);
extern void __gmon_start__(void) __attribute__((__weak__));
extern void _init (void);
extern void _fini (void);

__attribute__((__constructor__, __no_instrument_function__, __no_profile_instrument_function__)) 
static void mydso_global_constructor(void) {
#ifdef ENABLE_GPROF_FOR_THIS_DSO
  if (!&__gmon_start__)
    monstartup((u_long)&_init, (u_long)&_fini);
#endif
}

__attribute__((__destructor__, __no_instrument_function__, __no_profile_instrument_function__)) 
void static mydso_global_destructor(void) {
#ifdef ENABLE_GPROF_FOR_THIS_DSO
  if (!&__gmon_start__)
    _mcleanup();
#endif
}
Comment 1 account disabled by myself since useless 2022-08-03 15:30:39 UTC
I have a working draft of the patch ready, along with the tests.

However, during testing, a nasty bug was found in calculating the size of the allocated buffer inside the `__monstartup()`, which is why `_mcount()` may corrupt memory, and `write_call_graph()` does not output part of the collected data (which falls outside the border).

So I'll fill one more bug with a patch, and then come back here.
Comment 2 account disabled by myself since useless 2022-08-03 19:33:37 UTC
Created attachment 14254 [details]
The patch with implementation including test

Added the `gmon/tst-gmon-dso` test require Bug 29444 to be fixed (i.e. corresponding patch should be applied).

I also didn't figure out how the tests check the generated data, in particular the gprof output.

Nonetheless, I manually checked gprof output both for `make test t=gmon/tst-gmon-pie` and `make test t=gmon/tst-gmon-dso`. Surprising, during this check, the noted Bug 29444 was revealed.
Comment 3 Szabolcs Nagy 2022-10-21 13:39:16 UTC
i dont know if the patch is acceptable just want to note that there is an LD_PROFILE feature for something similar (i haven't used it though).
is that not suitable for your use case?
Comment 4 account disabled by myself since useless 2023-01-30 12:32:43 UTC
(In reply to Szabolcs Nagy from comment #3)
> LD_PROFILE feature for something similar (i haven't used it though).
> is that not suitable for your use case?

I agree that I should have mentioned LD_PROFILE in any way.
I was focused on my case, so forgot to mention LD_PROFILE, but at same time I didn't describe it so as not to pile up more information.

Briefly:
In my case, LD_PROFILE is not suitable, since profiling was happening on a customer system where, due to a specific requirements, it was not possible to change the environment settings.

I solved all the problems there, but nevertheless decided  to return the improvements that might be useful (i.e. not to throw ones away).

--

In addition to all of the above, the found issue Bug 29444 is a real bug that needs to be fixed.
Comment 5 account disabled by myself since useless 2023-02-08 12:07:46 UTC
Useless.
Comment 6 Sourceware Commits 2023-02-22 22:24:26 UTC
The master branch has been updated by DJ Delorie <dj@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=801af9fafd4689337ebf27260aa115335a0cb2bc

commit 801af9fafd4689337ebf27260aa115335a0cb2bc
Author: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru>
Date:   Sat Feb 4 14:41:38 2023 +0300

    gmon: Fix allocated buffer overflow (bug 29444)
    
    The `__monstartup()` allocates a buffer used to store all the data
    accumulated by the monitor.
    
    The size of this buffer depends on the size of the internal structures
    used and the address range for which the monitor is activated, as well
    as on the maximum density of call instructions and/or callable functions
    that could be potentially on a segment of executable code.
    
    In particular a hash table of arcs is placed at the end of this buffer.
    The size of this hash table is calculated in bytes as
       p->fromssize = p->textsize / HASHFRACTION;
    
    but actually should be
       p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms));
    
    This results in writing beyond the end of the allocated buffer when an
    added arc corresponds to a call near from the end of the monitored
    address range, since `_mcount()` check the incoming caller address for
    monitored range but not the intermediate result hash-like index that
    uses to write into the table.
    
    It should be noted that when the results are output to `gmon.out`, the
    table is read to the last element calculated from the allocated size in
    bytes, so the arcs stored outside the buffer boundary did not fall into
    `gprof` for analysis. Thus this "feature" help me to found this bug
    during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438
    
    Just in case, I will explicitly note that the problem breaks the
    `make test t=gmon/tst-gmon-dso` added for Bug 29438.
    There, the arc of the `f3()` call disappears from the output, since in
    the DSO case, the call to `f3` is located close to the end of the
    monitored range.
    
    Signed-off-by: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru>
    
    Another minor error seems a related typo in the calculation of
    `kcountsize`, but since kcounts are smaller than froms, this is
    actually to align the p->froms data.
    
    Co-authored-by: DJ Delorie <dj@redhat.com>
    Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Comment 7 Sourceware Commits 2023-04-28 12:13:15 UTC
The release/2.37/master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5d750495db62eeb3c5a62b80a1747b552db664fb

commit 5d750495db62eeb3c5a62b80a1747b552db664fb
Author: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru>
Date:   Sat Feb 4 14:41:38 2023 +0300

    gmon: Fix allocated buffer overflow (bug 29444)
    
    The `__monstartup()` allocates a buffer used to store all the data
    accumulated by the monitor.
    
    The size of this buffer depends on the size of the internal structures
    used and the address range for which the monitor is activated, as well
    as on the maximum density of call instructions and/or callable functions
    that could be potentially on a segment of executable code.
    
    In particular a hash table of arcs is placed at the end of this buffer.
    The size of this hash table is calculated in bytes as
       p->fromssize = p->textsize / HASHFRACTION;
    
    but actually should be
       p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms));
    
    This results in writing beyond the end of the allocated buffer when an
    added arc corresponds to a call near from the end of the monitored
    address range, since `_mcount()` check the incoming caller address for
    monitored range but not the intermediate result hash-like index that
    uses to write into the table.
    
    It should be noted that when the results are output to `gmon.out`, the
    table is read to the last element calculated from the allocated size in
    bytes, so the arcs stored outside the buffer boundary did not fall into
    `gprof` for analysis. Thus this "feature" help me to found this bug
    during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438
    
    Just in case, I will explicitly note that the problem breaks the
    `make test t=gmon/tst-gmon-dso` added for Bug 29438.
    There, the arc of the `f3()` call disappears from the output, since in
    the DSO case, the call to `f3` is located close to the end of the
    monitored range.
    
    Signed-off-by: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru>
    
    Another minor error seems a related typo in the calculation of
    `kcountsize`, but since kcounts are smaller than froms, this is
    actually to align the p->froms data.
    
    Co-authored-by: DJ Delorie <dj@redhat.com>
    Reviewed-by: Carlos O'Donell <carlos@redhat.com>
    (cherry picked from commit 801af9fafd4689337ebf27260aa115335a0cb2bc)
Comment 8 Sourceware Commits 2023-04-28 14:35:18 UTC
The release/2.36/master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=33909e5abc8ad24c52d48bd005e102b6b56537c4

commit 33909e5abc8ad24c52d48bd005e102b6b56537c4
Author: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru>
Date:   Sat Feb 4 14:41:38 2023 +0300

    gmon: Fix allocated buffer overflow (bug 29444)
    
    The `__monstartup()` allocates a buffer used to store all the data
    accumulated by the monitor.
    
    The size of this buffer depends on the size of the internal structures
    used and the address range for which the monitor is activated, as well
    as on the maximum density of call instructions and/or callable functions
    that could be potentially on a segment of executable code.
    
    In particular a hash table of arcs is placed at the end of this buffer.
    The size of this hash table is calculated in bytes as
       p->fromssize = p->textsize / HASHFRACTION;
    
    but actually should be
       p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms));
    
    This results in writing beyond the end of the allocated buffer when an
    added arc corresponds to a call near from the end of the monitored
    address range, since `_mcount()` check the incoming caller address for
    monitored range but not the intermediate result hash-like index that
    uses to write into the table.
    
    It should be noted that when the results are output to `gmon.out`, the
    table is read to the last element calculated from the allocated size in
    bytes, so the arcs stored outside the buffer boundary did not fall into
    `gprof` for analysis. Thus this "feature" help me to found this bug
    during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438
    
    Just in case, I will explicitly note that the problem breaks the
    `make test t=gmon/tst-gmon-dso` added for Bug 29438.
    There, the arc of the `f3()` call disappears from the output, since in
    the DSO case, the call to `f3` is located close to the end of the
    monitored range.
    
    Signed-off-by: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru>
    
    Another minor error seems a related typo in the calculation of
    `kcountsize`, but since kcounts are smaller than froms, this is
    actually to align the p->froms data.
    
    Co-authored-by: DJ Delorie <dj@redhat.com>
    Reviewed-by: Carlos O'Donell <carlos@redhat.com>
    (cherry picked from commit 801af9fafd4689337ebf27260aa115335a0cb2bc)
Comment 9 Sourceware Commits 2023-04-28 14:35:46 UTC
The release/2.35/master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f2820e478c68a73a38f81512cc38beeee220212a

commit f2820e478c68a73a38f81512cc38beeee220212a
Author: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru>
Date:   Sat Feb 4 14:41:38 2023 +0300

    gmon: Fix allocated buffer overflow (bug 29444)
    
    The `__monstartup()` allocates a buffer used to store all the data
    accumulated by the monitor.
    
    The size of this buffer depends on the size of the internal structures
    used and the address range for which the monitor is activated, as well
    as on the maximum density of call instructions and/or callable functions
    that could be potentially on a segment of executable code.
    
    In particular a hash table of arcs is placed at the end of this buffer.
    The size of this hash table is calculated in bytes as
       p->fromssize = p->textsize / HASHFRACTION;
    
    but actually should be
       p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms));
    
    This results in writing beyond the end of the allocated buffer when an
    added arc corresponds to a call near from the end of the monitored
    address range, since `_mcount()` check the incoming caller address for
    monitored range but not the intermediate result hash-like index that
    uses to write into the table.
    
    It should be noted that when the results are output to `gmon.out`, the
    table is read to the last element calculated from the allocated size in
    bytes, so the arcs stored outside the buffer boundary did not fall into
    `gprof` for analysis. Thus this "feature" help me to found this bug
    during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438
    
    Just in case, I will explicitly note that the problem breaks the
    `make test t=gmon/tst-gmon-dso` added for Bug 29438.
    There, the arc of the `f3()` call disappears from the output, since in
    the DSO case, the call to `f3` is located close to the end of the
    monitored range.
    
    Signed-off-by: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru>
    
    Another minor error seems a related typo in the calculation of
    `kcountsize`, but since kcounts are smaller than froms, this is
    actually to align the p->froms data.
    
    Co-authored-by: DJ Delorie <dj@redhat.com>
    Reviewed-by: Carlos O'Donell <carlos@redhat.com>
    (cherry picked from commit 801af9fafd4689337ebf27260aa115335a0cb2bc)
Comment 10 Sourceware Commits 2023-04-28 17:23:32 UTC
The release/2.34/master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8e1a8e04b153739a77289e6fc07cbfc252d87e02

commit 8e1a8e04b153739a77289e6fc07cbfc252d87e02
Author: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru>
Date:   Sat Feb 4 14:41:38 2023 +0300

    gmon: Fix allocated buffer overflow (bug 29444)
    
    The `__monstartup()` allocates a buffer used to store all the data
    accumulated by the monitor.
    
    The size of this buffer depends on the size of the internal structures
    used and the address range for which the monitor is activated, as well
    as on the maximum density of call instructions and/or callable functions
    that could be potentially on a segment of executable code.
    
    In particular a hash table of arcs is placed at the end of this buffer.
    The size of this hash table is calculated in bytes as
       p->fromssize = p->textsize / HASHFRACTION;
    
    but actually should be
       p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms));
    
    This results in writing beyond the end of the allocated buffer when an
    added arc corresponds to a call near from the end of the monitored
    address range, since `_mcount()` check the incoming caller address for
    monitored range but not the intermediate result hash-like index that
    uses to write into the table.
    
    It should be noted that when the results are output to `gmon.out`, the
    table is read to the last element calculated from the allocated size in
    bytes, so the arcs stored outside the buffer boundary did not fall into
    `gprof` for analysis. Thus this "feature" help me to found this bug
    during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438
    
    Just in case, I will explicitly note that the problem breaks the
    `make test t=gmon/tst-gmon-dso` added for Bug 29438.
    There, the arc of the `f3()` call disappears from the output, since in
    the DSO case, the call to `f3` is located close to the end of the
    monitored range.
    
    Signed-off-by: Ðеонид ЮÑÑев (Leonid Yuriev) <leo@yuriev.ru>
    
    Another minor error seems a related typo in the calculation of
    `kcountsize`, but since kcounts are smaller than froms, this is
    actually to align the p->froms data.
    
    Co-authored-by: DJ Delorie <dj@redhat.com>
    Reviewed-by: Carlos O'Donell <carlos@redhat.com>
    (cherry picked from commit 801af9fafd4689337ebf27260aa115335a0cb2bc)