Bug 27576 - gmon.out not consistently created
Summary: gmon.out not consistently created
Status: UNCONFIRMED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: 2.31
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-13 19:07 UTC by Wilson Snyder
Modified: 2023-04-28 17:23 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Testcase (674.22 KB, application/x-compressed)
2021-03-13 19:07 UTC, Wilson Snyder
Details
proposed patch (also sent to mailing list) (5.67 KB, patch)
2023-02-11 09:15 UTC, Simon Kissane
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Wilson Snyder 2021-03-13 19:07:47 UTC
Created attachment 13307 [details]
Testcase

== Description

Type "make" in this tarball's directory.

This will show (extra lines omitted):

  g++  -pg -O0  -c -o Vt_case_huge_prof__main.o Vt_case_huge_prof__main.ii
  @@@@@ Did we get a gmon?
  make: [Makefile:15: x] Error 1 (ignored)

  g++  -pg -Os  -c -o Vt_case_huge_prof__main.o Vt_case_huge_prof__main.ii
  @@@@@ Did we get a gmon?
    4554816     28 -rw-rw-r--   1 user  user     26943 Mar 13 11:33 ./gmon.out

  g++  -pg -O0  -c -o Vt_case_huge_prof__main.o Vt_case_huge_prof__main__ok.ii
  @@@@@ Did we get a gmon?
    4554816    884 -rw-rw-r--   1 user  user    897414 Mar 13 11:33 ./gmon.out


In short gcc -pg with -O0 is not reliably creating a gmon.

The problem goes away using -Os.

The problem goes away if line 60549 of ~/d/case/Vt_case_huge_prof__main.ii

    std::array<CData , 16384> __Vtablechg1;

is changed into

    CData __Vtablechg1[16384];

Others have reported the problem appears timing related and sometimes gmon.out will get produced but on my system it is completely reliably broken.

Experiments show clang++ (with same glibc on same system) gives same misbehavior.

While this "feels" like a memory corruption sort of bug, the program is sanitizer and warning clean. This was originally reported as GCC bug at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99579 but appears the compiler is creating reasonable code.

== gcc --version

gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

== glibc verison

glibc-source/focal-updates 2.31-0ubuntu9.2 all

== cat /etc/*release

DISTRIB_DESCRIPTION="Ubuntu 20.04 LTS"

== cat /proc/cpuinfo

... AMD Ryzen 9 3950X 16-Core
Comment 1 Wilson Snyder 2021-03-13 20:22:24 UTC
Pasting additional info from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99579

"GCC is emitting the mcount call correctly for each function.
Glibc is the library where the mcount is located and enabling of the timer (SIGPROF) is enabled and it controls the outputting of the gmon.out.
From what I can tell if the timer does not happen, then there will be no gmon.out outputted.
With a slower/older machine (AMD Athlon(tm) II X4 640 Processor), I sometimes get gmon.out and sometimes don't.
Comment 2 account disabled by myself since useless 2022-08-03 19:58:36 UTC
This may be a manifestation of the bug described and fixed (patch attached) in Bug 29444.
Comment 3 Simon Kissane 2023-02-10 10:56:45 UTC
I'm pretty sure this is an overflow in mcount.

Here's how I can tell:

$ gdb ./Vt_case_huge_prof 
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./Vt_case_huge_prof...
(No debugging symbols found in ./Vt_case_huge_prof)
(gdb) break _mcleanup
Function "_mcleanup" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (_mcleanup) pending.
(gdb) run
Starting program: /home/skissane/bz27576/case/Vt_case_huge_prof 
- t/t_case_huge.v:206: Verilog $finish

Breakpoint 1, _mcleanup () at gmon.c:440
440     gmon.c: No such file or directory.
(gdb) print _gmonparam
$1 = {state = 2, kcount = 0x7ffff7972a98, kcountsize = 508200, froms = 0x7ffff79eebc0, fromssize = 508194, tos = 0x7ffff78c0010, tossize = 731784, tolimit = 30491, lowpc = 93824992247264, 
  highpc = 93824993263652, textsize = 1016388, hashfraction = 2, log_hashfraction = 4}

Note that state=2. If you look at /usr/include/sys/gmon.h you will see:

#define GMON_PROF_ERROR 2

So, what has happened, is gmon has gone into error state.

Per the source code, only two ways this can happen:

1) monstartup runs out of memory - not case here since then tos will be NULL, but tos is non-NULL
2) overflow in mcount

So clearly this is an overflow in mcount. There is too much profiling data and mcount runs out of space in its buffer to store it. Probably the reason why changing compilation options around optimisation/etc makes the issue go away, is that changes how the code is laid out (how many functions are called vs inlined etc) which causes less consumption of space in the buffer.

Contrary to the comment by "account removed" (who I understand is someone who had their Bugzilla access revoked due to unpleasant behaviour), I don't believe this is related to bug 29444. That bug complains that kcountsize and fromsize are calculated incorrectly, causing a buffer overflow. Even if that's true, the overflow condition is based on exceeding tolimit, and the patch for that bug isn't changing the calculation of tolimit.

One problem here, is mcount doesn't print any error message when the overflow happens, leaving the user confused as to what went wrong, as has happened in this case.

Another problem, is that there is no way to increase the buffer size beyond MAXARCS entries without recompiling glibc. If MAXARCS (and maybe MINARCS too, in case the buffer sizing heuristic is malfunctioning) were tunables, the user could respond to the overflow error message (were it printed) by increasing that tunable.
Comment 4 Simon Kissane 2023-02-11 09:15:32 UTC
Created attachment 14676 [details]
proposed patch (also sent to mailing list)
Comment 5 Sourceware Commits 2023-02-23 02:01:17 UTC
The master branch has been updated by DJ Delorie <dj@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=31be941e4367c001b2009308839db5c67bf9dcbc

commit 31be941e4367c001b2009308839db5c67bf9dcbc
Author: Simon Kissane <skissane@gmail.com>
Date:   Sat Feb 11 20:12:13 2023 +1100

    gmon: improve mcount overflow handling [BZ# 27576]
    
    When mcount overflows, no gmon.out file is generated, but no message is printed
    to the user, leaving the user with no idea why, and thinking maybe there is
    some bug - which is how BZ 27576 ended up being logged. Print a message to
    stderr in this case so the user knows what is going on.
    
    As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
    small for some large applications, including the test case in that BZ. Rather
    than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
    at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
    mcount overflow error, they can try increasing maxarcs (they might need to
    increase minarcs too if the heuristic is wrong in their case.)
    
    Note setting minarcs/maxarcs too large can cause monstartup to fail with an
    out of memory error. If you set them large enough, it can cause an integer
    overflow in calculating the buffer size. I haven't done anything to defend
    against that - it would not generally be a security vulnerability, since these
    tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
    and if you can set GLIBC_TUNABLES in the environment of a process, you can take
    it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
    the code of monstartup to defend against integer overflows, but doing so is
    complicated, and I realise the existing code is susceptible to them even prior
    to this change (e.g. try passing a pathologically large highpc argument to
    monstartup), so I decided just to leave that possibility in-place.
    
    Add a test case which demonstrates mcount overflow and the tunables.
    
    Document the new tunables in the manual.
    
    Signed-off-by: Simon Kissane <skissane@gmail.com>
    Reviewed-by: DJ Delorie <dj@redhat.com>
Comment 6 Sourceware Commits 2023-04-28 12:13:21 UTC
The release/2.37/master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=d230623264e300ac1c827cb83ad7818f122a6a98

commit d230623264e300ac1c827cb83ad7818f122a6a98
Author: Simon Kissane <skissane@gmail.com>
Date:   Sat Feb 11 20:12:13 2023 +1100

    gmon: improve mcount overflow handling [BZ# 27576]
    
    When mcount overflows, no gmon.out file is generated, but no message is printed
    to the user, leaving the user with no idea why, and thinking maybe there is
    some bug - which is how BZ 27576 ended up being logged. Print a message to
    stderr in this case so the user knows what is going on.
    
    As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
    small for some large applications, including the test case in that BZ. Rather
    than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
    at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
    mcount overflow error, they can try increasing maxarcs (they might need to
    increase minarcs too if the heuristic is wrong in their case.)
    
    Note setting minarcs/maxarcs too large can cause monstartup to fail with an
    out of memory error. If you set them large enough, it can cause an integer
    overflow in calculating the buffer size. I haven't done anything to defend
    against that - it would not generally be a security vulnerability, since these
    tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
    and if you can set GLIBC_TUNABLES in the environment of a process, you can take
    it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
    the code of monstartup to defend against integer overflows, but doing so is
    complicated, and I realise the existing code is susceptible to them even prior
    to this change (e.g. try passing a pathologically large highpc argument to
    monstartup), so I decided just to leave that possibility in-place.
    
    Add a test case which demonstrates mcount overflow and the tunables.
    
    Document the new tunables in the manual.
    
    Signed-off-by: Simon Kissane <skissane@gmail.com>
    Reviewed-by: DJ Delorie <dj@redhat.com>
    (cherry picked from commit 31be941e4367c001b2009308839db5c67bf9dcbc)
Comment 7 Sourceware Commits 2023-04-28 14:35:23 UTC
The release/2.36/master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8920855c4568fa99c67cb1c7d6b29465c25f51c2

commit 8920855c4568fa99c67cb1c7d6b29465c25f51c2
Author: Simon Kissane <skissane@gmail.com>
Date:   Sat Feb 11 20:12:13 2023 +1100

    gmon: improve mcount overflow handling [BZ# 27576]
    
    When mcount overflows, no gmon.out file is generated, but no message is printed
    to the user, leaving the user with no idea why, and thinking maybe there is
    some bug - which is how BZ 27576 ended up being logged. Print a message to
    stderr in this case so the user knows what is going on.
    
    As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
    small for some large applications, including the test case in that BZ. Rather
    than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
    at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
    mcount overflow error, they can try increasing maxarcs (they might need to
    increase minarcs too if the heuristic is wrong in their case.)
    
    Note setting minarcs/maxarcs too large can cause monstartup to fail with an
    out of memory error. If you set them large enough, it can cause an integer
    overflow in calculating the buffer size. I haven't done anything to defend
    against that - it would not generally be a security vulnerability, since these
    tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
    and if you can set GLIBC_TUNABLES in the environment of a process, you can take
    it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
    the code of monstartup to defend against integer overflows, but doing so is
    complicated, and I realise the existing code is susceptible to them even prior
    to this change (e.g. try passing a pathologically large highpc argument to
    monstartup), so I decided just to leave that possibility in-place.
    
    Add a test case which demonstrates mcount overflow and the tunables.
    
    Document the new tunables in the manual.
    
    Signed-off-by: Simon Kissane <skissane@gmail.com>
    Reviewed-by: DJ Delorie <dj@redhat.com>
    (cherry picked from commit 31be941e4367c001b2009308839db5c67bf9dcbc)
Comment 8 Sourceware Commits 2023-04-28 14:35:51 UTC
The release/2.35/master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9f81b8fa65798233dff2794ff0d8d2d5d8062e8b

commit 9f81b8fa65798233dff2794ff0d8d2d5d8062e8b
Author: Simon Kissane <skissane@gmail.com>
Date:   Sat Feb 11 20:12:13 2023 +1100

    gmon: improve mcount overflow handling [BZ# 27576]
    
    When mcount overflows, no gmon.out file is generated, but no message is printed
    to the user, leaving the user with no idea why, and thinking maybe there is
    some bug - which is how BZ 27576 ended up being logged. Print a message to
    stderr in this case so the user knows what is going on.
    
    As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
    small for some large applications, including the test case in that BZ. Rather
    than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
    at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
    mcount overflow error, they can try increasing maxarcs (they might need to
    increase minarcs too if the heuristic is wrong in their case.)
    
    Note setting minarcs/maxarcs too large can cause monstartup to fail with an
    out of memory error. If you set them large enough, it can cause an integer
    overflow in calculating the buffer size. I haven't done anything to defend
    against that - it would not generally be a security vulnerability, since these
    tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
    and if you can set GLIBC_TUNABLES in the environment of a process, you can take
    it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
    the code of monstartup to defend against integer overflows, but doing so is
    complicated, and I realise the existing code is susceptible to them even prior
    to this change (e.g. try passing a pathologically large highpc argument to
    monstartup), so I decided just to leave that possibility in-place.
    
    Add a test case which demonstrates mcount overflow and the tunables.
    
    Document the new tunables in the manual.
    
    Signed-off-by: Simon Kissane <skissane@gmail.com>
    Reviewed-by: DJ Delorie <dj@redhat.com>
    (cherry picked from commit 31be941e4367c001b2009308839db5c67bf9dcbc)
Comment 9 Sourceware Commits 2023-04-28 17:23:37 UTC
The release/2.34/master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4dd89b2a8fc91bc74ea85a442ae4c672b6dda113

commit 4dd89b2a8fc91bc74ea85a442ae4c672b6dda113
Author: Simon Kissane <skissane@gmail.com>
Date:   Sat Feb 11 20:12:13 2023 +1100

    gmon: improve mcount overflow handling [BZ# 27576]
    
    When mcount overflows, no gmon.out file is generated, but no message is printed
    to the user, leaving the user with no idea why, and thinking maybe there is
    some bug - which is how BZ 27576 ended up being logged. Print a message to
    stderr in this case so the user knows what is going on.
    
    As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
    small for some large applications, including the test case in that BZ. Rather
    than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
    at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
    mcount overflow error, they can try increasing maxarcs (they might need to
    increase minarcs too if the heuristic is wrong in their case.)
    
    Note setting minarcs/maxarcs too large can cause monstartup to fail with an
    out of memory error. If you set them large enough, it can cause an integer
    overflow in calculating the buffer size. I haven't done anything to defend
    against that - it would not generally be a security vulnerability, since these
    tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
    and if you can set GLIBC_TUNABLES in the environment of a process, you can take
    it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
    the code of monstartup to defend against integer overflows, but doing so is
    complicated, and I realise the existing code is susceptible to them even prior
    to this change (e.g. try passing a pathologically large highpc argument to
    monstartup), so I decided just to leave that possibility in-place.
    
    Add a test case which demonstrates mcount overflow and the tunables.
    
    Document the new tunables in the manual.
    
    Signed-off-by: Simon Kissane <skissane@gmail.com>
    Reviewed-by: DJ Delorie <dj@redhat.com>
    (cherry picked from commit 31be941e4367c001b2009308839db5c67bf9dcbc)