Bug 13570 - Controlling gmon.out file creation in HPC environment
Summary: Controlling gmon.out file creation in HPC environment
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-06 18:32 UTC by Pidad
Modified: 2014-06-27 11:14 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pidad 2012-01-06 18:32:09 UTC
In the High Performance Computing (HPC) the parallel(Message Passing Interface - MPI) applications run as large number of tasks (100s, 1000s to 1 million tasks). Each of these tasks may be run on different nodes of a HPC cluster. Each task is a process by itself, which are executed in parallel using the same binary (SPMD or MPMD models). 

When these MPI applications are profiled using -pg option, depending on the number of tasks those many separate gmon.out files will be created in the same shared file system. This leads to bottleneck on to the application both from file system space, large amount disk operation during process exit.

HPC users might be interested in limited set of gmon.out files which meet certain defined criteria and not all the gmon.out files created. However the set is not known until after the application has completed execution and the gmon.out files are about to be written. 

In the view to address the above said bottleneck and give the user only the gmon.out files of interest, following approach can be thought about:
1. Disable the generation of gmon.out file at exit().
2. Obtain access to all gmon data in memory prior to application exit using public interfaces so that data can be processed for the subset of tasks that are of interest.

The above approach was tried on Redhat 6.1 (Santiago) 2.6.32-131.0.15.el6.ppc64.

With moncontrol(0) could not stop the creation of gmon.out file during exit. It only switches off the profiling. The atexit handler gmon.c:_mcleanup() creates the gmon.out. There is no way to stop it creating.

Even if one succeeds to stop creation of gmon.out file, the data variable which holds the profile data is declared as hidden.
Ref - gmon.c:struct gmonparam _gmonparam attribute_hidden = { GMON_PROF_OFF };

Due to this, the applications will not be able to read the collected in-memory data from the tasks to apply the filtering criteria.

We need mechanisms to stop the creation of gmon.out file and access to read the collected gmon data using which gmon.out file is generated at will.
Comment 1 Ondrej Bilka 2013-10-26 06:03:08 UTC
There is already a functionality to support this. You need to interpose a _mcleanup by compiling shared library containing function like one below and then LD_PRELOAD it for your program.

void _mcleanup (void)
{
  if (user condition)
    {
      void (*p)() = dlsym (RTLD_NEXT, "_mcleanup");
      p ();                                        
    }      
}