This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: mtrace perl tool reporting "+ 0x000001003c393dc0 Alloc 3183 duplicate: 0x3fffaa678d08 /lib64/libntirpc.so.1.3:[0x3fffaa678d08]"
- From: Malahal Naineni <malahal at gmail dot com>
- To: "Carlos O'Donell" <carlos at redhat dot com>
- Cc: DJ Delorie <dj at redhat dot com>, libc-alpha at sourceware dot org
- Date: Fri, 22 Feb 2019 02:04:05 +0530
- Subject: Re: mtrace perl tool reporting "+ 0x000001003c393dc0 Alloc 3183 duplicate: 0x3fffaa678d08 /lib64/libntirpc.so.1.3:[0x3fffaa678d08]"
- References: <xn4l8zbca4.fsf@greed.delorie.com> <1def8a74-a7a0-ecc8-dad7-fdc076de414f@redhat.com>
>> The serializing lock in all the tr_*hook functions should ensure that you see the correct order.
Looked at the tr_freehook() source. It changes __free_hook just before
calling real free(). So if 2 threads call free() at the same time, one
free() may call the tr_freehook() and this will acquire a lock and
change the __free_hook to tr_old_free_hook which is NULL. The second
thread's free() call may see this NULL hook and never really logs
anything to the trace file. I think mtrace() hooks don't work with
multithreaded application, pretty useless in multithreaded
applications. Am I wrong?
Regards, Malahal.
On Wed, Feb 20, 2019 at 12:36 PM Carlos O'Donell <carlos@redhat.com> wrote:
>
> On 2/19/19 6:15 PM, DJ Delorie wrote:
> >
> > Malahal Naineni <malahal@gmail.com> writes:
> >> I am using matrace to find out a memory leak in a long running server
> >> application. Why do I get the above warning/error? Using
> >> glibc-2.17-222.el7 package from Redhat if that matters. The
> >> applications is a multi-threaded application.
> >
> > This basically says "you've already allocated this, and I didn't see a
> > free()" which *seems* like malloc() is returning the same address twice.
> > In reality, it's more likely (assuming something hasn't actually broken)
> > that threads are doing malloc operations in one order, but reporting
> > them in a different order, so the mtrace tool sees a malloc() return a
> > pointer just before the free() that had free'd it, instead of
> > free-then-malloc.
> >
> > i.e. the program does this:
> >
> > x = malloc(sz);
> > free (sz);
> > x = malloc(sz); /* returns same value */
> >
> > but the *logs* say something like this:
> >
> > + MALLOC 0x1234
> > + MALLOC 0x1234
> > - FREE 0x1234
> >
> > Having said that, something could be broken for any number of weird
> > reasons (example: a lost record, data corruption, etc), and knowing for
> > sure what the problem is would require poring over your logs for hours
> > trying to figure out what actually happened.
>
> That is odd. The serializing lock in all the tr_*hook functions should
> ensure that you see the correct order. One possibility is a thread crash
> or hang in free/tr_freehook.
>
> Only in the modern tracer we wrote (not mtrace) can you get trace
> inversion because we don't force a total global order, we just dump
> as fast as we can into the trace buffer using atomics and sort it
> later.
>
> The only real solution is to go through your trace event by event and
> figure out what's wrong.
>
> --
> Cheers,
> Carlos.