This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Duplicated messages on libc-alpha Was: [PATCH 4/5] Refactor tst-strtod-round.c
- From: "Tulio Magno Quites Machado Filho" <tuliom at linux dot vnet dot ibm dot com>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: libc-alpha at sourceware dot org, fche at redhat dot com, "Paul E. Murphy" <murphyp at linux dot vnet dot ibm dot com>
- Cc:
- Date: Fri, 20 May 2016 18:01:35 -0300
- Subject: Duplicated messages on libc-alpha Was: [PATCH 4/5] Refactor tst-strtod-round.c
- Authentication-results: sourceware.org; auth=none
- References: <cover dot 1463433826 dot git dot murphyp at linux dot vnet dot ibm dot com> <8968b370018788e6fb7d7249118faa96f2e2ba90 dot 1463433827 dot git dot murphyp at linux dot vnet dot ibm dot com> <alpine dot DEB dot 2 dot 20 dot 1605162242570 dot 12314 at digraph dot polyomino dot org dot uk>
Hi Joseph and libc-alpha,
Joseph Myers <joseph@codesourcery.com> writes:
> So far this message has reached the libc-alpha list twice (you don't see
> it twice in the archives because it has the same message-id each time).
> I've seen this issue before with large messages coming to libc-alpha from
> IBM (many duplicate copies of them arrive on the list), as if there is
> some problem with IBM's mail server timing out before sourceware accepts
> the message, or something like that. Please try to get that mail system
> problem fixed.
TL;DR
We suspect this was caused by a spike on the load of the relay at
sourceware.org. Frank will try to tweak the server to minimize this.
We'll have to continue monitoring it to see if it happens again.
Complete explanation:
We found out that IBM's relay timed out while sending the message.
It did send the entire message, but didn't receive a reply in time and
re-added the message to the queue.
We aren't completely sure on what caused this delay to deliver this
message, or why IBM didn't receive the answer, but we did find out that
sourceware.org had a spike on the cpu load exactly at the time this message
was being transmitted.
Frank kindly showed me this server is usually under low usage and this
load spike was very high (the 1-minute load average went from 2 to 176)
and lasted ~5 minutes.
I tried to reproduce this issue myself using my own alias and sending bigger
emails, but it didn't happen again.
Frank suspects that load spike may be related to ezmlm / qmail and will try
to tweak priorities a little.
Frank, feel free to correct me, if I'm saying something wrong. ;-)
Meanwhile, I'll continue testing it from time to time.
Which means that, even if this was the right cause, there is a chance this
issue may still happen in the future. If you notice that, please let me
know.
--
Tulio Magno