This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: Backend doesn't catch the next command, after SIGUSR2
- From: Larry Hall <cygwin-lh at cygwin dot com>
- To: Patrick Samson <p_samson at yahoo dot com>, cygwin at cygwin dot com
- Date: Tue, 09 Mar 2004 12:15:31 -0500
- Subject: Re: Backend doesn't catch the next command, after SIGUSR2
- References: <20040309152842.40194.qmail@web60303.mail.yahoo.com>
- Reply-to: Cygwin List <cygwin at cygwin dot com>
From the information provided, I can't tell if the problem is noticed with
both Cygwin 1.5.5 and 1.5.7 or only 1.5.7. If it's the latter, try the
most recent snapshot and see if that helps.
Larry
At 10:28 AM 3/9/2004, you wrote:
>If I run a test script enough time, it eventually
>freezes in this deadlock situation:
>
>The client sends a command to a backend and waits
>for an answer. It will wait forever because the
>backend
>is not aware of the arrival of the request and waits
>for a next command.
>
>What happens in the loop is:
> SIInsertDataEntry: table is 70% full,
> signaling postmaster
>
> In reaction, the postmaster sends to its children:
> SignalChildren: sending signal 31 to process <pid>
>
>Most of the time, it works. But at an unpredictable
>iteration, it freezes.
>
>This problem appeared first in a replication
>machinery, so I reduced the number of components
>involved, to get a simpler test case:
>A pgtcl script, running a loop with:
> create table from another-table
> copy table to file
> drop table
>
>The 'create table' regularly fires the '70% full'
>event, and at some point, the 'copy' never gets
>answered.
>
>I attached these files:
>- test.tcl: the script to run.
> Change these values to meet your context:
>
> set srctable pgr_qryengine_log
> set dbname euronetUsers
>
> The source table can be anything empty.
> In my case, it's:
>CREATE TABLE public.pgr_qryengine_log
>(
> pgr_sid int4 NOT NULL,
> tablename varchar(50),
> pgr_gfid int8 NOT NULL,
> pgr_grid int8 NOT NULL,
> pgr_optype varchar(2),
> pgr_when timestamp,
> pgr_username varchar(30),
> qry_result text
>) WITH OIDS;
>
>- postmaster-ok.log
> The traces of a successful iteration.
>- postmaster-ko.log
> The traces of the forever waiting iteration.
> EOF is received on a ctrl/c on the client side.
>
>Comparison of the traces shows that the signals
>are processed, but the backend doesn't start a
>StartTransactionCommand for the expected 'copy'.
>
>I don't know the exact conditions for the freeze to
>arise. I just noticed that chances are higher if
>there is a lot of postgres.exe processes alive.
>I could run 10000 runs without any extra backends.
>So I opened a pgAdmin III session to have many
>connexions (on multiple db, with different accounts).
>With 7 to 10 processes, I reached the freeze at
>3392, 2027, 6729, 272, 1871 runs.
>
>I tried to strace the postmaster, but never managed
>to have the problem. I guess strace slow down the
>system too much.
>I just have a strace of a correct iteration.
>
>Done on:
>- postgres 7.3.5, W2000 SP2, cygwin 1.5.5-1
>- postgres 7.3.5, NT SP6, cygwin 1.5.7-1
>
>I can't tell if the source of the problem is in
>cygwin or in postgres, so I post in the two lists.
>
>Would be helpful if anybody can reproduce the
>problem, or provide advices to progress on the
>debugging work.
>
>Patrick
>
>
>
>
>__________________________________
>Do you Yahoo!?
>Yahoo! Search - Find what you?re looking for faster
>http://search.yahoo.com
>
>
>--
>Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
>Problem reports: http://cygwin.com/problems.html
>Documentation: http://cygwin.com/docs.html
>FAQ: http://cygwin.com/faq/
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/