Fwd: Re: headache on build repeatibility: octave vs BLODA ?

Wed Jan 29 15:39:00 GMT 2020

[Ooops, sent this to Takashi instead of the list, originally.]

Am 29.01.2020 um 14:46 schrieb Takashi Yano:
> On Wed, 29 Jan 2020 13:19:11 +0100
> Marco Atzeri wrote:

>> As Octave uses gnulib, it is possible that the changes in MS are causing
>> a different subset of gnulib to be used than before, may be exposing
>> a latent bug or race.
>>
>> Unfortunately my old build tree was polluted by mistake, so I can
>> not directly compare a good build tree versus a failing one.
> 
> I found suspicious difference between the working build and the
> not-working build.
> 
> The not-working build has fflush.o, fseek.o and fseeko.o in
> build/libgnu/.libs
> directory, while the working build does not.
> 
> Also, cygoctave-7.dll of not-working build exports rpl_fflush,
> rpl_fseek and rpl_fseeko, while that of the working build does
> not.

That's very interesting, as one of those: rpl_fseeko, is indeed in the 
code path to the crash:

=============================
#0  0x0000000000000000 in ?? ()
#1  0x000000018019b9c7 in __sflush_r (ptr=ptr@entry=0xffffd680, 
fp=fp@entry=0x800080ae8) at 
/usr/src/debug/cygwin-3.1.2-1/newlib/libc/stdio/fflush.c:179
#2  0x000000018019baeb in _fflush_r (ptr=ptr@entry=0xffffd680, 
fp=fp@entry=0x800080ae8) at 
/usr/src/debug/cygwin-3.1.2-1/newlib/libc/stdio/fflush.c:278
#3  0x000000018019fd67 in _fseeko_r (ptr=0xffffd680, fp=0x800080ae8, 
offset=4, whence=0) at 
/usr/src/debug/cygwin-3.1.2-1/newlib/libc/stdio/fseeko.c:314
#4  0x00000001801346bb in _sigfe () at sigfe.s:35
#5  0x000000042cdc77d9 in c_file_ptr_buf::seekoff (this=0x800223dc0, 
offset=<optimized out>, dir=<optimized out>) at 
/usr/src/debug/octave-5.1.0-1/libinterp/corefcn/c-file-ptr-stream.cc:118
#6  0x00000003d7fd72b3 in cygstdc++-6!_ZNSi5tellgEv () from 
/usr/bin/cygstdc++-6.dll
#7  0x000000042d0881da in octave::textscan::scan 
(this=this@entry=0xffffb470, isp=..., fmt=..., ntimes=ntimes@entry=2, 
options=..., count=@0xffffb6f8: 0)
=============================

(Yes, neither fseeko nor rpl_fseeko can bee seen here, but they were 
passed as part of executing #5: seekoff.

Here's me stepping into that seekoff() call, in a later gdb session:

=============================
Thread 1 "doctave-cli" hit Breakpoint 1, c_file_ptr_buf::seekoff 
(this=0x800220eb0, offset=0, dir=std::_S_cur) at 
/usr/src/debug/octave-5.1.0-1/libinterp/corefcn/c-file-ptr-stream.cc:115
115     {
(gdb) s
116       if (f)
(gdb)
118           octave_fseeko_wrapper (f, offset, seekdir_to_whence (dir));
(gdb)
octave_fseeko_wrapper (fp=0x800080ae8, offset=0, whence=1) at 
/usr/src/debug/octave-5.1.0-1/liboctave/wrappers/filepos-wrappers.c:40
40      }
(gdb)
rpl_fseeko (fp=0x800080ae8, offset=0, whence=1) at 
/usr/src/debug/octave-5.1.0-1/libgnu/fseeko.c:42
42      {
(gdb)
58        if ((fp->_flags & __SL64) == 0)
(gdb)
42      {
(gdb)
58        if ((fp->_flags & __SL64) == 0)
(gdb)
70        if (fp_->_p == fp_->_bf._base
(gdb)
163       return fseeko (fp, offset, whence);
(gdb)
164     }
(gdb)
163       return fseeko (fp, offset, whence);
(gdb)
/wip/cygport-git/gdb/gdb-8.2.1-1.x86_64/src/gdb-8.2.1/gdb/infrun.c:2723: 
internal-error: void resume_1(gdb_signal): Assertion 
`pc_in_thread_step_range (pc, tp)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) y

This is a bug, please report it.  For instructions, see:
<http://www.gnu.org/software/gdb/bugs/>.
=============================

Hm, so stepping's no good.  But the actual trigger is clear enough in 
the segfault backtrace:

=============================
(gdb) frame 1
#1  0x000000018019b9c7 in __sflush_r (ptr=ptr@entry=0xffffd680, 
fp=fp@entry=0x800080ae8) at 
/usr/src/debug/cygwin-3.1.2-1/newlib/libc/stdio/fflush.c:179
179                 curoff = fp->_seek64 (ptr, fp->_cookie, curoff, 
SEEK_SET);
(gdb) l
174                     curoff -= fp->_ur;
175                 }
176               /* Now physically seek to after byte last read.  */
177     #ifdef __LARGE64_FILES
178               if (fp->_flags & __SL64)
179                 curoff = fp->_seek64 (ptr, fp->_cookie, curoff, 
SEEK_SET);
180               else
181     #endif
182                 curoff = fp->_seek (ptr, fp->_cookie, curoff, SEEK_SET);
183               if (curoff != -1 || ptr->_errno == 0
(gdb) p *fp
$3 = {_p = 0x8004b1883 "3\n4\n5\n66", _r = 7, _w = 0, _flags = -17260, 
_file = 3, _bf = {_base = 0x8004b1880 "\n2\n3\n4\n5\n66", _size = 
65536}, _lbfsize = 0, _data = 0x0, _cookie = 0x800080ae8,
   _read = 0x1801accd0 <__sread>, _write = 0x1801acd80 <__swrite>, _seek 
= 0x1801ace40 <__sseek>, _close = 0x1801ace80 <__sclose>, _ub = {_base = 
0x0, _size = 0}, _up = 0x0, _ur = 0, _ubuf = "\000\000",
   _nbuf = "", _lb = {_base = 0x0, _size = 0}, _blksize = 65536, _flags2 
= 0, _offset = 11, _seek64 = 0x0, _lock = 0x800220e10, _mbstate = 
{__count = 0, __value = {__wch = 0, __wchb = "\000\000\000"}}}
=============================

Note that fp->_seek64 is actually null, so this is calling a null 
pointer, possibly caused by gnulib interfering with the call sequence?

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple