Crash in g_file_monitor on 32-bit Cygwin

Ken Brown kbrown@cornell.edu
Tue Oct 14 18:30:00 GMT 2014


On 10/14/2014 12:26 PM, Ken Brown wrote:
> On 6/28/2014 7:08 AM, Ken Brown wrote:
>> On 6/27/2014 1:52 PM, Yaakov Selkowitz wrote:
>>> On 2014-06-27 12:11, Ken Brown wrote:
>>>> On 6/25/2014 10:17 PM, Ken Brown wrote:
>>>>> This is a followup to
>>>>> https://cygwin.com/ml/cygwin/2014-06/msg00324.html, from which I
>>>>> extracted the following test case:
>>>>>
>>>>> $ cat gfile-test.c
>>>>> #include <stdio.h>
>>>>> #include <gio/gio.h>
>>>>>
>>>>> void
>>>>> gfile_add_watch (const char *file)
>>>>> {
>>>>>    GFile *gfile = g_file_new_for_path (file);
>>>>>    GFileMonitor *monitor;
>>>>>    GFileMonitorFlags gflags = G_FILE_MONITOR_NONE;
>>>>>    monitor = g_file_monitor (gfile, gflags, NULL, NULL);
>>>>>    if (! monitor)
>>>>>      printf ("Can't watch file %s\n", file);
>>>>>    else
>>>>>      printf ("Watching file %s\n", file);
>>>>> }
>>>>>
>>>>> int
>>>>> main ()
>>>>> {
>>>>>    const char *file = "gfile-test.c";
>>>>>    gfile_add_watch (file);
>>>>> }
>>>>>
>>>>> $ gcc -g -O0 -o gfile-test $(pkg-config --cflags gio-2.0) gfile-test.c
>>>>> $(pkg-config --libs gio-2.0)
>>>>>
>>>>> In the 64-bit case, this behaves as expected:
>>>>>
>>>>> $ ./gfile-test.exe
>>>>> Watching file gfile-test.c
>>>>>
>>>>> In the 32-bit case, however, it crashes.  Running it under gdb shows
>>>>> that the call to g_file_monitor leads to a SEGV, but I can't tell
>>>>> exactly where; when I try to single step through the Glib code, I
>>>>> eventually hit an assertion violation in gdb.  strace shows lots of
>>>>> exceptions, but I can't make much sense out of it otherwise.
>>>>
>>>> I rebuilt glib and gamin without optimization so that I could step
>>>> through the code in gdb.  But stepping through the code turned out to be
>>>> unnecessary, because the bug was gone after the rebuilds.  I don't know
>>>> if optimization was really the issue or whether just rebuilding with the
>>>> latest tools is what fixed it.
>>>>
>>>> My builds can be obtained from
>>>>
>>>>    http://sanibeltranquility.com/cygwin/
>>>>
>>>> if anyone else wants to try to reproduce this without rebuilding the
>>>> packages themselves.
>>>>
>>>> Yaakov, could you take a look?
>>>
>>> Sure.  Are you narrow this down to only one of glib or gamin?
>>
>> The culprit is gamin, and optimization *is* relevant.  What's strange, though,
>> is that when I rebuild it with optimization, my test case hangs instead of
>> crashing.  Summary:
>>
>> - With gamin-0.1.10-14 (and its subpackages), my test case crashes.  The outward
>> symptom is that there's no output, but running the test case under gdb shows the
>> SEGV.
>>
>> - If I rebuild gamin without optimization, I don't see any bug.  More precisely,
>> I build it using your gamin.cygport with the following line added:
>>
>>    CFLAGS+=" -O0 -g3"
>>
>> - If I rebuild gamin with optimization (i.e., just using your gamin.cygport with
>> no changes), my test case hangs.
>
> I made another attempt to debug this, and I found the problem, but I don't know
> how to fix it.  First, I have to correct the last assertion I made above about
> my test case hanging; I just didn't wait long enough for it to finish.  What
> happens is that there is a retry loop in
> libgamin/gam_api.c:gamin_connect_unix_socket that gives up after 25 seconds. And
> the reason it fails is that /usr/libexec/gam_server.exe has crashed.  In fact,
> the latter always crashes on 32-bit Cygwin if it's built with optimization and
> if the directory /tmp/fam-<username> exists before it is run.  [And this
> directory will always exist after one run of gam_server.exe.]
>
> The crash occurs in a call to g_free at server/gam_channel.c:525 because the
> pointer 'dir' that is being freed has been clobbered by a call to
> gam_check_not_fat on line 497.  Here are some details, based on a build using
> Yaakov's gamin.cygport file with the added line
>
>    CFLAGS+=" -O1 -g3"
>
> I've appended at the end of this message a transcript of a gdb session that
> illustrates some of the assertions I'll be making.
>
> At line 447 of server/gam_channel.c, g_strconcat is called to get a pointer to
> the directory name "/tmp/fam-<username>".  The value of this pointer is assigned
> to the variable 'dir' at line 473, and in my run it is 0x8005c068.  Although
> 'dir' is optimized out, I can see from a disassembly that the pointer is stored
> on the stack at -0x510(%ebp):
>
>     0x004058fc <+266>:    call   0x408bf8 <g_strconcat>
>     0x00405901 <+271>:    mov    %eax,-0x510(%ebp)
>
> And I verified in my gdb session that this stack location does indeed contain
> 0x8005c068.  After the call to gam_check_not_fat a little later, that stack
> location contains the value 0x00000104.  Then when g_free attempts to free the
> bogus pointer 0x00000104, we get a crash.
>
> I can't tell from the disassembly why the call to gam_check_not_fat clobbers the
> stack.  My best guess is that it happens as a result of calls to some Windows
> functions.  I hope someone more knowledgeable can take this further and fix it.

I stepped into gam_check_not_fat (which I should have done to begin with) and 
narrowed this down further.  The stack location in question gets clobbered by 
the call to GetVolumeInformation:

(gdb) s
gam_check_not_fat (path=0x8005c068 "/tmp/fam-kbrown")
     at /usr/src/debug/gamin-0.1.10-16/server/gam_channel.c:35
35        cygwin_conv_path(CCP_POSIX_TO_WIN_A, path, winpath, MAX_PATH);
(gdb) x/x $ebp-0x510
0x28a6a8:       0x8005c068
(gdb) n
37        pGVPN = GetProcAddress(LoadLibrary("kernel32"), "GetVolumePathNameA");
(gdb) x/x $ebp-0x510
0x28a6a8:       0x8005c068
(gdb) n
38        if (!pGVPN || !(pGVPN)(winpath, root, MAX_PATH))
(gdb) x/x $ebp-0x510
0x28a6a8:       0x8005c068
(gdb) n
52        if (!GetVolumeInformation (root, volname, MAX_PATH, NULL,
(gdb) x/x $ebp-0x510
0x28a6a8:       0x8005c068
(gdb) n
58        if (!strncmp(fsname, "FAT", 3))       /* FAT, FAT32 */
(gdb) x/x $ebp-0x510
0x28a6a8:       0x00000104

Here's the code near the call to GetVolumeInformation, followed by what I think 
is the relevant disassembly:

   if (!GetVolumeInformation (root, volname, MAX_PATH, NULL,
                              NULL, NULL, fsname, MAX_PATH))
     {
       fprintf (stderr, "GetVolumeInformation: %d\n", GetLastError ());
       return 0;
     }

    0x00405b3a <+840>:	movl   $0x104,0x1c(%esp) <<<<<<<<<<<<<<<<
    0x00405b42 <+848>:	lea    -0x120(%ebp),%eax
    0x00405b48 <+854>:	mov    %eax,0x18(%esp)
    0x00405b4c <+858>:	movl   $0x0,0x14(%esp)
    0x00405b54 <+866>:	movl   $0x0,0x10(%esp)
    0x00405b5c <+874>:	movl   $0x0,0xc(%esp)
    0x00405b64 <+882>:	movl   $0x104,0x8(%esp)  <<<<<<<<<<<<<<<<
    0x00405b6c <+890>:	lea    -0x224(%ebp),%eax
    0x00405b72 <+896>:	mov    %eax,0x4(%esp)
    0x00405b76 <+900>:	lea    -0x328(%ebp),%eax
    0x00405b7c <+906>:	mov    %eax,(%esp)
    0x00405b7f <+909>:	call   *0x41248c    <----- GetVolumeInformation?
    0x00405b85 <+915>:	sub    $0x20,%esp
    0x00405b88 <+918>:	test   %eax,%eax
    0x00405b8a <+920>:	jne    0x405bb5 <gam_server_create+963>
    0x00405b8c <+922>:	call   *0x412480    <----- GetLastError?
    0x00405b92 <+928>:	mov    %eax,%esi
    0x00405b94 <+930>:	call   0x408df0 <__getreent>
    0x00405b99 <+935>:	mov    %esi,0x8(%esp)
    0x00405b9d <+939>:	movl   $0x40c70f,0x4(%esp)
    0x00405ba5 <+947>:	mov    0xc(%eax),%eax
    0x00405ba8 <+950>:	mov    %eax,(%esp)
    0x00405bab <+953>:	call   0x408df8 <fprintf>
    0x00405bb0 <+958>:	jmp    0x406073 <gam_server_create+2177>

Note the two marked movl instructions involving 0x104; I guess one of these is 
the culprit, but I don't really know what's going on.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list