This is the mail archive of the glibc-bugs@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/322] New: Calling dcgettext(3) simultaneously sometimes causes segmentation fault.


Overview Description:
Calling dcgettext(3) simultaneously in multi-thread application
sometimes causes segmentation fault. However, it doesn't occur in
C or POSIX locale.

Steps to Reproduce:

1) Compile the following test.c with -lpthread and
   generate a.out. (gcc test.c -lpthread)

2) Run test.sh
   Segmentation fault occurs in several minutes.

---------- test.c
#include <string.h>
#include <pthread.h>
#include <locale.h>
static void *xxx(void *arg)
{
  dcgettext("libc", "No such process", LC_MESSAGES);
}
main()
{
  pthread_t tid1, tid2;
  setlocale(LC_ALL, "");
  pthread_create(&tid1, NULL, xxx, (void *)2);
  pthread_create(&tid2, NULL, xxx, (void *)3);
  pthread_join(tid1, NULL);
  pthread_join(tid2, NULL);
}
---------

--------- test.sh
#!/bin/bash

ulimit -c unlimited
# Problem doesn't occur in C locale
export LANG=ja_JP.eucJP

while :
do
	./a.out
	if [ $? -ne 0 ]
	then
		echo "segmentation fault occured"
		exit
	fi
done
---------

Actual Results:
The test program crashed. 

Expected Results:
The test program never cause segmentation fault.

Build Data & Platform:
glibc 2.3.2 on Linux 2.4.21

Additional Information: 

dcgettext(3) and related subroutines _nl_find_domain() and _nl_load_domain()
have the following three race conditions.

1) dcgettext(3) uses tfind() and tsearch() to manage translated cache, but
   it is not locked. Although dcgettext(3) hold _nl_state_lock reader lock,
   it can be run concurrently. When one dcgettext() is looking up the 
   translated cache with tfind(), it is possible that another dcgettext()
   register new cache entry with tsearch(). This condition leads SIGSEGV.

   Attached patch1 fix the problem by reader-writer lock, and calling tfind()
   holds a reader lock, calling tsearch() holds a writer lock.

2) _nl_make_l10nflist() may return partially initialized entry that is
   initializing by another _nl_make_l10nflist(). If caller uses the partially
   initialized entry, it leads SIGSEGV.

   _nl_make_l10nflist() has two mode, do_allocate or not, in other words,
   create mode or look up mode. When one _nl_find_domain() calls 
   _nl_make_l10nflist() on create mode, it is possible that another
   _nl_find_domain() calls _nl_make_l10nflist() on look up mode. 
   However in look up mode, it only check whether the entry's filename member
   is matched or not. 

   On the other hand, in create mode it assigns filename member for newly
   allocated entry before whole members are initialized by the recursive call 
   of _nl_make_l10nflist(). 

   Therefore _nl_make_l10nflist() on look up mode may return partially
   initialized entries, and if caller access one of these entry, 
   it leads SIGSEGV.

   Attached patch2 fix the problem by reader-writer lock, and calling 
   _nl_make_l10nflist() on look up mode holds a reader lock, calling one
   on create mode holds a writer lock.

3) A caller of _nl_load_domain() assumes that if domain's decided member,
   is non zero and data member is not NULL, that of domain can be used.
   
   However, this assumption is not correct, because _nl_load_domain() assigned
   both of the members before domain's whole data structure has been
   initialized yet.

   Therefore it is possible that a caller of _nl_load_domain assumes
   the domain is usable, although domain has not been fully initialized yet.
   This condition leads SIGSEGV by the caller.

   Attached patch3 fix the problem by serializing _nl_load_domain() and 
   embedding the condition clause for deciding to call _nl_load_domain().

All patches are made against glibc-2.3.3, and test program survives at least 
a few hours running test. 

---- patch1 BEGIN
diff -Naur glibc-2.3.3/intl/dcigettext.c glibc-2.3.3-
gettext_fix/intl/dcigettext.c
--- glibc-2.3.3/intl/dcigettext.c	2003-06-12 06:45:34.000000000 +0900
+++ glibc-2.3.3-gettext_fix/intl/dcigettext.c	2004-08-11 10:03:33.000000000 
+0900
@@ -88,6 +88,7 @@
 # define __libc_lock_unlock(NAME)
 # define __libc_rwlock_define_initialized(CLASS, NAME)
 # define __libc_rwlock_rdlock(NAME)
+# define __libc_rwlock_wrlock(NAME)
 # define __libc_rwlock_unlock(NAME)
 #endif
 
@@ -406,6 +407,7 @@
   size_t msgid_len;
 #endif
   size_t domainname_len;
+  __libc_rwlock_define_initialized (static, _nl_cache_lock);
 
   /* If no real MSGID is given return NULL.  */
   if (msgid1 == NULL)
@@ -439,7 +441,11 @@
   search->domainname = (char *) domainname;
   search->category = category;
 
+  /* Later tsearch may alter the search tree, so read lock
+     must be held to find the translation */
+  __libc_rwlock_rdlock (_nl_cache_lock);
   foundp = (struct known_translation_t **) tfind (search, &root, transcmp);
+  __libc_rwlock_unlock (_nl_cache_lock);
   freea (search);
   if (foundp != NULL && (*foundp)->counter == _nl_msg_cat_cntr)
     {
@@ -633,9 +639,13 @@
 		      newp->translation = retval;
 		      newp->translation_length = retlen;
 
+		      /* tsearch may alter the search tree, so write lock
+			 must be held */
+		      __libc_rwlock_wrlock (_nl_cache_lock);
 		      /* Insert the entry in the search tree.  */
 		      foundp = (struct known_translation_t **)
 			tsearch (newp, &root, transcmp);
+		      __libc_rwlock_unlock (_nl_cache_lock);
 		      if (foundp == NULL
 			  || __builtin_expect (*foundp != newp, 0))
 			/* The insert failed.  */
----- patch1 END

----- patch2 BEGIN
diff -Naur glibc-2.3.3/intl/finddomain.c glibc-2.3.3-
gettext_fix/intl/finddomain.c
--- glibc-2.3.3/intl/finddomain.c	2002-11-02 05:43:55.000000000 +0900
+++ glibc-2.3.3-gettext_fix/intl/finddomain.c	2004-08-11 10:33:14.000000000 
+0900
@@ -38,6 +38,17 @@
 # include "libgnuintl.h"
 #endif
 
+/* Thread safetyness.  */
+#ifdef _LIBC
+# include <bits/libc-lock.h>
+#else
+/* Provide dummy implementation if this is outside glibc.  */
+# define __libc_rwlock_define_initialized(CLASS, NAME)
+# define __libc_rwlock_rdlock(NAME)
+# define __libc_rwlock_wrlock(NAME)
+# define __libc_rwlock_unlock(NAME)
+#endif
+
 /* @@ end of prolog @@ */
 /* List of already loaded domains.  */
 static struct loaded_l10nfile *_nl_loaded_domains;
@@ -62,6 +73,7 @@
   const char *normalized_codeset;
   const char *alias_value;
   int mask;
+  __libc_rwlock_define_initialized (static, _nl_loaded_domains_lock);
 
   /* LOCALE can consist of up to four recognized parts for the XPG syntax:
 
@@ -77,11 +89,15 @@
 		(4) modifier
    */
 
+  /* Later _nl_make_l10nflist may alter the _nl_loaded_domains list,
+     so read lock must be held */
+  __libc_rwlock_rdlock (_nl_loaded_domains_lock);
   /* If we have already tested for this locale entry there has to
      be one data set in the list of loaded domains.  */
   retval = _nl_make_l10nflist (&_nl_loaded_domains, dirname,
 			       strlen (dirname) + 1, 0, locale, NULL, NULL,
 			       NULL, NULL, domainname, 0);
+  __libc_rwlock_unlock (_nl_loaded_domains_lock);
   if (retval != NULL)
     {
       /* We know something about this locale.  */
@@ -131,12 +147,16 @@
   mask = _nl_explode_name (locale, &language, &modifier, &territory,
 			   &codeset, &normalized_codeset);
 
+  /* _nl_make_l10nflist may alter the _nl_loaded_domains list,
+     so write lock must be held */
+  __libc_rwlock_wrlock (_nl_loaded_domains_lock);
   /* Create all possible locale entries which might be interested in
      generalization.  */
   retval = _nl_make_l10nflist (&_nl_loaded_domains, dirname,
 			       strlen (dirname) + 1, mask, language, 
territory,
 			       codeset, normalized_codeset, modifier,
 			       domainname, 1);
+  __libc_rwlock_unlock (_nl_loaded_domains_lock);
   if (retval == NULL)
     /* This means we are out of core.  */
     return NULL;
---- patch2 END

---- patch3 BEGIN
diff -Naur glibc-2.3.3/intl/dcigettext.c glibc-2.3.3-
gettext_fix/intl/dcigettext.c
--- glibc-2.3.3/intl/dcigettext.c	2004-08-11 10:14:13.000000000 +0900
+++ glibc-2.3.3-gettext_fix/intl/dcigettext.c	2004-08-11 10:45:24.000000000 
+0900
@@ -690,8 +690,7 @@
   char *result;
   size_t resultlen;
 
-  if (domain_file->decided == 0)
-    _nl_load_domain (domain_file, domainbinding);
+  _nl_load_domain (domain_file, domainbinding);
 
   if (domain_file->data == NULL)
     return NULL;
diff -Naur glibc-2.3.3/intl/finddomain.c glibc-2.3.3-
gettext_fix/intl/finddomain.c
--- glibc-2.3.3/intl/finddomain.c	2004-08-11 10:36:03.000000000 +0900
+++ glibc-2.3.3-gettext_fix/intl/finddomain.c	2004-08-11 10:46:02.000000000 
+0900
@@ -103,16 +103,14 @@
       /* We know something about this locale.  */
       int cnt;
 
-      if (retval->decided == 0)
-	_nl_load_domain (retval, domainbinding);
+      _nl_load_domain (retval, domainbinding);
 
       if (retval->data != NULL)
 	return retval;
 
       for (cnt = 0; retval->successor[cnt] != NULL; ++cnt)
 	{
-	  if (retval->successor[cnt]->decided == 0)
-	    _nl_load_domain (retval->successor[cnt], domainbinding);
+	  _nl_load_domain (retval->successor[cnt], domainbinding);
 
 	  if (retval->successor[cnt]->data != NULL)
 	    break;
@@ -161,15 +159,13 @@
     /* This means we are out of core.  */
     return NULL;
 
-  if (retval->decided == 0)
-    _nl_load_domain (retval, domainbinding);
+  _nl_load_domain (retval, domainbinding);
   if (retval->data == NULL)
     {
       int cnt;
       for (cnt = 0; retval->successor[cnt] != NULL; ++cnt)
 	{
-	  if (retval->successor[cnt]->decided == 0)
-	    _nl_load_domain (retval->successor[cnt], domainbinding);
+	  _nl_load_domain (retval->successor[cnt], domainbinding);
 	  if (retval->successor[cnt]->data != NULL)
 	    break;
 	}
diff -Naur glibc-2.3.3/intl/loadmsgcat.c glibc-2.3.3-
gettext_fix/intl/loadmsgcat.c
--- glibc-2.3.3/intl/loadmsgcat.c	2003-09-04 02:44:46.000000000 +0900
+++ glibc-2.3.3-gettext_fix/intl/loadmsgcat.c	2004-08-12 14:24:41.000000000 
+0900
@@ -471,6 +471,16 @@
 # define freea(p) free (p)
 #endif
 
+/* Thread safetyness.  */
+#ifdef _LIBC
+# include <bits/libc-lock.h>
+#else
+/* Provide dummy implementation if this is outside glibc.  */
+# define __libc_lock_define_recursive(CLASS, NAME)
+# define __libc_lock_lock_recursive(NAME)
+# define __libc_lock_unlock_recursive(NAME)
+#endif
+
 
 /* Prototypes for local functions.  Needed to ensure compiler checking of
    function argument counts despite of K&R C function definition syntax.  */
@@ -899,6 +909,16 @@
   struct loaded_domain *domain;
   int revision;
   const char *nullentry;
+  __libc_lock_define_recursive (static, lock);
+
+  /* Recursive lock must be used because _nl_load_domain is called
+     recursively as following.
+      _nl_load_domain -> _nl_init_domain_conv -> _nl_find_msg ->
+      _nl_load_domain */
+  __libc_lock_lock_recursive (lock);
+
+  if (domain_file->decided == 1)
+    goto end;
 
   domain_file->decided = 1;
   domain_file->data = NULL;
@@ -912,12 +932,12 @@
      specification the locale file name is different for XPG and CEN
      syntax.  */
   if (domain_file->filename == NULL)
-    return;
+    goto end;
 
   /* Try to open the addressed file.  */
   fd = open (domain_file->filename, O_RDONLY);
   if (fd == -1)
-    return;
+    goto end;
 
   /* We must know about the size of the file.  */
   if (
@@ -931,7 +951,7 @@
     {
       /* Something went wrong.  */
       close (fd);
-      return;
+      goto end;
     }
 
 #ifdef HAVE_MMAP
@@ -957,7 +977,7 @@
 
       data = (struct mo_file_header *) malloc (size);
       if (data == NULL)
-	return;
+	goto end;
 
       to_read = size;
       read_ptr = (char *) data;
@@ -971,7 +991,7 @@
 		continue;
 #endif
 	      close (fd);
-	      return;
+	      goto end;
 	    }
 	  read_ptr += nb;
 	  to_read -= nb;
@@ -993,12 +1013,12 @@
       else
 #endif
 	free (data);
-      return;
+      goto end;
     }
 
   domain = (struct loaded_domain *) malloc (sizeof (struct loaded_domain));
   if (domain == NULL)
-    return;
+    goto end;
   domain_file->data = domain;
 
   domain->data = (char *) data;
@@ -1264,7 +1284,7 @@
 	free (data);
       free (domain);
       domain_file->data = NULL;
-      return;
+      goto end;
     }
 
   /* Now initialize the character set converter from the character set
@@ -1274,6 +1294,10 @@
 
   /* Also look for a plural specification.  */
   EXTRACT_PLURAL_EXPRESSION (nullentry, &domain->plural, &domain->nplurals);
+
+end:
+  __libc_lock_unlock_recursive (lock);
+  return;
 }
 
 
---- patch3 END

-- 
           Summary: Calling dcgettext(3) simultaneously sometimes causes
                    segmentation fault.
           Product: glibc
           Version: 2.3.3
            Status: NEW
          Severity: critical
          Priority: P2
         Component: libc
        AssignedTo: gotom at debian dot or dot jp
        ReportedBy: maeda dot naoaki at jp dot fujitsu dot com
                CC: glibc-bugs at sources dot redhat dot com,maeda dot
                    naoaki at jp dot fujitsu dot com


http://sources.redhat.com/bugzilla/show_bug.cgi?id=322

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]