This is the mail archive of the
glibc-bugs@sources.redhat.com
mailing list for the glibc project.
[Bug libc/322] New: Calling dcgettext(3) simultaneously sometimes causes segmentation fault.
- From: "maeda dot naoaki at jp dot fujitsu dot com" <sourceware-bugzilla at sources dot redhat dot com>
- To: glibc-bugs at sources dot redhat dot com
- Date: 12 Aug 2004 09:35:13 -0000
- Subject: [Bug libc/322] New: Calling dcgettext(3) simultaneously sometimes causes segmentation fault.
- Reply-to: sourceware-bugzilla at sources dot redhat dot com
Overview Description:
Calling dcgettext(3) simultaneously in multi-thread application
sometimes causes segmentation fault. However, it doesn't occur in
C or POSIX locale.
Steps to Reproduce:
1) Compile the following test.c with -lpthread and
generate a.out. (gcc test.c -lpthread)
2) Run test.sh
Segmentation fault occurs in several minutes.
---------- test.c
#include <string.h>
#include <pthread.h>
#include <locale.h>
static void *xxx(void *arg)
{
dcgettext("libc", "No such process", LC_MESSAGES);
}
main()
{
pthread_t tid1, tid2;
setlocale(LC_ALL, "");
pthread_create(&tid1, NULL, xxx, (void *)2);
pthread_create(&tid2, NULL, xxx, (void *)3);
pthread_join(tid1, NULL);
pthread_join(tid2, NULL);
}
---------
--------- test.sh
#!/bin/bash
ulimit -c unlimited
# Problem doesn't occur in C locale
export LANG=ja_JP.eucJP
while :
do
./a.out
if [ $? -ne 0 ]
then
echo "segmentation fault occured"
exit
fi
done
---------
Actual Results:
The test program crashed.
Expected Results:
The test program never cause segmentation fault.
Build Data & Platform:
glibc 2.3.2 on Linux 2.4.21
Additional Information:
dcgettext(3) and related subroutines _nl_find_domain() and _nl_load_domain()
have the following three race conditions.
1) dcgettext(3) uses tfind() and tsearch() to manage translated cache, but
it is not locked. Although dcgettext(3) hold _nl_state_lock reader lock,
it can be run concurrently. When one dcgettext() is looking up the
translated cache with tfind(), it is possible that another dcgettext()
register new cache entry with tsearch(). This condition leads SIGSEGV.
Attached patch1 fix the problem by reader-writer lock, and calling tfind()
holds a reader lock, calling tsearch() holds a writer lock.
2) _nl_make_l10nflist() may return partially initialized entry that is
initializing by another _nl_make_l10nflist(). If caller uses the partially
initialized entry, it leads SIGSEGV.
_nl_make_l10nflist() has two mode, do_allocate or not, in other words,
create mode or look up mode. When one _nl_find_domain() calls
_nl_make_l10nflist() on create mode, it is possible that another
_nl_find_domain() calls _nl_make_l10nflist() on look up mode.
However in look up mode, it only check whether the entry's filename member
is matched or not.
On the other hand, in create mode it assigns filename member for newly
allocated entry before whole members are initialized by the recursive call
of _nl_make_l10nflist().
Therefore _nl_make_l10nflist() on look up mode may return partially
initialized entries, and if caller access one of these entry,
it leads SIGSEGV.
Attached patch2 fix the problem by reader-writer lock, and calling
_nl_make_l10nflist() on look up mode holds a reader lock, calling one
on create mode holds a writer lock.
3) A caller of _nl_load_domain() assumes that if domain's decided member,
is non zero and data member is not NULL, that of domain can be used.
However, this assumption is not correct, because _nl_load_domain() assigned
both of the members before domain's whole data structure has been
initialized yet.
Therefore it is possible that a caller of _nl_load_domain assumes
the domain is usable, although domain has not been fully initialized yet.
This condition leads SIGSEGV by the caller.
Attached patch3 fix the problem by serializing _nl_load_domain() and
embedding the condition clause for deciding to call _nl_load_domain().
All patches are made against glibc-2.3.3, and test program survives at least
a few hours running test.
---- patch1 BEGIN
diff -Naur glibc-2.3.3/intl/dcigettext.c glibc-2.3.3-
gettext_fix/intl/dcigettext.c
--- glibc-2.3.3/intl/dcigettext.c 2003-06-12 06:45:34.000000000 +0900
+++ glibc-2.3.3-gettext_fix/intl/dcigettext.c 2004-08-11 10:03:33.000000000
+0900
@@ -88,6 +88,7 @@
# define __libc_lock_unlock(NAME)
# define __libc_rwlock_define_initialized(CLASS, NAME)
# define __libc_rwlock_rdlock(NAME)
+# define __libc_rwlock_wrlock(NAME)
# define __libc_rwlock_unlock(NAME)
#endif
@@ -406,6 +407,7 @@
size_t msgid_len;
#endif
size_t domainname_len;
+ __libc_rwlock_define_initialized (static, _nl_cache_lock);
/* If no real MSGID is given return NULL. */
if (msgid1 == NULL)
@@ -439,7 +441,11 @@
search->domainname = (char *) domainname;
search->category = category;
+ /* Later tsearch may alter the search tree, so read lock
+ must be held to find the translation */
+ __libc_rwlock_rdlock (_nl_cache_lock);
foundp = (struct known_translation_t **) tfind (search, &root, transcmp);
+ __libc_rwlock_unlock (_nl_cache_lock);
freea (search);
if (foundp != NULL && (*foundp)->counter == _nl_msg_cat_cntr)
{
@@ -633,9 +639,13 @@
newp->translation = retval;
newp->translation_length = retlen;
+ /* tsearch may alter the search tree, so write lock
+ must be held */
+ __libc_rwlock_wrlock (_nl_cache_lock);
/* Insert the entry in the search tree. */
foundp = (struct known_translation_t **)
tsearch (newp, &root, transcmp);
+ __libc_rwlock_unlock (_nl_cache_lock);
if (foundp == NULL
|| __builtin_expect (*foundp != newp, 0))
/* The insert failed. */
----- patch1 END
----- patch2 BEGIN
diff -Naur glibc-2.3.3/intl/finddomain.c glibc-2.3.3-
gettext_fix/intl/finddomain.c
--- glibc-2.3.3/intl/finddomain.c 2002-11-02 05:43:55.000000000 +0900
+++ glibc-2.3.3-gettext_fix/intl/finddomain.c 2004-08-11 10:33:14.000000000
+0900
@@ -38,6 +38,17 @@
# include "libgnuintl.h"
#endif
+/* Thread safetyness. */
+#ifdef _LIBC
+# include <bits/libc-lock.h>
+#else
+/* Provide dummy implementation if this is outside glibc. */
+# define __libc_rwlock_define_initialized(CLASS, NAME)
+# define __libc_rwlock_rdlock(NAME)
+# define __libc_rwlock_wrlock(NAME)
+# define __libc_rwlock_unlock(NAME)
+#endif
+
/* @@ end of prolog @@ */
/* List of already loaded domains. */
static struct loaded_l10nfile *_nl_loaded_domains;
@@ -62,6 +73,7 @@
const char *normalized_codeset;
const char *alias_value;
int mask;
+ __libc_rwlock_define_initialized (static, _nl_loaded_domains_lock);
/* LOCALE can consist of up to four recognized parts for the XPG syntax:
@@ -77,11 +89,15 @@
(4) modifier
*/
+ /* Later _nl_make_l10nflist may alter the _nl_loaded_domains list,
+ so read lock must be held */
+ __libc_rwlock_rdlock (_nl_loaded_domains_lock);
/* If we have already tested for this locale entry there has to
be one data set in the list of loaded domains. */
retval = _nl_make_l10nflist (&_nl_loaded_domains, dirname,
strlen (dirname) + 1, 0, locale, NULL, NULL,
NULL, NULL, domainname, 0);
+ __libc_rwlock_unlock (_nl_loaded_domains_lock);
if (retval != NULL)
{
/* We know something about this locale. */
@@ -131,12 +147,16 @@
mask = _nl_explode_name (locale, &language, &modifier, &territory,
&codeset, &normalized_codeset);
+ /* _nl_make_l10nflist may alter the _nl_loaded_domains list,
+ so write lock must be held */
+ __libc_rwlock_wrlock (_nl_loaded_domains_lock);
/* Create all possible locale entries which might be interested in
generalization. */
retval = _nl_make_l10nflist (&_nl_loaded_domains, dirname,
strlen (dirname) + 1, mask, language,
territory,
codeset, normalized_codeset, modifier,
domainname, 1);
+ __libc_rwlock_unlock (_nl_loaded_domains_lock);
if (retval == NULL)
/* This means we are out of core. */
return NULL;
---- patch2 END
---- patch3 BEGIN
diff -Naur glibc-2.3.3/intl/dcigettext.c glibc-2.3.3-
gettext_fix/intl/dcigettext.c
--- glibc-2.3.3/intl/dcigettext.c 2004-08-11 10:14:13.000000000 +0900
+++ glibc-2.3.3-gettext_fix/intl/dcigettext.c 2004-08-11 10:45:24.000000000
+0900
@@ -690,8 +690,7 @@
char *result;
size_t resultlen;
- if (domain_file->decided == 0)
- _nl_load_domain (domain_file, domainbinding);
+ _nl_load_domain (domain_file, domainbinding);
if (domain_file->data == NULL)
return NULL;
diff -Naur glibc-2.3.3/intl/finddomain.c glibc-2.3.3-
gettext_fix/intl/finddomain.c
--- glibc-2.3.3/intl/finddomain.c 2004-08-11 10:36:03.000000000 +0900
+++ glibc-2.3.3-gettext_fix/intl/finddomain.c 2004-08-11 10:46:02.000000000
+0900
@@ -103,16 +103,14 @@
/* We know something about this locale. */
int cnt;
- if (retval->decided == 0)
- _nl_load_domain (retval, domainbinding);
+ _nl_load_domain (retval, domainbinding);
if (retval->data != NULL)
return retval;
for (cnt = 0; retval->successor[cnt] != NULL; ++cnt)
{
- if (retval->successor[cnt]->decided == 0)
- _nl_load_domain (retval->successor[cnt], domainbinding);
+ _nl_load_domain (retval->successor[cnt], domainbinding);
if (retval->successor[cnt]->data != NULL)
break;
@@ -161,15 +159,13 @@
/* This means we are out of core. */
return NULL;
- if (retval->decided == 0)
- _nl_load_domain (retval, domainbinding);
+ _nl_load_domain (retval, domainbinding);
if (retval->data == NULL)
{
int cnt;
for (cnt = 0; retval->successor[cnt] != NULL; ++cnt)
{
- if (retval->successor[cnt]->decided == 0)
- _nl_load_domain (retval->successor[cnt], domainbinding);
+ _nl_load_domain (retval->successor[cnt], domainbinding);
if (retval->successor[cnt]->data != NULL)
break;
}
diff -Naur glibc-2.3.3/intl/loadmsgcat.c glibc-2.3.3-
gettext_fix/intl/loadmsgcat.c
--- glibc-2.3.3/intl/loadmsgcat.c 2003-09-04 02:44:46.000000000 +0900
+++ glibc-2.3.3-gettext_fix/intl/loadmsgcat.c 2004-08-12 14:24:41.000000000
+0900
@@ -471,6 +471,16 @@
# define freea(p) free (p)
#endif
+/* Thread safetyness. */
+#ifdef _LIBC
+# include <bits/libc-lock.h>
+#else
+/* Provide dummy implementation if this is outside glibc. */
+# define __libc_lock_define_recursive(CLASS, NAME)
+# define __libc_lock_lock_recursive(NAME)
+# define __libc_lock_unlock_recursive(NAME)
+#endif
+
/* Prototypes for local functions. Needed to ensure compiler checking of
function argument counts despite of K&R C function definition syntax. */
@@ -899,6 +909,16 @@
struct loaded_domain *domain;
int revision;
const char *nullentry;
+ __libc_lock_define_recursive (static, lock);
+
+ /* Recursive lock must be used because _nl_load_domain is called
+ recursively as following.
+ _nl_load_domain -> _nl_init_domain_conv -> _nl_find_msg ->
+ _nl_load_domain */
+ __libc_lock_lock_recursive (lock);
+
+ if (domain_file->decided == 1)
+ goto end;
domain_file->decided = 1;
domain_file->data = NULL;
@@ -912,12 +932,12 @@
specification the locale file name is different for XPG and CEN
syntax. */
if (domain_file->filename == NULL)
- return;
+ goto end;
/* Try to open the addressed file. */
fd = open (domain_file->filename, O_RDONLY);
if (fd == -1)
- return;
+ goto end;
/* We must know about the size of the file. */
if (
@@ -931,7 +951,7 @@
{
/* Something went wrong. */
close (fd);
- return;
+ goto end;
}
#ifdef HAVE_MMAP
@@ -957,7 +977,7 @@
data = (struct mo_file_header *) malloc (size);
if (data == NULL)
- return;
+ goto end;
to_read = size;
read_ptr = (char *) data;
@@ -971,7 +991,7 @@
continue;
#endif
close (fd);
- return;
+ goto end;
}
read_ptr += nb;
to_read -= nb;
@@ -993,12 +1013,12 @@
else
#endif
free (data);
- return;
+ goto end;
}
domain = (struct loaded_domain *) malloc (sizeof (struct loaded_domain));
if (domain == NULL)
- return;
+ goto end;
domain_file->data = domain;
domain->data = (char *) data;
@@ -1264,7 +1284,7 @@
free (data);
free (domain);
domain_file->data = NULL;
- return;
+ goto end;
}
/* Now initialize the character set converter from the character set
@@ -1274,6 +1294,10 @@
/* Also look for a plural specification. */
EXTRACT_PLURAL_EXPRESSION (nullentry, &domain->plural, &domain->nplurals);
+
+end:
+ __libc_lock_unlock_recursive (lock);
+ return;
}
---- patch3 END
--
Summary: Calling dcgettext(3) simultaneously sometimes causes
segmentation fault.
Product: glibc
Version: 2.3.3
Status: NEW
Severity: critical
Priority: P2
Component: libc
AssignedTo: gotom at debian dot or dot jp
ReportedBy: maeda dot naoaki at jp dot fujitsu dot com
CC: glibc-bugs at sources dot redhat dot com,maeda dot
naoaki at jp dot fujitsu dot com
http://sources.redhat.com/bugzilla/show_bug.cgi?id=322
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.