This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro

From: Alexandre Oliva <aoliva at redhat dot com>
To: Torvald Riegel <triegel at redhat dot com>
Cc: libc-alpha at sourceware dot org, carlos at redhat dot com, mtk dot manpages at gmail dot com
Date: Thu, 21 Nov 2013 08:02:39 -0200
Subject: Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
Authentication-results: sourceware.org; auth=none
References: <20131113081059 dot 3464 dot 51385 dot stgit at frit dot home> <20131113081132 dot 3464 dot 30409 dot stgit at frit dot home> <1384859432 dot 32326 dot 364 dot camel at triegel dot csb> <orsiurva0g dot fsf at livre dot home> <1384956325 dot 3152 dot 591 dot camel at triegel dot csb>

On Nov 20, 2013, Torvald Riegel <triegel@redhat.com> wrote:

> What about linking statically and using LTO?

That's something I'd considered at some point, before concluding that a
lot of code in glibc just wasn't suited for this sort of use.  A number
of very small functions (feof comes to mind) are safe in spite of
avoiding synchronization primitives because we assume they won't be
inlined.  What makes them safe currently is their single access to a
single word; if we were to inline it along with other calls that take
such shortcuts, this would no longer make for consistent executions,
because of the very reordering possibilities you bring up.

> Such a scenario might not be common today, but I'd prefer to just be
> clear about what using MT-Unsafe functions in multi-threaded code can
> result in.  It doesn't hurt us to be clear, and we don't gain anything
> when we promise more than we can guarantee.

How about this:

  ...  Calling them within such contexts invokes undefined behavior.

?

> However, sometimes cached local information is used whereas in other
> cases, the locale object is accessed anew; thus, concurrent changes to
> the global locale object may cause these functions to behave in
> unexpected ways (i.e., behavior that is not possible if @code{setlocale}
> and the so-annotated functions would have been executed in some
> sequential order).

Thanks, taken.

> I don't quite know what you mean by "or even should @code{setlocale}
> alone be atomic" or if that's necessary, so I didn't add it to the
> sentence above.

It's implied by âsequential orderâ; I liked this phrasing of yours.
What I meant was that, even if setlocale alone was made atomic, this
still wouldn't solve the problem that these users could access two
different locale objects during one execution that ought to access a
single locale object.

>> >> +@item @code{envromt}
>> >> +@cindex envromt

>> > Why didn't
>> > you use a different MT-Safe tag (eg, MT-Safe-Env) for those functions,
>> > but instead merged them with the "true" MT-Safe?

>> Same reason as glocale, really: once all methods to modify something are
>> decreed unsafe to call in MT programs, this something becomes constant
>> in conformant MT programs, and therefore the functions are indeed
>> perfectly safe to call.

> That's true, but then you're still merging categories, namely functions
> "MT-Safe under constraint X", where X could be the environment stuff, or
> something else like glocale.  Or did I misunderstand your categories?

I don't know what you understood, but there appears to be some
disconnect indeed.  I wouldn't have said âcategoriesâ, for example; what
each keyword stands for is a (mis)feature detected in the implementation
of certain functions.  Multiple such features may be present in each
function.

Most such features make functions unsafe to call in certain contexts
(i.e., when multiple threads are running, or within an asynchronous
signal handler, or when async cancellation is enabled), but there are
some kinds of exceptions:

a) some functions exhibit a misfeature, but the misfeature does not make
it unsafe to call, e.g., it returns a newly-allocated piece of memory,
file descriptor, and there's no way to stop that return value from
leaking should a thread be canceled between the moment the function
returns and the result being stored in a location where a cleanup
handler could get to;

b) some of the misfeatures can be worked around by taking some
additional care in user code

c) some of the misfeatures cause all functions that exhibit a certain
behavior to be unsafe to call, which in turn ensures the harmful
behavior won't ever be exercised in conforming programs, which thus
makes a complementary misfeatures exhibited by other functions to become
harmless.  We bind the complementary misfeatures together when some work
around is possible that could make the unsafe functions safe to call, at
the expense of coordinating calls to the unsafe functions and to those
who were only regarded as safe because the unsafe behavior had been
ruled out.  This is the case of glocale and envromt.

> Also, from a readers perspective, mentioning "global" constraints like
> environment first before mentioning MT-Safe would be safer even if
> TL;DR.

Sorry, I don't follow.  Can you expand on what you have in mind?

> Well, what about "onceunsafe"?  Do you dislike this option?

Yeah.  For some reason it sounds to me like some native-Canadian city
name ;-) I don't like the additional length much, and I wouldn't want to
have to face that alliteration in a public speech ;-)

>> >> +Functions marked with @code{uunguard} modify non-atomically arguments or
>> >> +global objects that other functions access without synchronization.  To
>> >> +ensure MT- and AS-Safe behavior, callers should refrain from calling
>> >> +so-marked functions concurrently with readers of those objects.  A

> Just readers or also other "writers"?

Yeah, writers too, of course ;-)  I moved that sentence down to a new
paragraph and rewrote it altogether:

  In order to avoid the safety issues posed by these unguarded accesses,
  callers must guard calls to @emph{all} readers and writers of the
  affected objects with something equivalent to a @code{rwlock}, i.e.,
  writers must get exclusive access, while multiple concurrent readers
  may be allowed.

How's that?

> If that's the case, why do you have uunguard at all, given that it's
> just the negation of MT-Safe (or, be equal to MT-Unsafe)?

It documents a *reason* for a so-marked function to be MT-Unsafe.

> Second, assuming that uunguard is not unlike MT-Unsafe, why is it listed
> as a constraint of MT-Safe functions?

It's not a constraint; it's a (mis)feature that can be avoided by
constraining the program in certain ways.  In this case, by explicitly
guarding all potential readers and writers.

> What I'm saying is that the definition of MT-Safe at the very top should
> link to the listing of constraints.

Hmm, is this a result of the disconnect/misunderstanding we've detected?
I don't think linking the definition of MT-Safe to misfeatures that may
cause functions to be MT-Unsafe makes much sense.

And then, while constraints may imply only MT-Unsafe, others may imply
other kinds of unsafety, and others may imply multiple kinds of unsafety
(e.g., one function could be AS-Unsafe and AC-Unsafe, while another
could be just AC-Unsafe, because of the same misfeature)

> It's true that if you call an
> MT-Safe function A and a function B that is MT-Safe-under-constraint-X
> concurrently, and X doesn't hold, then this could be considered as
> MT-Safe A being mixed with MT-Unsafe B.  But readers could also
> interpret the first definition as "MT-Safe regardless".

I think I understand your concern.  The idea of marking both readers and
writers with the same (or complementary) keywords ought to indicate not
even the readers are not MT-Safe regardless, right?  E.g., consider the
documentation of setlocale:

     | MT-Unsafe uunguard, envromt || AS-Unsafe oncesafe, selfdeadlock,
     asmalloc, asynconsist || AC-Unsafe oncesafe, incansist, lockleak,
     memleak, fdleak |

while iswdigit says:

     | MT-Safe glocale || AS-Safe || AC-Safe |

It *should* be possible to infer from the docs that setlocale has
uunguard because of glocale (the documentation of uunguard says so), but
maybe making it uunguard:glocale would make it easier to correlate
readers (with just glocale) and writers (with uunguard:glocale).  Then,
if you decide you realy need to call setlocale in a MT program, you set
up and take a global rw lock for writing around setlocale and any
uunguard:glocale-marked functions, and for reading around iswdigit and
any other glocale-marked functions.

>> >> +Functions marked with @code{tempsig} may temporarily install signal
>> >> +handlers for internal purposes, which may interfere with other uses of
>> >> +those signals.  

>> >> +This makes such functions MT-Unsafe and AS-Unsafe to begin with.

>> > Why does it make such functions MT-Unsafe in every case?

>> Err...  Because they may interfere with other uses of those signals, and
>> there's not much one can do to avoid this interference.  Calling the
>> function concurrent from two different threads is a no-no.  Heck, since
>> signals are not thread-aware, it's not even safe to call the function in
>> a single thread; the signal it expects may be delivered to a different
>> thread, failing to fulfill its purpose.

> But that doesn't mean that the signal handler must lead to bad behavior.
> I was mainly wondering why installing a temporary signal handler would
> always lead to this being MT-Unsafe.  I suppose that there are ways to
> not make that MT-Unsafe, at least as far as glibc is concerned.

Ok, two comments on this.

1. the reasoning is not sets-temporary-signals-p => MT-Unsafe.  It goes
like: does this function set temporary signals in a way that causes
thread-unsafe behavior?  If so, mark it with tempsig, because that's
that's the keyword that documents this misfeature.  If some function
could set temporary signals without causing a safety problem as
described in tempsig, it just wouldn't get this note.

2. setting a signal temporarily means it ought to be restored before
returning.  But signal handlers are a process-wide property; there
aren't per-thread handlers.  So, when you set a signal handler in a
thread and get the previous handler back, three kinds of problems may
occur:

  a. you may get canceled before you get the return value from the
  signal call, so you can't restore the original handler => AC-Unsafe

  b. another thread may modify the signal handler, removing the handler
  you'll later restore, or overriding your newly-installed handler; if
  they ought to restore later, how do they synchronize so that they
  restore yours before you restore theirs?  there's just no
  standard-defined way to coordinate => MT-Unsafe

  c. you set a temp handler that restores its original handler; you get
  another signal and its handler installs the temp handler for its own
  uses, and then it gets the signal meant for you => you lose the signal
  if the other temp handler doesn't call you back, and you mess with the
  other handler if they do and you then restore the original handler =>
  AS-Unsafe

>> > Given that we control the allocator, I believe that it can be made
>> > atomic (without disabling signals or reentrancy).

>> That's not enough; we'd have to make *storing* the address of the
>> allocated memory into its destination an atomic part of the allocation
>> atomic unit, otherwise memory would still leak after we returned it, if
>> cancellation occurred before the returned value was stored in the
>> destination.

> One can build wait-free atomic updates.

How would that help?

  foo = malloc(bar);

if the thread is canceled just after malloc returns but before the
returned value is stored in foo, how would you clean it up?

>> They take a lock and don't release it, so the taken lock leaks.  I
>> understand the distinction you make, but I fail to see what good would
>> come out of this sort of hair splitting.  Like, what else might lockleak
>> be expected to mean?

> That the lock object itself leaks (eg, that it's not destroyed).

That would be a memleak, no?

> Which, I guess, might matter even for a robust lock (where an
> acquisition could "leak" and we might still have some way to reclaim
> it).

Oh, you mean something like running a destructor for the lock object,
not just leaking its memory?  Well, I guess lockleak could have been
defined so as to mean that.  But it wasn't ;-)

>> It's a partial overlap of async (signal) and inconsist (for
>> inconsistency).  Doubling the c wouldn't make it any less misleading ;-)

> Alright, why not async-inconsist, or asyncinconsist then?

See the 3 lines of safety notes for setlocale?  That's hardly the
longest set of keywords :-( Since the terms are arbitrary keywords, I'm
optimizing for size, some mnemonic value, and pronounceability, but
definitely not for self-explanatory meanings (see below).

> Just longer keywords?  I'd prefer something that has a somewhat
> descriptive name over a number.

Intuitive meanings are not necessarily a plus; they may induce users to
refrain from looking them up and to assume they mean something they
don't, just because it sounds like they do ;-)

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

Follow-Ups:
- Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
  - From: Torvald Riegel

References:
- [MTASCsft PATCH WIP5 00/33] MT-, AS- and AC-Safety docs
  - From: Alexandre Oliva
- [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
  - From: Alexandre Oliva
- Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
  - From: Torvald Riegel
- Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
  - From: Alexandre Oliva
- Re: [MTASCsft PATCH WIP5 01/33] Multi Thread, Async Signal and Async Cancel safety documentation: intro
  - From: Torvald Riegel

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]