This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: ctermid: return string literal, document MT-Safety pitfall
- From: Alexandre Oliva <aoliva at redhat dot com>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: Florian Weimer <fweimer at redhat dot com>, libc-alpha at sourceware dot org
- Date: Fri, 21 Nov 2014 07:30:45 -0200
- Subject: Re: ctermid: return string literal, document MT-Safety pitfall
- Authentication-results: sourceware.org; auth=none
- References: <ortx2b8l2k dot fsf at free dot home> <54620F80 dot 3030001 at redhat dot com> <ortx22ak52 dot fsf at free dot home> <5465EF19 dot 5040802 at redhat dot com> <1415971681 dot 4535 dot 326 dot camel at triegel dot csb> <54660803 dot 1050809 at redhat dot com> <1415973961 dot 4535 dot 331 dot camel at triegel dot csb> <or4mu1iv0p dot fsf at free dot home> <1416217460 dot 4535 dot 361 dot camel at triegel dot csb> <or8uj8kv1m dot fsf at free dot home> <1416435060 dot 3539 dot 104 dot camel at triegel dot csb>
On Nov 19, 2014, Torvald Riegel <triegel@redhat.com> wrote:
> On Tue, 2014-11-18 at 20:23 -0200, Alexandre Oliva wrote:
>> On Nov 17, 2014, Torvald Riegel <triegel@redhat.com> wrote:
>>
>> > On Fri, 2014-11-14 at 14:53 -0200, Alexandre Oliva wrote:
>> >> On Nov 14, 2014, Torvald Riegel <triegel@redhat.com> wrote:
>> >>
>> >> > AFAICT memset_s is still a sequentially-specified function.
>> >>
>> >> How can you tell? It's not like the standard explicitly says so, is it?
>> I'm asking how do you tell in general.
> If the function is doing something, for example a store, and the
> function does not specify what it does internally and which
> inter-thread happens-before relations this creates, then there's
> nothing specified that makes this not a data race if you try to look
> at an intermediate state from another thread.
Which is why I've resorted to non-threaded means of inspection of
intermediate states. I think we differ in whether âthe function does
not specify what it does internallyâ. If the definition of the function
said âcopy n chars from src[0..n-1] to dest[0..n-1] respectivelyâ,
besides any pre- and post- conditions, then it *does* specify what it
does internally. Not the order in which the chars are copied, for sure,
but still, it says the function should copy each and every one of those
chars. It doesn't state how to copy a char, but anything other than
load from src[i] and store the loaded value in dest[i] is hardly a copy.
So while this makes room an interrupted copy to leave dest[i] in an
unspecified state that could be its earlier value or the newly-copied
one, it would be hard to argue that anything else complies with the
behavior specification enclosed in quotes above.
I can see value in making simplifying assumptions to reason about
behavior in the presence of multiple threads, and I realize that the no
data race requirements can enable *reasoning* about sequential functions
in such contexts as if only the pre- and post-conditions mattered, I do
not agree that applying similar reasoning to go backwards is logically
sound.
I mean, âI perceive this as a sequential function, which enables
simplifying assumptions about internal behavior in multi-threaded
contexts, therefore I can disregard the explicit behavior specification
and only look at explicit or inferred (pre- and?) post-conditions to
reason in any context whatsoever, or to implement the function however I
like, even deviating from the specification, as long as it still
satisfies the post-conditions when given the pre-conditionsâ doesn't
hold, because there are issues that arise besides those that come up in
multi-threaded contexts, to which the simplifying assumptions for
reasoning about multi-threaded contexts do not apply.
> Well, the comparison callbacks can't just look at will at every piece
> of intermediate state.
Why is that? I mean, what, if any, part of the relevant standards says
so?
> So, I agree that these *specific* memory locations are intermediate
> states, but the comparison functions are not guaranteed to be able to
> look at other elements of the arrays and find sensible information in
> those.
The important question here IMHO is whether looking at them is invokes
undefined behavior, or just yields unspecified values, possibly narrowed
to a subset of all values that might be held by the types of the objects
in those locations, if there can even be valid assumptions about the
types of those memory locations.
>> I'm just trying to figure out what the
>> heck you mean by âsequential functionâ, and by âsequential
>> specificationâ.
> What I mean is that they are not concurrent specifications that make
> guarantees about states of an unfinished execution as visible to
> concurrent observers. They only make guarantees about the state after a
> function has finished executing. (Sorry if I'm using shared-memory
> synchronization terminology here, but given that we want to distinguish
> between concurrent and non-concurrent, that seems to make sense.)
Thanks. The definitely makes sense, when the goal is to reason about
shared-memory multi-threaded (henceforth SMMT) issues. But there are
other issues for which this distinction, or the simplifications in SMMT
reasoning that follow from it, don't apply, and may even contradict
other standard-imposed requirements. So please take the âsequential
functionâ claims with a grain of salt, and don't use them to discard
parts of the specification you don't generally have to worry about when
you're thinking of SMMT, when the context is not limited to SMMT.
>> I had understood the latter had to do with
>> specifications limited to pre- and post-conditions, but the standards
>> we've been talking about do not limit function specifications to that.
> Why do you think that is the case?
What does âthatâ mean? That I had understood it in a certain way? Or
that the standards do not limit specs to pre- and post-conditions?
> The callback, or composition of functions in general, is one thing you
> mentioned, and I hope was able to convince you that this doesn't give
> guarantees about the caller (e.g., qsort) to the callee (e.g.,
> comparison function), except when those guarantees overlap with
> preconditions for the callee.
I'm afraid you haven't, but you've helped me understand our differences
in reasoning, because I won't turn specifications of behavior into pre-
and post-conditions and label a function as sequential to then pretend
the original specifications did not exist and did not impose any other
requirements that are not necessarily relevant for SMMT contexts, but
that might be in other contexts.
> "A function that may be safely invoked by an application while the
> asynchronous form of cancellation is enabled."
> That doesn't really tell me a lot :) I can interpret "safely invoked"
> to at least mean that the mere act of cancellation will not break
> anything. But it doesn't tell me which state one can expect after
> cancellation.
Yup. Again, the important question is: is it undefined or unspecified?
> One way to define safety would be to say that a cancelled function
> should either take effect or not, but never partially take effect. IOW,
> it's either just the precondition or the postcondition that holds.
This would be a way to extend the simplifying assumptions of sequential
functions to some other contexts. Sequential functions would
essentially be regarded as, and required to behave as, atomic.
> Another option would be to allow specified intermediate steps to take
> effect. For 2), we could say that cancellation happens anywhere between
> the steps the abstract machine would do, but not within a step. This
> would be satisfied under the requirement you assumed for memset and
> strcpy implementations, I believe.
Yeah, with the caveat that the order of steps of the abstract machine
that may be used to carry out the required behavior is not specified.
So, interrupting memset, you might observe that dest[i+1] is modified
while dest[i] wasn't yet, or vice-versa.
> For 3), it could be cancellation between any of the atomic steps, unless
> otherwise specified. For condvar wait, for example, this could be one
> of the three parts: lock release, wakeup, lock acquisition.
Eeek, it would be Really Bad (TM) IMHO if a condvar wait could be
canceled while the lock is not held: this could mess with enclosing
cleanup handlers that, among other things, release the lock.
What states can cancellation cleanup handlers reliably inspect, anyway?
Are they to be regarded as running in async signal context, so that they
can't reliably access local state and are very limited in global state?
Or are they allowed to access local state, plus any global state that
could be accessed after pthread_join()ing the canceled thread?
>> Does this mean it is safe to access the variables
>> that were partially modified by the interrupted memcpy/strcpy/whatever,
>> and that this provides means to safely inspect intermediate states?
> For normal memcpy, strcpy, and other functions in group 1), the
> intermediate states aren't defined
Again, not defined or not specified?
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
- References:
- ctermid: return string literal, document MT-Safety pitfall
- Re: ctermid: return string literal, document MT-Safety pitfall
- Re: ctermid: return string literal, document MT-Safety pitfall
- Re: ctermid: return string literal, document MT-Safety pitfall
- Re: ctermid: return string literal, document MT-Safety pitfall
- Re: ctermid: return string literal, document MT-Safety pitfall
- Re: ctermid: return string literal, document MT-Safety pitfall
- Re: ctermid: return string literal, document MT-Safety pitfall
- Re: ctermid: return string literal, document MT-Safety pitfall
- Re: ctermid: return string literal, document MT-Safety pitfall
- Re: ctermid: return string literal, document MT-Safety pitfall