This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 07/10] Add __pthread_set_abort_hook export

> (1) This is not necessary for existing and correctly synchronized code
> because assertions will also fail in nontransactional executions (and
> the failure will be reported as expected).

It's useful for existing code too because a transaction may see a different
state (due to races with other threads) than the re-execution.

Of course if the code is correct synchronized it shouldn't have any such
races, but not all code is.

> (2) For explicitly transactional code (ie, code in which some programmer
> explicitly used TSX), you want a facility to communicate some
> information out of transactions without having to finish execution of
> these transactions.

I want it for all transactional code, both implicit and explicit.

> For (2), if the explicitly transactional code is correct and it's just
> performance issues we want to debug, we could do without terminating the
> transaction (unless we have to write so much data out that we're hitting

Transactional performance issues usually lead to aborts. When an abort
happened all the information (except what we can get from the profiler)
is lost.

> HTM capacity limits, etc.).  That is, I'm wondering whether assertions
> are the right tool for this.

I did a lot of transactional debugging and I think they are quite
useful for hard problems. For easy to medium problems the information from the 
profiler is usually enough.

> For (3) and also (2) if it's not just a performance problem, we need to
> terminate the current transaction to be able to get information out of
> it when we can't continue to execute it.  With TSX, we can either use
> the 8 bits that we can communicate via abort, or we could commit the
> transaction early, and then abort.

Hmm, you mean _xend(); assert()? That doesn't work for nested locking.

Ok one could do while (_xtest()) _xend(); assert(). 

HLE is not really supported by the abort hooks, requires RTM.

> Early commit would work if just RTM is used (ie, while (_xtest())
> _xcommit(); ).  But I guess it would fail if xacquire/xrelease is mixed
> in, or does TSX not complain about replacing xrelease with an RTM
> commit?

RTM inside HLE aborts.

> If TSX complains, we get a fault, IIRC, so when this fault happened
> within the code with the loop above, we'd still know that some assertion
> fired.  If we inline this code, or add other hints regarding what called
> it, I guess we could find out which assertion triggered the fault by
> looking at the code around where the fault happened?  Thoughts?

Inline the only way to know the code is to use XABORT and encode 
it in the abort code. That is what TXN_ASSERT() does essentially,
just in a more user friendly way.

For many cases the profiler works too, but not for all, that is why
we added the abort hook mechanism.

> only if the outermost transaction was not started with xacquire.  But
> with TSX we just have <255 values that we can get out (ie, without the
> values reserved for hold locks etc.).  And when we abort, we jump to
> whatever started the outermost transaction, which could be code in
> applications (programmers using transactions explicitly), glibc (e.g.,
> lock elision), libstdc++ (if it doesn't use glibc locks), boost
> (likewise), libitm (__transaction { }), and so on.  So to make this work
> in general, all those components would have to support the special
> assertions.

They all would need to call the abort hook correct.

However a common case is just using the pthread locking, with that it
just works.

> To actually support the assertions, abort codes need to be interpreted
> consistently, and all assertions in a process need to be encoded using
> <255 values.
> Who is supposed to be the consumer of the abort codes? (I've asked this
> previously, but you haven't answered.)  Is this code in the program, or
> something else?  This matter because it's the other end of the
> assertion, obviously.

For TXN_ASSERT() it's just the assert facility inside the program.

For some common cases that are in standard lock library we have a few
reserved codes that can be observed in the profiler.

0xff = lock busy
0xfe = lock is locked (not in pthread)
0xfd = nested trylock (just added)

The profiler doesn't need the abort hook of course.

> What do you think?  Are there any other alternatives?

while (_xtest()) _xend(): assert() may work.

Don't know of any other alternatives.


-- -- Speaking for myself only.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]