Bug 15329 - _dl_sort_fini and _dl_sort_init don't keep SCCs contiguous
Summary: _dl_sort_fini and _dl_sort_init don't keep SCCs contiguous
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: unspecified
: P2 minor
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-02 09:09 UTC by Don Hatch
Modified: 2021-10-27 14:59 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Don Hatch 2013-04-02 09:09:41 UTC
Found this by surprise while unit/stress testing _dl_sort_fini
and the corresponding init sorting code
(specifically, testing the init code on all graphs of up to 4 nodes).
Previously, it wasn't clear to me whether the existing topsorting algorithm
keeps SCCs (strongly connected components) contiguous--
it seemed to, for the most part, but
I didn't have a proof that it did,
nor a counterexample showing it didn't.
Here's a small counterexample showing it doesn't.

The static dependency graph has 4 nodes 0,1,2,3, 
and 5 edges:
    0 depends on 1,2,3
    1 depends on 0
    2 doesn't depend on anything
    3 depends on 0
In this case the dag of SCCs has two nodes and one edge:
    {0,1,3} depends on {2}.
So it would be good if the sort would put 2 last.
But it doesn't-- it produces [0 3 2 1].

To put it another way, 1 depends (transitively) on 2,
but 2 doesn't depend (transitively) on 1,
so it's desireable for 1 to come before 2 in the output order.

The following probably isn't very interesting,
but here are the steps the sorting code takes:
    initial order: [0 1 2 3]
    move 0 after 3 -> [1 2 3 0], 0 has been seen once
    move 1 after 0 -> [2 3 0 1], 1 has been seen once
    move 2 after 0 -> [3 0 2 1], 2 has been seen once
    move 3 after 0 -> [0 3 2 1], 3 has been seen once
    move 0 after 1 -> [3 2 1 0], 0 has been seen twice
    move 3 after 0 -> [2 1 0 3], 3 has been seen twice
    move 2 after 0 -> [1 0 2 3], 2 has been seen twice
    move 1 after 0 -> [0 1 2 3], 1 has been seen twice
        (original order!)
    move 0 after 3 -> [1 2 3 0], 0 has been seen 3 times
    move 1 after 0 -> [2 3 0 1], 1 has been seen 3 times
    move 2 after 0 -> [3 0 2 1], 2 has been seen 3 times
    move 3 after 0 -> [0 3 2 1], 3 has been seen 3 times
    move 0 after 1 -> [3 2 1 0], 0 has been seen 4 times
    move 3 after 0 -> [2 1 0 3], 3 has been seen 4 times
    move 2 after 0 -> [1 0 2 3], 2 has been seen 4 times
    move 1 after 0 -> [0 1 2 3], 1 has been seen 4 times
        (original order!)
    move 0 after 3 -> [1 2 3 0], 0 has been seen 5 times
    move 1 after 0 -> [2 3 0 1], 1 has been seen 5 times
    move 2 after 0 -> [3 0 2 1], 2 has been seen 5 times
    move 3 after 0 -> [0 3 2 1], 3 has been seen 5 times
Now object 3 has been seen 5 times which exceeds its limit,
so 0 gets locked in place, all seen counts get cleared,
and we go on to sort the last three items,
which have no order conflicts within themselves and so the output order is [0 3 2 1].

This is something to keep in mind when overhauling the sorting code,
which already needs to be done for bug 15310 (sorting is O(n^3))
and bug 15311 (static deps can be violated by dynamic ones).
It should be possible to make a fast, robust,
easier-to-analyze implementation that keeps SCCs contiguous.