This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 3/3] Mutex: Avoid useless spinning
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: libc-alpha at sourceware dot org
- Date: Thu, 5 Apr 2018 17:59:19 -0300
- Subject: Re: [PATCH 3/3] Mutex: Avoid useless spinning
- Autocrypt: addr=adhemerval dot zanella at linaro dot org; prefer-encrypt=mutual; keydata= xsFNBFcVGkoBEADiQU2x/cBBmAVf5C2d1xgz6zCnlCefbqaflUBw4hB/bEME40QsrVzWZ5Nq 8kxkEczZzAOKkkvv4pRVLlLn/zDtFXhlcvQRJ3yFMGqzBjofucOrmdYkOGo0uCaoJKPT186L NWp53SACXguFJpnw4ODI64ziInzXQs/rUJqrFoVIlrPDmNv/LUv1OVPKz20ETjgfpg8MNwG6 iMizMefCl+RbtXbIEZ3TE/IaDT/jcOirjv96lBKrc/pAL0h/O71Kwbbp43fimW80GhjiaN2y WGByepnkAVP7FyNarhdDpJhoDmUk9yfwNuIuESaCQtfd3vgKKuo6grcKZ8bHy7IXX1XJj2X/ BgRVhVgMHAnDPFIkXtP+SiarkUaLjGzCz7XkUn4XAGDskBNfbizFqYUQCaL2FdbW3DeZqNIa nSzKAZK7Dm9+0VVSRZXP89w71Y7JUV56xL/PlOE+YKKFdEw+gQjQi0e+DZILAtFjJLoCrkEX w4LluMhYX/X8XP6/C3xW0yOZhvHYyn72sV4yJ1uyc/qz3OY32CRy+bwPzAMAkhdwcORA3JPb kPTlimhQqVgvca8m+MQ/JFZ6D+K7QPyvEv7bQ7M+IzFmTkOCwCJ3xqOD6GjX3aphk8Sr0dq3 4Awlf5xFDAG8dn8Uuutb7naGBd/fEv6t8dfkNyzj6yvc4jpVxwARAQABzUlBZGhlbWVydmFs IFphbmVsbGEgTmV0dG8gKExpbmFybyBWUE4gS2V5KSA8YWRoZW1lcnZhbC56YW5lbGxhQGxp bmFyby5vcmc+wsF3BBMBCAAhBQJXFRpKAhsDBQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJ EKqx7BSnlIjv0e8P/1YOYoNkvJ+AJcNUaM5a2SA9oAKjSJ/M/EN4Id5Ow41ZJS4lUA0apSXW NjQg3VeVc2RiHab2LIB4MxdJhaWTuzfLkYnBeoy4u6njYcaoSwf3g9dSsvsl3mhtuzm6aXFH /Qsauav77enJh99tI4T+58rp0EuLhDsQbnBic/ukYNv7sQV8dy9KxA54yLnYUFqH6pfH8Lly sTVAMyi5Fg5O5/hVV+Z0Kpr+ZocC1YFJkTsNLAW5EIYSP9ftniqaVsim7MNmodv/zqK0IyDB GLLH1kjhvb5+6ySGlWbMTomt/or/uvMgulz0bRS+LUyOmlfXDdT+t38VPKBBVwFMarNuREU2 69M3a3jdTfScboDd2ck1u7l+QbaGoHZQ8ZNUrzgObltjohiIsazqkgYDQzXIMrD9H19E+8fw kCNUlXxjEgH/Kg8DlpoYJXSJCX0fjMWfXywL6ZXc2xyG/hbl5hvsLNmqDpLpc1CfKcA0BkK+ k8R57fr91mTCppSwwKJYO9T+8J+o4ho/CJnK/jBy1pWKMYJPvvrpdBCWq3MfzVpXYdahRKHI ypk8m4QlRlbOXWJ3TDd/SKNfSSrWgwRSg7XCjSlR7PNzNFXTULLB34sZhjrN6Q8NQZsZnMNs TX8nlGOVrKolnQPjKCLwCyu8PhllU8OwbSMKskcD1PSkG6h3r0AqzsFNBFcVGkoBEACgAdbR Ck+fsfOVwT8zowMiL3l9a2DP3Eeak23ifdZG+8Avb/SImpv0UMSbRfnw/N81IWwlbjkjbGTu oT37iZHLRwYUFmA8fZX0wNDNKQUUTjN6XalJmvhdz9l71H3WnE0wneEM5ahu5V1L1utUWTyh VUwzX1lwJeV3vyrNgI1kYOaeuNVvq7npNR6t6XxEpqPsNc6O77I12XELic2+36YibyqlTJIQ V1SZEbIy26AbC2zH9WqaKyGyQnr/IPbTJ2Lv0dM3RaXoVf+CeK7gB2B+w1hZummD21c1Laua +VIMPCUQ+EM8W9EtX+0iJXxI+wsztLT6vltQcm+5Q7tY+HFUucizJkAOAz98YFucwKefbkTp eKvCfCwiM1bGatZEFFKIlvJ2QNMQNiUrqJBlW9nZp/k7pbG3oStOjvawD9ZbP9e0fnlWJIsj 6c7pX354Yi7kxIk/6gREidHLLqEb/otuwt1aoMPg97iUgDV5mlNef77lWE8vxmlY0FBWIXuZ yv0XYxf1WF6dRizwFFbxvUZzIJp3spAao7jLsQj1DbD2s5+S1BW09A0mI/1DjB6EhNN+4bDB SJCOv/ReK3tFJXuj/HbyDrOdoMt8aIFbe7YFLEExHpSk+HgN05Lg5TyTro8oW7TSMTk+8a5M kzaH4UGXTTBDP/g5cfL3RFPl79ubXwARAQABwsFfBBgBCAAJBQJXFRpKAhsMAAoJEKqx7BSn lIjvI/8P/jg0jl4Tbvg3B5kT6PxJOXHYu9OoyaHLcay6Cd+ZrOd1VQQCbOcgLFbf4Yr+rE9l mYsY67AUgq2QKmVVbn9pjvGsEaz8UmfDnz5epUhDxC6yRRvY4hreMXZhPZ1pbMa6A0a/WOSt AgFj5V6Z4dXGTM/lNManr0HjXxbUYv2WfbNt3/07Db9T+GZkpUotC6iknsTA4rJi6u2ls0W9 1UIvW4o01vb4nZRCj4rni0g6eWoQCGoVDk/xFfy7ZliR5B+3Z3EWRJcQskip/QAHjbLa3pml xAZ484fVxgeESOoaeC9TiBIp0NfH8akWOI0HpBCiBD5xaCTvR7ujUWMvhsX2n881r/hNlR9g fcE6q00qHSPAEgGr1bnFv74/1vbKtjeXLCcRKk3Ulw0bY1OoDxWQr86T2fZGJ/HIZuVVBf3+ gaYJF92GXFynHnea14nFFuFgOni0Mi1zDxYH/8yGGBXvo14KWd8JOW0NJPaCDFJkdS5hu0VY 7vJwKcyHJGxsCLU+Et0mryX8qZwqibJIzu7kUJQdQDljbRPDFd/xmGUFCQiQAncSilYOcxNU EMVCXPAQTteqkvA+gNqSaK1NM9tY0eQ4iJpo+aoX8HAcn4sZzt2pfUB9vQMTBJ2d4+m/qO6+ cFTAceXmIoFsN8+gFN3i8Is3u12u8xGudcBPvpoy4OoG
- Openpgp: preference=signencrypt
- References: <1522394093-9835-1-git-send-email-kemi.wang@intel.com> <1522394093-9835-3-git-send-email-kemi.wang@intel.com>
On 30/03/2018 04:14, Kemi Wang wrote:
> Usually, we can't set too short time out while spinning on the lock, that
> probably makes a thread which is trying to acquire the lock go to sleep
> quickly, thus weakens the benefit of pthread adaptive spin lock.
>
> However, there is also a problem if we set the time out large in
> case of protecting a small critical section with severe lock contention.
> As we can see the test result in the last patch, the performance is highly
> effected by the spin count tunables, smaller spin count, better performance
> improvement. This is because the thread probably spins on the lock until
> timeout in severe lock contention before going to sleep.
>
> In this patch, we avoid the useless spin by making the spinner sleep
> if it fails to acquire the lock when the lock is available, as suggested
> by Tim Chen.
>
> nr_threads base COUNT=1000(head~1) COUNT=1000(head)
> 1 51644585 51323778(-0.6%) 51378551(-0.5%)
> 2 7914789 9867343(+24.7%) 11503559(+45.3%)
> 7 1687620 3430504(+103.3%) 7817383(+363.2%)
> 14 1026555 1843458(+79.6%) 7360883(+617.0%)
> 28 962001 681965(-29.1%) 5681945(+490.6%)
> 56 883770 364879(-58.7%) 3416068(+286.5%)
> 112 1150589 415261(-63.9%) 3255700(+183.0%)
As before [2], I checked the change on a 64 cores aarch64 machine, but
differently than previous patch this one seems to show improvements:
nr_threads base head(SPIN_COUNT=10) head(SPIN_COUNT=1000)
1 27566206 28776779 (4.206770) 28778073 (4.211078)
2 8498813 9129102 (6.904173) 7042975 (-20.670782)
7 5019434 5832195 (13.935765) 5098511 (1.550982)
14 4379155 6507212 (32.703053) 5200018 (15.785772)
28 4397464 4584480 (4.079329) 4456767 (1.330628)
56 4020956 3534899 (-13.750237) 4096197 (1.836850)
I would suggest you to squash both patch in only one for version 2.
>
> Suggested-by: Tim Chen <tim.c.chen@intel.com>
> Signed-off-by: Kemi Wang <kemi.wang@intel.com>
> ---
> nptl/pthread_mutex_lock.c | 25 +++++++++++--------------
> 1 file changed, 11 insertions(+), 14 deletions(-)
>
> diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
> index c3aca93..0faee1a 100644
> --- a/nptl/pthread_mutex_lock.c
> +++ b/nptl/pthread_mutex_lock.c
> @@ -127,22 +127,19 @@ __pthread_mutex_lock (pthread_mutex_t *mutex)
> int cnt = 0;
> int max_cnt = MIN (__mutex_aconf.spin_count,
> mutex->__data.__spins * 2 + 100);
> +
> + /* MO read while spinning */
> do
> - {
> - if (cnt >= max_cnt)
> - {
> - LLL_MUTEX_LOCK (mutex);
> - break;
> - }
> - /* MO read while spinning */
> - do
> - {
> - atomic_spin_nop ();
> - }
> - while (atomic_load_relaxed (&mutex->__data.__lock) != 0 &&
> + {
> + atomic_spin_nop ();
> + }
> + while (atomic_load_relaxed (&mutex->__data.__lock) != 0 &&
> ++cnt < max_cnt);
> - }
> - while (LLL_MUTEX_TRYLOCK (mutex) != 0);
> + /* Try to acquire the lock if lock is available or the spin count
> + * is run out, go to sleep if fails
> + */
> + if (LLL_MUTEX_TRYLOCK (mutex) != 0)
> + LLL_MUTEX_LOCK (mutex);
>
> mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8;
> }
>
Please fix the format issue here.