This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
- From: Florian Weimer <fweimer at redhat dot com>
- To: kemi <kemi dot wang at intel dot com>, Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, Glibc alpha <libc-alpha at sourceware dot org>
- Cc: Dave Hansen <dave dot hansen at linux dot intel dot com>, Tim Chen <tim dot c dot chen at intel dot com>, Andi Kleen <andi dot kleen at intel dot com>, Ying Huang <ying dot huang at intel dot com>, Aaron Lu <aaron dot lu at intel dot com>, Lu Aubrey <aubrey dot li at intel dot com>
- Date: Tue, 8 May 2018 17:44:05 +0200
- Subject: Re: [PATCH v2 1/3] Tunables: Add tunables of spin count for pthread adaptive spin mutex
- References: <1524624988-29141-1-git-send-email-kemi.wang@intel.com> <0c66f19d-c0e8-accd-85dd-7e55dd6da1af@redhat.com> <55c818fb-1b7e-47d0-0287-2ea33ce69fd5@intel.com>
On 05/02/2018 01:06 PM, kemi wrote:
Hi, Florian
Thanks for your time to review.
On 2018年05月02日 16:04, Florian Weimer wrote:
On 04/25/2018 04:56 AM, Kemi Wang wrote:
+ mutex {
+ spin_count {
+ type: INT_32
+ minval: 0
+ maxval: 30000
+ default: 1000
+ }
How did you come up with the default and maximum values? Larger maximum values might be useful for testing boundary conditions.
For the maximum value of spin count:
Please notice that mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8, and the variable *cnt* could reach
the value of spin count due to spinning timeout. In such case, mutex->__data.__spins is increased and could be close to *cnt*
(close to the value of spin count). Keeping the value of spin count less than MAX_SHORT can avoid the overflow of
mutex->__data.__spins variable with the possible type of short.
Could you add this as a comment, please?
For the default value of spin count:
I referred to the previous number of 100 times for trylock in the loop. When this mode is changed to read only while spinning.
I suppose the value could be larger because of lower overhead and latency of read compared with cmpxchg.
Ahh, makes sense. Perhaps put this information into the commit message.
Perhaps we should make the default value of spin count differently according to architecture.
Sure, or if there is just a single good choice for the tunable, just use
that and remove the tunable again. I guess one aspect here is to
experiment with different values and see if there's a clear winner.
+# define TUNABLE_CALLBACK_FNDECL(__name, __type) \
+static inline void \
+__always_inline \
+do_set_mutex_ ## __name (__type value) \
+{ \
+ __mutex_aconf.__name = value; \
+} \
+void \
+TUNABLE_CALLBACK (set_mutex_ ## __name) (tunable_val_t *valp) \
+{ \
+ __type value = (__type) (valp)->numval; \
+ do_set_mutex_ ## __name (value); \
+}
+
+TUNABLE_CALLBACK_FNDECL (spin_count, int32_t);
I'm not sure if the macro is helpful in this context.
It is a matter of taste.
But, perhaps we have other mutex tunables in future.
We can still macroize the code at that point. But no strong preference
here.
+void (*const __pthread_mutex_tunables_init_array []) (int, char **, char **)
+ __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) =
+{
+ &mutex_tunables_init
+};
Can't you perform the initialization as part of overall pthread initialization? This would avoid the extra relocation.
Thanks for your suggestion. I am not sure how to do it now and will take a look at it.
The code would go into nptl/nptl-init.c. It's just an idea, but I think
it should be possible to make it work.
Thanks,
Florian