[PATCH][BZ #16145] Reduce lock contention in __tz_convert()

This patch is an "easy win" partial fix for BZ #16145, which notes
the heavy contention on tzset_lock when multiple threads are converting
times with localtime_r().

In __tz_convert(), the lock does not need to be held after
__tzfile_compute() / __tz_compute() have been called, so we can move the
unlock up.  At this point there is still significant work to be done in
__offtime(), so we see some improvement (in my testing with 8 cores
banging on localtime_r(), ~20% improvement in throughput).

	[BZ #16145] (partial fix)
	* time/tzset.c (__tz_convert): Unlock tzset_lock earlier
	to reduce lock contention.

diff --git a/time/tzset.c b/time/tzset.c
index 8bc7a2e..82324ca 100644
--- a/time/tzset.c
+++ b/time/tzset.c
@@ -644,6 +644,8 @@ __tz_convert (const time_t *timer, int use_localtime, struct tm *tp)
       leap_extra_secs = 0;

+  __libc_lock_unlock (tzset_lock);
   if (tp)
       if (! use_localtime)
@@ -659,8 +661,6 @@ __tz_convert (const time_t *timer, int use_localtime, struct tm *tp)
        tp = NULL;

-  __libc_lock_unlock (tzset_lock);
   return tp;

