This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2] Add x86 32 bit vDSO time function support
- From: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
- To: Nathan Lynch <Nathan_Lynch at codesourcery dot com>
- Cc: "GNU C. Library" <libc-alpha at sourceware dot org>
- Date: Mon, 03 Nov 2014 19:42:31 -0200
- Subject: Re: [PATCH v2] Add x86 32 bit vDSO time function support
- Authentication-results: sourceware.org; auth=none
- References: <5436D48C dot 2090509 at linux dot vnet dot ibm dot com> <5436F50C dot 9070002 at codesourcery dot com> <5457E5D5 dot 6000103 at linux dot vnet dot ibm dot com>
On 03-11-2014 18:30, Adhemerval Zanella wrote:
> On 09-10-2014 17:50, Nathan Lynch wrote:
>> On 10/09/2014 01:31 PM, Adhemerval Zanella wrote:
>>> +static long int
>>> +clock_gettime_syscall (clockid_t id, struct timespec *tp)
>>> +{
>>> + INTERNAL_SYSCALL_DECL (err);
>>> + return INTERNAL_SYSCALL (clock_gettime, err, 2, id, tp);
>>> +}
>>> +
>>> +static inline void
>>> +__vdso_platform_setup (void)
>>> +{
>>> + PREPARE_VERSION (linux26, "LINUX_2.6", 61765110);
>> Perhaps:
>>
>> PREPARE_VERSION_KNOWN (linux26, LINUX_2_6);
>>
>> (here and several other places)
> Thanks, I fixed it on all the places nows.
>
>>> +#ifdef SHARED
>>> +# define SYSCALL_GETTIME(id, tp) \
>>> + ({ long int (*f) (clockid_t, struct timespec *) = __vdso_clock_gettime; \
>>> + long int v_ret; \
>>> + PTR_DEMANGLE (f); \
>>> + v_ret = (*f) (id, tp); \
>>> + if (INTERNAL_SYSCALL_ERROR_P (v_ret, )) { \
>>> + __set_errno (INTERNAL_SYSCALL_ERRNO (v_ret, )); \
>>> + v_ret = -1; \
>>> + } \
>>> + v_ret; })
>> Does introducing the dispatch through function pointer here cause a
>> measurable performance regression on i386 kernels which lack the VDSO?
>> If so, is that a concern?
>>
>> When I've tried this approach on ARM, it appears to do so (around 5%
>> slowdown).
> Using a simple benchmark (in attachments) the difference in such scenarios is not
> as drastic as ARM it seems:
>
> kernel: Linux birita 3.13.0-39
> CPU: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
>
> EGLIBC 2.19-0ubuntu6.3: 1415.12 cycles
> GLIBC 2.20 master: 1421.66 cycles
>
>>
>>> +# define INTERNAL_GETTIME(id, tp) \
>>> + ({ long int (*f) (clockid_t, struct timespec *) = __vdso_clock_gettime; \
>>> + PTR_DEMANGLE (f); \
>>> + (*f) (id, tp); })
>>> +#endif
>> I'm probably missing something, but I am failing to see the need for an
>> INTERNAL_GETTIME definition in
>> sysdeps/unix/sysv/linux/x86/clock_gettime.c. I know this patch is
>> merely moving existing code, but sysdeps/unix/sysv/linux/clock_gettime.c
>> does not use INTERNAL_GETTIME, and neither does
>> sysdeps/unix/clock_gettime.c.
>>
>> INTERNAL_GETTIME is needed for timespec_get, but I am not seeing the
>> need to duplicate it for clock_gettime.
> i386 does not define HAVE_CLOCK_GETTIME_VSYSCALL and thus:
>
> sysdeps/unix/sysv/linux/clock_gettime.c:
>
> 26: # define INTERNAL_VSYSCALL INTERNAL_SYSCALL
>
> and then if INTERNAL_GETTIME is not defined, it will as:
>
> 37 #ifndef INTERNAL_GETTIME
> 38 # define INTERNAL_GETTIME(id, tp) \
> 39 INTERNAL_VSYSCALL (clock_gettime, err, 2, id, tp)
> 40 #endif
>
> And without proper set the PTR_DEMANGLE is not called either.
>
> With PREPARE_VERSION_KNOWN fixes, is it ok to commit?
>
Send the missing simple benchmark.
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <time.h>
#include <stdint.h>
#ifdef __x86_64__
# define HP_TIMING_NOW(Var) \
({ unsigned int _hi, _lo; \
asm volatile ("rdtsc" : "=a" (_lo), "=d" (_hi)); \
(Var) = ((unsigned long long int) _hi << 32) | _lo; })
#else
# define HP_TIMING_NOW(Var) __asm__ __volatile__ ("rdtsc" : "=A" (Var))
#endif
/* Compute the difference between START and END, storing into DIFF. */
#define HP_TIMING_DIFF(Diff, Start, End) ((Diff) = (End) - (Start))
/* Accumulate ADD into SUM. No attempt is made to be thread-safe. */
#define HP_TIMING_ACCUM_NT(Sum, Diff) ((Sum) += (Diff))
/* We use 64bit values for the times. */
typedef unsigned long long int hp_timing_t;
#define NITER 10000000UL
int main ()
{
const clockid_t id = CLOCK_REALTIME;
uint64_t i;
hp_timing_t start, end, diff;
int ret = 0;
HP_TIMING_NOW (start);
for (i=0; i<NITER; ++i)
{
struct timespec ts;
ret += clock_gettime (id, &ts);
}
HP_TIMING_NOW (end);
HP_TIMING_DIFF (diff, start, end);
double callcyc = (double)diff / (double)NITER;
printf ("%2.2lf cycles\n", callcyc);
return ret;
}