This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/21305] clock_gettime(CLOCK_MONOTONIC_RAW) can and should use rdtsc instructions instead of entering kernel through VDSO


https://sourceware.org/bugzilla/show_bug.cgi?id=21305

--- Comment #2 from Jason Vas Dias <jason.vas.dias at gmail dot com> ---
I am using the unmodified Linux 4.10.0 (latest stable git tag).


In arch/x86/entry/vdso/vclock_gettime.c, 
clock_gettime(CLOCK_MONOTONIC_RAW,&ts) 
is handled by :

notrace static int __always_inline do_monotonic(struct timespec *ts)
{
        unsigned long seq;
        u64 ns;
        int mode;

        do {
                seq = gtod_read_begin(gtod);
                mode = gtod->vclock_mode;
                ts->tv_sec = gtod->monotonic_time_sec;
                ns = gtod->monotonic_time_snsec;
                ns += vgetsns(&mode);
                ns >>= gtod->shift;
        } while (unlikely(gtod_read_retry(gtod, seq)));

        ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, &ns);
        ts->tv_nsec = ns;

        return mode;
}


The problem is, it is doing locking here in gtod_read_begin() .


to clarify :

$ objdump -t $BLD/linux-4.10/arch/x86/entry/vdso/vdso64.so.dbg | grep vvar
ffffffffffffe000 l       *ABS*  0000000000000000              vvar_page
ffffffffffffe000 l       .hash  0000000000000000              vvar_start
ffffffffffffe080 l       *ABS*  0000000000000000              vvar_\
                                                      vsyscall_gtod_data
So you can get the base address of the VDSO & VVAR page of any process:
$ egrep '\[vdso|vvar\]' /proc/$$/maps
7fff5d122000-7fff5d124000 r--p 00000000 00:00 0                          [vvar]
7fff5d124000-7fff5d126000 r-xp 00000000 00:00 0                          [vdso]

So in this case, vvar_vsyscall_gtod_data is at :

  (0xffffffffffffe080 - 0xffffffffffffe000) + 0x7fff5d122000

You can cast this to a pointer to 'struct vsyscall_gtod_data' (*p) 
in GDB and it prints a valid struct, with vclock_mode set to 1
(VCLOCK_MODE_TSC)
and shift and mult member values that are very close to those calculated in 
the function above (derived from the function in clocksource.c), but which ARE
dynamically adjusted, even on machines with constant_tsc and nonstop_tsc 
enabled. 

The precise TSC frequency may be unavailable on machines with cpuid level <
0x15  , but is always calibrated by linux during startup and stored in
'tsc_khz':

$ ksym tsc_khz 4
ffffffff82c04774 tsc_khz 002c25f3

(ksym is a little script I wrote to use /proc/kallsyms and /proc/kcore 
 to use objdump to lookup the value of a symbol, of size 2nd argument, 
 in the running kernel - so tsc_khz is 2893299 , or near the rated frequency 
 of the CPU: 2.9Ghz -
 this is the initial value used by the ia64_tsc_calc_mult_shift() function.
).

I think GLIBC should access the (vvar_syscall_gtod_data)->shift and ->mult
values to calculate clock_gettime(CLOCK_MONOTONIC_RAW, &tsp) values in 
user-space in the same way the kernel does:
    struct vsyscall_gtod_data *gtod = linux_vdso_vvar_vsyscall_gtod_lookup() ;
    U64_t tsc_ticks = x86_64_rdtscp() ;
    U64_t s_ns = (tsc_ticks * gtod->mult) >> gtod->shift;
    tsp->tv_sec = (s_ns >> 32) & ((1UL<<32)-1);
    tsp->tv_nsec =(s_ns & ((1UL <<32)-1)));


RE: 
> I don't see why your proposed performance enhancement couldn't be applied to 
> the vDSO itself.  

It is unlikely that the kernel developers would agree to abandon locking 
around the vsyscall_gtod_data here .  
Strictly, they are correct, in that it is possible that a process could 
be updating gtod->mult and gtod->shift as another process reads them
without locking . But since updates to these integers are atomic, 
invalid values for them won't be read by unlocked readers , and 
it is incredibly unlikely that both gtod->mult AND gtod->shift 
would be updated so that mismatched values for them would be
obtained by an unlocked reader .

I think taking the miniscule risk that once in a blue moon a reader
might get mismatched shift & mult values is far, far outweighed by
the vast benefits users would gain by being able to measure times 
less than 600ns or so. 


>I don't think it is a good idea to put detailed knowledge about system clocks >into glibc.

Idealogically, perhaps, but we are not in a perfect world . 
Users of a modern POSIX operating system on a 2.9-3.9Ghz machine
should expect to be able to measure the time with a granularity 
of less then 500ns.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]