Calculate system time using rdtsc

守給你的承諾、 提交于 2019-11-28 10:31:14

Don't do that -using yourself directly the RDTSC machine instruction- (because your OS scheduler could reschedule other threads or processes at arbitrary moments, or slow down the clock). Use a function provided by your library or OS.

My main objective is to avoid the need to perform system call every time I want to know the system time

On Linux, read time(7) then use clock_gettime(2) which is really quick (and does not involve any slow system call) thanks to vdso(7).

On a C++11 compliant implementation, simply use the standard <chrono> header. And standard C has clock(3) (giving microsecond precision). Both would use on Linux good enough time measurement functions (so indirectly vdso)

Last time I measured clock_gettime it often took less than 4 nanoseconds per call.

Margaret Bloom

The idea is not unsound but it is not suited for user-mode applications, for which, as @Basile suggested, there are better alternatives.

Intel itself suggests to use the TSC as a wall-clock:

The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states.
This is the architectural behaviour moving forward. On processors with invariant TSC support, the OS may use the TSC for wall clock timer services (instead of ACPI or HPET timers). TSC reads are much more efficient and do not incur the overhead associated with a ring transition or access to a platform resource.

However, care must be taken.

The TSC is not always invariant

In older processors the TSC is incremented on every internal clock cycle, it was not a wall-clock.
Quoting Intel

For Pentium M processors (family [06H], models [09H, 0DH]); for Pentium 4 processors, Intel Xeon processors (family [0FH], models [00H, 01H, or 02H]); and for P6 family processors: the time-stamp counter increments with every internal processor clock cycle.

The internal processor clock cycle is determined by the current core-clock to bus-clock ratio. Intel® SpeedStep® technology transitions may also impact the processor clock.

If you only have a variant TSC, the measurement are unreliable for tracking time. There is hope for invariant TSC though.

The TSC is not incremented at the frequency advised on the brand string

Still quoting Intel

the time-stamp counter increments at a constant rate. That rate may be set by the maximum core-clock to bus-clock ratio of the processor or may be set by the maximum resolved frequency at which the processor is booted. The maximum resolved frequency may differ from the processor base frequency.
On certain processors, the TSC frequency may not be the same as the frequency in the brand string.

You can't simply take the frequency written on the box of the processor.
See below.

rdtsc is not serialising

You need to serialise it from above and below.
See this.

The TSC is based on the ART (Always Running Timer) when invariant

The correct formula is

TSC_Value = (ART_Value * CPUID.15H:EBX[31:0] )/ CPUID.15H:EAX[31:0] + K

See section 17.15.4 of the Intel manual 3.

Of course, you have to solve for ART_Value since you start from a TSC_Value. You can ignore the K as you are interested in deltas only. From the ART_Value delta you can get the time elapsed once you know the frequency of the ART. This is given as k * B where k is a constant in the MSR MSR_PLATFORM_INFO and B is 100Mhz or 133+1/3 Mhz depending on the processor.

As @BeeOnRope pointed out, from Skylake the ART crystal frequency is no longer the bus frequency.
The actual values, maintained by Intel, can be found in the turbostat.c file.

switch(model) 
{
case INTEL_FAM6_SKYLAKE_MOBILE: /* SKL */
case INTEL_FAM6_SKYLAKE_DESKTOP:    /* SKL */
case INTEL_FAM6_KABYLAKE_MOBILE:    /* KBL */
case INTEL_FAM6_KABYLAKE_DESKTOP:   /* KBL */
    crystal_hz = 24000000;  /* 24.0 MHz */
    break;
case INTEL_FAM6_SKYLAKE_X:  /* SKX */
case INTEL_FAM6_ATOM_DENVERTON: /* DNV */
    crystal_hz = 25000000;  /* 25.0 MHz */
    break;
case INTEL_FAM6_ATOM_GOLDMONT:  /* BXT */
    crystal_hz = 19200000;  /* 19.2 MHz */
    break;
default:
    crystal_hz = 0; 
}

The TSC is not incremented when the processor enter a deep sleep

This should not be a problem on single socket machines but the Linux kernel has some comment about the TSC being reset even on non-deep sleep states.

The context switches will poison the measurements

There nothing you can do about it.
This actually prevents you from time-keeping with the TSC.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!