Calculate system time using rdtsc

后端 未结 2 1399
栀梦
栀梦 2020-12-03 09:08

Suppose all the cores in my CPU have same frequency, technically I can synchronize system time and time stamp counter pairs for each core every millisecond or so. Then based

相关标签:
2条回答
  • 2020-12-03 09:45

    The idea is not unsound but it is not suited for user-mode applications, for which, as @Basile suggested, there are better alternatives.

    Intel itself suggests to use the TSC as a wall-clock:

    The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states.
    This is the architectural behaviour moving forward. On processors with invariant TSC support, the OS may use the TSC for wall clock timer services (instead of ACPI or HPET timers). TSC reads are much more efficient and do not incur the overhead associated with a ring transition or access to a platform resource.

    However, care must be taken.

    The TSC is not always invariant

    In older processors the TSC is incremented on every internal clock cycle, it was not a wall-clock.
    Quoting Intel

    For Pentium M processors (family [06H], models [09H, 0DH]); for Pentium 4 processors, Intel Xeon processors (family [0FH], models [00H, 01H, or 02H]); and for P6 family processors: the time-stamp counter increments with every internal processor clock cycle.

    The internal processor clock cycle is determined by the current core-clock to bus-clock ratio. Intel® SpeedStep® technology transitions may also impact the processor clock.

    If you only have a variant TSC, the measurement are unreliable for tracking time. There is hope for invariant TSC though.

    The TSC is not incremented at the frequency advised on the brand string

    Still quoting Intel

    the time-stamp counter increments at a constant rate. That rate may be set by the maximum core-clock to bus-clock ratio of the processor or may be set by the maximum resolved frequency at which the processor is booted. The maximum resolved frequency may differ from the processor base frequency.
    On certain processors, the TSC frequency may not be the same as the frequency in the brand string.

    You can't simply take the frequency written on the box of the processor.
    See below.

    rdtsc is not serialising

    You need to serialise it from above and below.
    See this.

    The TSC is based on the ART (Always Running Timer) when invariant

    The correct formula is

    TSC_Value = (ART_Value * CPUID.15H:EBX[31:0] )/ CPUID.15H:EAX[31:0] + K
    

    See section 17.15.4 of the Intel manual 3.

    Of course, you have to solve for ART_Value since you start from a TSC_Value. You can ignore the K as you are interested in deltas only. From the ART_Value delta you can get the time elapsed once you know the frequency of the ART. This is given as k * B where k is a constant in the MSR MSR_PLATFORM_INFO and B is 100Mhz or 133+1/3 Mhz depending on the processor.

    As @BeeOnRope pointed out, from Skylake the ART crystal frequency is no longer the bus frequency.
    The actual values, maintained by Intel, can be found in the turbostat.c file.

    switch(model) 
    {
    case INTEL_FAM6_SKYLAKE_MOBILE: /* SKL */
    case INTEL_FAM6_SKYLAKE_DESKTOP:    /* SKL */
    case INTEL_FAM6_KABYLAKE_MOBILE:    /* KBL */
    case INTEL_FAM6_KABYLAKE_DESKTOP:   /* KBL */
        crystal_hz = 24000000;  /* 24.0 MHz */
        break;
    case INTEL_FAM6_SKYLAKE_X:  /* SKX */
    case INTEL_FAM6_ATOM_DENVERTON: /* DNV */
        crystal_hz = 25000000;  /* 25.0 MHz */
        break;
    case INTEL_FAM6_ATOM_GOLDMONT:  /* BXT */
        crystal_hz = 19200000;  /* 19.2 MHz */
        break;
    default:
        crystal_hz = 0; 
    }
    

    The TSC is not incremented when the processor enter a deep sleep

    This should not be a problem on single socket machines but the Linux kernel has some comment about the TSC being reset even on non-deep sleep states.

    The context switches will poison the measurements

    There nothing you can do about it.
    This actually prevents you from time-keeping with the TSC.

    0 讨论(0)
  • 2020-12-03 09:58

    Don't do that -using yourself directly the RDTSC machine instruction- (because your OS scheduler could reschedule other threads or processes at arbitrary moments, or slow down the clock). Use a function provided by your library or OS.

    My main objective is to avoid the need to perform system call every time I want to know the system time

    On Linux, read time(7) then use clock_gettime(2) which is really quick (and does not involve any slow system call) thanks to vdso(7).

    On a C++11 compliant implementation, simply use the standard <chrono> header. And standard C has clock(3) (giving microsecond precision). Both would use on Linux good enough time measurement functions (so indirectly vdso)

    Last time I measured clock_gettime it often took less than 4 nanoseconds per call.

    0 讨论(0)
提交回复
热议问题