Strange results while measuring delta time on Linux

问题

Update: fixed delta calculations in code, still the issue remains

Folks, could you please explain why I'm getting very strange results from time to time using the the following code:

#include <unistd.h>
#include <sys/time.h>
#include <stdio.h>

int main()
{
  struct timeval start, end;
  long mtime1, mtime2, diff;

  while(1)
  {
    gettimeofday(&start, NULL);

    usleep(2000);

    gettimeofday(&end, NULL);

    mtime1 = (start.tv_sec * 1000 + start.tv_usec/1000.0);
    mtime2 = (end.tv_sec * 1000 + end.tv_usec/1000.0);

    diff = mtime2 - mtime1;
    if(diff > 10) 
        printf("WTF: %ld\n", diff);
  }

  return 0;
}

(You can compile and run it with: gcc test.c -o out -lrt && ./out)

What I'm experiencing is sporadic big values of diff variable almost every second or even more often, e.g:

$ gcc test.c -o out -lrt && ./out 
WTF: 14
WTF: 11
WTF: 11
WTF: 11
WTF: 14
WTF: 13
WTF: 13
WTF: 11
WTF: 16

How can this be possible? Is it OS to blame? Does it do too much context switching? But my box is idle( load average: 0.02, 0.02, 0.3).

Here is my Linux kernel version:

$ uname -a
Linux kurluka 2.6.31-21-generic #59-Ubuntu SMP Wed Mar 24 07:28:56 UTC 2010 i686 GNU/Linux

回答1:

The sleep functions only make sure that you sleep AT LEAST a certain amount of time. As Linux is not a real time OS, you CAN NOT be sure that it will sleep ONLY the amount of time you want. It is a problem as you CAN NOT count on that value. As you have pointed out, it happens that the sleep time is really BIG.

The Linux scheduler can not guarantee that. With a real time OS you can get that.

Your formula is wrong in a way, but I think that can not be the reason why you have such big sleep time. I check the two formulae with this snippet, and I have the same result :

#include <unistd.h>
#include <sys/time.h>
#include <time.h>
#include <stdio.h>

int main()
{
  struct timeval start, end;
  long mtime, mtime2, start_time, end_time, seconds, useconds;

  while(1)
  {
    gettimeofday(&start, NULL);

    usleep(2000);

    gettimeofday(&end, NULL);

    seconds  = end.tv_sec  - start.tv_sec;
    useconds = end.tv_usec - start.tv_usec;

    mtime = ((seconds) * 1000 + useconds/1000.0) + 0.5;

    start_time = ((start.tv_sec) * 1000 + start.tv_usec/1000.0) + 0.5;
    end_time = ((end.tv_sec) * 1000 + end.tv_usec/1000.0) + 0.5;

    mtime2 = end_time - start_time;

    if(mtime > 10 || mtime2 > 10)
    {
      printf("WTF: %ld\n", mtime);
      printf("WTF2: %ld\n", mtime2);
    }
  }

  return 0;
}

The results :

$ gcc test.c -o out -lrt && ./out
WTF: 11
WTF2: 12
WTF: 21
WTF2: 21

I thought it wrong as the useconds part is cyclic and could lead to a big negative difference. But it will not lead to such big time as you use signed long integers ...

my2cents

Edit : from man nanosleep :

The current implementation of nanosleep() is based on the normal kernel timer mechanism, which has a resolution of 1/HZ s (see time(7)). Therefore, nanosleep() pauses always for at least the specified time, however it can take up to 10 ms longer than specified until the process becomes runnable again. For the same reason, the value returned in case of a delivered signal in *rem is usually rounded to the next larger multiple of 1/HZ s.

回答2:

My guess is that it's related to the OS. Try running the process at realtime priority (see the chrt program) and see if that helps.

On another note, you're calculating mtime incorrectly. Here' a routine that I use, although it's for struct timespec rather than struct timeval (nanoseconds instead of microseconds) the principle should be clear:

timespec diff(timespec start, timespec end)
{
    timespec temp;
    if ((end.tv_nsec - start.tv_nsec) < 0) {
        temp.tv_sec = end.tv_sec - start.tv_sec - 1;
        temp.tv_nsec = 1000000000 + end.tv_nsec - start.tv_nsec;
    } else {
        temp.tv_sec = end.tv_sec - start.tv_sec;
        temp.tv_nsec = end.tv_nsec - start.tv_nsec;
    }
    return temp;
}

回答3:

Found it, see the man page

http://linux.die.net/man/3/usleep

by the granularity of system timers.

which is 10 ms afaik. So the usleep can expire long before the process gets rescheduled.

It's also consistent with the values you get which are in the order of magnitude of a "normal" time slice.

回答4:

I've already done such measures, and my conclusion is exactly the same. Happens equally on Windows and Linux.

A program building an histogram in 10^-n of seconds gives the following results.

0.1 0
0.01 0
0.001 0
0.0001 2
1e-05 24
1e-06 69726023
Total: 69726049
Duration: 6.47403 seconds.
Average: 0.0928495 microseconds.

But note that this is on a brand new system. I recall using this a year ago on a 2004 system, and having a couple of hits per second in the 0.01 range (above 10 ms).

回答5:

Your formula is wrong. You have to convert both times in the same scale. In your example ms.

double mtime1 = (start.tv_sec * 1000 + start.tv_usec/1000.0) ;
double mtime2 = (end.tv_sec * 1000 + end.tv_usec/1000.0) ;

double diff = mtime2 - mtime1;
if(diff > 10) 
  printf("WTF: %ld\n", diff);

You have to substract the corrected values

Example: t1 = 1.999999 t2 = 2.000001 so an interval of 2 µs

With your formula you calculate:

2 - 1 == 1 and 1 - 9999999 giving a result of (1 * 1000 - 999998 / 1000) + 0.5 == 0.502 which is obviously false.

My method gives:

mtime1 = (1 * 1000 + 999999 / 1000) = 1999.999
mtime2 = (2 * 1000 +      1 / 1000) = 2000.001

2000.001 - 1999.999 = 0.002 ms

来源：https://stackoverflow.com/questions/2882097/strange-results-while-measuring-delta-time-on-linux

标签

Linux

gettimeofday