How to find the processor queue length in linux

后端 未结 4 1169
执笔经年
执笔经年 2021-02-06 04:56

Trying to determine the Processor Queue Length (the number of processes that ready to run but currently aren\'t) on a linux machine. There is a WMI call in Windows for this metr

4条回答
  •  暖寄归人
    2021-02-06 05:35

    The metrics you seek exist in /proc/schedstat.

    The format of this file is described in sched-stats.txt in the kernel source. Specifically, the cpu lines are what you want:

    CPU statistics
    --------------
    cpu 1 2 3 4 5 6 7 8 9
    
    
    First field is a sched_yield() statistic:
         1) # of times sched_yield() was called
    
    
    Next three are schedule() statistics:
         2) This field is a legacy array expiration count field used in the O(1)
        scheduler. We kept it for ABI compatibility, but it is always set to zero.
         3) # of times schedule() was called
         4) # of times schedule() left the processor idle
    
    
    Next two are try_to_wake_up() statistics:
         5) # of times try_to_wake_up() was called
         6) # of times try_to_wake_up() was called to wake up the local cpu
    
    
    Next three are statistics describing scheduling latency:
         7) sum of all time spent running by tasks on this processor (in jiffies)
         8) sum of all time spent waiting to run by tasks on this processor (in
            jiffies)
         9) # of timeslices run on this cpu
    

    In particular, field 8. To find the run queue length, you would:

    1. Observe field 8 for each CPU and record the value.
    2. Wait for some interval.
    3. Observe field 8 for each CPU again, and calculate how much the value has increased.
    4. Dividing that difference by the length of the time interval waited (the documentation says it's in jiffies, but it's actually in nanoseconds since the addition of CFS), by Little's Law, yields the mean length of the scheduler run queue over the interval.

    Unfortunately, I'm not aware of any utility to automate this process which is usually installed or even packaged in a Linux distribution. I've not used it, but the kernel documentation suggests http://eaglet.rain.com/rick/linux/schedstat/v12/latency.c, which unfortunately refers to a domain that is no longer resolvable. Fortunately, it's available on the wayback machine.


    Why not sar or vmstat?

    These tools report the number of currently runnable processes. Certainly if this number is greater than the number of CPUs, some of them must be waiting. However, processes can still be waiting even when the number of processes is less than the number of CPUs, for a variety of reasons:

    • A process may be pinned to a particular CPU.
    • The scheduler may decide to schedule a process on a particular CPU to make better utilization of cache, or for NUMA optimization reasons.
    • The scheduler may intentionally idle a CPU to allow more time to a competing, higher priority process on another CPU that shares the same execution core (a hyperthreading optimization).
    • Hardware interrupts may be processable only on particular CPUs for a variety of hardware and software reasons.

    Moreover, the number of runnable processes is only sampled at an instant in time. In many cases this number may fluctuate rapidly, and the contention may be occurring between the times the metric is being sampled.

    These things mean the number of runnable processes minus the number of CPUs is not a reliable indicator of CPU contention.

提交回复
热议问题