I\'m trying to understand what happens during the load_balance
function.
I\'m checking version 3.14 but I also took a look at version 4.3 since I was told that
Load balancing is invoked if SMP is configured.Linux uses Completely Fair Scheduler(CFS) to schedule each task so that each one obtain “fair” share of the processor time.. CFS uses concept of red-black tree.
Scheduler records the current time when a task enters into the run queue. While the process waits for processor time, it's “wait” value gets incremented by an amount derived from the total number of tasks currently in the run queue and the process priority. When processor runs this task ,this "wait" value gets decremented. If this value drops below certain value then scheduler will preempt the task and other task come closer to processor to execute. CFS always try to keep situation ideal by keeping "wait" value zero.
There are two functions in linux load_balance
and select_task_rq_fair()
which perform the task of load balance.
In simple word CFS load balance mechanism offload the busy CPU to less busy or ideal one.
task_h_load
is used to calculate weight of task.
What I don't understand is what/how the load is calculated in task_h_load. how it is calculated in the task_h_load function?
This weight factor depends on nice value of the process.
weighting factor = weight of process with nice value 0 / weight of current task;
where ‘weight’ is approximately equivalent to 1024 * (1.25)^(-nice)
For example : weight is 820 for nice value 1 weight is 1277 for nice value -1
task_h_load
For more load-balance methods and basics refer kernel comment
task_h_load
use update_cfs_rq_h_load
to calculate hierarchical load of run queue for fair time scheduling and which use cfs_rq_load_avg return average load of runqueue load_avg.
Virtual run time is the weighted time a task has run on the CPU. CFS always try to keep this Red-black tree in balance.
<--smaller value -----------vruntime value--------larger value-->
Each runnable task is placed in self balancing Red Black tree based on vruntime
. If task is ready to run (means task is not waiting for any resource), It is placed to tree. If task is waiting for some resource (i.e waiting for I/O )then it is removed. Tasks those have less processing time (means smaller vruntime
) are towards left side of tree and tasks having more processing time are right side of tree.
Left node has the smallest key value (For CFS ,it is task with higher priority). Self balancing red black tree need O(lgN) operation to navigate towards left node. Scheduler caches this value in rb_leftmost
. By retrieving only this value ,scheduler determine which task to run next
This load balance is used for non real task only for real time task push-pull operations are used which is developed by Steven Rostedt and Gregory Haskins
One more thing about CFS is also helpful in Fair Group Scheduling .Consider below figure
move_task is simply try to move up imbalance weighted load (means after calculating load factor as shown above ) from busies to this_rq
. It try to equal the load balance of both run queue so that both can spare "fair" processor time.
detach_task
detach the task for the migration specified in env
pointer of linux kernel.
detach_one_task
tries to dequeue exactly one task from env->src_rq.
detach_tasks() tries to detach up to imbalance weighted load from
busiest_rq
. And it returns number of detached tasks or else zero if fail to detach.
To attach this detached task to new rq
(run queue) , attach_task, attach_one_task,attach_tasks
is used according to situation.
New warning check lockdep_assert_held() is introduced in detach_tasks
which was not present in move_tasks
On multi-processor it is not easy to move task so easily so cfs do domain specific load-balancing as below:
For all this to understand I would like you to go through following reference
Per-Entity-Load-Tracking metric for load balancing
weight factor
Open-suse
Robert-love Chapter 3
CPU-scheduling
I specially read all these documentation to answer your question I hope you don't mind to comment if something is missing.