An adaptive load-balancer for task-scheduling in FastFlow

Autoren Md Moniruzzaman
Kamran Idrees
Michael Roßbory
José Gracia
Editoren
Titel An adaptive load-balancer for task-scheduling in FastFlow
Buchtitel Proceedings of the 5th International Conference on Advanced Communications and Computation (INFOCOMP 2015)
Typ in Konferenzband
Verlag IARIA
ISBN 978-1-61208-416-9
Monat June
Jahr 2015
Seiten 6-12
SCCH ID# 1570
Abstract

Balancing the computational load of multiple concurrent tasks on heterogeneous architectures is one of the critical requirements for efficient usage of such systems. Load-imbalance is inherently present if the computation load is distributed nonuniformly across various tasks or if execution time for the same kind of tasks varies from one class of processing element to the other. Load-imbalance may however also arise from causes that are beyond the control of the user, as for instance operating system jitter, over-subscription of the available workers, interference and resource contention by concurrent tasks, etc. Writing a balanced parallel application requires careful analysis of the problem and good understating of various hardware architectures of the computing nodes. FastFlow is a C++ library that offers high-level parallel pattern abstractions on the user side, and lowers those onto efficiently implemented architecture specific skeletons. The default FastFlow scheduler, however, assigns tasks to workers in a round-robin fashion and is thus not well suited to handle load-imbalance. In this paper, we present an adaptive loadbalancing task scheduler for FastFlow, a model for the expected relative performance of our adaptive scheduler over the default round-robin scheduler, and finally evaluate the quality of the implementation with low-level as well as two specific application benchmarks. We find that the adaptive load-balancer does not introduce additional overheads if load-imbalances are not present, and that our scheme is particularly efficient in mitigating the effect of thread over-subscription. Finally, we show that the proposed scheduler can lead to substantial performance gain for real industrial applications.