TY - GEN
T1 - Efficient Jobs Dispatching in Emerging Clouds
AU - Bitton, Shimon
AU - Emek, Yuval
AU - Kutten, Shay
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/10/8
Y1 - 2018/10/8
N2 - This study was carried in the context of the development of technologies for a cloud that uses an optical network for internal communication. The problem addressed in this paper deals with dispatching jobs - units of work, to be performed by machines on the cloud. Sending (or migrating) a job to a machine involves establishing a lightpath (a la circuit switching); this incurs a significant setup cost, but while it exists, the lightpath's capacity is very high. Hence, moving one job is about as expensive as moving a set of jobs over the same lightpath. Our goal is to develop online network dispatching algorithms for a work conserving job scheduling. That is, consider a set of jobs the dispatcher is responsible for their executions on some set S-{M} of machines. Any machine in the network may join or (when not executing a job) leave S-{M} according to decisions made outside the scope of this paper. Whenever a machine joins the set or is in the set and has just finished executing a job, it issues a request for a new job to perform and the dispatcher must send this machine a job that has not been executed yet (if such exists). Every machine can perform any of the jobs, and each job is performed on a single machine. The main algorithmic challenge in this context boils down to the following questions: How many jobs should we send to a requesting machine (or to some intermediate storage to be distributed from there)? From the storage on which machine should these jobs be taken? The algorithms developed here are shown to be efficient in reducing the costs of establishing lightpaths. As opposed to related algorithms for delivering consumable resources (in other contexts), we prove that our online algorithms are fully competitive. We present randomized online algorithms for two different settings: in the first it is assumed that each message requires establishing a lightpath and thus, incurs a setup cost; in the second, we distinguish between messages that carry job sets and small control messages sent by the algorithm, where the latter type of messages is assumed to be sent over a designated (non-optical) control plane at a negligible cost. Our algorithms are quite simple, though the analysis turns out to be rather involved. They are designed (and rigorously analyzed) for a general architecture, but would be especially efficient in fat tree architectures - the common choice in many data centers.
AB - This study was carried in the context of the development of technologies for a cloud that uses an optical network for internal communication. The problem addressed in this paper deals with dispatching jobs - units of work, to be performed by machines on the cloud. Sending (or migrating) a job to a machine involves establishing a lightpath (a la circuit switching); this incurs a significant setup cost, but while it exists, the lightpath's capacity is very high. Hence, moving one job is about as expensive as moving a set of jobs over the same lightpath. Our goal is to develop online network dispatching algorithms for a work conserving job scheduling. That is, consider a set of jobs the dispatcher is responsible for their executions on some set S-{M} of machines. Any machine in the network may join or (when not executing a job) leave S-{M} according to decisions made outside the scope of this paper. Whenever a machine joins the set or is in the set and has just finished executing a job, it issues a request for a new job to perform and the dispatcher must send this machine a job that has not been executed yet (if such exists). Every machine can perform any of the jobs, and each job is performed on a single machine. The main algorithmic challenge in this context boils down to the following questions: How many jobs should we send to a requesting machine (or to some intermediate storage to be distributed from there)? From the storage on which machine should these jobs be taken? The algorithms developed here are shown to be efficient in reducing the costs of establishing lightpaths. As opposed to related algorithms for delivering consumable resources (in other contexts), we prove that our online algorithms are fully competitive. We present randomized online algorithms for two different settings: in the first it is assumed that each message requires establishing a lightpath and thus, incurs a setup cost; in the second, we distinguish between messages that carry job sets and small control messages sent by the algorithm, where the latter type of messages is assumed to be sent over a designated (non-optical) control plane at a negligible cost. Our algorithms are quite simple, though the analysis turns out to be rather involved. They are designed (and rigorously analyzed) for a general architecture, but would be especially efficient in fat tree architectures - the common choice in many data centers.
UR - http://www.scopus.com/inward/record.url?scp=85056156335&partnerID=8YFLogxK
U2 - 10.1109/INFOCOM.2018.8485864
DO - 10.1109/INFOCOM.2018.8485864
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85056156335
T3 - Proceedings - IEEE INFOCOM
SP - 2033
EP - 2041
BT - INFOCOM 2018 - IEEE Conference on Computer Communications
T2 - 2018 IEEE Conference on Computer Communications, INFOCOM 2018
Y2 - 15 April 2018 through 19 April 2018
ER -