LAS: Logical-Block Affinity Scheduling in Big Data Analytics Systems

Liang Bao, Chase Q. Wu, Haiyang Qi, Weizhao Chen, Xin Zhang, Weina Han, Wei Wei, En Tail, Hao Wang, Jiahao Zhai, Xiang Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Parallel computing combined with distributed data storage and management has been widely adopted by most big data analytics systems. Scheduling computing tasks to improve data locality is crucial to the performance of such systems. While existing schedulers target near-data scheduling on top of physical data blocks, these systems face a new scheduling problem where computing tasks process table-based datasets directly and access large physical blocks indirectly through their indices stored in associated small logical blocks. This new problem invalidates the basic assumption made by many existing algorithms on near-data scheduling. In this paper, we propose a Logical-block Affinity Scheduling (LAS) algorithm to coordinate the near-data scheduling of computing tasks and the placement of logical blocks for a desired balance between data-locality and load-balancing to maximize system throughput. The proposed algorithm is implemented and evaluated using a well-known big data benchmark and a practical production system deployed in public clouds. Extensive experimental results illustrate the performance superiority of LAS over three existing scheduling algorithms.

Original languageEnglish (US)
Title of host publicationINFOCOM 2018 - IEEE Conference on Computer Communications
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages522-530
Number of pages9
ISBN (Electronic)9781538641286
DOIs
StatePublished - Oct 8 2018
Event2018 IEEE Conference on Computer Communications, INFOCOM 2018 - Honolulu, United States
Duration: Apr 15 2018Apr 19 2018

Publication series

NameProceedings - IEEE INFOCOM
Volume2018-April
ISSN (Print)0743-166X

Other

Other2018 IEEE Conference on Computer Communications, INFOCOM 2018
Country/TerritoryUnited States
CityHonolulu
Period4/15/184/19/18

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'LAS: Logical-Block Affinity Scheduling in Big Data Analytics Systems'. Together they form a unique fingerprint.

Cite this