×

An optimal preemptive algorithm for online MapReduce scheduling on two parallel machines. (English) Zbl 1395.90147

Summary: In this paper, we study an online scheduling on two parallel machines in MapReduce-like system where each job contains two kinds of tasks: map tasks and reduce tasks. A job’s reduce tasks can only be processed after all its map tasks are finished. We assume that the map tasks are fractional and the reduce tasks are preemptive. Our objective is to minimize makespan. We show that the lower bound for this MapReduce scheduling problem is \(\sqrt{2}\). We then present an online algorithm with competitive ratio of \(\sqrt{2}\) and thus it is optimal.

MSC:

90B35 Deterministic scheduling theory in operations research

Software:

MapReduce
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Bechini, A, Marcelloni, F and Segatori, A. A MapReduce solution for associative classification of big data. Information Sciences, 2015, 332 :33-55.
[2] Chang, H, Kodialam, M, Kompella, RR, Lakshman, TV, Lee, M and Mukherjee, S (2015). Scheduling in mapreduce-like systems for fast completion time. Proceedings - IEEE INFOCOM, 2(3), 3074-3082.
[3] Chen, B, van Vliet, A and Woeginger, G (1995). An optimal agorithm for preemptive on-line scheduling. Operations Research Letters, 18, 127-131. · Zbl 0855.90069
[4] Chen, C, Xu, Y, Zhu, Y and Sun, C (2017). Online MapReduce scheduling problem of minimizing the makespan. Journal of Combinatorial Optimization, 33, 590-608. · Zbl 1362.90184
[5] Dean, J and Ghemawat, S (2004). MapReduce: Simplified data processing on large clusters. Proceedings of Operating Systems Design and Implementation \((O S D I), 51(1), 107-113\).
[6] Faigle, U, Kern, W and Turan, G (1989). On the performance of online algorithms for partition problems. Acta Cybernetica, 9, 107-119. · Zbl 0689.68051
[7] Isard, M, Prabhakaran, V, Currey, J, Wieder, U, Talwar, K and Goldberg, A (2009). Quincy: Fair scheduling for distributed computing clusters. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, ACM, pp. 261-276.
[8] Kolb, L, Thor, A and Rahm, E (2012). Load balancing for mapreduce based entity resolution. In IEEE 28th International Conference on Data Engineering \((I C D E)\), pp. 618-629.
[9] Luo, T, Zhu, Y, Wu, W, Xu, Y and Du, D (2017). Online makespan minimization in MapReduce-like systems with complex reduce tasks. Optimization Letters, 11, 271-277. · Zbl 1369.90081
[10] Moseley, B, Dasgupta, A, Kumar, R and Sarlós, T (2011). On scheduling in map-reduce and flowshops. Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures, ACM, pp. 289-298.
[11] Zaharia, M, Borthakur, D, Sen Sarma, J, Elmeleegy, K, Shenker, S and Stoica, I (2010). Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. Proceedings of the 5th European Conference on Computer Systems, ACM, pp. 265-278.
[12] Zheng, Y, Shroff, NB and Sinha, P (2013). A new analytical technique for designing provably efficient MapReduce schedulers. 2013 Proceedings IEEE, INFOCOM, pp. 1600-1608.
[13] Zhu, Y, Jiang, Y, Wu, W, Ding, L, Teredesai, A, Li, D and Lee, W (2014). Minimizing makespan and total completion time in mapreduce-like systems. In IEEE, INFOCOM \(\)’ 14, pp. 2166-2174.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.