×

Asymptotic optimality of power-of-\(d\) load balancing in large-scale systems. (English) Zbl 1455.60124

Summary: We consider a system of \(N\) identical server pools and a single dispatcher in which tasks with unit-exponential service requirements arrive at rate \(\lambda(N)\). In order to optimize the experienced performance, the dispatcher aims to evenly distribute the tasks across the various server pools. Specifically, when a task arrives, the dispatcher assigns it to the server pool with the minimum number of tasks among \(d(N)\) randomly selected server pools. We construct a stochastic coupling to bound the difference in the system occupancy processes between the join-the-shortest-queue (JSQ) policy and a scheme with an arbitrary value of \(d(N)\). We use the coupling to derive the fluid limit in case \(d(N) \to \infty\) and \(\lambda(N)/N \to \lambda\) as \(N \to \infty\) along with the associated fixed point. The fluid limit turns out to be insensitive to the exact growth rate of \(d(N)\) and coincides with that for the JSQ policy. We further establish that the diffusion limit corresponds to that for the JSQ policy as well, as long as \(d(N) / \sqrt{N} \log(N) \to \infty\), and characterize the common limiting diffusion process. These results indicate that the JSQ optimality can be preserved at the fluid and diffusion levels while reducing the overhead by nearly a factor \(O(N)\) and \(O( \sqrt{N} / \log(N))\), respectively.

MSC:

60K25 Queueing theory (aspects of probability theory)
68M20 Performance evaluation, queueing, and scheduling in the context of computer systems
90B22 Queues and service in operations research
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] [1] Atar R, Biswas A, Kaspi H (2018) Law of large numbers for the many-server earliest-deadline-first queue. Stochastic Processes Their Appl. 128(7):2270-2296.Google Scholar · Zbl 1388.60155
[2] [2] Atar R, Budhiraja A, Ramanan K (2008) Deterministic and stochastic differential inclusions with multiple surfaces of discontinuity. Probab. Theory Related Fields 142(1):249-283.Google Scholar · Zbl 1156.60037
[3] [3] Atar R, Biswas A, Kaspi H, Ramanan K (2018) A Skorokhod map on measure-valued paths with applications to priority queues. Ann. Appl. Probab. 28(1):418-481.Google Scholar · Zbl 1391.60220
[4] [4] Borovkov AA (1976) Stochastic Processes in Queueing Theory (Springer, New York).Google Scholar
[5] [5] Costantini C (1992) The Skorohod oblique reflection problem in domains with corners and application to stochastic differential equations. Probab. Theory Related Fields 91(1):43-70.Google Scholar · Zbl 0744.60065
[6] [6] Ephremides A, Varaiya P, Walrand J (1980) A simple dynamic routing problem. IEEE Trans. Automatic Control 25(4):690-693.Crossref, Google Scholar · Zbl 0451.90060
[7] [7] Eschenfeldt P, Gamarnik D (2018) Join the shortest queue with many servers: The heavy traffic-asymptotics. Math. Oper. Res. 43(3):867-886.Link, Google Scholar · Zbl 1433.60087
[8] [8] Ethier SN, Kurtz TG (2009) Markov Processes: Characterization and Convergence (John Wiley & Sons, Hoboken, NJ).Google Scholar
[9] [9] Feller W (1968) An Introduction to Probability Theory and Its Applications, vol. 1, 3rd ed., Wiley Series in Probability and Mathematical Statistics (John Wiley & Sons, New York).Google Scholar
[10] [10] Halfin S, Whitt W (1981) Heavy-traffic limits for queues with many exponential servers. Oper. Res. 29(3):567-588.Google Scholar · Zbl 0455.60079
[11] [11] Hunt P, Kurtz T (1994) Large loss networks. Stochastic Processes Their Appl. 53(2):363-378.Google Scholar · Zbl 0810.60087
[12] [12] Jagerman D (1974) Some properties of the Erlang loss function. Bell System Tech. J. 53(3):525-551.Google Scholar · Zbl 0276.60092
[13] [13] Johri PK (1989) Optimality of the shortest line discipline with state-dependent service rates. Eur. J. Oper. Res. 41(2):157-161.Google Scholar · Zbl 0675.60084
[14] [14] Kruk Ł, Lehoczky J, Ramanan K, Shreve S (2008) Double Skorokhod map and reneging real-time queues. Ethier SN, Feng J, Stockbridge RH, eds. Markov Processes and Related Topics: A Festschrift for Thomas G. Kurtz, Collections, vol. 4 (IMS, Beachwood, OH), 169-193.Crossref, Google Scholar · Zbl 1170.60314
[15] [15] Kruk Ł, Lehoczky J, Ramanan K, Shreve S (2011) Heavy traffic analysis for EDF queues with reneging. Ann. Appl. Probab. 21(2):484-545.Google Scholar · Zbl 1220.60053
[16] [16] Lions PL, Sznitman AS (1984) Stochastic differential equations with reflecting boundary conditions. Comm. Pure Appl. Math. 37(4):511-537.Google Scholar · Zbl 0598.60060
[17] [17] Liptser R, Shiryaev A (1989) Theory of Martingales, Mathematics and Its Applications, vol. 49 (Kluwer, Dordrecht, Netherlands).Crossref, Google Scholar
[18] [18] Menich R (1987) Optimality of shortest queue routing for dependent service stations. Levine WS, ed. 26th IEEE Conf. Decision Control (IEEE, Los Angeles), 1069-1072.Google Scholar
[19] [19] Menich R, Serfozo RF (1991) Optimality of routing and servicing in dependent parallel processing systems. Queueing Systems 9(4):403-418.Google Scholar · Zbl 0748.90077
[20] [20] Mitzenmacher M (2001) The power of two choices in randomized load balancing. IEEE Trans. Parallel Distribution Systems 12(10):1094-1104.Google Scholar
[21] [21] Mukherjee D, Borst SC, Van Leeuwaarden JSH, Whiting PA (2016) Universality of load balancing schemes on the diffusion scale. J. Appl. Probab. 53(4), 1111-1124.Google Scholar · Zbl 1356.60149
[22] [22] Mukherjee D, Borst SC, van Leeuwaarden JSH, Whiting PA (2018) Universality of power-of-d load balancing in many-server systems. Stochastic Systems 8(4):265-292.Google Scholar · Zbl 1446.60073
[23] [23] Mukhopadhyay A, Mazumdar RR, Guillemin F (2015) The power of randomized routing in heterogeneous loss systems. Meo M, Rosenberg C, Wittevrongel S, eds. Proc. 2015 27th Internat. Teletraffic Congress (IEEE, Piscataway, NJ), 125-133.Google Scholar
[24] [24] Mukhopadhyay A, Karthik A, Mazumdar RR, Guillemin F (2015) Mean field and propagation of chaos in multi-class heterogeneous loss models. Performance Evaluation 91:117-131.Google Scholar
[25] [25] Pang G, Talreja R, Whitt W (2007) Martingale proofs of many-server heavy-traffic limits for Markovian queues. Probab. Surveys 4:193-267.Google Scholar · Zbl 1189.60067
[26] [26] Patel P, Bansal D, Yuan L, Murthy A, Greenberg A, Maltz DA, Kern R, et al.. (2013) Ananta: Cloud scale load balancing. ACM SIGCOMM Comput. Comm. Rev. 43(4):207-218.Google Scholar
[27] [27] Ramanan K (2006) Reflected diffusions defined via the extended Skorokhod map. Electronic J. Probab. 11:934-992.Google Scholar · Zbl 1111.60043
[28] [28] Robert P (2003) Stochastic Networks and Queues (Springer, Berlin Heidelberg).Google Scholar
[29] [29] Sparaggis PD, Towsley D, Cassandras CG (1993) Extremal properties of the shortest/longest non-full queue policies in finite-capacity systems with state-dependent service rates. J. Appl. Probab. 30(1):223-236.Google Scholar · Zbl 0768.60090
[30] [30] Sparaggis PD, Towsley D, Cassandras CG (1994) Sample path criteria for weak majorization. Adv. Appl. Probab. 26(1):155-171.Google Scholar · Zbl 0792.60102
[31] [31] Stolyar AL (2015) Pull-based load distribution in large-scale heterogeneous service systems. Queueing Systems 80(4):341-361.Google Scholar · Zbl 1317.90073
[32] [32] Stolyar AL (2017) Pull-based load distribution among heterogeneous parallel servers: The case of multiple routers. Queueing Systems 85(1):31-65.Google Scholar · Zbl 1366.90026
[33] [33] Stolyar AL, Ramanan K (2001) Largest weighted delay first scheduling: Large deviations and optimality. Ann. Appl. Probab. 11(1):1-48.Google Scholar · Zbl 1024.60012
[34] [34] Towsley D (1995) Application of majorization to control problems in queueing systems. Chrétienne P, Coffman EG, Lenstra JK, Liu Z, eds. Scheduling Theory and Its Application (John Wiley & Sons, Chichester, UK), 295-311.Google Scholar
[35] [35] Towsley D, Sparaggis P, Cassandras C (1992) Optimal routing and buffer allocation for a class of finite capacity queueing systems. IEEE Trans. Automatic Control 37(9):1446-1451.Google Scholar · Zbl 0757.60099
[36] [36] Turner SR (1998) The effect of increasing routing choice on resource pooling. Probab. Engrg. Inform. Sci. 12(1):109-124.Google Scholar · Zbl 0971.90017
[37] [37] Vvedenskaya ND, Dobrushin RL, Karpelevich FI (1996) Queueing system with selection of the shortest of two queues: An asymptotic approach. Problemy Peredachi Informatsii 32(1):20-34.Google Scholar
[38] [38] Weber RR (1978) On the optimal assignment of customers to parallel servers. J. Appl. Probab. 15(2):406-413.Google Scholar · Zbl 0378.60095
[39] [39] Whitt W (1984) Heavy-traffic approximations for service systems with blocking. AT&T Bell Laboratories Tech. J. 63(5):689-708.Google Scholar · Zbl 0591.90034
[40] [40] Whitt W (2002) Stochastic-Process Limits, Springer Series in Operations Research and Financial Engineering (Springer-Verlag, New York).Crossref, Google Scholar · Zbl 0993.60001
[41] [41] Winston W (1977) Optimality of the shortest line discipline. J. Appl. Probab. 14(1):181-189.Google Scholar · Zbl 0357.60023
[42] [42] Xie Q,
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.