×

Reasoning task dependencies for robust service selection in data intensive workflows. (English) Zbl 1337.68042

Summary: Selecting appropriate services for task execution in workflows should not only consider budget and deadline constraints, but also ensure the best probability that workflow will succeed and minimize the potential loss in case of exceptions. This requirement is more critical for data-intensive applications in grids or clouds since any failure is costly. Therefore, we design a fine-grained risk evaluation model customized for workflows to precisely compute the cost of failure for selected services. In comparison with current course-grained model, ours takes the relation of task dependency into consideration and assigns higher impact factor to tasks at the end. Thereafter, we design the utility function with the model and apply a genetic algorithm to find the optimized service allocations, thereby maximizing the robustness of the workflow while minimizing the possible risk of failure. Experiments and analysis show that the application of customized risk evaluation model into service selection can generally improve the successful probability of a workflow while reducing its exposure to the risk.

MSC:

68M14 Distributed systems
68M15 Reliability, testing and fault tolerance of networks and computer systems

Software:

PSPLIB; JGAP
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Cardoso J, Sheth A, Miller J, Arnold J, Kochut K (2004) Quality of service for workflows and web service processes. Web Semant Sci Serv Agents World Wide Web 1(3):281-308 · doi:10.1016/j.websem.2004.03.001
[2] Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-science: an overview of workflow system features and capabilities. Futur Gener Comput Syst 25(5):528-540 · doi:10.1016/j.future.2008.06.012
[3] Hoffa C, Mehta G, Freeman T, Deelman E, Keahey K, Berriman B, Good J (2008) On the use of cloud computing for scientific workflows. In: Proceedings of the 2008 fourth IEEE international conference on eScience. IEEE computer society, Washington, DC, USA, pp 640-645
[4] Kllapi H, Sitaridi E, Tsangaris MM, Ioannidis Y (2011) Schedule optimization for data processing flows on the cloud. In: Proceedings of the 2011 ACM international conference on management of data. ACM, New York, pp 289-300
[5] Kokash N, D’Andrea V (2007) Evaluating quality of web services: a risk-driven approach. In: Abramowicz W (ed) Business information systems. Lecture Notes in Computer Science, vol 4439. Springer, Berlin, pp 180-194
[6] Kolisch R, Sprecher A, Drexl A (2005) PSPLIB—project scheduling problem library V2.1. http://129.187.106.231/psplib/. Accessed 28 Mar 2013 · Zbl 0947.90587
[7] Lin C, Lu S (2011) Scheduling scientific workflows elastically for cloud computing. In: Proceedings of 2011 IEEE international conference on cloud, Computing, pp 746-747
[8] Ma H, Schewe KD, Thalheim B, Wang Q (2009) A theory of data-intensive software services. Serv Orient Comput Appl 3(4):263-283 · doi:10.1007/s11761-009-0051-x
[9] Meffert K, Rotstan N, Knowles C, Sangiorgi UB (2012) JGAP—Java genetic algorithms and genetic programming package V3.6. http://jgap.sourceforge.net/. Accessed 28 Mar 2013
[10] Olston C, Chiou G, Chitnis L, Liu F, Han Y, Larsson M, Neumann A, Rao VB, Sankarasubramanian V, Seth S, Tian C, ZiCornell T, Wang X (2011) Nova: continuous pig/hadoop workflows. In: Proceedings of the 2011 ACM international conference on management of data. ACM, New York,, pp 1081-1090
[11] Pettifer S, Ison J, Kalas M, Thorne D, McDermott P, Jonassen I, Liaquat A, Fernandez JM, Rodriguez JM, Partners I, Pisano DG, Blanchet C, Uludag M, Rice P, Bartaseviciute E, Rapacki K, Hekkelman M, Sand O, Stockinger H, Clegg AB, Bongcam-Rudloff E, Salzemann J, Breton V, Attwood TK, Cameron G, Vriend G (2010) The embrace web service collection. Nucleic Acids Res 38:683-688 · doi:10.1093/nar/gkq297
[12] Qi L, Lin W, Dou W, Jiang J, Chen J (2011) A QoS-aware exception handling method in scientific workflow execution. Concurr Comput Pract Exp 23(16):1951-1968 · doi:10.1002/cpe.1737
[13] Rahman M, Ranjan R, Buyya R (2010) Reputation-based dependable scheduling of workflow applications in peer-to-peer grids. Comput Netw 54:3341-3359 · doi:10.1016/j.comnet.2010.05.016
[14] Skene J, Raimondi F, Emmerich W (2010) Service-level agreements for electronic services. IEEE Trans Softw Eng 36(2):288-304 · doi:10.1109/TSE.2009.55
[15] Vanhatalo J, Völzer H, Leymann F, Moser S (2008) Automatic workflow graph refactoring and completion. In: Proceedings of the 6th international conference on service-oriented computing. Springer, Berlin, pp 100-115 · Zbl 1268.68070
[16] Wang M, Ramamohanarao K, Chen J (2009) Trust-based robust scheduling and runtime adaptation of scientific workflow. Concurr Comput Pract Exp 21(16):1982-1998 · doi:10.1002/cpe.1456
[17] Wang X, Yeo CS, Buyya R, Su J (2011) Optimizing the makespan and reliability for workflow applications with reputation and a look-ahead genetic algorithm. Futur Gener Comput Syst 27(8):1124-1134 · doi:10.1016/j.future.2011.03.008
[18] Weißbach M, Zimmermann W (2010) Termination analysis of business process workflows. In: Proceedings of the 5th international workshop on enhanced web service technologies. ACM, New York, pp 18-25
[19] Yeo CS, Buyya R (2007) Integrated risk analysis for a commercial computing service. In: IEEE international parallel and distributed processing symposium, pp 1-10.
[20] Zhang X, Liu C, Nepal S, Chen J (2013a) An efficient quasi-identifier index based approach for privacy preservation over incremental data sets on cloud. J Comput Syst Sci 79(5):542-555 · Zbl 1268.68070 · doi:10.1016/j.jcss.2012.11.008
[21] Zhang X, Liu C, Nepal S, Pandey S, Chen J (2013b) A privacy leakage upper-bound constraint based approach for cost-effective privacy preserving of intermediate datasets in cloud. IEEE Trans Parallel Distrib Syst 24(6):1192-1202 · doi:10.1109/TPDS.2012.238
[22] Zhang X, Yang LT, Liu C, Chen J (2013c), A scalable two-phase top-down specialization approach for data anonymization using mapreduce on cloud. IEEE Trans Parallel Distrib Syst 99 (PrePrints)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.