Stochastic allocation and scheduling for conditional task graphs in multi-processor systems-on-chip.

*(English)*Zbl 1232.68017Summary: Embedded systems designers are turning to multicore architectures to satisfy the ever-growing computational needs of applications within a reasonable power envelope. One of the most daunting challenges for multiprocessor system-on-chip (MPSoC) platforms is the development of tools for efficient mapping multi-task applications onto hardware platforms. Software mapping can be formulated as an optimal allocation and scheduling problem, where the application is modeled as a task graph, the target hardware is modeled as a set of heterogeneous resources, and the objective function represents a design goal \(\alpha \) (e.g. minimum execution time, minimum usage of communication resources, etc.). Conditional task graphs, where inter-task edges represent data as well as control dependencies, are a well-known computational model to describe complex real-life applications where alternative execution paths, guarded by conditionals, can be specified. Each condition has a probability associated with each possible outcome.

Mapping conditional task graphs is significantly more challenging than mapping pure data-flow graphs (where edges only represent data dependencies). Approaches based on general-purpose complete solvers (e.g. integer linear programming solvers) are limited both by computational blowup and by the fact that the objective is a stochastic functional. The main contribution of our work is an efficient and complete approach to allocation and scheduling of conditional task graphs, based on (i) an exact analytic formulation of the stochastic objective function exploiting task graph analysis and (ii) an extension of the timetable constraint for conditional activities. Moreover, our solver is integrated in a complete application development environment which produces executable code for target multicore platforms. This integrated framework allows us to validate modeling assumptions and to assess constraint satisfaction and objective function optimization. Extensive validation results demonstrate not only that our approach can handle non-trivial instances efficiently, but also that our models are accurate and lead to optimal and highly predictable execution.

Mapping conditional task graphs is significantly more challenging than mapping pure data-flow graphs (where edges only represent data dependencies). Approaches based on general-purpose complete solvers (e.g. integer linear programming solvers) are limited both by computational blowup and by the fact that the objective is a stochastic functional. The main contribution of our work is an efficient and complete approach to allocation and scheduling of conditional task graphs, based on (i) an exact analytic formulation of the stochastic objective function exploiting task graph analysis and (ii) an extension of the timetable constraint for conditional activities. Moreover, our solver is integrated in a complete application development environment which produces executable code for target multicore platforms. This integrated framework allows us to validate modeling assumptions and to assess constraint satisfaction and objective function optimization. Extensive validation results demonstrate not only that our approach can handle non-trivial instances efficiently, but also that our models are accurate and lead to optimal and highly predictable execution.

##### MSC:

68M07 | Mathematical problems of computer architecture |

68M20 | Performance evaluation, queueing, and scheduling in the context of computer systems |

05C90 | Applications of graph theory |

##### Keywords:

mapping; scheduling; allocation; MPSoC; ILP; constraint programming; system design; conditional task graph##### Software:

JaCoP
PDF
BibTeX
XML
Cite

\textit{M. Lombardi} et al., J. Sched. 13, No. 4, 315--345 (2010; Zbl 1232.68017)

Full Text:
DOI

##### References:

[1] | AMD (Advanced Micro Devices). http://multicore.amd.com/us-en/AMD-Multi-Core.aspx . |

[2] | ARM Semiconductor, ARM11 MPCore Multiprocessor. Available at http://arm.convergencepromotions.com/catalog/753.htm . |

[3] | Axelsson, J. (1997). Architecture synthesis and partitioning of real-time systems: A comparison of three heuristic search strategies. In Proceedings of the international conference hardware/software codesign and system synthesis, CODES’97 (p. 161). |

[4] | Baptiste, P., Le Pape, C., & Nuijten, W. (2003). Constraint-based scheduling. Dordrecht: Kluwer Academic. · Zbl 1094.90002 |

[5] | Bender, A. (1996). MILP based task mapping for heterogeneous multiprocessor systems. In Proceedings of the European design and automation conference, EURO-DAC’96–EURO-VHDL’96 (p. 197). |

[6] | Benders, J. (1962). Partitioning procedures for solving mixed-variables programming problems. Numerische Mathematik, 4, 238–252. · Zbl 0109.38302 · doi:10.1007/BF01386316 |

[7] | Benini, L., Bertozzi, D., Guerri, A., & Milano, M. (2005). Allocation and scheduling for MPSoCs via decomposition and no-good generation. In Proceedings of the international conference on principles and practice of constraint programming, CP2005 (pp. 107–121). · Zbl 1153.68448 |

[8] | Benini, L., Bertozzi, D., Guerri, A., & Milano, M. (2006). Allocation, scheduling and voltage scaling on energy aware MPSoCs. In Proceedings of international conference on integration of AI and OR techniques in constraint programming for combinatorial optimization problems, CPAIOR 2006 (pp. 44–58). · Zbl 1177.68023 |

[9] | Benini, L., Lombardi, M., Milano, M., & Ruggiero, M. (2008a). A constraint programming approach for allocation and scheduling on the cell broadband engine. In Proceedings of the international conference on principles and practice of constraint programming, CP 2008 (pp. 21–35). |

[10] | Benini, L., Lombardi, M., Mantovani, M., Milano, M., & Ruggiero, M. (2008b). Multi-stage benders decomposition for optimizing multicore architectures. In Proceedings of international conference on integration of AI and OR techniques in constraint programming for combinatorial optimization problems, CPAIOR 2008 (pp. 36–50). · Zbl 1142.68506 |

[11] | Borkar, S. (1999). Design challenges of technology scaling. IEEE Micro, 19(4), 23–29. · Zbl 05098433 · doi:10.1109/40.782564 |

[12] | Borkar, S. (2007). Thousand core chips: a technology perspective. In DAC’07: Proceedings of the 44th annual conference on design automation (pp. 746–749). |

[13] | Brodersen, R. W., Horowitz, M. A., Markovic, D., Nikolic, B., & Stojanovic, V. (2002). Methods for true power minimization. In Proceedings of the 2002 IEEE/ACM international conference on computer-aided design, ICCAD’02 (pp. 35–42). |

[14] | Brunnbauer, W., Wild, T., Foag, J., & Pazos, N. (2003). A constructive algorithm with look-ahead for mapping and scheduling of task graphs with conditional edges. In Proceedings of the euromicro conference on digital system design, DSD 2003 (pp. 98–103). |

[15] | Chatha, K. S., & Vemuri, R. (2002). Hardware-software partitioning and pipelined scheduling of transformative applications. IEEE Transactions on Very Large Scale Integrated Systems, 10(3), 193–208. · Zbl 05460407 · doi:10.1109/TVLSI.2002.1043323 |

[16] | CISCO Systems, http://www.cisco.com/en/US/products/ps5763/ . |

[17] | Cradle Technologies, The multi-core DSP advantage for multimedia. Available at http://www.cradle.com/ . |

[18] | Dolif, E., Lombardi, M., Ruggiero, M., Milano, M., & Benini, L. (2007). Communication-aware stochastic allocation and scheduling framework for conditional task graphs in multi-processor systems-on-chip. In Proceedings of the international conference on embedded software, EMSOFT’07 (p. 56). |

[19] | Eles, P., Kuchcinski, K., Peng, Z., Doboli, A., & Pop, P. (1998). Scheduling of conditional process graphs for the synthesis of embedded systems. In Proceedings of the conference on design, automation and test in Europe, DATE’98 (pp. 15–29). |

[20] | Eremin, A., & Wallace, M. (2001). Hybrid benders decomposition algorithms in constraint logic programming. In Proceedings of the international conference on principles and practice of constraint programming, CP 2001 (pp. 1–15). · Zbl 1067.68630 |

[21] | Faraboschi, P., Fisher, J., & Young, C. (2001). Instruction scheduling for instruction level parallel processors. Proceedings of the IEEE, 89(11), 1638–1659. · doi:10.1109/5.964443 |

[22] | Francesco, P., Antonio, P., & Marchal, P. (2005). Flexible hardware/software support for message passing on a distributed shared memory architecture. In Proceedings of the conference on Design, automation and test in Europe, DATE’05 (pp. 736–741). |

[23] | Gomes, C., Selman, B., McAloon, K., & Tretkoff, C. (1998). Randomization in backtrack search: Exploiting heavy-tailed profiles for solving hard scheduling problems. In Proceedings of the international conference on AI planning and scheduling, AIPS 98 (pp. 208–213). |

[24] | Goodacre, J., & Sloss, A. N. (2005). Parallelism and the ARM instruction set architecture. Journal Computer, 38(7), 42–50. · Zbl 05089483 · doi:10.1109/MC.2005.239 |

[25] | Graham, J. R. (2007). Integrating parallel programming techniques into traditional computer science curricula. ACM SIGCSE Bullettin, 39(4), 75–78. · doi:10.1145/1345375.1345419 |

[26] | Hooker, J. N. (2005a). A hybrid method for planning and scheduling. Constraints, 10(4), 385–401. · Zbl 1122.90054 · doi:10.1007/s10601-005-2812-2 |

[27] | Hooker, J. N. (2005b). Planning and scheduling to minimize tardiness. In Proceedings of the international conference on principles and practice of constraint programming, CP2005 (pp. 314–327). · Zbl 1153.90423 |

[28] | Hooker, J. N., & Ottosson, G. (2003). Logic-based Benders decomposition. Mathematical Programming, 96(1), 33–60. · Zbl 1023.90082 |

[29] | Horowitz, M. (2007). Scaling, power and the future of CMOS. In Proceedings of the 20th international conference on VLSI design, VLSID’07 (p. 7). |

[30] | Horowitz, M. et al. (2001). The future of wires. Proceedings of the IEEE, 89, 490–504. · doi:10.1109/5.920580 |

[31] | Intel Corporation (2002). Intel IXP2800 network processor product brief. Available at http://download.intel.com/design/network/ProdBrf/27905403.pdf . |

[32] | Jain, V., & Grossmann, I. E. (2001). Algorithms for hybrid milp/cp models for a class of optimization problems. INFORMS Journal on Computing, 13(4), 258–276. · Zbl 1238.90106 · doi:10.1287/ijoc.13.4.258.9733 |

[33] | Kodase, S., Wang, S., Gu, Z., & Shin, K. G. (2003). Improving scalability of task allocation and scheduling in large distributed real-time systems using shared buffers. In Proceedings of the IEEE real-time and embedded technology and applications symposium, RTAS 03 (p. 181). |

[34] | Kuchcinski, K. (1997). Embedded system synthesis by timing constraint solving. In Proceedings of IEEE ISSS’97 (pp. 50–57). |

[35] | Kuchcinski, K. (2003). Constraints-driven scheduling and resource assignment. ACM Transactions on Design Automation of Electronic Systems, 8(3), 355–383. · Zbl 05456790 · doi:10.1145/785411.785416 |

[36] | Kuchcinski, K., & Wolinski, C. (2003). Global approach to assignment and scheduling of complex behaviors based on HCDG and constraint programming. Journal of Systems Architecture, 49(12–15), 489–503. · Zbl 05432040 · doi:10.1016/S1383-7621(03)00075-4 |

[37] | Laborie, P. (2003). Algorithms for propagating resource constraints in AI planning and scheduling: Existing approaches and new results. Artificial Intelligence, 143(2), 151–188. · Zbl 1079.68622 · doi:10.1016/S0004-3702(02)00362-4 |

[38] | Laborie, P. (2005). Complete MCS-based search: application to resource constrained project scheduling. Proceedings of the International Joint Conferences on Artificial Intelligence, IJCAI 2005, 19, 181. |

[39] | Laporte, G., & Louveaux, F. (1993). The integer l-shaped method for stochastic integer programs with complete recourse. Operations Research Letters, 13, 133–142. · Zbl 0793.90043 · doi:10.1016/0167-6377(93)90002-X |

[40] | Le Pape, C., Vergamini, D., & Gosselin, V. (1994). Time-versus-capacity compromises in project scheduling. In Proceedings of the thirteenth workshop of the UK planning special interest group (p. 19). |

[41] | Lombardi, M., & Milano, M. (2006). Stochastic allocation and scheduling for conditional task graphs in MPSoCs. In Proceedings of the international conference on principles and practice of constraint programming, CP2006 (pp. 299–313). |

[42] | Martin, G. (2006). Overview of the MPSoC design challenge. In Proceedings of the 43rd annual conference on design automation, DAC’06 (pp. 274–279). |

[43] | Medardoni, S., Ruggiero, M., Bertozzi, D., Benini, L., Strano, G., & Pistritto, C. (2007). Interactive presentation: capturing the interaction of the communication, memory and I/O subsystems in memory-centric industrial MPSoC platforms. In Proceedings of the conference on design, automation and test in Europe, DATE’07 (p. 665). |

[44] | Mudge, T. (2001). Power: A first-class architectural design constraint. Computer, 34(4), 52–58. · Zbl 05088200 · doi:10.1109/2.917539 |

[45] | NEC, http://www.nec.co.jp/techrep/en/journal/g06/n03/060311.html . · Zbl 0266.20038 |

[46] | Palazzari, P., Baldini, L., & Coli, M. (2004). Synthesis of pipelined systems for the contemporaneous execution of periodic and aperiodic tasks with hard real-time constraints. In Proceedings of the IEEE international parallel & distributed processing symposium. |

[47] | Pham, D. et al. (2005). The design and implementation of a first-generation CELL processor. In Proceedings of the international solid state circuits conference, ISSCC’05 (pp. 45–49). |

[48] | Prakash, S., & Parker, A. C. (1992). Sos: synthesis of application-specific heterogeneous multiprocessor systems. Journal of Parallel and Distributed Computing, 16(4), 338–351. · Zbl 0786.68009 · doi:10.1016/0743-7315(92)90017-H |

[49] | Ruggiero, M., Guerri, A., Bertozzi, D., Poletti, F., & Milano, M. (2006). Communication-aware allocation and scheduling framework for stream-oriented multi-processor systems-on-chip. In Proceedings of the conference on design, automation and test in Europe, DATE’06 (p. 8). |

[50] | Shin, D., & Kim, J. (2003). Power-aware scheduling of conditional task graphs in real-time multiprocessor systems. In Proceedings of the international symposium on low power electronics and design, ISLPED 2003 (pp. 408–413). |

[51] | ST Microelectronics, http://www.st.com/stonline/products/families/mobile/processors/processorsprod.htm . |

[52] | Szymanek, R., & Kuchcinski, K. (2001). A constructive algorithm for memory-aware task assignment and scheduling. In Proceedings of the international conference on hardware-software codesign and system synthesis, CODES’01 (p. 152). |

[53] | Thorsteinsson, E. S. (2001). Branch-and-check: A hybrid framework integrating mixed integer programming and constraint logic programming. In Proceedings of the international conference on principles and practice of constraint programming, CP’01 (pp. 16–30). · Zbl 1067.68677 |

[54] | Wu, D., Al-Hashimi, B., & Eles, P. (2003). Scheduling and mapping of conditional task graph for the synthesis of low power embedded systems. Computers and Digital Techniques, IEEE Proceedings, 150(5), 262–273. · doi:10.1049/ip-cdt:20030837 |

[55] | Xie, Y., & Wolf, W. (2001). Allocation and scheduling of conditional task graph in hardware/software co-synthesis. In Proceedings of the conference on design, automation and test in Europe, DATE 2001 (pp. 620–625). |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.