Abstract
New embedded signal-processing architectures are emerging that are composed of loosely coupled heterogeneous components like CPUs or DSPs, specialized IP cores, reconfigurable units, or memories. We believe that these architectures should be programmed using the process network model of computation. To ease the mapping of applications, we are developing the Compaan compiler that automatically derives a process network (PN) description from an application written in Matlab or C. In this paper, we investigate a particular problem in nested loop programs, which is about classifying the interprocess communication in the PN representation of the nested loop program. The global memory arrays present in the code have to be replaced by a distributed communication structure used for communicating data between the network processes. We show that four types of communication exist, each exhibiting different requirements when realizing them in hardware or software. We first present two compile time tests that are based on integer linear programming to decide the type of the communication. In the second part of this paper, we present alternative classification techniques that have polynomial complexity. However, in some cases, those techniques do not give a definitive answer and the ILP tests have to be applied. All present tests are combined in a hybrid classification scheme that correctly classifies the interprocess communication. In only 5% of the cases to classify, we have to rely on integer linear programming while, in the remaining 95%, the alternative techniques presented in this paper are able to correctly classify each case. The hybrid classification scheme has become an important part of our Compaan compiler.
- Basten, T. and Hoogerbrugge, J. 2001. Efficient execution of process networks. In Communicating Process Architectures---2001, Proceedings. Bristol. 1--14.Google Scholar
- Clauss, P. 1996. Counting solutions to linear and nonlinear constraints through ehrhart polynomials: Applications to analyse and transform scientific programs. In 10th International Conference on Supercomputing, Philadelphia, PA. Google Scholar
Digital Library
- Davis, Jr, II, J., Hylands, C., Kienhuis, B., Lee, E. A., Liu, J., Liu, X., Muliadi, L., Neuendorffer, S., Tsay, J., Vogel, B., and Xiong, Y. 2001. Heterogeneous concurrent modeling and design in java. Tech. Rep. Memorandum UCB/ERL M01/12 (Mar.), University of California, Dept EECS, Berkeley, CA 94720.Google Scholar
- De Greef, E., Catthoor, F., and De Man, H. 1977. Memory size reduction through storage order optimization for embedded parallel multimedia applications. In Parallel Processing and Multimedia. Geneva.Google Scholar
- De Kock, E., Essink, G., Smits, W., van der Wolf, P., Brunel, J.-Y., Kruijtzer, W., Lieverse, P., and Vissers, K. 2000. YAPI: Application modeling for signal processing systems. In Proc. 37th Design Automation Conference (DAC'2000). Los Angeles, CA. 402--405. Google Scholar
Digital Library
- Feautrier, P. 1988. Parametric Integer Programming. In RAIRO Recherche Op?rationnelle, 22, 3, 243--268.Google Scholar
- Kahn, G. 1974. The semantics of a simple language for parallel programming. In Proc. of the IFIP Congress 74. North-Holland, Amsterdam.Google Scholar
- Kienhuis, B., Rypkema, E., and Deprettere, E. 2000. Compaan: Deriving process networks from matlab for embedded signal processing architectures. In Proceedings of the 8th International Workshop on Hardware/Software Codesign (CODES). San Diego, CA. Google Scholar
Digital Library
- Kienhuis, B., Deprettere, E., van der Wolf, P., and Vissers, K. 2002. A Methodology to Design Programmable Embedded Systems. LNCS, vol. 2268. Springer Verlag, New York, 18--37. Google Scholar
Digital Library
- Lefebvre, V. and Feautrier, P. 1998. Automatic storage management in paralel programs. Vol. 24. Parallel Computing. 649--671. Google Scholar
Digital Library
- Nemhauser, G. and Wolsey, L. 1988. Integer and Combinatorial Optimization. Wiley-Interscience, New York. Google Scholar
Digital Library
- Parks, T. 1995. Bounded scheduling of process networks. Ph.D. thesis, EECS Department, University of California, Berkeley. Google Scholar
Digital Library
- PicoChip. 2000. http://www.picochip.com.Google Scholar
- Pugh, W. 1992. The Omega test: A fast and practical integer programming algorithm for dependence analysis. Communications of the ACM 35, 8, 102--114. Google Scholar
Digital Library
- Quillere, F. and Rajopadhye, S. 2000. Optimizing memory usage in the polyhedral model. In ACM Transactions on Programming Languages and Systems. Vol. 22. 773--815. Google Scholar
Digital Library
- Rijpkema, E. 2002. Modeling task level parallelism in piece-wise regular programs. PhD thesis, Leiden Institute of Advanced Computer Science, Leiden Univerity, The Netherlands.Google Scholar
- Schrijver, A. 1986. Theory of Linear and Integer Programming. Wiley, New York. Google Scholar
Digital Library
- Stefanov, T. and Deprettere, E. 2003. Deriving process networks from weakly dynamic applications in system-level design. In Proc. IEEE-ACM-IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES + ISSS'03). Newport Beach, CA. 90--96. Google Scholar
Digital Library
- Stefanov, T., Zissulescu, C., Turjan, A., Kienhuis, B., and Deprettere, E. 2004. System design using kahn process networks: The compaan/laura approach. In Proceedings of DATE2004. Paris. Google Scholar
Digital Library
- Stravers, P. and Hoogerbrugge, J. 2001. Homogeneous multiprocessoring and the future of silicon design paradigms. In Proceedings of the Int. Symposium on VLSI Technology, Systems, and Applications.Google Scholar
- Teich, J. and Thiele, L. 1993. Partitioning of processor arrays: A piecewise regular approach. Integration, the VLSI Journal 14, 297--332. Google Scholar
Digital Library
- Turjan, A. and Kienhuis, B. 2003. Storage management in process networks using the lexicographically maximal preimage. In Proceedings of the IEEE 14th Int. Conf. on Application-specific Systems, Architectures and Processors (ASAP'03). The Hague.Google Scholar
- Turjan, A., Kienhuis, B., and Deprettere, E. 2002. A compile time based approach for solving out-of-order communication in kahn process networks. In Proceedings of the IEEE 13th International Conference on Application-specific Systems, Architectures and Processors (ASAP'02). San Jose, CA. Google Scholar
Digital Library
- Turjan, A., Kienhuis, B., and Deprettere, E. 2004. Translating affine nested loops to Process Networks. In International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES). Washington D.C. Google Scholar
Digital Library
- Wilde, D. and Rajopadhye, S. 2002. Memory reuse in the polyhedral model. In In Proc. Euro-Par96. Lyon, France. Google Scholar
Digital Library
- Xilinx. 2000. http://www.xilinx.com.Google Scholar
- Zissulescu, C., Stefanov, T., Kienhuis, B., and Deprettere, E. 2003. LAURA: Leiden architecture research and exploration tool. In Proc. 13th Int. Conference on Field Programmable Logic and Applications (FPL'03).Google Scholar
Index Terms
Classifying interprocess communication in process network representation of nested-loop programs
Recommendations
Translating affine nested-loop programs to process networks
CASES '04: Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systemsNew heterogeneous multiprocessor platforms are emerging that are typically composed of loosely coupled components that exchange data using programmable interconnections. The components can be CPUs or DSPs, specialized IP cores, reconfigurable units, or ...
A Hierarchical Classification Scheme to Derive Interprocess Communication in Process Networks
ASAP '04: Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International ConferenceThe Compaan compiler automatically derives a Process Network (PN) description from an application written in Matlab. The basic element of a PN is a Producer/Consumer (P/C) pair. Four different communication patterns for a P/C pair have been identified ...
Tiling imperfectly-nested loop nests
SC '00: Proceedings of the 2000 ACM/IEEE conference on SupercomputingTiling is one of the more important transformations for enhancing loca lity of reference in programs. Intuitively, tiling a set of loops achieves the effect of interleaving iterations of these loops. Tiling of perfectly-nested loop nests (which are loop ...






Comments