Abstract
The extension of a given instruction-set with specialized instructions has become a common technique used to speed up the execution of applications. By identifying computationally intensive portions of an application to be partitioned in segments of code to execute in software and segments of code to execute in hardware, the execution of an application can be considerably speeded up. Each segment of code implemented in hardware can then be seen as a specialized application-specific instruction extending a given instruction-set. Although a number of approaches exist in literature proposing different methodologies to customize an instruction-set, the description of the problem consists only of sporadic comparisons limited to isolated problems. This survey presents a unique detailed description of the problem and provides an exhaustive overview of the research in the past years in instruction-set extension. This article presents a thorough analysis of the issues involved during the customization of an instruction-set by means of a set of specialized application-specific instructions. The investigation of the problem covers both instruction generation and instruction selection and different kinds of customizations are analyzed in a great detail.
- Aho, A. V., Ganapathi, M., and Tjiang, S. W. K. 1989. Code generation using tree matching and dynamic programming. ACM Trans. Programm. Lang. Syst. 11, 4, 491--516. Google Scholar
Digital Library
- Aletà, A., Codina, J. M., González, A., and Kaeli, D. 2004. Removing communications in clustered microarchitectures through instruction replication. ACM Trans. Archit. Code Optimiz. 1, 2, 127--151. Google Scholar
Digital Library
- Alippi, C., Fornaciari, W., Pozzi, L., and Sami, M. 1999. A dag-based design approach for reconfigurable vliw processors. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’99). 778--779. Google Scholar
Digital Library
- Alippi, C., Fornaciari, W., Pozzi, L., and Sami, M. 2001. Determining the optimum extended instruction-set architecture for application specific reconfigurable vliw cpus. In Proceedings of the 12th International Workshop on Rapid System Prototyping (RSP’01). 50--56. Google Scholar
Digital Library
- Alomary, A., Nakata, T., Honma, Y., Imai, M., and Hikichi, N. 1993. An asip instruction set optimization algorithm with functional module sharing constraint. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’93). 526--532. Google Scholar
Digital Library
- Alomary, A. Y. 1996. A hardware/software codesign partitioner for asip design. In Proceedings of the 3rd IEEE International Conference on Electronics, Circuits, and Systems (ICECS’96). 251--254.Google Scholar
Cross Ref
- Arató, P., Juhász, S., Ádám Mann, Z., Orbán, A., and Papp, D. 2003. Hardware-Software partitioning in embedded system design. In Proceedings of the IEEE International Symposium on Intelligent Signal Processing (WISP’03). 197--202.Google Scholar
- Arnold, M. 2001. Instruction set extension for embedded processors. Ph.D. thesis, University of Delft, The Netherlands.Google Scholar
- Arnold, M. and Corporaal, H. 1999. Automatic detection of recurring operation patterns. In Proceedings of the 7th International Workshop on Hardware/Software Codesign (CODES’99). 22--26. Google Scholar
Digital Library
- Arnold, M. and Corporaal, H. 2001. Designing domain-specific processors. In Proceedings of the 9th International Symposium on Hardware/Software Codesign (CODES’01). 61--66. Google Scholar
Digital Library
- Atasu, K. 2007. Hardware/software partitioning for custom instruction processors. Ph.D. thesis, Boğaziçi University, Turkey. December.Google Scholar
- Atasu, K., Pozzi, L., and Ienne, P. 2003a. Automatic application-specific instruction-set extensions under microarchitectural constraints. In Proceedings of the 40th Conference on Design Automation (DAC’03). 256--261. Google Scholar
Digital Library
- Atasu, K., Pozzi, L., and Ienne, P. 2003b. Automatic application-specific instruction-set extensions under microarchitectural constraints. Int. J. Parall. Programm. 31, 6, Special issue: Workshop on application specific processors (WASP), 411--428. Google Scholar
Digital Library
- Atasu, K., Dündar, G., and Özturan, C. 2005. An integer linear programming approach for identifying instruction-set extensions. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’05). 172--177. Google Scholar
Digital Library
- Atasu, K., Dimond, R. G., Mencer, O., Luk, W., Özturan, C., and Dündar, G. 2007. Optimizing instruction-set extensible processors under data bandwidth constraints. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’07). 588--593. Google Scholar
Digital Library
- Atasu, K., Mencer, O., Luk, W., Özturan, C., and Dündar, G. 2008. Fast custom instruction identification by convex subgraph enumeration. In Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors (ASAP’08). 1--6. Google Scholar
Digital Library
- Athanas, P. M. and Silverman, H. F. 1993. Processor reconfiguration through instruction-set metamorphosis. Comput. 26, 3, 11--18. Google Scholar
Digital Library
- Baleani, M., Gennari, F., Jiang, Y., Patel, Y., Brayton, R. K., and Sangiovanni-Vincentelli, A. 2002. Hw/sw partitioning and code generation of embedded control applications on a reconfigurable architecture platform. In Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES’02). 151--156. Google Scholar
Digital Library
- Barat, F. and Lauwereins, R. 2000. Reconfigurable instruction set processors: A survey. In Proceedings of the 11th IEEE International Workshop on Rapid System Prototyping (RSP’00). IEEE Computer Society, 168. Google Scholar
Digital Library
- Barat, F., Lauwereins, R., and Deconinck, G. 2002. Reconfigurable instruction set processors from a hardware/software perspective. IEEE Trans. Softw. Engin. 28, 9, 847--862. Google Scholar
Digital Library
- Bình, N. N., Imai, M., and Hikichi, N. 1995. A hardware/software partitioning algorithm for pipelined instruction set processor. In Proceedings of the Conference on European Design Automation (EURO-DAC’95/EURO-VHDL’95). 176--181. Google Scholar
Digital Library
- Bình, N. N., Imai, M., and Shiomi, A. 1996a. A new hw/sw partitioning algorithm for synthesizing the highest performance pipelined asips with multiple identical fus. In Proceedings of the Conference on European Design Automation (EURO-DAC’96/EURO-VHDL’96). 126--131. Google Scholar
Digital Library
- Bình, N. N., Imai, M., Shiomi, A., and Hikichi, N. 1996b. A hardware/software partitioning algorithm for designing pipelined asips with least gate counts. In Proceedings of the 33rd Annual Conference on Design Automation (DAC’96). 527--532. Google Scholar
Digital Library
- Biswas, P. and Dutt, N. 2003a. Greedy and heuristic-based algorithms for synthesis of complex instructions in heterogeneous-connectivity-based DSPs. Tech. rep. 03-16, UCI-ISR.Google Scholar
- Biswas, P. and Dutt, N. 2003b. Reducing code size for heterogeneous-connectivity-based vliw dsps through syntheis of instruction set extensions. In Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’03). 104--112. Google Scholar
Digital Library
- Biswas, P. and Dutt, N. D. 2005. Code size reduction in heterogeneous-connectivity-based dsps using instruction set extensions. IEEE Trans. Comput. 54, 10, 1216--1226. Google Scholar
Digital Library
- Biswas, P., Banerjee, S., Dutt, N., Pozzi, L., and Ienne, P. 2004a. Fast automated generation of high-quality instruction set extensions for processor customization. In Proceedings of the 3rd Workshop on Application Specific Processors (WASP’04).Google Scholar
- Biswas, P., Choudhary, V., Atasu, K., Pozzi, L., Ienne, P., and Dutt, N. 2004b. Introduction of local memory elements in instruction set extensions. In Proceedings of the 41st Annual Conference on Design Automation (DAC’04). 729--734. Google Scholar
Digital Library
- Biswas, P., Banerjee, S., Dutt, N., Pozzi, L., and Ienne, P. 2005. Isegen: Generation of high-quality instruction set extensions by iterative improvement. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’05). 1246--1251. Google Scholar
Digital Library
- Biswas, P., Dutt, N., Ienne, P., and Pozzi, L. 2006. Automatic identification of application-specific functional units with architecturally visible storage. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’06). European Design and Automation Association, 212--217. Google Scholar
Digital Library
- Bobda, C. 2007. Introduction to Reconfigurable Computing. Springer. Google Scholar
Digital Library
- Bonzini, P. and Pozzi, L. 2007a. Polynomial-Time subgraph enumeration for automated instruction set extension. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’07). 1331--1336. Google Scholar
Digital Library
- Bonzini, P. and Pozzi, L. 2007b. A retargetable framework for automated discovery of custom instructions. In Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors (ASAP07).Google Scholar
- Borin, E., Klein, F., Moreano, N., Azevedo, R., and Araujo, G. 2004. Fast instruction set customization. In 2nd Workshop on Embedded Systems for Real-Time Multimedia (ESTImedia’04). 53--58.Google Scholar
- Brayton, R. K. and Somenzi, F. 1989. Boolean relations and the incomplete specification of logic networks. In Proceedings of the 1992 IEEE/ACM International Conference on Computer-Aided Design (ICCAD’89). 316--319.Google Scholar
- Brisk, P., Kaplan, A., Kastner, R., and Sarrafzadeh, M. 2002. Instruction generation and regularity extraction for reconfigurable processors. In Proceedings of the 2002 International Conference on Compilers, Architecture, and Sfor Embedded Systems (CASES’02). 262--269. Google Scholar
Digital Library
- Brisk, P., Kaplan, A., and Sarrafzadeh, M. 2004. Area-Efficient instruction set synthesis for reconfigurable system-on-chip designs. In Proceedings of the 41st annual conference on Design automation (DAC’04). 395--400. Google Scholar
Digital Library
- Buell, D., Kleinfelder, W., and Arnold, J. 1996. Splash 2: FPGAs in a Custom Computing Machine.Google Scholar
- Chen, L. 1996. Graph isomorphism and identification matrices: Parallel algorithms. IEEE Trans. Parall. Distrib. Syst. 7, 3, 308--319. Google Scholar
Digital Library
- Cheung, N., Henkel, J., and Parameswaran, S. 2003a. Rapid configuration and instruction selection for an asip: A case study. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’03). Google Scholar
Digital Library
- Cheung, N., Parameswaran, S., and Henkel, J. 2003b. Inside: Instruction selection/identification and design exploration for extensible processors. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’03). Google Scholar
Digital Library
- Cheung, N., Parameswaran, S., and Henkel, J. 2005. Battery-Aware instruction generation for embedded processors. In Proceedings of the Conference on Asia South Pacific Design Automation (ASP-DAC’05). 553--556. Google Scholar
Digital Library
- Choi, H., Hwang, S. H., Kyung, C.-M., and Park, I.-C. 1998. Synthesis of application specific instructions for embedded dsp software. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’98). 665--671. Google Scholar
Digital Library
- Choi, H., Kim, J.-S., Yoon, C.-W., Park, I.-C., Hwang, S. H., and Kyung, C.-M. 1999. Synthesis of application specific instructions for embedded dsp software. IEEE Trans. Comput. 48, 6, 603--614. Google Scholar
Digital Library
- Clark, N. 2007. Customizing the computation capabilities of microprocessors. Ph.D. thesis, University of Michigan, Ann Arbor. Google Scholar
Digital Library
- Clark, N. T. and Zhong, H. 2005. Automated custom instruction generation for domain-specific processor acceleration. IEEE Trans. Comput. 54, 10, 1258--1270. Google Scholar
Digital Library
- Clark, N., Tang, W., and Mahlke, S. 2002. Automatically generating custom instruction set extensions. In Proceedings of 1st Workshop on Application Specific Processors (WASP). 94--101.Google Scholar
- Clark, N., Zhong, H., and Mahlke, S. 2003. Processor acceleration through automated instruction set customization. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture (MICRO’36). Google Scholar
Digital Library
- Clark, N., Kudlur, M., Park, H., Mahlke, S., and Flautner, K. 2004. Application-specific processing on a general-purpose core via transparent instruction set customization. In Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’37). 30--40. Google Scholar
Digital Library
- Clark, N., Blome, J., Chu, M., Mahlke, S., Biles, S., and Flautner, K. 2005. An architecture framework for transparent instruction set customization in embedded processors. SIGARCH Comput. Archit. News 33, 2, 272--283. Google Scholar
Digital Library
- Clark, N., Hormati, A., Mahlke, S., and Yehia, S. 2006. Scalable subgraph mapping for acyclic computation accelerators. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’06). 147--157. Google Scholar
Digital Library
- Compton, K. and Hauck, S. 2002. Reconfigurable computing: A survey of systems and software. ACM Comput. Surv. 34, 2, 171--210. Google Scholar
Digital Library
- Cong, J., Fan, Y., Han, G., and Zhang, Z. 2004. Application-specific instruction generation for configurable processor architectures. In Proceedings of the ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays (FPGA’04). 183--189. Google Scholar
Digital Library
- Coudert, O. 1996. On solving covering problems. In Proceedings of the 33rd Annual Conference on Design Automation (DAC’96). 197--202. Google Scholar
Digital Library
- Coudert, O. and Madre, J. C. 1995. New ideas for solving covering problems. In Proceedings of the 32nd ACM/IEEE Conference on Design Automation (DAC’95). 641--646. Google Scholar
Digital Library
- De Micheli, G. and Gupta, R. K. 1997. Hardware/software co-design. Proc. IEEE 85, 3, 349--365.Google Scholar
Cross Ref
- Ebeling, C., Cronquist, D., and Franklin, P. 1996. Rapid - reconfigurable pipelined datapath. In Proceedings of the 6th International Workshop on Field-Programmable Logic, Smart Applications, New Paradigms and Compilers (FPL’96). Springer, 126--135. Google Scholar
Digital Library
- Faraboschi, P., Brown, G., Fisher, J. A., Desoli, G., and Homewood, F. 2000. Lx: a technology platform for customizable vliw embedded processing. ACM SIGARCH Comput. Archit. News 28, 2, 203--213. Google Scholar
Digital Library
- Fornaciari, W., Pozzi, L., and Sami, M. 1999. Processori riconfigurabili: unalternativa flessibile per i sistemi dedicati. Alta Frequenza - Rivista di Elettronica, 22--28.Google Scholar
- Fortin, S. 1996. The graph isomorphism problem. Tech. rep. TR 96-20, Department of Computing Science, University of Alberta, Canada.Google Scholar
- Galuzzi, C., Bertels, K., and Vassiliadis, S. 2007a. A linear complexity algorithm for the automatic generation of convex multiple input multiple output instructions. In Proceedings of the 3rd International Workshop Reconfigurable Computing: Architectures, Tools and Applications (ARC’07), P. C. Diniz, E. Marques, K. Bertels, M. M. Fernandes, and J. M. P. Cardoso Eds., Lecture Notes in Computer Science, vol. 4419. Springer, 130--141. Google Scholar
Digital Library
- Galuzzi, C., Bertels, K., and Vassiliadis, S. 2007b. A linear complexity algorithm for the generation of multiple input single output instructions of variable size. In Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 7th International Workshop (SAMOS’07), S. Vassiliadis, M. Berekovic, and T. D. Hämäläinen, Eds. Lecture Notes in Computer Science, vol. 4599. Springer, 283--293. Google Scholar
Digital Library
- Galuzzi, C., Moscu Panainte, E., Yankova, Y., Bertels, K., and Vassiliadis, S. 2006. Automatic selection of application-specific instruction-set extensions. In Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’06). 160--165. Google Scholar
Digital Library
- Geurts, W. 1995. Synthesis of accelerator data paths for high-throughput signal processing applications. Ph.D. thesis, Katholieke Universiteit Leuven.Google Scholar
- Geurts, W. 1997. Accelerator Data-Path Synthesis for High-Throughput Signal Processing Applications. Kluwer Academic Publishers, Norwell, MA. Google Scholar
Digital Library
- Gokhale, M., Holmes, W., Kopser, A., Lucas, S., Minnich, R., Sweely, D., and Lopresti, D. 1991. Building and using a highly parallel programmable logic array. Comput. 24, 1, 81--89. Google Scholar
Digital Library
- Goldstein, S. C., Schmit, H., Moe, M., Budiu, M., Cadambi, S., Taylor, R. R., and Laufer, R. 1999. Piperench: A co-processor for streaming multimedia acceleration. SIGARCH Comput. Archit. News 27, 2, 28--39. Google Scholar
Digital Library
- Grasselli, A. and Luccio, F. 1965. A method for minimizing the number of internal states in incompletely specified sequential networks. IEEE Trans. Electron. Comp. EC-14, 350--359.Google Scholar
Cross Ref
- Guo, Y. 2006. Mapping applications to a coarse-grained reconfigurable architecture. Ph.D. thesis, University of Twente, The Netherlands.Google Scholar
- Guo, Y., Smit, G. J., Broersma, H., and Heysters, P. M. 2003. A graph covering algorithm for a coarse grain reconfigurable system. In Proceedings of the ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems (LCTES’03). 199--208. Google Scholar
Digital Library
- Gutin, G., Johnstone, A., Reddington, J., Scott, E., Soleimanfallah, A., and Yeo, A. 2007. An algorithm for finding connected convex subgraphs of an acyclic digraph. In Proceedings of the ACiD 2007.Google Scholar
- Hartenstein, R. 2001a. Coarse grain reconfigurable architecture (embedded tutorial). In Proceedings of the Conference on Asia South Pacific Design Automation (ASP-DAC’01). 564--570. Google Scholar
Digital Library
- Hartenstein, R. 2001b. A decade of reconfigurable computing: a visionary retrospective. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’01). 642--649. Google Scholar
Digital Library
- Hauck, S., Fry, T. W., Hosler, M. M., and Kao, J. P. 1997. The chimaera reconfigurable functional unit. In Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines (FCCM’97). Google Scholar
Digital Library
- Hauck, S., Fry, T. W., Hosler, M. M., and Kao, J. P. 2004. The chimaera reconfigurable functional unit. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 12, 2, 206--217. Google Scholar
Digital Library
- Hauser, J. R. and Wawrzynek, J. 1997. Garp: a mips processor with a reconfigurable coprocessor. In Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines (FCCM’97). Google Scholar
Digital Library
- Haynes, S. D., Cheung, P. Y. K., Luk, W., and Stone, J. 1999. Sonic - A plug-in architecture for video processing. In Proceedings of the 7th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’99). Google Scholar
Digital Library
- Haynes, S. D., Stone, J., Cheung, P. Y. K., and Luk, W. 2000. Video image processing with the sonic architecture. Comput. 33, 4, 50--57. Google Scholar
Digital Library
- Holmer, B. 1993. Automatic design of computer instruction sets. Ph.D. thesis. Google Scholar
Digital Library
- Huang, I.-J. and Despain, A. M. 1994a. Generating instruction sets and microarchitectures from applications. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’94). 391--396. Google Scholar
Digital Library
- Huang, I.-J. and Despain, A. M. 1994b. Synthesis of instruction sets for pipelined microprocessors. In Proceedings of the 31st Annual Conference on Design Automation (DAC’94). 5--11. Google Scholar
Digital Library
- Huang, Z. and Malik, S. 2001. Managing dynamic reconfiguration overhead in system-on-a-chip design using reconfigurable datapaths and optimized interconnection networks. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’01). 735--740. Google Scholar
Digital Library
- Huang, Z., Malik, S., Moreano, N., and Araujo, G. 2004. The design of dynamically reconfigurable datapath coprocessors. Trans. Embed. Comput. Syst. 3, 2, 361--384. Google Scholar
Digital Library
- Huynh, H. P., Sim, J. E., and Mitra, T. 2007. An efficient framework for dynamic reconfiguration of instruction-set customization. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’07). 135--144. Google Scholar
Digital Library
- Ienne, P. and Leupers, R. 2006. Customizable Embedded Processors: Design Technologies and Applications (Systems on Silicon). Morgan Kaufmann Publishers, San Francisco, CA. Google Scholar
Digital Library
- Imai, M., Sato, J., Alomary, A., and Hikichi, N. 1992. An integer programming approach to instruction implementation method selection problem. In Proceedings of the Conference on European Design Automation (EURO-DAC’92). 106--111. Google Scholar
Digital Library
- Iseli, C. 1996. Spyder: A reconfigurable processor development system. Ph.D. thesis, Ecole Polytechnique Federale de Lausanne.Google Scholar
- Iseli, C. and Sanchez, E. 1995. Spyder: A sure (superscalar and reconfigurable) processor. J. Supercomput. 9, 3, 231--252. Google Scholar
Digital Library
- Janssen, M., Catthoor, F., and de Man, H. 1996. A specification invariant technique for regularity improvement between flow-graph clusters. In Proceedings of the European Conference on Design and Test (EDTC’96). Google Scholar
Digital Library
- Jayaseelan, R., Liu, H., and Mitra, T. 2006. Exploiting forwarding to improve data bandwidth of instruction-set extensions. In Proceedings of the 43rd Annual Conference on Design Automation (DAC’06). 43--48. Google Scholar
Digital Library
- Kastner, R., Ogrenci-Memik, S., Bozorgzadeh, E., and Sarrafzadeh, M. 2001. Instruction generation for hybrid reconfigurable systems. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’01). 127--130. Google Scholar
Digital Library
- Kastner, R., Kaplan, A., Memik, S. O., and Bozorgzadeh, E. 2002. Instruction generation for hybrid reconfigurable systems. ACM Trans. Des. Automa. Electron. Syst. (TODAES) 7, 4, 605--627. Google Scholar
Digital Library
- Kavvadias, N. and Nikolaidis, S. 2005. Automated instruction-set extension of embedded processors with application to mpeg-4 video encoding. In Proceedings of the IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP’05). 140--145. Google Scholar
Digital Library
- Kavvadias, N. and Nikolaidis, S. 2006. A flexible instruction generation framework for extending embedded processors. In Proceedings of the 13th IEEE Mediterranean Electrotechnical Conference (MELECON’06). 125--128.Google Scholar
- Keutzer, K., Malik, S., and Newton, A. R. 2002. From asic to asip: The next design discontinuity. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD’02). 84--90. Google Scholar
Digital Library
- Lam, S.-K. and Srikanthan, T. 2009. Rapid design of area-efficient custom instructions for reconfigurable embedded processing. J. Syst. Archit. 55, 1, 1--14. Google Scholar
Digital Library
- Lam, S. K., Srikantham, T., and Clarke, C. T. 2006. Rapid generation of custom instructions using predefined dataflow structures. Microprocess. Microsyst. 30, 6, (Special Issue on FPGA’s), 355--366.Google Scholar
- Lee, C., Potkonjak, M., and Mangione-Smith, W. H. 1997. Mediabench: A tool for evaluating and synthesizing multimedia and communicatons systems. In Proceedings of the 30th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’30). 330--335. Google Scholar
Digital Library
- Lee, J.-E., Choi, K., and Dutt, N. 2002. Efficient instruction encoding for automatic instruction set design of configurable asips. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’02). 649--654. Google Scholar
Digital Library
- Lee, J.-E., Choi, K., and Dutt, N. D. 2003a. Energy-efficient instruction set synthesis for application-specific processors. In Proceedings of the 2003 International Symposium on Low Power Electronics and Design (ISLPED’03). 330--333. Google Scholar
Digital Library
- Lee, J.-E., Choi, K., and Dutt, N. D. 2003b. An algorithm for mapping loops onto coarse-grained reconfigurable architectures. In Proceedings of the ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems (LCTES’03). 183--188. Google Scholar
Digital Library
- Lee, J.-E., Choi, K., and Dutt, N. D. 2007. Instruction set synthesis with efficient instruction encoding for configurable processors. ACM Trans. Des. Autom. Electron. Syst. 12, 1, 8. Google Scholar
Digital Library
- Leupers, R., Karuri, K., Kraemer, S., and Pandey, M. 2006. A design flow for configurable embedded processors based on optimized instruction set extension synthesis. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’06). European Design and Automation Association, 581--586. Google Scholar
Digital Library
- Li, X. Y., Stallmann, M. F., and Brglez, F. 2005. Effective bounding techniques for solving unate and binate covering problems. In Proceedings of the 42nd Annual Conference on Design Automation (DAC’05). 385--390. Google Scholar
Digital Library
- Liao, S. and Devadas, S. 1997. Solving covering problems using lpr-based lower bounds. In Proceedings of the 34th Annual Conference on Design Automation (DAC’97). 117--120. Google Scholar
Digital Library
- Liao, S., Devadas, S., Keutzer, K., and Tjiang, S. 1995. Instruction selection using binate covering for code size optimization. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’95). 393--399. Google Scholar
Digital Library
- Liao, S., Keutzer, K., Tjiang, S., and Devadas, S. 1998. A new viewpoint on code generation for directed acyclic graphs. ACM Trans. Design Automat. Electron. Syst. (TODAES) 3, 1, 51--75. Google Scholar
Digital Library
- Liem, C., May, T., and Paulin, P. 1994. Instruction-set matching and selection for DSP and ASIP code generation. In Proceedings of the European Design and Test Conference (ED&TC). 31--37.Google Scholar
- Lin, S. and Kernighan, B. 1973. An effective heuristic algorithm for the traveling-salesman problem. Oper. Res. 21, 2, 498--516.Google Scholar
Digital Library
- Lu, G., Singh, H., Lee, M.-H., Bagherzadeh, N., Kurdahi, F. J., and Filho, E. M. C. 1999. The morphosys parallel reconfigurable system. In Proceedings of the 5th International Euro-Par Conference on Parallel Processing (Euro-Par’99). Springer, 727--734. Google Scholar
Digital Library
- Mei, B., Vernalde1, S., Verkest, D., Man, H. D., and Lauwereins, R. 2003. Adres: An architecture with tightly coupled vliw processor and coarse-grained reconfigurable matrix. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’03). Springer, 61--70.Google Scholar
- Messmer, B. T. and Bunke, H. 1995. Subgraph isomorphism in polynomial time. Tech. rep. IAM 95-003, University of Bern, Switzerland.Google Scholar
- Miyamori, T. and Olukotun, K. 1998. Remarc (abstract): Reconfigurable multimedia array coprocessor. In Proceedings of the ACM/SIGDA 6th International Symposium on Field Programmable Gate Arrays (FPGA’98). Google Scholar
Digital Library
- Moreano, N., Araujo, G., Huang, Z., and Malik, S. 2002. Datapath merging and interconnection sharing for reconfigurable architectures. In Proceedings of the 15th International Symposium on System Synthesis (ISSS’02). 38--43. Google Scholar
Digital Library
- Niemann, R. and Marwedel, P. 1996. Hardware/software partitioning using integer programming. In Proceedings of the European Conference on Design and Test (EDTC9’6). Google Scholar
Digital Library
- Niemann, R. and Marwedel, P. 1997. An algorithm for hardware/software partitioning using mixed integer linear programming. Des. Automat. Embedd. Syst. 2, 2, Special Issue: Partitioning Methods for Embedded Systems, 165--193.Google Scholar
- Peymandoust, A., Pozzil, L., Ienne, P., and Micheli, G. D. 2003. Automatic instruction set extension and utilization for embedded processors. In Proceedings of the 14th International Conference on Application-Specific Systems, Architectures and Processors (ASAP’03). 108--118.Google Scholar
- Pothineni, N., Kumar, A., and Paul, K. 2007. Application specific datapath extension with distributed i/o functional units. In Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference (VLSID’07). 551--558. Google Scholar
Digital Library
- Pozzi, L. 2000. Methodologies for the design of application-specific reconfigurable vliw processors. Ph.D. thesis, Politecnico di Milano, Milano, Italy.Google Scholar
- Pozzi, L. and Ienne, P. 2005. Exploiting pipelining to relax register-file port constraints of instruction-set extensions. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’05). 2--10. Google Scholar
Digital Library
- Pozzi, L., Vuletić, M., and Ienne, P. 2001. Automatic topology-based identification of instruction-set extensions for embedded processors. Tech. rep. CS 01/377, EPFL, DI-LAP, Lausanne.Google Scholar
- Pozzi, L., Vuletić, M., and Ienne, P. 2002. Automatic topology-based identification of instruction-set extensions for embedded processors. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’02). Google Scholar
Digital Library
- Pozzi, L., Atasu, K., and Ienne, P. 2006a. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. Comput.-Aid. Desi. Integra. Circ. Syst. 25, 7, 1209--1229. Google Scholar
Digital Library
- Pozzi, L., Atasu, K., and Ienne, P. 2006b. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. Comput.-Aid. Des. Integra. Circ. Syst. 25, 7, 1209--1229. Google Scholar
Digital Library
- Rabaey, J. 1997. Reconfigurable processing: The solution to low-power programmable dsp. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’97). vol. 1. Google Scholar
Digital Library
- Radunovic, B. and Milutinovic, V. M. 1998. A survey of reconfigurable computing architectures. In Proceedings of the 8th International Workshop on Field-Programmable Logic and Applications, From FPGAs to Computing Paradigm (FPL’98). Springer, 376--385. Google Scholar
Digital Library
- Razdan, R., Brace, K. S., and Smith, M. D. 1994. PRISC software acceleration techniques. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computer & Processors (ICCS’’94). 145--149. Google Scholar
Digital Library
- Razdan, R. and Smith, M. D. 1994. A high-performance microarchitecture with hardware-programmable functional units. In Proceedings of the 27th Annual International Symposium on Microarchitecture (MICRO’27). 172--180. Google Scholar
Digital Library
- Rupp, C. R., Landguth, M., Garverick, T., Gomersall, E., Holt, H., Arnold, J. M., and Gokhale, M. 1998. The napa adaptive processing architecture. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM’98). Google Scholar
Digital Library
- Sang, S., Li, X., and Ye, Y. 2005. Automatic instruction generation for application specific co-processor. In 6th International Conference On ASIC (ASICON’05). 934--938.Google Scholar
- Scharwaechter, H., Youn, J. M., Leupers, R., Paek, Y., Ascheid, G., and Meyr, H. 2007. A code-generator generator for multi-output instructions. In Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’07). 131--136. Google Scholar
Digital Library
- Seto, K. and Fujita, M. 2008. Custom instruction generation with high-level synthesis. In Proceedings of the 2008 Symposium on Application Specific Processors (SASP). Anaheim, California, 14--19. Google Scholar
Digital Library
- Sreenivasa Rao, D. and Kurdahi, F. J. 1992. Partitioning by regularity extraction. In Proceedings of the 29th ACM/IEEE Conference on Design Automation (DAC’92). 235--238. Google Scholar
Digital Library
- Sreenivasa Rao, D. and Kurdahi, F. J. 1993a. Hierarchical design space exploration for a class of digital systems. IEEE Trans. Very Large Scale Integra. (VLSI) Syst. 1, 3, 282--295.Google Scholar
Digital Library
- Sreenivasa Rao, D. and Kurdahi, F. J. 1993b. On clustering for maximal regularity extraction. IEEE Trans. Comput.-Aid. Des. 12, 8, 1198--1208.Google Scholar
Digital Library
- Strozek, L. and Brooks, D. 2006. Efficient architectures through application clustering and architectural heterogeneity. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’06). 190--200. Google Scholar
Digital Library
- Sun, F., Ravi, S., Raghunathan, A., and Jha, N. K. 2002. Synthesis of custom processors based on extensible platforms. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’02). 641--648. Google Scholar
Digital Library
- Sun, F., Ravi, S., Raghunathan, A., and Jha, N. K. 2003. A scalable application-specific processor synthesis methodology. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’03). Google Scholar
Digital Library
- Sun, F., Ravi, S., Raghunathan, A., and Jha, N. K. 2004. Custom-instruction synthesis for extensible processor platform. IEEE Trans. Comput.-Aid. Des. Integra. Circ. 23, 2, 216--228. Google Scholar
Digital Library
- Todman, T., Constantinides, G., Wilton, S., Mencer, O., Luk, W., and Cheung, P. 2005. Reconfigurable computing: Architectures and design methods. IEE Proc. - Comput. Digital Tech. 152, 2, 193--207.Google Scholar
Cross Ref
- Van Praet, J., Goossens, G., Lanneer, D., and De Man, H. 1994. Instruction set definition and instruction selection for asips. In Proceedings of the 7th International Symposium on High-level Synthesis (ISSS’94). 11--16. Google Scholar
Digital Library
- Vassiliadis, S. and Soudris, D., Eds. 2007. Fine- and Coarse-Grain Reconfigurable Computing. Springer. Google Scholar
Digital Library
- Vassiliadis, S., Wong, S., and Cotofana, S. 2001. The molen ϱμ-coded processor. In Proceedings of the 11th International Conference on Field-Programmable Logic and Applications (FPL’01). Springer-Verlag, London, UK, 275--285. Google Scholar
Digital Library
- Vassiliadis, S., Wong, S., Gaydadjiev, G., Bertels, K., Kuzmanov, G., and Moscu Panainte, E. 2004. The molen polymorphic processor. IEEE Trans. Comput. 53, 11, 1363--1375. Google Scholar
Digital Library
- Vassiliadis, N., Kavvadias, N., Theodoridis, G., and Nikolaidis, S. 2006. A risc architecture extended by an efficient tightly coupled reconfigurable unit. Inte. J. Electron. 93, 6, 421--438.Google Scholar
Cross Ref
- Vassiliadis, N., Theodoridis, G., and Nikolaidis, S. 2007. Enhancing a reconfigurable instruction set processor with partial predication and virtual opcode support. In Proceedings of the 2nd International Workshop on Applied Reconfigurable Computing (ARC’06). Lecture Notes in Computer Science, vol. 3985. Springer, 217--229.Google Scholar
- Verma, A. K., Atasu, K., Vuletić, M., Pozzi, L., and Ienne, P. 2002. Automatic application-specific instruction-set extensions under microarchitectural constraints. In Proceedings of the 1st Workshop on Application Specific Processors (WASP-1).Google Scholar
- Verma, A. K., Brisk, P., and Ienne, P. 2007. Rethinking custom ise identification: A new processor-agnostic method. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’07). 125--134. Google Scholar
Digital Library
- Wang, A., Killian, E., Maydan, D., and Rowen, C. 2001. Hardware/software instruction set configurability for system-on-chip processors. In Proceedings of the 38th Conference on Design Automation (DAC’01). 184--188. Google Scholar
Digital Library
- Wazlowski, M., Agarwal, L., Lee, T., Smith, A., Lam, E., Athanas, P., Silverman, H., and Ghosh, S. 1993. Prism-ii compiler and architecture. In Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines. 9--16.Google Scholar
- Wirthlin, M. J. and Hutchings, B. L. 1995. Disc: The dynamic instruction set computer. In Proceedings of the International Society of Optical Engineering SPIE. Field Programmable Gate Arrays (FPGAs) for Fast Board Development and Reconfigurable Computing. vol. 2607. 92--103. Google Scholar
Digital Library
- Wittig, R. and Chow, P. 1996. OneChip: An FPGA processor with reconfigurable logic. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines. 126--135.Google Scholar
- Wittig, R. D. 1995. Onechip: An fpga processor with reconfigurable logic. M.S. thesis, Department of Electrical and Computer Engineering, University of Toronto.Google Scholar
- Wolinski, C. and Kuchcinski, K. 2007. Identification of application specific instructions based on sub-graph isomorphism constraints. In Proceedings of the IEEE International Application -specific Systems, Architectures and Processors. 328--333.Google Scholar
- Wolinski, C. and Kuchcinski, K. 2008. Automatic selection of application-specific reconfigurable processor extensions. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’08). 1214--1219. Google Scholar
Digital Library
- Wong, S., Vassiliadis, S., and Cotofana, S. 2007. Instruction set extension generation with considering physical constraints. In Proceedings of the International Conference on High Performance Embedded Architectures and Compilers. 291--305. Google Scholar
Digital Library
- Ye, Z. A., Moshovos, A., Hauck, S., and Banerjee, P. 2000. CHIMAERA: A high-performance architecture with a tightly-coupled reconfigurable functional unit. In ACM SIGARCH Comput. Archit. News (Special Issue: Proceedings of the 27th annual international symposium on Computer architecture ISCA), 225--235. Google Scholar
Digital Library
- Yu, P. and Mitra, T. 2004. Scalable custom instructions identification for instruction-set extensible processors. In Proceedings of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’04). 69--78. Google Scholar
Digital Library
- Yu, P. and Mitra, T. 2005. Satisfying real-time constraints with custom instructions. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’05). 166--171. Google Scholar
Digital Library
- Yu, P. and Mitra, T. 2007. Disjoint pattern enumeration for custom instructions identification. In Proceedings of the 17th IEEE International Conference on Field Programmable Logic and Applications (FPL’07). Amsterdam, The Netherlands, --.Google Scholar
- Zhao, K., Bian, J., Dong, S., Song, Y., and Goto, S. 2008. Fast custom instruction identification algorithm based on basic convex pattern model for supporting asip automated design. IEICE Trans. Fundam. Electron. Comm. Comput. Sci. E91-A, 6, 1478--1487. Google Scholar
Digital Library
Index Terms
The Instruction-Set Extension Problem: A Survey
Recommendations
Automatic selection of application-specific instruction-set extensions
CODES+ISSS '06: Proceedings of the 4th international conference on Hardware/software codesign and system synthesisIn this paper, we present a general and an efficient algorithm for automatic selection of new application-specific instructions under hardware resources constraints. The instruction selection is formulated as an ILP problem and efficient solvers can be ...
A Study on Instruction-set Selection Using Multi-application Based Application Specific Instruction-set Processors
VLSID '13: Proceedings of the 2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded SystemsEfficiency in embedded systems is paramount to achieve high performance while consuming less area and power. Processors in embedded systems have to be designed carefully to achieve such design constraints. Application Specific Instruction set Processors ...
Instruction set independent program encoding
Instruction encoding techniques have been designed for reducing the program memory footprint and improving processors performance. However, many techniques are instruction-set dependent thus minimizing the adoption in different application domains and ...






Comments