skip to main content
research-article

The Instruction-Set Extension Problem: A Survey

Published:01 May 2011Publication History
Skip Abstract Section

Abstract

The extension of a given instruction-set with specialized instructions has become a common technique used to speed up the execution of applications. By identifying computationally intensive portions of an application to be partitioned in segments of code to execute in software and segments of code to execute in hardware, the execution of an application can be considerably speeded up. Each segment of code implemented in hardware can then be seen as a specialized application-specific instruction extending a given instruction-set. Although a number of approaches exist in literature proposing different methodologies to customize an instruction-set, the description of the problem consists only of sporadic comparisons limited to isolated problems. This survey presents a unique detailed description of the problem and provides an exhaustive overview of the research in the past years in instruction-set extension. This article presents a thorough analysis of the issues involved during the customization of an instruction-set by means of a set of specialized application-specific instructions. The investigation of the problem covers both instruction generation and instruction selection and different kinds of customizations are analyzed in a great detail.

References

  1. Aho, A. V., Ganapathi, M., and Tjiang, S. W. K. 1989. Code generation using tree matching and dynamic programming. ACM Trans. Programm. Lang. Syst. 11, 4, 491--516. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aletà, A., Codina, J. M., González, A., and Kaeli, D. 2004. Removing communications in clustered microarchitectures through instruction replication. ACM Trans. Archit. Code Optimiz. 1, 2, 127--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Alippi, C., Fornaciari, W., Pozzi, L., and Sami, M. 1999. A dag-based design approach for reconfigurable vliw processors. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’99). 778--779. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Alippi, C., Fornaciari, W., Pozzi, L., and Sami, M. 2001. Determining the optimum extended instruction-set architecture for application specific reconfigurable vliw cpus. In Proceedings of the 12th International Workshop on Rapid System Prototyping (RSP’01). 50--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alomary, A., Nakata, T., Honma, Y., Imai, M., and Hikichi, N. 1993. An asip instruction set optimization algorithm with functional module sharing constraint. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’93). 526--532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Alomary, A. Y. 1996. A hardware/software codesign partitioner for asip design. In Proceedings of the 3rd IEEE International Conference on Electronics, Circuits, and Systems (ICECS’96). 251--254.Google ScholarGoogle ScholarCross RefCross Ref
  7. Arató, P., Juhász, S., Ádám Mann, Z., Orbán, A., and Papp, D. 2003. Hardware-Software partitioning in embedded system design. In Proceedings of the IEEE International Symposium on Intelligent Signal Processing (WISP’03). 197--202.Google ScholarGoogle Scholar
  8. Arnold, M. 2001. Instruction set extension for embedded processors. Ph.D. thesis, University of Delft, The Netherlands.Google ScholarGoogle Scholar
  9. Arnold, M. and Corporaal, H. 1999. Automatic detection of recurring operation patterns. In Proceedings of the 7th International Workshop on Hardware/Software Codesign (CODES’99). 22--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Arnold, M. and Corporaal, H. 2001. Designing domain-specific processors. In Proceedings of the 9th International Symposium on Hardware/Software Codesign (CODES’01). 61--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Atasu, K. 2007. Hardware/software partitioning for custom instruction processors. Ph.D. thesis, Boğaziçi University, Turkey. December.Google ScholarGoogle Scholar
  12. Atasu, K., Pozzi, L., and Ienne, P. 2003a. Automatic application-specific instruction-set extensions under microarchitectural constraints. In Proceedings of the 40th Conference on Design Automation (DAC’03). 256--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Atasu, K., Pozzi, L., and Ienne, P. 2003b. Automatic application-specific instruction-set extensions under microarchitectural constraints. Int. J. Parall. Programm. 31, 6, Special issue: Workshop on application specific processors (WASP), 411--428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Atasu, K., Dündar, G., and Özturan, C. 2005. An integer linear programming approach for identifying instruction-set extensions. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’05). 172--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Atasu, K., Dimond, R. G., Mencer, O., Luk, W., Özturan, C., and Dündar, G. 2007. Optimizing instruction-set extensible processors under data bandwidth constraints. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’07). 588--593. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Atasu, K., Mencer, O., Luk, W., Özturan, C., and Dündar, G. 2008. Fast custom instruction identification by convex subgraph enumeration. In Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors (ASAP’08). 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Athanas, P. M. and Silverman, H. F. 1993. Processor reconfiguration through instruction-set metamorphosis. Comput. 26, 3, 11--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Baleani, M., Gennari, F., Jiang, Y., Patel, Y., Brayton, R. K., and Sangiovanni-Vincentelli, A. 2002. Hw/sw partitioning and code generation of embedded control applications on a reconfigurable architecture platform. In Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES’02). 151--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Barat, F. and Lauwereins, R. 2000. Reconfigurable instruction set processors: A survey. In Proceedings of the 11th IEEE International Workshop on Rapid System Prototyping (RSP’00). IEEE Computer Society, 168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Barat, F., Lauwereins, R., and Deconinck, G. 2002. Reconfigurable instruction set processors from a hardware/software perspective. IEEE Trans. Softw. Engin. 28, 9, 847--862. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Bình, N. N., Imai, M., and Hikichi, N. 1995. A hardware/software partitioning algorithm for pipelined instruction set processor. In Proceedings of the Conference on European Design Automation (EURO-DAC’95/EURO-VHDL’95). 176--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Bình, N. N., Imai, M., and Shiomi, A. 1996a. A new hw/sw partitioning algorithm for synthesizing the highest performance pipelined asips with multiple identical fus. In Proceedings of the Conference on European Design Automation (EURO-DAC’96/EURO-VHDL’96). 126--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Bình, N. N., Imai, M., Shiomi, A., and Hikichi, N. 1996b. A hardware/software partitioning algorithm for designing pipelined asips with least gate counts. In Proceedings of the 33rd Annual Conference on Design Automation (DAC’96). 527--532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Biswas, P. and Dutt, N. 2003a. Greedy and heuristic-based algorithms for synthesis of complex instructions in heterogeneous-connectivity-based DSPs. Tech. rep. 03-16, UCI-ISR.Google ScholarGoogle Scholar
  25. Biswas, P. and Dutt, N. 2003b. Reducing code size for heterogeneous-connectivity-based vliw dsps through syntheis of instruction set extensions. In Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’03). 104--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Biswas, P. and Dutt, N. D. 2005. Code size reduction in heterogeneous-connectivity-based dsps using instruction set extensions. IEEE Trans. Comput. 54, 10, 1216--1226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Biswas, P., Banerjee, S., Dutt, N., Pozzi, L., and Ienne, P. 2004a. Fast automated generation of high-quality instruction set extensions for processor customization. In Proceedings of the 3rd Workshop on Application Specific Processors (WASP’04).Google ScholarGoogle Scholar
  28. Biswas, P., Choudhary, V., Atasu, K., Pozzi, L., Ienne, P., and Dutt, N. 2004b. Introduction of local memory elements in instruction set extensions. In Proceedings of the 41st Annual Conference on Design Automation (DAC’04). 729--734. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Biswas, P., Banerjee, S., Dutt, N., Pozzi, L., and Ienne, P. 2005. Isegen: Generation of high-quality instruction set extensions by iterative improvement. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’05). 1246--1251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Biswas, P., Dutt, N., Ienne, P., and Pozzi, L. 2006. Automatic identification of application-specific functional units with architecturally visible storage. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’06). European Design and Automation Association, 212--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Bobda, C. 2007. Introduction to Reconfigurable Computing. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Bonzini, P. and Pozzi, L. 2007a. Polynomial-Time subgraph enumeration for automated instruction set extension. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’07). 1331--1336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Bonzini, P. and Pozzi, L. 2007b. A retargetable framework for automated discovery of custom instructions. In Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors (ASAP07).Google ScholarGoogle Scholar
  34. Borin, E., Klein, F., Moreano, N., Azevedo, R., and Araujo, G. 2004. Fast instruction set customization. In 2nd Workshop on Embedded Systems for Real-Time Multimedia (ESTImedia’04). 53--58.Google ScholarGoogle Scholar
  35. Brayton, R. K. and Somenzi, F. 1989. Boolean relations and the incomplete specification of logic networks. In Proceedings of the 1992 IEEE/ACM International Conference on Computer-Aided Design (ICCAD’89). 316--319.Google ScholarGoogle Scholar
  36. Brisk, P., Kaplan, A., Kastner, R., and Sarrafzadeh, M. 2002. Instruction generation and regularity extraction for reconfigurable processors. In Proceedings of the 2002 International Conference on Compilers, Architecture, and Sfor Embedded Systems (CASES’02). 262--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Brisk, P., Kaplan, A., and Sarrafzadeh, M. 2004. Area-Efficient instruction set synthesis for reconfigurable system-on-chip designs. In Proceedings of the 41st annual conference on Design automation (DAC’04). 395--400. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Buell, D., Kleinfelder, W., and Arnold, J. 1996. Splash 2: FPGAs in a Custom Computing Machine.Google ScholarGoogle Scholar
  39. Chen, L. 1996. Graph isomorphism and identification matrices: Parallel algorithms. IEEE Trans. Parall. Distrib. Syst. 7, 3, 308--319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Cheung, N., Henkel, J., and Parameswaran, S. 2003a. Rapid configuration and instruction selection for an asip: A case study. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’03). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Cheung, N., Parameswaran, S., and Henkel, J. 2003b. Inside: Instruction selection/identification and design exploration for extensible processors. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’03). Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Cheung, N., Parameswaran, S., and Henkel, J. 2005. Battery-Aware instruction generation for embedded processors. In Proceedings of the Conference on Asia South Pacific Design Automation (ASP-DAC’05). 553--556. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Choi, H., Hwang, S. H., Kyung, C.-M., and Park, I.-C. 1998. Synthesis of application specific instructions for embedded dsp software. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’98). 665--671. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Choi, H., Kim, J.-S., Yoon, C.-W., Park, I.-C., Hwang, S. H., and Kyung, C.-M. 1999. Synthesis of application specific instructions for embedded dsp software. IEEE Trans. Comput. 48, 6, 603--614. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Clark, N. 2007. Customizing the computation capabilities of microprocessors. Ph.D. thesis, University of Michigan, Ann Arbor. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Clark, N. T. and Zhong, H. 2005. Automated custom instruction generation for domain-specific processor acceleration. IEEE Trans. Comput. 54, 10, 1258--1270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Clark, N., Tang, W., and Mahlke, S. 2002. Automatically generating custom instruction set extensions. In Proceedings of 1st Workshop on Application Specific Processors (WASP). 94--101.Google ScholarGoogle Scholar
  48. Clark, N., Zhong, H., and Mahlke, S. 2003. Processor acceleration through automated instruction set customization. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture (MICRO’36). Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Clark, N., Kudlur, M., Park, H., Mahlke, S., and Flautner, K. 2004. Application-specific processing on a general-purpose core via transparent instruction set customization. In Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’37). 30--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Clark, N., Blome, J., Chu, M., Mahlke, S., Biles, S., and Flautner, K. 2005. An architecture framework for transparent instruction set customization in embedded processors. SIGARCH Comput. Archit. News 33, 2, 272--283. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Clark, N., Hormati, A., Mahlke, S., and Yehia, S. 2006. Scalable subgraph mapping for acyclic computation accelerators. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’06). 147--157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Compton, K. and Hauck, S. 2002. Reconfigurable computing: A survey of systems and software. ACM Comput. Surv. 34, 2, 171--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Cong, J., Fan, Y., Han, G., and Zhang, Z. 2004. Application-specific instruction generation for configurable processor architectures. In Proceedings of the ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays (FPGA’04). 183--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Coudert, O. 1996. On solving covering problems. In Proceedings of the 33rd Annual Conference on Design Automation (DAC’96). 197--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Coudert, O. and Madre, J. C. 1995. New ideas for solving covering problems. In Proceedings of the 32nd ACM/IEEE Conference on Design Automation (DAC’95). 641--646. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. De Micheli, G. and Gupta, R. K. 1997. Hardware/software co-design. Proc. IEEE 85, 3, 349--365.Google ScholarGoogle ScholarCross RefCross Ref
  57. Ebeling, C., Cronquist, D., and Franklin, P. 1996. Rapid - reconfigurable pipelined datapath. In Proceedings of the 6th International Workshop on Field-Programmable Logic, Smart Applications, New Paradigms and Compilers (FPL’96). Springer, 126--135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Faraboschi, P., Brown, G., Fisher, J. A., Desoli, G., and Homewood, F. 2000. Lx: a technology platform for customizable vliw embedded processing. ACM SIGARCH Comput. Archit. News 28, 2, 203--213. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Fornaciari, W., Pozzi, L., and Sami, M. 1999. Processori riconfigurabili: unalternativa flessibile per i sistemi dedicati. Alta Frequenza - Rivista di Elettronica, 22--28.Google ScholarGoogle Scholar
  60. Fortin, S. 1996. The graph isomorphism problem. Tech. rep. TR 96-20, Department of Computing Science, University of Alberta, Canada.Google ScholarGoogle Scholar
  61. Galuzzi, C., Bertels, K., and Vassiliadis, S. 2007a. A linear complexity algorithm for the automatic generation of convex multiple input multiple output instructions. In Proceedings of the 3rd International Workshop Reconfigurable Computing: Architectures, Tools and Applications (ARC’07), P. C. Diniz, E. Marques, K. Bertels, M. M. Fernandes, and J. M. P. Cardoso Eds., Lecture Notes in Computer Science, vol. 4419. Springer, 130--141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Galuzzi, C., Bertels, K., and Vassiliadis, S. 2007b. A linear complexity algorithm for the generation of multiple input single output instructions of variable size. In Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 7th International Workshop (SAMOS’07), S. Vassiliadis, M. Berekovic, and T. D. Hämäläinen, Eds. Lecture Notes in Computer Science, vol. 4599. Springer, 283--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Galuzzi, C., Moscu Panainte, E., Yankova, Y., Bertels, K., and Vassiliadis, S. 2006. Automatic selection of application-specific instruction-set extensions. In Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’06). 160--165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Geurts, W. 1995. Synthesis of accelerator data paths for high-throughput signal processing applications. Ph.D. thesis, Katholieke Universiteit Leuven.Google ScholarGoogle Scholar
  65. Geurts, W. 1997. Accelerator Data-Path Synthesis for High-Throughput Signal Processing Applications. Kluwer Academic Publishers, Norwell, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Gokhale, M., Holmes, W., Kopser, A., Lucas, S., Minnich, R., Sweely, D., and Lopresti, D. 1991. Building and using a highly parallel programmable logic array. Comput. 24, 1, 81--89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Goldstein, S. C., Schmit, H., Moe, M., Budiu, M., Cadambi, S., Taylor, R. R., and Laufer, R. 1999. Piperench: A co-processor for streaming multimedia acceleration. SIGARCH Comput. Archit. News 27, 2, 28--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Grasselli, A. and Luccio, F. 1965. A method for minimizing the number of internal states in incompletely specified sequential networks. IEEE Trans. Electron. Comp. EC-14, 350--359.Google ScholarGoogle ScholarCross RefCross Ref
  69. Guo, Y. 2006. Mapping applications to a coarse-grained reconfigurable architecture. Ph.D. thesis, University of Twente, The Netherlands.Google ScholarGoogle Scholar
  70. Guo, Y., Smit, G. J., Broersma, H., and Heysters, P. M. 2003. A graph covering algorithm for a coarse grain reconfigurable system. In Proceedings of the ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems (LCTES’03). 199--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Gutin, G., Johnstone, A., Reddington, J., Scott, E., Soleimanfallah, A., and Yeo, A. 2007. An algorithm for finding connected convex subgraphs of an acyclic digraph. In Proceedings of the ACiD 2007.Google ScholarGoogle Scholar
  72. Hartenstein, R. 2001a. Coarse grain reconfigurable architecture (embedded tutorial). In Proceedings of the Conference on Asia South Pacific Design Automation (ASP-DAC’01). 564--570. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Hartenstein, R. 2001b. A decade of reconfigurable computing: a visionary retrospective. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’01). 642--649. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Hauck, S., Fry, T. W., Hosler, M. M., and Kao, J. P. 1997. The chimaera reconfigurable functional unit. In Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines (FCCM’97). Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Hauck, S., Fry, T. W., Hosler, M. M., and Kao, J. P. 2004. The chimaera reconfigurable functional unit. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 12, 2, 206--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Hauser, J. R. and Wawrzynek, J. 1997. Garp: a mips processor with a reconfigurable coprocessor. In Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines (FCCM’97). Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Haynes, S. D., Cheung, P. Y. K., Luk, W., and Stone, J. 1999. Sonic - A plug-in architecture for video processing. In Proceedings of the 7th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’99). Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Haynes, S. D., Stone, J., Cheung, P. Y. K., and Luk, W. 2000. Video image processing with the sonic architecture. Comput. 33, 4, 50--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Holmer, B. 1993. Automatic design of computer instruction sets. Ph.D. thesis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Huang, I.-J. and Despain, A. M. 1994a. Generating instruction sets and microarchitectures from applications. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’94). 391--396. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Huang, I.-J. and Despain, A. M. 1994b. Synthesis of instruction sets for pipelined microprocessors. In Proceedings of the 31st Annual Conference on Design Automation (DAC’94). 5--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Huang, Z. and Malik, S. 2001. Managing dynamic reconfiguration overhead in system-on-a-chip design using reconfigurable datapaths and optimized interconnection networks. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’01). 735--740. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Huang, Z., Malik, S., Moreano, N., and Araujo, G. 2004. The design of dynamically reconfigurable datapath coprocessors. Trans. Embed. Comput. Syst. 3, 2, 361--384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Huynh, H. P., Sim, J. E., and Mitra, T. 2007. An efficient framework for dynamic reconfiguration of instruction-set customization. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’07). 135--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Ienne, P. and Leupers, R. 2006. Customizable Embedded Processors: Design Technologies and Applications (Systems on Silicon). Morgan Kaufmann Publishers, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Imai, M., Sato, J., Alomary, A., and Hikichi, N. 1992. An integer programming approach to instruction implementation method selection problem. In Proceedings of the Conference on European Design Automation (EURO-DAC’92). 106--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Iseli, C. 1996. Spyder: A reconfigurable processor development system. Ph.D. thesis, Ecole Polytechnique Federale de Lausanne.Google ScholarGoogle Scholar
  88. Iseli, C. and Sanchez, E. 1995. Spyder: A sure (superscalar and reconfigurable) processor. J. Supercomput. 9, 3, 231--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Janssen, M., Catthoor, F., and de Man, H. 1996. A specification invariant technique for regularity improvement between flow-graph clusters. In Proceedings of the European Conference on Design and Test (EDTC’96). Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Jayaseelan, R., Liu, H., and Mitra, T. 2006. Exploiting forwarding to improve data bandwidth of instruction-set extensions. In Proceedings of the 43rd Annual Conference on Design Automation (DAC’06). 43--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Kastner, R., Ogrenci-Memik, S., Bozorgzadeh, E., and Sarrafzadeh, M. 2001. Instruction generation for hybrid reconfigurable systems. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’01). 127--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Kastner, R., Kaplan, A., Memik, S. O., and Bozorgzadeh, E. 2002. Instruction generation for hybrid reconfigurable systems. ACM Trans. Des. Automa. Electron. Syst. (TODAES) 7, 4, 605--627. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Kavvadias, N. and Nikolaidis, S. 2005. Automated instruction-set extension of embedded processors with application to mpeg-4 video encoding. In Proceedings of the IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP’05). 140--145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Kavvadias, N. and Nikolaidis, S. 2006. A flexible instruction generation framework for extending embedded processors. In Proceedings of the 13th IEEE Mediterranean Electrotechnical Conference (MELECON’06). 125--128.Google ScholarGoogle Scholar
  95. Keutzer, K., Malik, S., and Newton, A. R. 2002. From asic to asip: The next design discontinuity. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD’02). 84--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. Lam, S.-K. and Srikanthan, T. 2009. Rapid design of area-efficient custom instructions for reconfigurable embedded processing. J. Syst. Archit. 55, 1, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. Lam, S. K., Srikantham, T., and Clarke, C. T. 2006. Rapid generation of custom instructions using predefined dataflow structures. Microprocess. Microsyst. 30, 6, (Special Issue on FPGA’s), 355--366.Google ScholarGoogle Scholar
  98. Lee, C., Potkonjak, M., and Mangione-Smith, W. H. 1997. Mediabench: A tool for evaluating and synthesizing multimedia and communicatons systems. In Proceedings of the 30th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’30). 330--335. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. Lee, J.-E., Choi, K., and Dutt, N. 2002. Efficient instruction encoding for automatic instruction set design of configurable asips. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’02). 649--654. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. Lee, J.-E., Choi, K., and Dutt, N. D. 2003a. Energy-efficient instruction set synthesis for application-specific processors. In Proceedings of the 2003 International Symposium on Low Power Electronics and Design (ISLPED’03). 330--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Lee, J.-E., Choi, K., and Dutt, N. D. 2003b. An algorithm for mapping loops onto coarse-grained reconfigurable architectures. In Proceedings of the ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems (LCTES’03). 183--188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. Lee, J.-E., Choi, K., and Dutt, N. D. 2007. Instruction set synthesis with efficient instruction encoding for configurable processors. ACM Trans. Des. Autom. Electron. Syst. 12, 1, 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. Leupers, R., Karuri, K., Kraemer, S., and Pandey, M. 2006. A design flow for configurable embedded processors based on optimized instruction set extension synthesis. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’06). European Design and Automation Association, 581--586. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. Li, X. Y., Stallmann, M. F., and Brglez, F. 2005. Effective bounding techniques for solving unate and binate covering problems. In Proceedings of the 42nd Annual Conference on Design Automation (DAC’05). 385--390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. Liao, S. and Devadas, S. 1997. Solving covering problems using lpr-based lower bounds. In Proceedings of the 34th Annual Conference on Design Automation (DAC’97). 117--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. Liao, S., Devadas, S., Keutzer, K., and Tjiang, S. 1995. Instruction selection using binate covering for code size optimization. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’95). 393--399. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Liao, S., Keutzer, K., Tjiang, S., and Devadas, S. 1998. A new viewpoint on code generation for directed acyclic graphs. ACM Trans. Design Automat. Electron. Syst. (TODAES) 3, 1, 51--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. Liem, C., May, T., and Paulin, P. 1994. Instruction-set matching and selection for DSP and ASIP code generation. In Proceedings of the European Design and Test Conference (ED&TC). 31--37.Google ScholarGoogle Scholar
  109. Lin, S. and Kernighan, B. 1973. An effective heuristic algorithm for the traveling-salesman problem. Oper. Res. 21, 2, 498--516.Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. Lu, G., Singh, H., Lee, M.-H., Bagherzadeh, N., Kurdahi, F. J., and Filho, E. M. C. 1999. The morphosys parallel reconfigurable system. In Proceedings of the 5th International Euro-Par Conference on Parallel Processing (Euro-Par’99). Springer, 727--734. Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. Mei, B., Vernalde1, S., Verkest, D., Man, H. D., and Lauwereins, R. 2003. Adres: An architecture with tightly coupled vliw processor and coarse-grained reconfigurable matrix. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’03). Springer, 61--70.Google ScholarGoogle Scholar
  112. Messmer, B. T. and Bunke, H. 1995. Subgraph isomorphism in polynomial time. Tech. rep. IAM 95-003, University of Bern, Switzerland.Google ScholarGoogle Scholar
  113. Miyamori, T. and Olukotun, K. 1998. Remarc (abstract): Reconfigurable multimedia array coprocessor. In Proceedings of the ACM/SIGDA 6th International Symposium on Field Programmable Gate Arrays (FPGA’98). Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. Moreano, N., Araujo, G., Huang, Z., and Malik, S. 2002. Datapath merging and interconnection sharing for reconfigurable architectures. In Proceedings of the 15th International Symposium on System Synthesis (ISSS’02). 38--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. Niemann, R. and Marwedel, P. 1996. Hardware/software partitioning using integer programming. In Proceedings of the European Conference on Design and Test (EDTC9’6). Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Niemann, R. and Marwedel, P. 1997. An algorithm for hardware/software partitioning using mixed integer linear programming. Des. Automat. Embedd. Syst. 2, 2, Special Issue: Partitioning Methods for Embedded Systems, 165--193.Google ScholarGoogle Scholar
  117. Peymandoust, A., Pozzil, L., Ienne, P., and Micheli, G. D. 2003. Automatic instruction set extension and utilization for embedded processors. In Proceedings of the 14th International Conference on Application-Specific Systems, Architectures and Processors (ASAP’03). 108--118.Google ScholarGoogle Scholar
  118. Pothineni, N., Kumar, A., and Paul, K. 2007. Application specific datapath extension with distributed i/o functional units. In Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference (VLSID’07). 551--558. Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. Pozzi, L. 2000. Methodologies for the design of application-specific reconfigurable vliw processors. Ph.D. thesis, Politecnico di Milano, Milano, Italy.Google ScholarGoogle Scholar
  120. Pozzi, L. and Ienne, P. 2005. Exploiting pipelining to relax register-file port constraints of instruction-set extensions. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’05). 2--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. Pozzi, L., Vuletić, M., and Ienne, P. 2001. Automatic topology-based identification of instruction-set extensions for embedded processors. Tech. rep. CS 01/377, EPFL, DI-LAP, Lausanne.Google ScholarGoogle Scholar
  122. Pozzi, L., Vuletić, M., and Ienne, P. 2002. Automatic topology-based identification of instruction-set extensions for embedded processors. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’02). Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. Pozzi, L., Atasu, K., and Ienne, P. 2006a. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. Comput.-Aid. Desi. Integra. Circ. Syst. 25, 7, 1209--1229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. Pozzi, L., Atasu, K., and Ienne, P. 2006b. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. Comput.-Aid. Des. Integra. Circ. Syst. 25, 7, 1209--1229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. Rabaey, J. 1997. Reconfigurable processing: The solution to low-power programmable dsp. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’97). vol. 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. Radunovic, B. and Milutinovic, V. M. 1998. A survey of reconfigurable computing architectures. In Proceedings of the 8th International Workshop on Field-Programmable Logic and Applications, From FPGAs to Computing Paradigm (FPL’98). Springer, 376--385. Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. Razdan, R., Brace, K. S., and Smith, M. D. 1994. PRISC software acceleration techniques. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computer & Processors (ICCS’’94). 145--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  128. Razdan, R. and Smith, M. D. 1994. A high-performance microarchitecture with hardware-programmable functional units. In Proceedings of the 27th Annual International Symposium on Microarchitecture (MICRO’27). 172--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  129. Rupp, C. R., Landguth, M., Garverick, T., Gomersall, E., Holt, H., Arnold, J. M., and Gokhale, M. 1998. The napa adaptive processing architecture. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM’98). Google ScholarGoogle ScholarDigital LibraryDigital Library
  130. Sang, S., Li, X., and Ye, Y. 2005. Automatic instruction generation for application specific co-processor. In 6th International Conference On ASIC (ASICON’05). 934--938.Google ScholarGoogle Scholar
  131. Scharwaechter, H., Youn, J. M., Leupers, R., Paek, Y., Ascheid, G., and Meyr, H. 2007. A code-generator generator for multi-output instructions. In Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’07). 131--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  132. Seto, K. and Fujita, M. 2008. Custom instruction generation with high-level synthesis. In Proceedings of the 2008 Symposium on Application Specific Processors (SASP). Anaheim, California, 14--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  133. Sreenivasa Rao, D. and Kurdahi, F. J. 1992. Partitioning by regularity extraction. In Proceedings of the 29th ACM/IEEE Conference on Design Automation (DAC’92). 235--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  134. Sreenivasa Rao, D. and Kurdahi, F. J. 1993a. Hierarchical design space exploration for a class of digital systems. IEEE Trans. Very Large Scale Integra. (VLSI) Syst. 1, 3, 282--295.Google ScholarGoogle ScholarDigital LibraryDigital Library
  135. Sreenivasa Rao, D. and Kurdahi, F. J. 1993b. On clustering for maximal regularity extraction. IEEE Trans. Comput.-Aid. Des. 12, 8, 1198--1208.Google ScholarGoogle ScholarDigital LibraryDigital Library
  136. Strozek, L. and Brooks, D. 2006. Efficient architectures through application clustering and architectural heterogeneity. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’06). 190--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  137. Sun, F., Ravi, S., Raghunathan, A., and Jha, N. K. 2002. Synthesis of custom processors based on extensible platforms. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’02). 641--648. Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. Sun, F., Ravi, S., Raghunathan, A., and Jha, N. K. 2003. A scalable application-specific processor synthesis methodology. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’03). Google ScholarGoogle ScholarDigital LibraryDigital Library
  139. Sun, F., Ravi, S., Raghunathan, A., and Jha, N. K. 2004. Custom-instruction synthesis for extensible processor platform. IEEE Trans. Comput.-Aid. Des. Integra. Circ. 23, 2, 216--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  140. Todman, T., Constantinides, G., Wilton, S., Mencer, O., Luk, W., and Cheung, P. 2005. Reconfigurable computing: Architectures and design methods. IEE Proc. - Comput. Digital Tech. 152, 2, 193--207.Google ScholarGoogle ScholarCross RefCross Ref
  141. Van Praet, J., Goossens, G., Lanneer, D., and De Man, H. 1994. Instruction set definition and instruction selection for asips. In Proceedings of the 7th International Symposium on High-level Synthesis (ISSS’94). 11--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  142. Vassiliadis, S. and Soudris, D., Eds. 2007. Fine- and Coarse-Grain Reconfigurable Computing. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  143. Vassiliadis, S., Wong, S., and Cotofana, S. 2001. The molen ϱμ-coded processor. In Proceedings of the 11th International Conference on Field-Programmable Logic and Applications (FPL’01). Springer-Verlag, London, UK, 275--285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  144. Vassiliadis, S., Wong, S., Gaydadjiev, G., Bertels, K., Kuzmanov, G., and Moscu Panainte, E. 2004. The molen polymorphic processor. IEEE Trans. Comput. 53, 11, 1363--1375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  145. Vassiliadis, N., Kavvadias, N., Theodoridis, G., and Nikolaidis, S. 2006. A risc architecture extended by an efficient tightly coupled reconfigurable unit. Inte. J. Electron. 93, 6, 421--438.Google ScholarGoogle ScholarCross RefCross Ref
  146. Vassiliadis, N., Theodoridis, G., and Nikolaidis, S. 2007. Enhancing a reconfigurable instruction set processor with partial predication and virtual opcode support. In Proceedings of the 2nd International Workshop on Applied Reconfigurable Computing (ARC’06). Lecture Notes in Computer Science, vol. 3985. Springer, 217--229.Google ScholarGoogle Scholar
  147. Verma, A. K., Atasu, K., Vuletić, M., Pozzi, L., and Ienne, P. 2002. Automatic application-specific instruction-set extensions under microarchitectural constraints. In Proceedings of the 1st Workshop on Application Specific Processors (WASP-1).Google ScholarGoogle Scholar
  148. Verma, A. K., Brisk, P., and Ienne, P. 2007. Rethinking custom ise identification: A new processor-agnostic method. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’07). 125--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  149. Wang, A., Killian, E., Maydan, D., and Rowen, C. 2001. Hardware/software instruction set configurability for system-on-chip processors. In Proceedings of the 38th Conference on Design Automation (DAC’01). 184--188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  150. Wazlowski, M., Agarwal, L., Lee, T., Smith, A., Lam, E., Athanas, P., Silverman, H., and Ghosh, S. 1993. Prism-ii compiler and architecture. In Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines. 9--16.Google ScholarGoogle Scholar
  151. Wirthlin, M. J. and Hutchings, B. L. 1995. Disc: The dynamic instruction set computer. In Proceedings of the International Society of Optical Engineering SPIE. Field Programmable Gate Arrays (FPGAs) for Fast Board Development and Reconfigurable Computing. vol. 2607. 92--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  152. Wittig, R. and Chow, P. 1996. OneChip: An FPGA processor with reconfigurable logic. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines. 126--135.Google ScholarGoogle Scholar
  153. Wittig, R. D. 1995. Onechip: An fpga processor with reconfigurable logic. M.S. thesis, Department of Electrical and Computer Engineering, University of Toronto.Google ScholarGoogle Scholar
  154. Wolinski, C. and Kuchcinski, K. 2007. Identification of application specific instructions based on sub-graph isomorphism constraints. In Proceedings of the IEEE International Application -specific Systems, Architectures and Processors. 328--333.Google ScholarGoogle Scholar
  155. Wolinski, C. and Kuchcinski, K. 2008. Automatic selection of application-specific reconfigurable processor extensions. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’08). 1214--1219. Google ScholarGoogle ScholarDigital LibraryDigital Library
  156. Wong, S., Vassiliadis, S., and Cotofana, S. 2007. Instruction set extension generation with considering physical constraints. In Proceedings of the International Conference on High Performance Embedded Architectures and Compilers. 291--305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  157. Ye, Z. A., Moshovos, A., Hauck, S., and Banerjee, P. 2000. CHIMAERA: A high-performance architecture with a tightly-coupled reconfigurable functional unit. In ACM SIGARCH Comput. Archit. News (Special Issue: Proceedings of the 27th annual international symposium on Computer architecture ISCA), 225--235. Google ScholarGoogle ScholarDigital LibraryDigital Library
  158. Yu, P. and Mitra, T. 2004. Scalable custom instructions identification for instruction-set extensible processors. In Proceedings of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’04). 69--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  159. Yu, P. and Mitra, T. 2005. Satisfying real-time constraints with custom instructions. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’05). 166--171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  160. Yu, P. and Mitra, T. 2007. Disjoint pattern enumeration for custom instructions identification. In Proceedings of the 17th IEEE International Conference on Field Programmable Logic and Applications (FPL’07). Amsterdam, The Netherlands, --.Google ScholarGoogle Scholar
  161. Zhao, K., Bian, J., Dong, S., Song, Y., and Goto, S. 2008. Fast custom instruction identification algorithm based on basic convex pattern model for supporting asip automated design. IEICE Trans. Fundam. Electron. Comm. Comput. Sci. E91-A, 6, 1478--1487. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The Instruction-Set Extension Problem: A Survey

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!