Abstract
The large number of embedded soft core processors available today make it tedious and time consuming to select the best processor for a given application. This task is even more challenging due to the numerous configuration options available for a single soft core processor while optimizing for contradicting design requirements such as performance and area. In this article, we propose a generic framework for rapid performance estimation of applications on soft core processors. The proposed technique is scalable to the large number of configuration options available in modern soft core processors by relying on rapid and accurate estimation models instead of time-consuming FPGA synthesis and execution-based techniques. Experimental results on two leading commercial soft core processors executing applications from the widely used CHStone benchmark suite show an average error of less than 6% while running in the order of minutes when compared to hours taken by synthesis-based techniques.
- Altera. 2005. Stratix II vs. Virtex-4 Density Comparison. Retrieved April 2, 2018, from https://www.altera.com/en_US/pdfs/literature/wp/wpstxiixlnx.pdf.Google Scholar
- Altera. 2016. Cyclone V Device Overview. Retrieved April 2, 2018, from https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/hb/cyclone-v/cv_51001.pdf.Google Scholar
- Sven-Ake Andersson. 2013. Four Soft-Core Processors for Embedded Systems. Retrieved April 2, 2018, from http://www.eetimes.com/document.asp?doc_id=1280290.Google Scholar
- Yan Lin Aung. 2015. Rapid Design Exploration Framework for Realizing Custom Computing Systems on FPGAs. Ph.D. Dissertation. Nanyang Technological University.Google Scholar
- Yan Ling Aung, Siew-Kei Lam, and Thambipillai Srikanthan. 2011. Compiler-assisted technique for rapid performance estimation of FPGA-based processors. In Proceedings of the 2011 IEEE International SOC Conference. 341--346.Google Scholar
Cross Ref
- Todd Austin, Eric Larson, and Dan Ernst. 2002. SimpleScalar: An infrastructure for computer system modeling. Computer 35, 2, 59--67. Google Scholar
Digital Library
- Rodolfo Azevedo, Sandro Rigo, Marcus Bartholomeu, Guido Araujo, Cristiano Araujo, and Edna Barros. 2005. The ArchC architecture description language and tools. Int. J. Parallel Program. 33, 5, 453--484. Google Scholar
Digital Library
- Brian Bailey. 2008. System Level Virtual Prototyping Becomes a Reality With OVP Donation From Imperas. Technical Report. Retrieved April 2, 2018, from http://www.ovpworld.org/documents/BrianBaileyWhitePaper_SLVP_and_OVP.pdf.Google Scholar
- Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The gem5 simulator. SIGARCH Comput. Archit. News 39, 2, 1--7. Google Scholar
Digital Library
- Aimen Bouchhima, Patrice Gerin, and Frederic Petrot. 2009. Automatic instrumentation of embedded software for high level hardware/software co-simulation. In Proceedings of the 2009 Asia and South Pacific Design Automation Conference. 546--551. Google Scholar
Digital Library
- Cadence Design Systems. 2016. Virtual System Platform. Retrieved April 2, 2018, from https://www.cadence.com/content/cadence-www/global/en_US/home/tools/system-design-and-verification/software-driven-verification/virtual-system-platform.html.Google Scholar
- Andrew Cagney. 2011. PSIM—Model of the PowerPC Architecture. Retrieved April 2, 2018, from http://www-rohan.sdsu.edu/doc/psim/.Google Scholar
- Eric Cheung, Harry Hsieh, and Felice Balarin. 2007. Framework for fast and accurate performance simulation of multiprocessor systems. In Proceedings of the 2007 IEEE International High Level Design Validation and Test Workshop. 21--28. Google Scholar
Digital Library
- Lei Gao, Jia Huang, Jianjiang Ceng, Rainer Leupers, Gerd Ascheid, and Heinrich Meyr. 2009. TotalProf: A fast and accurate retargetable source code profiler. In Proceedings of the 7th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’09). ACM, New York, NY, 305--314. Google Scholar
Digital Library
- Paolo Giusto, Grant Edmund Martin, and Ed Harcourt. 2001. Reliable estimation of execution time of embedded software. In Proceedings of the Conference on Design, Automation, and Test in Europe (DATE’01). IEEE, Los Alamitos, CA, 580--589. http://dl.acm.org/citation.cfm?id=367072.367827 Google Scholar
Digital Library
- Jan Gray. 2014. The Past and Future of FPGA Soft Processors. Retrieved April 2, 2018, from http://fpga.org/wp-content/uploads/2014/12/reconfig-14-the-past-and-future-of-fpga-soft-processors.pdf.Google Scholar
- Yuko Hara, Hiroyuki Tomiyama, Shinya Honda, Hiroaki Takada, and Katsuya Ishii. 2008. CHStone: A benchmark program suite for practical C-based high-level synthesis. In Proceedings of the 2008 IEEE International Symposium on Circuits and Systems. 1192--1195.Google Scholar
Cross Ref
- Yonghyun Hwang, Samar Abdi, and Daniel Gajski. 2008. Cycle-approximate retargetable performance estimation at the transaction level. In Proceedings of the Conference on Design, Automation, and Test in Europe. 3--8. Google Scholar
Digital Library
- Imperas Software Ltd. 2016. Why Use Virtual Platforms? Retrieved April 2, 2018, from http://www.imperas.com/why-use-virtual-platforms.Google Scholar
- Intel. 2017. Intel FPGAs. Retrieved April 2, 2018, from https://www.altera.com/products/fpga/overview.html.Google Scholar
- Rui Jia, Colin Yu Lin, Zhenhong Guo, Rui Chen, Fei Wang, Tongqiang Gao, and Haigang Yang. 2014. A survey of open source processors for FPGAs. In Proceedings of the 2014 24th International Conference on Field Programmable Logic and Applications (FPL’14). 1--6.Google Scholar
- Marco Lattuada and Fabrizio Ferrandi. 2010. Performance modeling of embedded applications with zero architectural knowledge. In Proceedings of the 8th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES/ISSS’10). ACM, New York, NY, 277--286. Google Scholar
Digital Library
- Bryan Lewis and Ganesh Ramamoorthy. 2011. Market Trends: ASIC Design Starts. Retrieved April 2, 2018, from https://www.gartner.com/doc/1934119/market-trends-worldwide-asic-assp.Google Scholar
- LLVM. 2017a. LLVM Language Reference Manual. Retrieved April 2, 2018, from https://llvm.org/docs/LangRef.html.Google Scholar
- LLVM. 2017b. The LLVM Compiler Infrastructure. Retrieved April 2, 2018, from https://llvm.org/.Google Scholar
- Trevor Meyerowitz, Alberto Sangiovanni-Vincentelli, Mirko Sauermann, and Dominik Langen. 2008. Source-level timing annotation and simulation for a heterogeneous multiprocessor. In Proceedings of the Conference on Design, Automation, and Test in Europe. 276--279. Google Scholar
Digital Library
- Marcio Seiji Oyamada, Felipe Zschornack, and Flávio Rech Wagner. 2008. Applying neural networks to performance estimation of embedded software. J. Syst. Arch. 54, 1--2, 224--240. Google Scholar
Digital Library
- Rajendra Patel and Arvind Rajawat. 2013. Recent trends in embedded system software performance estimation. Des. Autom. Embedded Syst. 17, 1, 193--213. Google Scholar
Digital Library
- Petr Pfeifer, Zdenëk Plva, Mario Schölzel, Tobias Koal, and Heinrich T. Vierhaus. 2013. On performance estimation of a scalable VLIW soft-core on ALTERA and XILINX FPGA platforms. In Proceedings of the 2013 International Conference on Applied Electronics. 1--4.Google Scholar
- Adam Powell, Christos S. Bouganis, and Peter Y. K. Cheung. 2012. Early performance estimation of image compression methods on soft processors. In Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL’12). 587--590.Google Scholar
- Alok Prakash, Siew-Kei Lam, Amit Kumar Singh, and Thambipillai Srikanthan. 2009. Rapid design exploration framework for application-aware customization of soft core processors. In Proceedings of the 2009 International Conference on Field Programmable Logic and Applications. 539--542.Google Scholar
Cross Ref
- QEMU. 2016. QEMU: The FAST! Processor Emulator. Retrieved April 2, 2018, from https://www.qemu.org/.Google Scholar
- Abhijit Ray. 2008. Methods for Rapid Selection of Processors for Constraint-Aware Embedded Systems. Ph.D. Dissertation. Nanyang Technological University.Google Scholar
- Abhijit Ray, Thambipillai Srikanthan, and Wu Jigang. 2005. Practical techniques for performance estimation of processors. In Proceedings of the 5th International Workshop on System-on-Chip for Real-Time Applications (IWSOC’05). 308--311. Google Scholar
Digital Library
- Jurgen Schnerr, Oliver Bringmann, Alexander Viehl, and Wolfgang Rosenstiel. 2008. High-performance timing simulation of embedded software. In Proceedings of the 2008 45th ACM/IEEE Design Automation Conference. 290--295. Google Scholar
Digital Library
- Yakun Sophia Shao and David Brooks. 2013. ISA-independent workload characterization and its implications for specialized architectures. In Proceedings of the 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’13). 245--255.Google Scholar
Cross Ref
- David Sheldon, Rakesh Kumar, Roman Lysecky, Frank Vahid, and Dean Tullsen. 2006. Application-specific customization of parameterized FPGA soft-core processors. In Proceedings of the 2006 IEEE/ACM International Conference on Computer Aided Design. 261--268. Google Scholar
Digital Library
- David Sheldon, Frank Vahid, and Stefano Lonardi. 2007. Soft-core processor customization using the design of experiments paradigm. In Proceedings of the 2007 Design, Automation, and Test in Europe Conference and Exhibition (DATE’07). 1--6. Google Scholar
Digital Library
- Rehab Abdullah Shendi. 2015. Run-Time Customization of a Soft-Core CPU on an FPGA. Master’s Thesis. University of Manchester.Google Scholar
- Synopsys. 2016. Virtual Prototyping. Retrieved April 2, 2018, from http://www.synopsys.com/prototyping/virtualprototyping/Pages/default.aspx.Google Scholar
- Zhonglei Wang. 2010. Software Performance Estimation Methods for System-Level Design of Embedded Systems. Ph.D. Dissertation. Technical University of Munich.Google Scholar
- Deshya Wijesundera, Alok Prakash, Siew Kei Lam, and Thambipillai Srikanthan. 2016b. Exploiting configuration dependencies for rapid area-efficient customization of soft-core processors. In Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems (SCOPES’16). ACM, New York, NY, 163--172. Google Scholar
Digital Library
- Deshya Wijesundera, Alok Prakash, and Thambipillai Srikanthan. 2016a. Rapid design space exploration for soft core processor customization and selection. In Proceedings of the 2016 International Conference on Field Programmable Technology.Google Scholar
Cross Ref
- Xilinx. 2016a. 7 Series FPGAs Data Sheet: Overview. Retrieved April 2, 2018, from https://www.xilinx.com/support/documentation/data_sheets/ds180_7Series_Overview.pdf.Google Scholar
- Xilinx. 2016b. Using the MicroBlaze Processor to Accelerate Cost-Sensitive Embedded System Development. Retrieved from https://www.xilinx.com/support/documentation/white_papers/wp469-microblaze-for-cost-sensitive-apps.pdf.Google Scholar
- Xilinx. 2016c. QEMU. Retrieved April 2, 2018, from http://www.wiki.xilinx.com/QEMU?responseToken=08ea8bcce365b7745b5d26035259a4e85.Google Scholar
- Xilinx. 2017. FPGAs and 3D ICs. Retrieved April 2, 2018, from https://www.xilinx.com/products/silicon-devices/fpga.html.Google Scholar
- Cheah Hui Yan. 2016. The iDEA Architecture-Focused FPGA Soft Processor. Ph.D. Dissertation. Nanyang Technological University.Google Scholar
- Peter Yiannacouras, J. Gregory Steffan, and Jonathan Rose. 2006. Application-specific customization of soft processor microarchitecture. In Proceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays (FPGA’06). ACM, New York, NY, 201--210. Google Scholar
Digital Library
Index Terms
Framework for Rapid Performance Estimation of Embedded Soft Core Processors
Recommendations
High-Performance Instruction Scheduling Circuits for Superscalar Out-of-Order Soft Processors
Special Section on FCCM 2016 and Regular PapersSoft processors have a role to play in simplifying field-programmable gate array (FPGA) application design as they can be deployed only when needed, and it is easier to write and debug single-threaded software code than create hardware. The breadth of ...
Fine-grain performance scaling of soft vector processors
CASES '09: Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systemsEmbedded systems are often implemented on FPGA devices and 25% of the time include a soft processor--a processor built using the FPGA reprogrammable fabric. Because of their prevalence and flexibility, soft processors are compelling targets for ...
Performance evaluation of intel's quad core processors for embedded applications
Recently, multiprocessing is implemented using either chip multiprocessing (CMP) or Simultaneous multithreading (SMT). Multi-core processors, represent CMP processors, are widely used in desktop and server applications and are now appearing in real-time ...






Comments