Abstract
Switching active memory banks at runtime allows a processor with a narrow address bus to access memory that exceeds ranges normally addressable via the bus. Switching code memory banks is regaining interest in microcontrollers for the Internet of Things (IoT), which have to run continuously growing software, while at the same time consuming ultra-small amounts of energy. To make use of bank switching, such software must be partitioned among the available banks and augmented with bank-switching instructions. In contrast to the augmenting, which is done automatically by a compiler, today the partitioning is normally done manually by programmers. However, since IoT software is cross-compiled on much more powerful machines than its target microcontrollers, it becomes possible to partition it automatically during compilation.
In this article, we thus study the problem of partitioning program code among banks such that the resulting runtime performance of the program is maximized. We prove that the problem is NP-hard and propose a heuristic algorithm with a low complexity, so it enables fast compilation and hence interactive software development. The algorithm decomposes the problem into three subproblems and introduces a heuristic for each of them: (1) which pieces of code to partition, (2) which of them to assign to permanently mapped banks, and (3) how to divide the remaining ones among switchable banks. We integrate the algorithm, together with earlier ones, in an open-source compiler and test the resulting solution on synthetic as well as actual commercial IoT software bases, thereby demonstrating its advantages and drawbacks. In particular, the results show that the performance of partitions produced by our algorithm comes close to that of partitions created manually by programmers with expert knowledge on the partitioned code.
- Amstrad. 1987. ZX Spectrum +3: Manual. Amstrad PLC, Brentwood, UK. Retrieved from http://www.worldofspectrum.org/ZXSpectrum128+3Manual/.Google Scholar
- ARM Limited. 2009. Cortex-M0: Technical Reference Manual. Revision r0p0, ARM DDI 0432C (ID113009). (Nov. 2009).Google Scholar
- ARM Limited. 2010. Cortex-M3: Technical Reference Manual. Revision r2p0, ARM DDI 0337H (ID032710). (Feb. 2010).Google Scholar
- ARM Limited. 2012. Cortex-M0+: Technical Reference Manual. Revision r0p0, ARM DDI 0484B (ID041812). (April 2012).Google Scholar
- C. Gordon Bell and Allen Newell. 1971. Computer Structures: Readings and Examples. McGraw-Hill.Google Scholar
- Andrew C. Bray, Adrian C. Dickens, and Mark A. Holmes. 1983. The Advanced User Guide for the BBC Microcomputer (3rd ed.). The Cambridge Microcomputer Centre.Google Scholar
- Stephane Carrez. 2003. GNU Development Chain for 68HC11868HC12. Retrieved from http://www.gnu.org/software/m68hc11/m68hc11_doc.html.Google Scholar
- Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, and Kevin Skadron. 2008. A performance study of general-purpose applications on graphics processors using CUDA. J. Parallel Distrib. Comput. 68, 10 (2008), 1370--1380. DOI:http://dx.doi.org/10.1016/j.jpdc.2008.05.014 General-Purpose Processing using Graphics Processing Units. Google Scholar
Digital Library
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms (3rd ed.). The MIT Press.Google Scholar
Digital Library
- Wei Dong, Chun Chen, Jiajun Bu, and Wen Liu. 2014. Optimizing relocatable code for efficient software update in networked embedded systems. ACM Trans. Sen. Netw. 11, 2 (July 2014), Article 22, 34 pages. DOI:http://dx.doi.org/10.1145/2629479 Google Scholar
Digital Library
- Mathilde Durvy, Julien Abeillé, Patrick Wetterwald, Colin O’Flynn, Blake Leverett, Eric Gnoske, Michael Vidales, Geoff Mulligan, Nicolas Tsiftes, Niclas Finne, and Adam Dunkels. 2008. Making sensor networks IPv6 ready. In Proceedings of the 6th ACM Conference on Embedded Network Sensor Systems (SenSys’08). ACM, New York, NY, 421--422. DOI:http://dx.doi.org/10.1145/1460412.1460483 Google Scholar
Digital Library
- Sandeep Dutta. 2015. SDCC Compiler User Guide (SDCC 3.5.1). Revision 929. (Aug. 2015).Google Scholar
- Michael R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman.Google Scholar
Digital Library
- David Gay, Philip Levis, Robert von Behren, Matt Welsh, Eric Brewer, and David Culler. 2003. The nesc language: A holistic approach to networked embedded systems. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (PLDI’03). ACM, New York, NY, 1--11. DOI:http://dx.doi.org/10.1145/781131.781133 Google Scholar
Digital Library
- Mateusz Grabowski, Michal Marschall, Wojciech Sirko, Maciej Debski, Marcin Ziombski, Przemyslaw Horban, Szymon Acedanski, Marcin Peczarski, Dominik Batorski, and Konrad Iwanicki. 2015. An experimental platform for quantified crowd. In Proceedings of the 2015 24th International Conference on Computer Communication and Networks (ICCCN’15). 1--6. DOI:http://dx.doi.org/10.1109/ICCCN.2015.7288377 Google Scholar
Cross Ref
- John Hopcroft and Robert Tarjan. 1973. Algorithm 447: Efficient algorithms for graph manipulation. Commun. ACM 16, 6 (June 1973), 372--378. DOI:http://dx.doi.org/10.1145/362248.362272 Google Scholar
Digital Library
- Jonathan W. Hui. 2008. An Extended Internet Architecture for Low-Power Wireless Networks - Design and Implementation. Ph.D. Dissertation. University of California, Berkeley, Berkeley, CA, USA.Google Scholar
- Jonathan W. Hui and David E. Culler. 2008. IP is dead, long live IP for wireless sensor networks. In Proceedings of the 6th ACM Conference on Embedded Network Sensor Systems (SenSys’08). ACM, New York, NY, 15--28. DOI:http://dx.doi.org/10.1145/1460412.1460415 Google Scholar
Digital Library
- Konrad Iwanicki. 2016. RNFD: Routing-layer detection of DODAG (Root) node failures in low-power wireless networks. In Proceedings of the 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN’16). 1--12. DOI:http://dx.doi.org/10.1109/IPSN.2016.7460720 Google Scholar
Cross Ref
- Konrad Iwanicki, Przemyslaw Horban, Piotr Glazar, and Karol Strzelecki. 2014. Bringing modern unit testing techniques to sensornets. ACM Trans. Sen. Netw. 11, 2 (July 2014), Article 25, 41 pages. DOI:http://dx.doi.org/10.1145/2629422 Google Scholar
Digital Library
- David R. Karger and Clifford Stein. 1996. A new approach to the minimum cut problem. J. ACM 43, 4 (July 1996), 601--640. DOI:http://dx.doi.org/10.1145/234533.234534 Google Scholar
Digital Library
- Simon Kellner. 2010. Flexible online energy accounting in TinyOS. In Proceedings of the 4th International Workshop on Real-World Wireless Sensor Networks (REALWSN’10). Springer, Berlin, 62--73. DOI:http://dx.doi.org/10.1007/978-3-642-17520-6_6 Google Scholar
Cross Ref
- JeongGil Ko, Qiang Wang, Thomas Schmid, Wanja Hofer, Prabal Dutta, and Andreas Terzis. 2010. Egs: A cortex M3-based mote platform. In Proceedings of the 2010 7th Annual IEEE Communications Society Conference on Sensor Mesh and Ad Hoc Communications and Networks (SECON). 1--3. DOI:http://dx.doi.org/10.1109/SECON.2010.5508223 Google Scholar
Cross Ref
- Rainer Leupers and Daniel Kotte. 2001. Variable partitioning for dual memory bank DSPs. In Proceedings of the Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP’01). 2001 IEEE International Conference on, Vol. 2. 1121--1124 vol.2. DOI:http://dx.doi.org/10.1109/ICASSP.2001.941118 Google Scholar
Digital Library
- Lian Li, Lin Gao, and Jingling Xue. 2005. Memory coloring: A compiler approach for scratchpad memory management. In Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT’05). 329--338. DOI:http://dx.doi.org/10.1109/PACT.2005.27 Google Scholar
Digital Library
- Qing’an Li, Yanxiang He, Yong Chen, Wei Wu, and Wenwen Xu. 2010. A heuristic algorithm for optimizing page selection instructions. In Proceedings of the 2010 2nd International Conference on Software Technology and Engineering (ICSTE), Vol. 2. 143--148. DOI:http://dx.doi.org/10.1109/ICSTE.2010.5608834 Google Scholar
Cross Ref
- Lotus Development Corporation, Intel Corporation, and Microsoft Corporation. 1987. Expanded Memory Specification (Version 4.0). 300275-005. Retrieved from http://www.phatcode.net/res/218/files/limems 40.txt.Google Scholar
- Lotus Development Corporation, Intel Corporation, and Microsoft Corporationand AST Research, Inc. 1988. eXtended Memory Specification (XMS), ver 2.0. Retrieved from http://www.phatcode.net/res/219/files/xms20.txt.Google Scholar
- George S. Lueker. 1978. A data structure for orthogonal range queries. In Proceedings of the 19th Annual Symposium on Foundations of Computer Science. IEEE, 28--34. DOI:http://dx.doi.org/10.1109/SFCS.1978.1 Google Scholar
Digital Library
- Yuan Mengting, Chun J. Xue, Chen Yong, Li Qing’an, and Yingchao Zhao. 2013. Minimizing code size via page selection optimization on partitioned memory architectures. In Proceedings of the 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES). 1--10. DOI:http://dx.doi.org/10.1109/CASES.2013.6662516 Google Scholar
Cross Ref
- Microchip Technology, Inc. 2002. PIC16F7X: 28/40-pin, 8-bit CMOS FLASH Microcontrollers. Data Sheet, DS30325B. (2002).Google Scholar
- Bhargavi Nisarga. 2007. Extended Memory Access Using IAR v3.42A and CCE v2. Texas Instruments, Inc., Application Report, SLAA376. (Nov. 2007).Google Scholar
- Preeti R. Panda, F. Catthoor, Nikil D. Dutt, K. Danckaert, E. Brockmeyer, C. Kulkarni, A. Vandercappelle, and P. G. Kjeldsberg. 2001. Data and memory optimization techniques for embedded systems. ACM Trans. Des. Autom. Electron. Syst. 6, 2 (April 2001), 149--206. DOI:http://dx.doi.org/10.1145/375977.375978 Google Scholar
Digital Library
- Shane Ryoo, Christopher I. Rodrigues, Sara S. Baghsorkhi, Sam S. Stone, David B. Kirk, and Wen-mei W. Hwu. 2008. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’08). ACM, New York, NY, 73--82. DOI:http://dx.doi.org/10.1145/1345206.1345220 Google Scholar
Digital Library
- Mazen A. R. Saghir, Paul Chow, and Corinna G. Lee. 1996. Exploiting dual data-memory banks in digital signal processors. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII). ACM, New York, NY, 234--243. DOI:http://dx.doi.org/10.1145/237090.237193 Google Scholar
Digital Library
- Bernhard Scholz, Bernd Burgstaller, and Jingling Xue. 2006. Minimizing bank selection instructions for partitioned memory architecture. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’06). 201--211. DOI:http://dx.doi.org/10.1145/1176760.1176786 Google Scholar
Digital Library
- Paul Sokolovsky. 2013. 8051 Code Banking in ContikiOS. Retrieved from https://github.com/contiki-os/contiki/wiki/8051-Code-Banking.Google Scholar
- Mechthild Stoer and Frank Wagner. 1997. A simple min-cut algorithm. J. ACM 44, 4 (July 1997), 585--591. DOI:http://dx.doi.org/10.1145/263867.263872 Google Scholar
Digital Library
- Texas Instruments, Inc. 2013. MSP430x2xx Family: User’s Guide. SLAU144J.Google Scholar
- Texas Instruments, Inc. 2014. CC253x System-on-Chip Solution for 2.4-GHz IEEE 802.15.4 and ZigBee Applications, CC2540/41 System-on-Chip Solution for 2.4-GHz Bluetooth low energy Applications: User’s Guide. Data Sheet, no. SWRU191F. (April 2014).Google Scholar
- Sumesh Udayakumaran and Rajeev Barua. 2003. Compiler-decided dynamic memory allocation for scratch-pad based embedded systems. In Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’03). ACM, New York, NY, 276--286. DOI:http://dx.doi.org/10.1145/951710.951747 Google Scholar
Digital Library
- Jean-Philippe Vasseur and Adam Dunkels. 2010. Interconnecting Smart Objects with IP: The Next Internet. Morgan Kaufmann, San Francisco, CA.Google Scholar
- Raeto C. West. 1985. Programming the Commodore 64: The Definitive Guide. COMPUTE! Publications, Inc., Greensboro, NC.Google Scholar
- David P. Williamson and David B. Shmoys. 2011. The Design of Approximation Algorithms. Cambridge University Press. Google Scholar
Cross Ref
- Xiaotong Zhuang, Santosh Pande, and John S. Greenland Jr. 2002. A framework for parallelizing load/stores on embedded processors. In Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques. 68--79. DOI:http://dx.doi.org/10.1109/PACT.2002.1106005 Google Scholar
Cross Ref
Index Terms
Efficient Automated Code Partitioning for Microcontrollers with Switchable Memory Banks
Recommendations
Joint variable partitioning and bank selection instruction optimization for partitioned memory architectures
About 55% of all CPUs sold in the world are 8-bit microcontrollers or microprocessors which can only access limited memory space without extending address buses. Partitioned memory with bank switching is a technique to increase memory size without ...
A Code Partitioning Tool for Simulink Models to Implement on FPGA-Based Network-on-Chip Architecture
MCSOC '14: Proceedings of the 2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCsIn this paper, we propose a code partitioning tool to implement application C codes generated from a Simulink model into FPGA-based network-on-chip (NoC) architecture. First, we propose our FPGA-based NoC architecture with communication functions for ...
Internet of things: financial perspective and its associated security concerns
The internet of things (IoT) has expanded at a very rapid rate over the last decade and revolutionised much of internet and devices technologies. Though much of transformation was driven by IoT, however, its implementation, security issues and other ...






Comments