skip to main content
research-article

Architecture-Aware Real-Time Compression of Execution Traces

Published:09 September 2015Publication History
Skip Abstract Section

Abstract

In recent years, on-chip trace generation has been recognized as a solution to the debugging of increasingly complex software. An execution trace can be seen as the most fundamentally useful type of trace, allowing the execution path of software to be determined post hoc. However, the bandwidth required to output such a trace can be excessive. Our architecture-aware trace compression (AATC) scheme adds an on-chip branch predictor and branch target buffer to reduce the volume of execution trace data in real time through on-chip compression. Novel redundancy reduction strategies are employed, most notably in exploiting the widespread use of linked branches and the compiler-driven movement of return addresses between link register, stack, and program counter. In doing so, the volume of branch target addresses is reduced by 52%, whereas other algorithmic improvements further decrease trace volume. An analysis of spatial and temporal redundancy in the trace stream allows a comparison of encoding strategies to be made for systematically increasing compression performance. A combination of differential, Fibonacci, VarLen, and Move-to-Front encodings are chosen to produce two compressor variants: a performance-focused xAATC that encodes 56.5 instructions/bit using 24,133 gates and an area-efficient fAATC that encodes 48.1 instructions/bit using only 9,854 gates.

References

  1. ARM. 2011a. Embedded Trace Macrocell Architecture Specification. http://infocenter.arm.com/help/topic/com.arm.doc.ihi0014q/IHI0014Q_etm_architecture_spec.pdf.Google ScholarGoogle Scholar
  2. ARM. 2011b. RealView Debugger User Guide—Version 4.1.2. http://infocenter.arm.com/help/index.jsp? topic=/com.arm.doc.dui0153n/Babdjcjf.html.Google ScholarGoogle Scholar
  3. Fabrice Bellard. 2005. QEMU, a fast and portable dynamic translator. In Proceedings of the USENIX Annual Technical Conference. 41--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Martin Burtscher, Ilya Ganusov, Sandra J. Jackson, Jian Ke, Paruj Ratanaworabhan, and Nana B. Sam. 2005. The VPC trace-compression algorithms. IEEE Transactions on Computers 54, 11, 1329--1344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Eric S. Chung and James C. Hoe. 2010. High-level design and validation of the BlueSPARC multithreaded processor. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 10, 1459--1470. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. GNU. 2015a. GCC, the GNU Compiler Collection. Retrieved August 14, 2015, from http://gcc.gnu.org.Google ScholarGoogle Scholar
  7. GNU. 2015b. GDB: The GNU Project Debugger. Retrieved April 14, 2015, from http://www.gnu.org/software/gdb.Google ScholarGoogle Scholar
  8. Patrice Godefroid and Nachiappan Nagappan. 2008. Concurrency at Microsoft: An exploratory survey. In Proceedings of the Workshop on Exploiting Concurrency Efficiently and Correctly.Google ScholarGoogle Scholar
  9. Matthew R. Guthaus, Jeffrey S. Ringenberg, Dan Ernst, Todd M. Austin, Trevor Mudge, and Richard B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE International Workshop on Workload Characterization. 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Brent Hailpern and Peter Santhanam. 2002. Software debugging, testing, and verification. IBM Systems Journal 41, 1, 4--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Andrew B. T. Hopkins and Klaus D. McDonald-Maier. 2006a. Debug support for complex systems on-chip: A review. IEE Proceedings on Computers and Digital Techniques 153, 4, 197--207.Google ScholarGoogle ScholarCross RefCross Ref
  12. Andrew B. T. Hopkins and Klaus D. McDonald-Maier. 2006b. Debug support strategy for systems-on-chips with multiple processor cores. IEEE Transactions on Computers 55, 2, 174--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. IEEE-ISTO 5001. 2012. The Nexus 5001 Forum Standard for a Global Embedded Processor Debug Interface. Available at http://nexus5001.org.Google ScholarGoogle Scholar
  14. Yuan-Long Jeang, Ching-Ta Chen, and Chih-Chung Tai. 2006. A new and efficient real-time address tracer for embedded microprocessors. In Proceedings of the International Conference on Innovative Computing, Information, and Control. 14--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Daniel A. Jimenez. 2003. Fast path-based neural branch prediction. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture. 243--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Chung-Fu Kao, Shyh-Ming Huang, and Ing-Jer Huang. 2007. A hardware approach to real-time program trace compression for embedded processors. IEEE Transactions on Circuits and Systems I: Regular Papers 54, 3, 530--543.Google ScholarGoogle ScholarCross RefCross Ref
  17. Bojan Mihajlovic, Warren J. Gross, and Zeljko Zilic. 2013. Software debugging infrastructure for multi-core systems-on-chip. In Multicore Technology: Architecture, Reconfiguration, and Modeling, M. Y. Qadri and S. Sangwine (Eds.). CRC Press, 257--282.Google ScholarGoogle Scholar
  18. Bojan Mihajlovic and Zeljko Zilic. 2011. Real-time address trace compression for emulated and real system-on-chip processor core debugging. In Proceedings of the ACM Great Lakes Symposium on VLSI. 331--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Bojan Mihajlovic, Zeljko Zilic, and Warren J. Gross. 2014. Dynamically instrumenting the QEMU emulator for Linux process trace generation with the GDB debugger. ACM Transactions on Embedded Computing Systems 13, 5s, 167:1--167:18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Aleksandar Milenkovic, Vladimir Uzelac, Milena Milenkovic, and Martin Burtscher. 2011. Caches and predictors for real-time, unobtrusive, and cost-effective program tracing in embedded systems. IEEE Transactions on Computers 60, 7, 992--1005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Milena Milenkovic and Martin Burtscher. 2007. Algorithms and hardware structures for unobtrusive real-time compression of instruction and data address traces. In Proceedings of the Data Compression Conference. 283--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. William Orme. 2008. Debug and Trace for Multicore SoCs. White Paper. ARM Corp. Available at http://www.arm.com/files/pdf/CoresightWhitepaper.pdf.Google ScholarGoogle Scholar
  23. Bernhard Plattner. 1984. Real-time execution monitoring. IEEE Transactions on Software Engineering SE-10, 6, 756--764. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Vladimir Uzelac and Aleksandar Milenkovic. 2009. A real-time program trace compressor utilizing double Move-to-Front method. In Proceedings of the ACM/IEEE Design Automation Conference. 738--743. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Vladimir Uzelac, Aleksandar Milenkovic, Martin Burtscher, and Milena Milenkovic. 2010. Real-time unobtrusive program execution trace compression using branch predictor events. In Proceedings of the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems. 97--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Vladimir Uzelac, Aleksandar Milenkovic, Milena Milenkovic, and Martin Burtscher. 2014. Using branch predictors and variable encoding for on-the-fly program tracing. IEEE Transactions on Computers 63, 4, 1008--1020. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Fu-Ching Yang, Cheng-Lung Chiang, and Ing-Jer Huang. 2010. A reverse-encoding-based on-chip bus tracer for efficient circular-buffer utilization. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 18, 5, 732--741. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jacob Ziv and Abraham Lempel. 1977. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory 23, 3, 337--343. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Architecture-Aware Real-Time Compression of Execution Traces

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!