Abstract
Microprocessors designed using HW/SW codesign principles, such as Transmeta™ Efficeon™ and the soon-to-ship NVIDIA 64-bit Tegra® K1, use dynamic binary optimization to extract instruction-level parallelism. Many code optimizations are made significantly more effective through the use of alias speculation. The state-of-the-art alias speculation system, SMARQ, provides 40% speedup on average over a system with no alias speculation. This performance, however, comes at the cost of introducing new alias registers and increased power consumption due to new checks for validating speculation. Consequently, improving the efficiency of alias speculation by reducing alias register requirements and rationalizing speculation validation checks is critical for the viability of SMARQ. This paper presents alias coalescing, a novel technique to significantly improve the efficiency of SMARQ through a synergistic combination of compiler and microarchitectural techniques. By using a more compact encoding for memory access ranges for memory instructions, alias coalescing simultaneously reduces the alias register pressure in SMARQ by a geomean of 26.09% and 39.96%, and the dynamic alias checks by 20.73% and 33.87%, across the entire SPEC CINT2006 and SPEC CFP2006 suites respectively.
- W. Ahn, Y. Duan, and J. Torrellas. DeAliaser: Alias Speculation Using Atomic Region Support. In ASPLOS, 2013. Google Scholar
Digital Library
- D. A. Connors. Memory Profiling for Directing Data Speculative Optimizations and Scheduling. Master's thesis, University of Illinois, Urbana, IL, 1997.Google Scholar
- J. Crawford. Guest Editor's Introduction: Introducing the Itanium Processors. IEEE Micro, 20(5):9--11, Sept. 2000. Google Scholar
Digital Library
- X. Dai, A. Zhai, W.-C. Hsu, and P.-C. Yew. A General Compiler Framework for Speculative Optimizations Using Data Speculative Code Motion. In CGO, 2005. Google Scholar
Digital Library
- S. Debray, R. Muth, and M. Weippert. Alias analysis of executable code. In POPL, 1998. Google Scholar
Digital Library
- J. C. Dehnert, B. K. Grant, J. P. Banning, R. Johnson, T. Kistler, and J. Mattson. The Transmeta Code Morphing Software: Using Speculation, Recovery, and Adaptive Retranslation to Address Real-life Challenges. In CGO, 2003. Google Scholar
Digital Library
- M. Franklin and G. S. Sohi. ARB: A Hardware Mechanism for Dynamic Reordering of Memory References. IEEE Transactions on Computers, 45(5), May 1996. Google Scholar
Digital Library
- M. Herlihy and J. E. B. Moss. Transactional Memory: Arch. Support for Lock-free Data Structures. In ISCA, 1993. Google Scholar
Digital Library
- B. Holscher, G. Rozas, J. Van Zoeren, and D. Dunn. Systems and methods for reordering processor instructions. US Patent.Google Scholar
- M. Itzkowitz, B. J. N. Wylie, C. Aoki, and N. Kosche. Memory Profiling using Hardware Counters. In SC, 2003. Google Scholar
Digital Library
- K. Krewell. Transmeta Gets More Efficeon. Microprocessor report. v.17, October 2003.Google Scholar
- W. Landi. Undecidability of static analysis. LOPLAS, 1992. Google Scholar
Digital Library
- J. Lin, T. Chen, W.-C. Hsu, and P.-C. Yew. Speculative Register Promotion Using Advanced Load Address Table (ALAT). In CGO, 2003. Google Scholar
Digital Library
- J. Lin, T. Chen, W.-C. Hsu, P.-C. Yew, R. D.-C. Ju, T.-F. Ngai, and S. Chan. A Compiler Framework for Speculative Analysis and Optimizations. In PLDI, 2003. Google Scholar
Digital Library
- M. Mehrara, J. Hao, P.-C. Hsu, and S. Mahlke. Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory. In PLDI, 2009. Google Scholar
Digital Library
- S. S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann Publishers Inc., 1997. Google Scholar
Digital Library
- N. Neelakantam, R. Rajwar, S. Srinivas, U. Srinivasan, and C. Zilles. Hardware Atomicity for Reliable Software Speculation. In ISCA, 2007. Google Scholar
Digital Library
- H. Rong, H. Park, C. Wang, and Y. Wu. Allocating Rotating Registers by Scheduling. In MICRO, 2013. Google Scholar
Digital Library
- S. Rubin, R. Bodík, and T. Chilimbi. An Efficient Profile-analysis Framework for Data-Layout Optimizations. In POPL, 2002. Google Scholar
Digital Library
- S. Sethumadhavan, R. Desikan, D. Burger, C. R. Moore, and S. W. Keckler. Scalable hardware memory disambiguation for high ilp processors. In MICRO, 2003. Google Scholar
Digital Library
- C. Wang, Y. Wu, H. Rong, and H. Park. SMARQ: Software-Managed Alias Register Queue for Dynamic Optimizations. In MICRO, 2012. Google Scholar
Digital Library
- Q. Wu, A. Pyatakov, A. Spiridonov, E. Raman, D. W. Clark, and D. I. August. Exposing Memory Access Regularities Using Object-Relative Memory Profiling. In CGO, 2004. Google Scholar
Digital Library
Index Terms
Enabling Efficient Alias Speculation
Recommendations
Enabling Efficient Alias Speculation
LCTES'15: Proceedings of the 16th ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems 2015 CD-ROMMicroprocessors designed using HW/SW codesign principles, such as Transmeta™ Efficeon™ and the soon-to-ship NVIDIA 64-bit Tegra® K1, use dynamic binary optimization to extract instruction-level parallelism. Many code optimizations are made significantly ...
Predicated switching - optimizing speculation on EPIC machines
Explicitly parallel instruction computing (EPIC) processors are a very attractive platform for many of today's multimedia and communications applications. In particular, clustered EPIC machines can take aggressive advantage of the available instruction-...
An energy efficient multi-target binary translator for instruction and data level parallelism exploitation
AbstractEmbedded devices are omnipresent in our daily routine, from smartphones to home appliances, that run data and control-oriented applications. To maximize the energy-performance tradeoff, data and instruction-level parallelism are exploited by using ...







Comments