Post-pass binary adaptation for software-based speculative precomputation
|
Tools and Resources
Share: |
|||||||||||||||||||||||||||||||||||||
ABSTRACT
AUTHORS
|
|
||||||||||||||||||||||||||||||||||||||||
| View colleagues of Steve S.W. Liao | |||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| View colleagues of Perry H. Wang | |||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| View colleagues of Hong Wang | |||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| View colleagues of Gerolf Hoflehner | |||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| View colleagues of Daniel Lavery | |||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| View colleagues of John P. Shen | |||||||||||||||||||||||||||||||||||||||||
REFERENCESNote: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
|
1
|
T. Aamodt, A. Moshovos, and P. Chow. The Predictability of Computations that Produce Unpredictable Outcomes. In 5th Workshop on Multithreaded Execution, Architecture and Compilation, pp. 23-34, Austin, Texas, December 2001
|
|
| |
2
|
|
| |
3
|
|
|
4
|
||
|
5
|
||
|
6
|
||
| |
7
|
Jamison D. Collins , Hong Wang , Dean M. Tullsen , Christopher Hughes , Yong-Fong Lee , Dan Lavery , John P. Shen, Speculative precomputation: long-range prefetching of delinquent loads, Proceedings of the 28th annual international symposium on Computer architecture, p.14-25, June 30-July 04, 2001, Göteborg, Sweden [doi>10.1145/379240.379248]
|
|
8
|
K. Cooper, P. Schielke, D. Subramanian. An Experimental Evaluation of List Scheduling. Rice University Technical Report 98-326, September 1998
|
|
|
9
|
||
|
10
|
J. Emer. Simultaneous Multithreading: Multiplying Alpha's Performance. In Microprocessor Forum, October 1999
|
|
| |
11
|
Rakesh Ghiya , Daniel Lavery , David Sehr, On the importance of points-to analysis and other memory disambiguation methods for C programs, Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation, p.47-58, June 2001, Snowbird, Utah, USA [doi>10.1145/378795.378806]
|
| |
12
|
|
| |
13
|
|
|
14
|
||
|
15
|
G. Hinton and J. Shen. Intel's Multi-Threading Technology. In Microprocessor Forum, October 2001
|
|
|
16
|
||
|
17
|
||
| |
18
|
|
|
19
|
S. Liao. SUIF Explorer. Ph.D. thesis, Stanford University, August 2000, Stanford Technical Report CSL-TR-00-807
|
|
| |
20
|
Shih-Wei Liao , Amer Diwan , Robert P. Bosch, Jr. , Anwar Ghuloum , Monica S. Lam, SUIF Explorer: an interactive and interprocedural parallelizer, Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, p.37-48, May 04-06, 1999, Atlanta, Georgia, USA [doi>10.1145/301104.301108]
|
| |
21
|
|
|
22
|
D. Marr, F. Binns, D. Hill, G. Hinton, D. Koufaty, J. Miller, M. Upton. Hyper-Threading Technology Architecture and Microarchitecture. In Intel Technology Journal, Volume 6, Issue on Hyper-threading, February 2002
|
|
| |
23
|
|
|
24
|
||
|
25
|
||
|
26
|
||
|
27
|
D. M. Tullsen. Simulation and Modeling of a Simultaneous Multithreaded Processor. In 22nd Annual Computer Measurement Group Conference, December 1996
|
|
| |
28
|
|
|
29
|
R. Uhlig, R. Rishtein, O. Gershon, I. Hirsh, and H. Wang. SoftSDV: A Presilicon Software Development Environment for the IA-64 Architecture. In Intel Technology Journal, Q4 1999
|
|
|
30
|
H. Wang, P. Wang, R. D. Weldon, S. Ettinger, H. Saito, M. Girkar, S. Liao, J. Shen. Speculative Precomputation: Exploring Use of Multithreading Technology for Latency. In Intel Technology Journal, Volume 6, Issue on Hyper-threading, February 2002
|
|
|
31
|
||
|
32
|
M. Weiser. Program Slicing. In IEEE Transactions on Software Engineering, 10(4), pp. 352-357, 1984
|
|
| |
33
|
|
| |
34
|
CITED BY44 Citations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
INDEX TERMSThe ACM Computing Classification System (CCS rev.2012)
PUBLICATION| · Proceeding | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Title | PLDI '02 Proceedings of the ACM SIGPLAN 2002 conference on Programming language design and implementation table of contents | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| General Chairs | Jens Knoop Universität Dortmund, Germany | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Program Chairs | Laurie J. Hendren McGill University, Canada | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Pages | 117-128 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Publication Date | 2002-06-17 (yyyy-mm-dd) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sponsor | SIGPLAN ACM Special Interest Group on Programming Languages | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Publisher | ACM New York, NY, USA ©2002 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ISBN: 1-58113-463-0 Order Number: 548020 doi>10.1145/512529.512544 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Conference |
PLDIProgramming Language Design and Implementation
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Paper Acceptance Rate 28 of 169 submissions, 17% | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Overall Acceptance Rate 711 of 3,503 submissions, 20% | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| · Newsletter | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Title | ACM SIGPLAN Notices Homepage table of contents archive | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Volume 37 Issue 5, May 2002 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Pages | 117-128 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Publication Date | 2002-05-17 (yyyy-mm-dd) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sponsor | SIGPLAN ACM Special Interest Group on Programming Languages | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Publisher | ACM New York, NY, USA | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ISSN: 0362-1340 EISSN: 1558-1160 doi>10.1145/543552.512544 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
REVIEWS
COMMENTSBe the first to comment To Post a comment please sign in or create a free Web account
Table of Contents| SESSION: Type Systems | ||
| Oege de Moor | ||
| Flow-sensitive type qualifiers | ||
| Jeffrey S. Foster, Tachio Terauchi, Alex Aiken | ||
| Pages: 1-12 | ||
| doi>10.1145/512529.512531 | ||
Full text: PDF
|
||
|
We present a system for extending standard type systems with flow-sensitive type qualifiers. Users annotate their programs with type qualifiers, and inference checks that the annotations are correct. In our system only the type qualifiers are modeled ...
expand
|
||
| Adoption and focus: practical linear types for imperative programming | ||
| Manuel Fahndrich, Robert DeLine | ||
| Pages: 13-24 | ||
| doi>10.1145/512529.512532 | ||
Full text: PDF
|
||
|
A type system with linearity is useful for checking software protocols andresource management at compile time. Linearity provides powerful reasoning about state changes, but at the price of restrictions on aliasing. The hard division between linear and ...
expand
|
||
| SESSION: Register Allocation and Value Numbering | ||
| Rajiv Gupta | ||
| Fast copy coalescing and live-range identification | ||
| Zoran Budimlic, Keith D. Cooper, Timothy J. Harvey, Ken Kennedy, Timothy S. Oberg, Steven W. Reeves | ||
| Pages: 25-32 | ||
| doi>10.1145/512529.512534 | ||
Full text: PDF
|
||
|
This paper presents a fast new algorithm for modeling and reasoning about interferences for variables in a program without constructing an interference graph. It then describes how to use this information to minimize copy insertion for &fgr;-node instantiation ...
expand
|
||
| Preference-directed graph coloring | ||
| Akira Koseki, Hideaki Komatsu, Toshio Nakatani | ||
| Pages: 33-44 | ||
| doi>10.1145/512529.512535 | ||
Full text: PDF
|
||
|
This paper describes a new framework of register allocation based on Chaitin-style coloring. Our focus is on maximizing the chances for live ranges to be allocated to the most preferred registers while not destroying the colorability obtained by graph ...
expand
|
||
| A sparse algorithm for predicated global value numbering | ||
| Karthik Gargi | ||
| Pages: 45-56 | ||
| doi>10.1145/512529.512536 | ||
Full text: PDF
|
||
|
This paper presents a new algorithm for performing global value numbering on a routine in static single assignment form. Our algorithm has all the strengths of the most powerful existing practical methods of global value numbering; it unifies optimistic ...
expand
|
||
| SESSION: Program Correctness | ||
| Rita Loogen | ||
| ESP: path-sensitive program verification in polynomial time | ||
| Manuvir Das, Sorin Lerner, Mark Seigle | ||
| Pages: 57-68 | ||
| doi>10.1145/512529.512538 | ||
Full text: PDF
|
||
|
In this paper, we present a new algorithm for partial program verification that runs in polynomial time and space. We are interested in checking that a program satisfies a given temporal safety property. Our insight is that by accurately modeling only ...
expand
|
||
| A system and language for building system-specific, static analyses | ||
| Seth Hallem, Benjamin Chelf, Yichen Xie, Dawson Engler | ||
| Pages: 69-82 | ||
| doi>10.1145/512529.512539 | ||
Full text: PDF
|
||
|
This paper presents a novel approach to bug-finding analysis and an implementation of that approach. Our goal is to find as many serious bugs as possible. To do so, we designed a flexible, easy-to-use extension language for specifying analyses and an ...
expand
|
||
| Deriving specialized program analyses for certifying component-client conformance | ||
| G. Ramalingam, Alex Warshavsky, John Field, Deepak Goyal, Mooly Sagiv | ||
| Pages: 83-94 | ||
| doi>10.1145/512529.512540 | ||
Full text: PDF
|
||
|
We are concerned with the problem of statically certifying (verifying) whether the client of a software component conforms to the component's constraints for correct usage. We show how conformance certification can be efficiently carried out in ...
expand
|
||
| SESSION: Profiling and Speculation | ||
| Barbara Ryder | ||
| Profile-guided code compression | ||
| Saumya Debray, William Evans | ||
| Pages: 95-105 | ||
| doi>10.1145/512529.512542 | ||
Full text: PDF
|
||
|
As computers are increasingly used in contexts where the amount of available memory is limited, it becomes important to devise techniques that reduce the memory footprint of application programs while leaving them in an executable form. This paper describes ...
expand
|
||
| Profile-directed optimization of event-based programs | ||
| Mohan Rajagopalan, Saumya K. Debray, Matti A. Hiltunen, Richard D. Schlichting | ||
| Pages: 106-116 | ||
| doi>10.1145/512529.512543 | ||
Full text: PDF
|
||
|
Events are used as a fundamental abstraction in programs ranging from graphical user interfaces (GUIs) to systems for building customized network protocols. While providing a flexible structuring and execution paradigm, events have the potentially serious ...
expand
|
||
| Post-pass binary adaptation for software-based speculative precomputation | ||
| Steve S.W. Liao, Perry H. Wang, Hong Wang, Gerolf Hoflehner, Daniel Lavery, John P. Shen | ||
| Pages: 117-128 | ||
| doi>10.1145/512529.512544 | ||
Full text: PDF
|
||
|
Recently, a number of thread-based prefetching techniques have been proposed. These techniques aim at improving the latency of single-threaded applications by leveraging multithreading resources to perform memory prefetching via speculative prefetch ...
expand
|
||
| SESSION: Garbage Collection | ||
| David Detlefs | ||
| A parallel, incremental and concurrent GC for servers | ||
| Yoav Ossia, Ori Ben-Yitzhak, Irit Goft, Elliot K. Kolodner, Victor Leikehman, Avi Owshanko | ||
| Pages: 129-140 | ||
| doi>10.1145/512529.512546 | ||
Full text: PDF
|
||
|
Multithreaded applications with multi-gigabyte heaps running on modern servers provide new challenges for garbage collection (GC). The challenges for "server-oriented" GC include: ensuring short pause times on a multi-gigabyte heap, while minimizing ...
expand
|
||
| Combining region inference and garbage collection | ||
| Niels Hallenberg, Martin Elsman, Mads Tofte | ||
| Pages: 141-152 | ||
| doi>10.1145/512529.512547 | ||
Full text: PDF
|
||
|
This paper describes a memory discipline that combines region-based memory management and copying garbage collection by extending Cheney's copying garbage collection algorithm to work with regions. The paper presents empirical evidence that region inference ...
expand
|
||
| Beltway: getting around garbage collection gridlock | ||
| Stephen M Blackburn, Richard Jones, Kathryn S. McKinley, J Eliot B Moss | ||
| Pages: 153-164 | ||
| doi>10.1145/512529.512548 | ||
Full text: PDF
|
||
|
We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or ...
expand
|
||
| SESSION: Hardware-Conscious Optimizations | ||
| Yanhong Annie Liu | ||
| A compiler approach to fast hardware design space exploration in FPGA-based systems | ||
| Byoungro So, Mary W. Hall, Pedro C. Diniz | ||
| Pages: 165-176 | ||
| doi>10.1145/512529.512550 | ||
Full text: PDF
|
||
|
The current practice of mapping computations to custom hardware implementations requires programmers to assume the role of hardware designers. In tuning the performance of their hardware implementation, designers manually apply loop transformations such ...
expand
|
||
| Space-time trade-off optimization for a class of electronic structure calculations | ||
| Daniel Cociorva, Gerald Baumgartner, Chi-Chung Lam, P. Sadayappan, J. Ramanujam, Marcel Nooijen, David E. Bernholdt, Robert Harrison | ||
| Pages: 177-186 | ||
| doi>10.1145/512529.512551 | ||
Full text: PDF
|
||
|
The accurate modeling of the electronic structure of atoms and molecules is very computationally intensive. Many models of electronic structure, such as the Coupled Cluster approach, involve collections of tensor contractions. There are usually a large ...
expand
|
||
| Effective sign extension elimination | ||
| Motohiro Kawahito, Hideaki Komatsu, Toshio Nakatani | ||
| Pages: 187-198 | ||
| doi>10.1145/512529.512552 | ||
Full text: PDF
|
||
|
Computer designs are shifting from 32-bit architectures to 64-bit architectures, while most of the programs available today are still designed for 32-bit architectures. Java™, for example, specifies the frequently used int" as a 32-bit data type. ...
expand
|
||
| SESSION: Dynamic Prefetching & Cache Optimizations | ||
| Michal Cierniak | ||
| Dynamic hot data stream prefetching for general-purpose programs | ||
| Trishul M. Chilimbi, Martin Hirzel | ||
| Pages: 199-209 | ||
| doi>10.1145/512529.512554 | ||
Full text: PDF
|
||
|
Prefetching data ahead of use has the potential to tolerate the grow ing processor-memory performance gap by overlapping long latency memory accesses with useful computation. While sophisti cated prefetching techniques have been automated for limited ...
expand
|
||
| Efficient discovery of regular stride patterns in irregular programs and its use in compiler prefetching | ||
| Youfeng Wu | ||
| Pages: 210-221 | ||
| doi>10.1145/512529.512555 | ||
Full text: PDF
|
||
|
Irregular data references are difficult to prefetch, as the future memory address of a load instruction is hard to anticipate by a compiler. However, recent studies as well as our experience indicate that some important load instructions in irregular ...
expand
|
||
| Static load classification for improving the value predictability of data-cache misses | ||
| Martin Burtscher, Amer Diwan, Matthias Hauswirth | ||
| Pages: 222-233 | ||
| doi>10.1145/512529.512556 | ||
Full text: PDF
|
||
|
While caches are effective at avoiding most main-memory accesses, the few remaining memory references are still expensive. Even one cache miss per one hundred accesses can double a program's execution time. To better tolerate the data-cache miss latency, ...
expand
|
||
| SESSION: Analysis of Object-Oriented Programs | ||
| Jan Vitek | ||
| Extended static checking for Java | ||
| Cormac Flanagan, K. Rustan M. Leino, Mark Lillibridge, Greg Nelson, James B. Saxe, Raymie Stata | ||
| Pages: 234-245 | ||
| doi>10.1145/512529.512558 | ||
Full text: PDF
|
||
|
Software development and maintenance are costly endeavors. The cost can be reduced if more software defects are detected earlier in the development cycle. This paper introduces the Extended Static Checker for Java (ESC/Java), an experimental compile-time ...
expand
|
||
| Using data groups to specify and check side effects | ||
| K. Rustan M. Leino, Arnd Poetzsch-Heffter, Yunhong Zhou | ||
| Pages: 246-257 | ||
| doi>10.1145/512529.512559 | ||
Full text: PDF
|
||
|
Reasoning precisely about the side effects of procedure calls is important to many program analyses. This paper introduces a technique for specifying and statically checking the side effects of methods in an object-oriented language. The technique uses ...
expand
|
||
| Efficient and precise datarace detection for multithreaded object-oriented programs | ||
| Jong-Deok Choi, Keunwoo Lee, Alexey Loginov, Robert O'Callahan, Vivek Sarkar, Manu Sridharan | ||
| Pages: 258-269 | ||
| doi>10.1145/512529.512560 | ||
Full text: PDF
|
||
|
We present a novel approach to dynamic datarace detection for multithreaded object-oriented programs. Past techniques for on-the-fly datarace detection either sacrificed precision for performance, leading to many false positive datarace reports, or maintained ...
expand
|
||
| SESSION: Language Design & Implementation Issues | ||
| Andrew C. Myers | ||
| Maya: multiple-dispatch syntax extension in Java | ||
| Jason Baker, Wilson C. Hsieh | ||
| Pages: 270-281 | ||
| doi>10.1145/512529.512562 | ||
Full text: PDF
|
||
|
We have designed and implemented Maya, a version of Java that allows programmers to extend and reinterpret its syntax. Maya generalizes macro systems by treating grammar productions as generic functions, and semantic actions on productions as multimethods ...
expand
|
||
| Region-based memory management in cyclone | ||
| Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling Wang, James Cheney | ||
| Pages: 282-293 | ||
| doi>10.1145/512529.512563 | ||
Full text: PDF
|
||
|
Cyclone is a type-safe programming language derived from C. The primary design goal of Cyclone is to let programmers control data representation and memory management without sacrificing type-safety. In this paper, we focus on the region-based memory ...
expand
|
||
| MaJIC: compiling MATLAB for speed and responsiveness | ||
| George Almási, David Padua | ||
| Pages: 294-303 | ||
| doi>10.1145/512529.512564 | ||
Full text: PDF
|
||
|
This paper presents and evaluates techniques to improve the execution performance of MATLAB. Previous efforts concentrated on source to source translation and batch compilation; MaJIC provides an interactive frontend that looks like MATLAB and ...
expand
|
||
| SESSION: High Performance & Real-Time Issues | ||
| Charles Consel | ||
| Denali: a goal-directed superoptimizer | ||
| Rajeev Joshi, Greg Nelson, Keith Randall | ||
| Pages: 304-314 | ||
| doi>10.1145/512529.512566 | ||
Full text: PDF
|
||
|
This paper provides a preliminary report on a new research project that aims to construct a code generator that uses an automatic theorem prover to produce very high-quality (in fact, nearly mathematically optimal) machine code for modern architectures. ...
expand
|
||
| The embedded machine: predictable, portable real-time code | ||
| Thomas A. Henzinger, Christoph M. Kirsch | ||
| Pages: 315-326 | ||
| doi>10.1145/512529.512567 | ||
Full text: PDF
|
||
|
The Embedded Machine is a virtual machine that mediates in real time the interaction between software processes and physical processes. It separates the compilation of embedded programs into two phases. The first, platform-independent compiler phase ...
expand
|
||