
- Aust92a T. M. Austin and G. S. Sohi, "Dynamic Dependency Analysis of Ordinary Programs," in The 19th Annual International Symposium on Computer Architecture, Gold Coast, Australia, May 1992. Google Scholar
Digital Library
- Blan92a G. Blanck and S. Kreuger, "The SuperSPARC Microprocessor," in Compcon92, Los Alamitos, CA, 1992. Google Scholar
Digital Library
- Butl91a M. Buffer, T. -Y. Yeh, Y. PaR, M. Alsup, H. Scales, and M. Shebanow, "Single Instruction Stream Parallelism Is Greater than Two," in The 18th Annual International Symposium on Computer Architecture, Toronto, Canada, May 1991. Google Scholar
Digital Library
- CRI88a CRI, "The CRAY Y-MP Series of Computer Systems," Cray Research Inc., Publication No. CCMP-0301, February 1988.Google Scholar
- Cybe90a G. Cybenko, L. Kipp, L. Pointer, and D. Kuck, "Supercomputer Performance Evaluation and the Perfect Benchmarks," in CSRD Report No. 965, University of Illinois, March 1990.Google Scholar
- Iino92a H. Iino, H. Takahashi, T. Sukemura, M. Kimura, K. Fujita, and S. Mori, "A 289MFLOPS Single-Chip Supercomputer," Proceedings International Solid-State Gircuits Conference, February 1992.Google Scholar
- Joup89a N. P. Jouppi and D. W. Wall, "Available Instruction-Level Parallelism for Superscalar and Superpipelined Machines," in ASPLOS-III, Boston, MA, April 1989. Google Scholar
Digital Library
- Joup89b N. P. louppi, "The Nonuniform Distribution of Instruction- Level and Machine Parallelism and its Effect on Performance," IEEE Transactions on Computers, vol. 38, December 1989. Google Scholar
Digital Library
- Manua Unix Programmer's Manual, "pixie command."Google Scholar
- McMa86a F. H. McMahon, The Livermore FORTRAN Kernels: A Computer Test of the Numerical Performance Range. Research Report: Lawrence Livermore Laboratories, December 1986.Google Scholar
- Nico84a A. Nicolau and J. A. Fisher, "Measuring the Parallelism Available for Very Long Instruction Work Architectures," IEEE Transactions on Computers, vol. C-33, November 1984.Google Scholar
- Oehl90a R. R. Oehler and R. D. Groves, "IBM RISC Syst~n/i/X)0 processor architecture," IBM Journal of Research and Development, vol. 34, January 1990. Google Scholar
Digital Library
- Okam91a F. Okamoto, Y. Hagihara, C. Ohkubo, N. Nishi, H. Yamada, and T. Enomoto, "A 200-MFLOPS 100-MHz 64-b BiCMOS Vector-Pipelined Processor (VPP) ULSI," IEEE Journal of Solid-State Circuits, December 1991.Google Scholar
Cross Ref
- Smit89a M. D. Smith, M. Johnson, and M. A. Horowitz, "Limits on Muhiple Instruction Issue," in Proc. ASPLOS-III, Boston, MA, April 1989. Google Scholar
Digital Library
- Sohi89a G. S. Sohi and S. Vajapeyam, "Tradeoffs in Instruction Format Design For Horizontal Architectures," in ASPLOS-IiI, Boston, MA, April 1989. Google Scholar
Digital Library
- Vaja91a S. Vajapeyam, "instmction-Level Characterization of the CRAY Y-MP Processor," Ph.D. Thesis (available as tr1086 via anonymous ftp from ftp.cs.wisc.edu), University of Wisconsin, Madison, WI, 199 I. Google Scholar
Digital Library
- Vaja91b S. Vajapeyam, G. S. Sohi, and W. -C. Hsu, "An Empirical Study of the CRAY Y-MP Processor using the PERFECT Club Benchmarks," in The 18th Annual International Symposium on Computer Architecture, Toronto, Canada, May 1991. Google Scholar
Digital Library
- Wall91a D. W. Wall, "Limits of Instruction-Level Parallelism," in ASPLOSJV, Santa Clam, CA, April 1991. Google Scholar
Digital Library
Index Terms
On the instruction-level characteristics of scalar code in highly-vectorized scientific applications
Recommendations
On the instruction-level characteristics of scalar code in highly-vectorized scientific applications
MICRO 25: Proceedings of the 25th annual international symposium on MicroarchitectureEfficient multimedia coprocessor with enhanced SIMD engines for exploiting ILP and DLP
Multimedia applications have become increasingly important in daily computing. These applications are composed of heterogeneous regions of code mixed with data-level parallelism (DLP) and instruction-level parallelism (ILP). A standard solution for a ...
Exploiting Instruction- and Data-Level Parallelism
Historically, there have been two different approaches to high performance computing: instruction-level parallelism (ILP) and data-level parallelism (DLP). The ILP paradigm seeks to execute several instructions each cycle by exploring a sequential ...






Comments