|
|
The Omega test: a fast and practical integer programming algorithm for dependence analysis |
| |
William Pugh
|
|
Pages: 4-13 |
|
doi>10.1145/125826.125848 |
|
Full text: PDF
|
|
|
|
|
Pointer target tracking—an empirical study |
| |
Jon Loeliger,
Robert Metzger,
Mark Seligman,
Sean Stroud
|
|
Pages: 14-23 |
|
doi>10.1145/125826.125856 |
|
Full text: PDF
|
|
|
|
|
On-the-fly detection of data races for programs with nested fork-join parallelism |
| |
John Mellor-Crummey
|
|
Pages: 24-33 |
|
doi>10.1145/125826.125861 |
|
Full text: PDF
|
|
|
|
|
Programming costs of explicit memory localization on a large scale shared memory multiprocessor |
| |
Silvio Picano,
Eugene D. Brooks, III,
Joseph E. Hoag
|
|
Pages: 36-45 |
|
doi>10.1145/125826.125864 |
|
Full text: PDF
|
|
|
|
|
A conflict-free memory design for multiprocessors |
| |
Honda Shing,
Lionel M. Ni
|
|
Pages: 46-55 |
|
doi>10.1145/125826.125868 |
|
Full text: PDF
|
|
|
|
|
A ultra fast Euclidean division algorithm for prime memory systems |
| |
Benôit Dupont de Dinechin
|
|
Pages: 56-65 |
|
doi>10.1145/125826.125874 |
|
Full text: PDF
|
|
|
|
|
A high school supercomputing challenge |
| |
Marion Cohen,
Marilyn Foster,
David Kratzer,
Patricia Malone,
Ann Solem
|
|
Pages: 68-75 |
|
doi>10.1145/125826.125878 |
|
Full text: PDF
|
|
|
|
|
Chaotic cardiac arrhythmias |
| |
Ramana Sadananda
|
|
Pages: 76-84 |
|
doi>10.1145/125826.125881 |
|
Full text: PDF
|
|
|
|
|
Compiler optimizations for Fortran D on MIMD distributed-memory machines |
| |
Seema Hiranandani,
Ken Kennedy,
Chau-Wen Tseng
|
|
Pages: 86-100 |
|
doi>10.1145/125826.125886 |
|
Full text: PDF
|
|
|
|
|
Compile-time generation of regular communications patterns |
| |
Charles Koelbel
|
|
Pages: 101-110 |
|
doi>10.1145/125826.125890 |
|
Full text: PDF
|
|
|
|
|
Tiling multidimensional iteration spaces for nonshared memory machines |
| |
J. Ramanujam,
P. Sadayappan
|
|
Pages: 111-120 |
|
doi>10.1145/125826.125893 |
|
Full text: PDF
|
|
|
|
|
A new approach for automatic parallelization of blocked linear Algebra computations |
| |
H. T. Kung,
Jaspal Subhlok
|
|
Pages: 122-129 |
|
doi>10.1145/125826.125898 |
|
Full text: PDF
|
|
|
|
|
Wide format floating-point math libraries |
| |
Victoria Markstein,
Peter Markstein,
Tung Nguyen,
Steve Poole
|
|
Pages: 130-138 |
|
doi>10.1145/125826.125903 |
|
Full text: PDF
|
|
|
|
|
Distributing the comparison of DNA and protein sequences across heterogeneous supercomputers |
| |
Hugh Nicholas,
Grace Giras,
Vasiliki Hartonas-Garmhausen,
Michael Kopko,
Christopher Maher,
Alexander Ropelewski
|
|
Pages: 139-146 |
|
doi>10.1145/125826.125911 |
|
Full text: PDF
|
|
|
|
|
Panel: parallel computing in the undergraduate computer science curriculum |
| |
Nan C. Schaller
|
|
Page: 148 |
|
doi>10.1145/125826.125919 |
|
Full text: PDF
|
|
|
|
|
A performance comparison of three supercomputers: Fujitsu VP-2600, NEC SX-3, and CRAY Y-MP |
| |
Margaret L. Simmons,
Harvey J. Wasserman,
Olaf M. Lubeck,
Christopher Eoyang Eoyang,
Raul Mendez,
Hirro Harada,
Misako Ishiguro
|
|
Pages: 150-157 |
|
doi>10.1145/125826.125924 |
|
Full text: PDF
|
|
|
|
|
The NAS parallel benchmarks—summary and preliminary results |
| |
D. H. Bailey,
E. Barszcz,
J. T. Barton,
D. S. Browning,
R. L. Carter,
L. Dagum,
R. A. Fatoohi,
P. O. Frederickson,
T. A. Lasinski,
R. S. Schreiber,
H. D. Simon,
V. Venkatakrishnan,
S. K. Weeratunga
|
|
Pages: 158-165 |
|
doi>10.1145/125826.125925 |
|
Full text: PDF
|
|
|
|
|
Performance results for two of the NAS parallel benchmarks |
| |
David H. Bailey,
Paul O. Frederickson
|
|
Pages: 166-173 |
|
doi>10.1145/125826.125930 |
|
Full text: PDF
|
|
|
|
|
An effective on-chip preloading scheme to reduce data access penalty |
| |
Jean-Loup Baer,
Tien-Fu Chen
|
|
Pages: 176-186 |
|
doi>10.1145/125826.125932 |
|
Full text: PDF
|
|
|
|
|
Using Lookahead to reduce memory bank contention for decoupled operand references |
| |
Peter L. Bird,
Richard A. Uhlig
|
|
Pages: 187-196 |
|
doi>10.1145/125826.125938 |
|
Full text: PDF
|
|
|
|
|
Delayed consistency and its effects on the miss rate of parallel programs |
| |
Michel Dubois,
Jin Chin Wang,
Luiz A. Barroso,
Kangwoo Lee,
Yung-Syau Chen
|
|
Pages: 197-206 |
|
doi>10.1145/125826.125941 |
|
Full text: PDF
|
|
|
|
|
Architecture-independent scientific programming in data parallel C: three case studies |
| |
Philip J. Hatcher,
Michael J. Quinn,
Ray J. Anderson,
Anthony J. Lapadula,
Bradley K. Seevers,
Andrew F. Bennett
|
|
Pages: 208-217 |
|
doi>10.1145/125826.125945 |
|
Full text: PDF
|
|
|
|
|
Solution functions of PDEQSOL (Partial differential EQuation SOlver language) for fluid problems |
| |
Hiroyuki Hirayama,
Miiko Ikeda,
Nobutoshi Sagawa
|
|
Pages: 218-227 |
|
doi>10.1145/125826.125950 |
|
Full text: PDF
|
|
|
|
|
Computing turbulent flow in complex geometries on a massively parallel processor |
| |
James A. Sethian,
Jean-Philippe Brunet,
Adam Greenberg,
Jill P. Mesirov
|
|
Pages: 230-241 |
|
doi>10.1145/125826.125954 |
|
Full text: PDF
|
|
|
|
|
A lattice Boltzmann method for a two-dimensional viscous Burgers equation: computational results |
| |
Bracy H. Elton
|
|
Pages: 242-252 |
|
doi>10.1145/125826.125961 |
|
Full text: PDF
|
|
|
|
|
Distribution of a climate model across high-speed networks |
| |
Carlos R. Mechoso,
Chung-Chun Ma,
John D. Farrara,
Joseph A. Spahr
|
|
Pages: 253-260 |
|
doi>10.1145/125826.125965 |
|
Full text: PDF
|
|
|
|
|
Retire Fortran? A debate rekindled |
| |
David Cann
|
|
Pages: 264-272 |
|
doi>10.1145/125826.125976 |
|
Full text: PDF
|
|
|
|
|
Object oriented parallel programming: experiments and results |
| |
J. K. Lee,
D. Gannon
|
|
Pages: 273-282 |
|
doi>10.1145/125826.105186 |
|
Full text: PDF
|
|
|
|
|
High level support for divide-and-conquer parallelism |
| |
Attila Gürsoy,
L. V. Kalé
|
|
Pages: 283-292 |
|
doi>10.1145/125826.125985 |
|
Full text: PDF
|
|
|
|
|
Vector/parallel implementation of a porous media flow code |
| |
R. Ewing,
P. O'Leary,
J. Sochacki
|
|
Pages: 294-303 |
|
doi>10.1145/125826.125997 |
|
Full text: PDF
|
|
|
|
|
High performance vector processing in reservoir simulation |
| |
L. C. Young,
S. E. Zarantonello
|
|
Pages: 304-315 |
|
doi>10.1145/125826.126000 |
|
Full text: PDF
|
|
|
|
|
Seismic modeling at 14 gigaflops on the connection machine |
| |
Jacek Myczkowski,
Guy Steele
|
|
Pages: 316-326 |
|
doi>10.1145/125826.126004 |
|
Full text: PDF
|
|
|
|
|
Gordon Bell prize lectures |
| |
J. J. Dongarra,
A. Karp,
K. Miura,
H. D. Simon
|
|
Pages: 328-337 |
|
doi>10.1145/125826.126011 |
|
Full text: PDF
|
|
|
|
|
Compiler parallelization of an elliptic grid generator for 1990 Gordon Bell prize |
| |
Gary Sabot,
Lisa Tennies,
Alex Vasilevsky,
Richard Shapiro
|
|
Pages: 338-346 |
|
doi>10.1145/125826.126020 |
|
Full text: PDF
|
|
|
|
|
GPFP: an array processing element for the next generation of massively parallel supercomputer architectures |
| |
Don Beal,
Costas Lambrinoudakis
|
|
Pages: 348-357 |
|
doi>10.1145/125826.126024 |
|
Full text: PDF
|
|
|
|
|
A new parallel architecture for sparse matrix computation based on finite projective geometries |
| |
Narendra Karmarkar
|
|
Pages: 358-369 |
|
doi>10.1145/125826.126029 |
|
Full text: PDF
|
|
|
|
|
Time multiplexed optical computers |
| |
Harry F. Jordan,
Vincent P. Heuring
|
|
Pages: 370-378 |
|
doi>10.1145/125826.126033 |
|
Full text: PDF
|
|
|
|
|
Universal multistage networks via linear permutations |
| |
Charles M. Fiduccia,
Elaine M. Jacobson
|
|
Pages: 380-389 |
|
doi>10.1145/125826.126038 |
|
Full text: PDF
|
|
|
|
|
Design and analysis of efficient hierarchical interconnection networks |
| |
Sizheng Wei,
Saul Levy
|
|
Pages: 390-399 |
|
doi>10.1145/125826.126043 |
|
Full text: PDF
|
|
|
|
|
Alleviation of tree saturation in multistage interconnection networks |
| |
Matthew Farrens,
Brad Wetmore,
Allison Woodruff
|
|
Pages: 400-409 |
|
doi>10.1145/125826.126047 |
|
Full text: PDF
|
|
|
|
|
An evaluation of automatic and interactive parallel programming tools |
| |
Doreen Y. Cheng,
Douglas M. Pase
|
|
Pages: 412-423 |
|
doi>10.1145/125826.126052 |
|
Full text: PDF
|
|
|
|
|
Interprocedural transformations for parallel code generation |
| |
Mary W. Hall,
Ken Kennedy,
Kathryn S. McKinley
|
|
Pages: 424-434 |
|
doi>10.1145/125826.126055 |
|
Full text: PDF
|
|
|
|
|
Graphical development tools for network-based concurrent supercomputing |
| |
Adam Beguelin,
Jack J. Dongarra
|
|
Pages: 435-444 |
|
doi>10.1145/125826.126059 |
|
Full text: PDF
|
|
|
|
|
Synthetic aperture radar image processing on parallel supercomputers |
| |
Steve Plimpton,
Gary Mastin,
Denni Ghiglia
|
|
Pages: 446-452 |
|
doi>10.1145/125826.126062 |
|
Full text: PDF
|
|
|
|
|
The auditorialization of scientific information |
| |
Robert S. Hotchkiss,
Cheryl L. Wampler
|
|
Pages: 453-461 |
|
doi>10.1145/125826.126064 |
|
Full text: PDF
|
|
|
|
|
Parallel approaches to short range molecular dynamics simulations |
| |
Pablo Tamayo,
Jill P. Mesirov,
Bruce M. Boghosian
|
|
Pages: 462-470 |
|
doi>10.1145/125826.126067 |
|
Full text: PDF
|
|
|
|
|
Visualizing the behavior of massively parallel programs |
| |
Mark Friedell,
Mark LaPolla,
Sandeep Kochhar,
Steve Sistare,
Janusz Juda
|
|
Pages: 472-480 |
|
doi>10.1145/125826.126069 |
|
Full text: PDF
|
|
|
|
|
Performance debugging shared memory multiprocessor programs with MTOOL |
| |
Aaron J. Goldberg,
John L. Hennessy
|
|
Pages: 481-490 |
|
doi>10.1145/125826.126075 |
|
Full text: PDF
|
|
|
|
|
Graphical animation of parallel Fortran programs |
| |
Sue Utter-Honig,
Cherri M. Pancake
|
|
Pages: 491-499 |
|
doi>10.1145/125826.126079 |
|
Full text: PDF
|
|
|
|
|
Scheduling parallel programs with non-uniform parallelism profiles |
| |
Yao-Jen Chang,
Jean-Lien C. Wu,
Jingshown Wu
|
|
Pages: 502-511 |
|
doi>10.1145/125826.126083 |
|
Full text: PDF
|
|
|
|
|
Intelligent mapping of communicating processes in distributed computing systems |
| |
Arthur Ieumwananonthachai,
Akiko N. Aizawa,
Steven R. Schwartz,
Benjamin W. Wah,
Jerry C. Yan
|
|
Pages: 512-521 |
|
doi>10.1145/125826.126087 |
|
Full text: PDF
|
|
|
|
|
Load balancing by function distribution on the EM-4 prototype |
| |
Y. Kodama,
S. Sakai,
Y. Yamaguchi
|
|
Pages: 522-531 |
|
doi>10.1145/125826.126091 |
|
Full text: PDF
|
|
|
|
|
Gigascale integration (GSI) technology |
| |
James D. Meindl
|
|
Pages: 534-538 |
|
doi>10.1145/125826.126094 |
|
Full text: PDF
|
|
|
|
|
Exploration geophysics, parallel computing and reality |
| |
Dennis E. Willen
|
|
Page: 540 |
|
doi>10.1145/125826.126098 |
|
Full text: PDF
|
|
|
|
|
Application issues for large scale reservoir simulation on massively parallel computers |
| |
Jeffrey M. Rutledge,
David R. Jones,
Wen H. Chen,
Ernest Y. Chung
|
|
Page: 541 |
|
doi>10.1145/125826.126101 |
|
Full text: PDF
|
|
|
|
|
Large scale reservoir simulation in the concurrent processing milieu |
| |
R. P. Kendall,
J. R. Wallis,
J. A. Foster,
J. S. Nolen
|
|
Page: 542 |
|
doi>10.1145/125826.126103 |
|
Full text: PDF
|
|
|
|
|
Vectorizing C compilers: how good are they? |
| |
Lauren L. Smith
|
|
Pages: 544-553 |
|
doi>10.1145/125826.126105 |
|
Full text: PDF
|
|
|
|
|
Characterizing memory hot spots in a shared memory MIMD machine |
| |
Raymond R. Glenn,
Daniel V. Pryor,
John M. Conroy,
Theodore Johnson
|
|
Pages: 554-566 |
|
doi>10.1145/125826.126132 |
|
Full text: PDF
|
|
|
|
|
Input/output behavior of supercomputing applications |
| |
Ethan L. Miller,
Randy H. Katz
|
|
Pages: 567-576 |
|
doi>10.1145/125826.126133 |
|
Full text: PDF
|
|
|
|
|
Towards efficient parallel implementation of the CG method applied to a class of block tridiagonal linear systems |
| |
A. T. Chronopoulos
|
|
Pages: 578-587 |
|
doi>10.1145/125826.126134 |
|
Full text: PDF
|
|
|
|
|
PILS: an iterative linear solver package for ill-conditioned systems |
| |
C. Pommerell,
W. Fichtner
|
|
Pages: 588-599 |
|
doi>10.1145/125826.126135 |
|
Full text: PDF
|
|
|
|
|
Threshold pivoting for dense LU factorization on distributed memory multiprocessors |
| |
Joel Malard
|
|
Pages: 600-607 |
|
doi>10.1145/125826.126136 |
|
Full text: PDF
|
|
|
|
|
Factoring: a practical and robust method for scheduling parallel loops |
| |
Susan Flynn Hummel,
Edith Schonberg,
Lawrence E. Flynn
|
|
Pages: 610-632 |
|
doi>10.1145/125826.126137 |
|
Full text: PDF
|
|
|
|
|
A fast static scheduling algorithm for DAGs on an unbounded number of processors |
| |
Tao Yang,
Apostolos Gerasoulis
|
|
Pages: 633-642 |
|
doi>10.1145/125826.126138 |
|
Full text: PDF
|
|
|
|
|
Time-division optical communications in multiprocessor arrays |
| |
Chunming Qiao,
Rami G. Melhem
|
|
Pages: 644-653 |
|
doi>10.1145/125826.126139 |
|
Full text: PDF
|
|
|
|
|
Fully-adaptive routing: packet switching performance and wormhole algorithms |
| |
S. A. Felperin,
L. Gravano,
G. D. Pifarré,
J. L. C. Sanz
|
|
Pages: 654-663 |
|
doi>10.1145/125826.126141 |
|
Full text: PDF
|
|
|
|
|
Network-based multicomputers: an emerging parallel architecture |
| |
H. T. Kung,
Robert Sansom,
Steven Schlick,
Peter Steenkiste,
Matthieu Arnould,
Francois J. Bitz,
Fred Christianson,
Eric C. Cooper,
Onat Menzilcioglu,
Denise Ombres,
Brian Zill
|
|
Pages: 664-673 |
|
doi>10.1145/125826.126144 |
|
Full text: PDF
|
|
|
|
|
Computing climate change: can we beat nature? |
| |
Robert C. Malone,
Robert Chervin,
Richard Smith,
William P. Dannevik,
John Drake
|
|
Page: 676 |
|
doi>10.1145/125826.126148 |
|
Full text: PDF
|
|
|
|
|
Climate modeling with parallel vector supercomputers |
| |
Robert Chervin
|
|
Page: 677 |
|
doi>10.1145/125826.126149 |
|
Full text: PDF
|
|
|
|
|
Computing modeling in a MIMD environment |
| |
William Dannevik
|
|
Page: 678 |
|
doi>10.1145/125826.126151 |
|
Full text: PDF
|
|
|
|
|
Ocean modeling on the connection machine |
| |
R. D. Smith,
J. K. Dukowicz,
R. C. Malone
|
|
Page: 679 |
|
doi>10.1145/125826.126153 |
|
Full text: PDF
|
|
|
|
|
An integrated memory management scheme for dynamic alias resolution |
| |
Tzi-cker Chiueh
|
|
Pages: 682-691 |
|
doi>10.1145/125826.126157 |
|
Full text: PDF
|
|
|
|
|
MOVE: a framework for high-performance processor design |
| |
Henk Corporaal,
Hans (J.M.) Mulder
|
|
Pages: 692-701 |
|
doi>10.1145/125826.126159 |
|
Full text: PDF
|
|
|
|
|
A semantics-directed partitioning of a processor architecture |
| |
Peter L. Bird,
Uwe F. Pleban
|
|
Pages: 702-709 |
|
doi>10.1145/125826.126162 |
|
Full text: PDF
|
|
|
|
|
Radix sort for vector multiprocessors |
| |
Marco Zagha,
Guy E. Blelloch
|
|
Pages: 712-721 |
|
doi>10.1145/125826.126164 |
|
Full text: PDF
|
|
|
|
|
A method of vector processing for shared symbolic data |
| |
Yasusi Kanada
|
|
Pages: 722-731 |
|
doi>10.1145/125826.126167 |
|
Full text: PDF
|
|
|
|
|
Optimal bounded-degree VLSI networks for sorting in a constant number of rounds |
| |
Hussein M. Alnuweiri
|
|
Pages: 732-739 |
|
doi>10.1145/125826.126169 |
|
Full text: PDF
|
|
|
|
|
“Whither massive parallelism?” |
| |
Howard Jay Siegel
|
|
Pages: 740-740 |
|
doi>10.1145/125826.126173 |
|
Full text: PDF
|
|
|
|
|
An efficient parallel algorithm for all pairs examination |
| |
Kevin B. Theobald,
Guang R. Gao
|
|
Pages: 742-753 |
|
doi>10.1145/125826.126175 |
|
Full text: PDF
|
|
|
|
|
Parallel power-of-two FFTs on hypercubes |
| |
Mahn-ling Woo,
R. A. Renaut
|
|
Pages: 754-763 |
|
doi>10.1145/125826.126178 |
|
Full text: PDF
|
|
|
|
|
Analysis of replicated data algorithms on processor array architectures |
| |
P. J. Narayanan
|
|
Pages: 764-773 |
|
doi>10.1145/125826.126182 |
|
Full text: PDF
|
|
|
|
|
Design of a highly reliable cube-connected cycles architecture |
| |
Nian-Feng Tzeng
|
|
Pages: 776-785 |
|
doi>10.1145/125826.126184 |
|
Full text: PDF
|
|
|
|
|
Three-dimensional finite-element analyses: implications for computer architectures |
| |
Valerie E. Taylor,
Abhiram Ranade,
David G. Messerschmitt
|
|
Pages: 786-795 |
|
doi>10.1145/125826.126188 |
|
Full text: PDF
|
|
|
|
|
Massively parallel computing and the mid-course tracking problem |
| |
J. L. Tomkins,
J. P. VanDyke
|
|
Pages: 796-804 |
|
doi>10.1145/125826.126190 |
|
Full text: PDF
|
|
|
|
|
Measurement of memory access contentions in multiple vector processor systems |
| |
Ingrid Y. Bucher,
Margaret L. Simmons
|
|
Pages: 806-817 |
|
doi>10.1145/125826.126197 |
|
Full text: PDF
|
|
|
|
|
Comparison and analysis of software and directory coherence schemes |
| |
Yung-Chin Chen,
Alexander V. Veidenbaum
|
|
Pages: 818-829 |
|
doi>10.1145/125826.126200 |
|
Full text: PDF
|
|
|
|
|
Performance prediction of distributed load balancing on multicomputer systems |
| |
Ishfaq Ahmad,
Arif Ghafoor,
Kishan Mehrotra
|
|
Pages: 830-839 |
|
doi>10.1145/125826.126208 |
|
Full text: PDF
|
|
|
|
|
Efficient Doacross execution on distributed shared-memory multiprocessors |
| |
Hong-Men Su,
Pen-Chung Yew
|
|
Pages: 842-853 |
|
doi>10.1145/125826.105185 |
|
Full text: PDF
|
|
|
|
|
Detecting redundant accesses to array data |
| |
Elana D. Granston,
Alexander V. Veidenbaum
|
|
Pages: 854-865 |
|
doi>10.1145/125826.126714 |
|
Full text: PDF
|
|
|
|
|
Effects of partitioning and scheduling sparse matrix factorization on communication and load balance |
| |
Sesh Venugopal,
Vijay K. Naik
|
|
Pages: 866-875 |
|
doi>10.1145/125826.126716 |
|
Full text: PDF
|
|
|
|
|
Mass storage requirements in the intelligence community |
| |
Tom Myers,
Elizabeth Williams
|
|
Pages: 878-889 |
|
doi>10.1145/125826.126717 |
|
Full text: PDF
|
|
|
|
|
A virtual memory translation mechanism to support checkpoint and rollback recovery |
| |
Nicholas S. Bowen,
Dhiraj K. Pradhan
|
|
Pages: 890-899 |
|
doi>10.1145/125826.126719 |
|
Full text: PDF
|
|
|
|
|
The K2 distributed memory parallel processor: architecture, compiler, and operating system |
| |
M. Annaratone,
M. Fillo,
M. Halbherr,
R. Rühl,
P. Steiner,
M. Viredaz
|
|
Pages: 900-909 |
|
doi>10.1145/125826.126721 |
|
Full text: PDF
|
|
|