No abstract available.
Proceeding Downloads
FAUST: an environment for programming parallel scientific applications
The subject of integrated programming environments for scientific computing has become very popular over the last few years. Environments such as Rn [3] are being constructed to help coordinate the disjoint activities of editing, debugging, and ...
Parallel algorithm development workbench
This paper proposes an approach, and a resulting tool, for developing algorithms for parallel processors. The problems in developing efficient algorithms for parallel systems are discussed. The objectives of this approach are to support the evaluation ...
Growing discord: programming philosophy and hardware design
Generally, vector compiler technology has been successful in achieving reasonable peak efficiency on “good” code. Moreover, the community's ability to generate “good” vector code has improved dramatically. As we move into the era of parallelism, ...
The horizon supercomputing system: architecture and software
Horizon is the name currently being used to refer to a shared-memory Multiple Instruction stream - Multiple Data stream (MIMD) computer architecture under study by independent groups at the Supercomputing Research Center and at Tera Computer Company. ...
A processor architecture for horizon
Horizon is a scalable shared-memory Multiple Instruction stream - Multiple Data stream (MIMD) computer architecture independently under study at the Supercomputing Research Center (SRC) and Tera Computer Company. It is composed of a few hundred ...
Analysis of a 3D toroidal network for a shared memory architecture
This paper describes a synchronized network model suitable for the Horizon architecture. The model is defined in terms of a topology and routing policy. A three dimensional toroidal topology is investigated for its multiple redundant paths, memory ...
Performance prediction for the horizon super computer
The performance of one Horizon processing element can be quantified by user operations per instruction, the instructions per tick, and the basic clock rate. Assuming there is sufficient parallelism within a problem, the performance of one PE can be ...
Compiling on horizon
Compiler research to test several hardware features of the Horizon supercomputer design has yielded some preliminary results based solely on code generation and local optimization. Correctly packing operations into the moderately wide instruction word ...
HORSE: a simulation of the horizon supercomputer
HORSE is a program that combines the simulation of the HORIZON multiprocessor architecture with an interactive debugging environment. The models for processing elements, interconnection network, and memory modules include enough detail to allow ...
The fast fourier transform and sparse matrix computations: a study of two applications on teh HORIZON supercomputer
As part of the HORIZON project currently underway at the Supercomputing Research Center, a set of application programs are being written and their performance is being evaluated. This paper discusses two of these applications: the fast Fourier transform ...
Assessing the benefits of fine-grain parallelism in dataflow programs
A method for assessing the benefits of fine-grain parallelism in “real” programs is presented. The method is based on parallelism profiles and speedup curves derived by executing dataflow graphs on an interpreter under progressively more realistic ...
Tokenless static data flow using associative templates
The static data flow model of computation promises high performance from fine grained parallelism, but conventional token-driven static data flow architectures are inefficient in terms of memory bandwidth and microcycles required per operation. The ...
Elimination of bottlenecks in dynamic dataflow processors
A key component of a dynamic dataflow processor, the matching unit, has been identified as a major bottleneck. An alternate implementation for the matching unit is presented. This implementation increases the operating bandwidth of the unit by allowing ...
I-NET mechanism for issuing multiple instructions
Conventional instruction issuing methods use hardware control mechanism to issue instructions in multiple-functional-unit systems. They reach physical limitations due to the complexity of issuing logic when they intend to issue multiple instructions per ...
Vectorizing compilers: a test suite and results
This report describes a collection of 100 Fortran loops used to test the effectiveness of an automatic vectorizing compiler. We present the results of compiling these loops using commercially available, vectorizing Fortran compilers on a variety of ...
An evaluation of vector Fortran 200 generated by Cyber 205 and ETA-10 pre-compilation tools
Vectorizing pre-compilers such as KAP/205 and VAST-2 complement the efficient use of FORTRAN on the CDC Cyber 205. With the advent of the ETA-10 and its EOS/VSOS environment, the performance of these FORTRAN 200 pre-processors has come under closer ...
Cedar Fortran and other Vector and parallel Fortran dialects
The introduction of vector processors and multiprocessors punctuate the most dramatic changes in Fortran and its dialects. The emerging generation of supercomputers utilize both vector processing and multiprocessing simultaneously. The challenge is to ...
Polycyclic Vector scheduling vs. Chaining on 1-Port Vector supercomputers
This paper studies the impact of chaining and several instruction scheduling schemes on one-memory-port vector supercomputers, illustrated by the Cray-1 and Cray-2. The lack of instruction chaining in the Cray-2 vector processor requires a different ...
Interactive scientific visualization and parallel display techniques
In this paper, we describe a new graphics environment for essentially real-time interactive visualization of computational fluid mechanics. Within this environment, the researcher may interactively examine fluid data on a framebuffer with animated flow ...
A scientific visualization workbench
A system for visualization of data from supercomputer simulations has been developed for use by scientists and engineers at Los Alamos National Laboratory. The scientific visualization workbench, as the system is called, is based on an industry standard ...
Distributed scientific video movie making
We describe a versatile, low cost, video movie making system for generating and displaying scientific graphics from remote supercomputers. The system makes video movies by single frame animation from the output of time dependent, numerical simulations ...
Compiling issues for supercomputers
Accurate and fast methods for computing data dependencies are vital to the efficiency of vectorizing and parallelizing compilers. Program transformations employed by these compilers are effective only if dependencies are computed as accurately as ...
Compiling techniques for first-order liner recurrences on a Vector computer
Linear recurrences are the most important class of non-vectorizable problems in typical scientific/engineering calculations. This work discusses high performance methods for solving first-order linear recurrences on a vector computer, investigates ...
V-Pascal: an automatic vectorizing compiler for Pascal with no language extensions
Detailed anatomy of automatic vectorizing compiler V-Pascal (Version 1, now operational) is given. With no language extensions, V-Pascal efficiently vectorizes the whole of arbitrarily given multiply nested for loops using the mechanism of vector ...
Using Linda for supercomputing on a local area network
A distributed parallel processing system based on the LINDA programming constructs has been implemented on a local area network of computers. This system allows a single application program to utilize many machines on the network simultaneously. Several ...
Development of job-job step scheduler for NAL numerical simulator
In this paper we present the concepts and functions of the job-job step scheduler which is the kernel of the software packages developed for the management of the NAL Numerical Simulator (NS). This scheduler is partially responsible for the high ...
The symbolic hyperplane transformation for recursively defined arrays
This paper describes a restructuring transformation which can be used to parallelize recurrence relations. The transformation is based on the hyperplane (or wavefront) method, but extends the applicability of the method to irregularly-structured ...
Mass storage support for supercomputing
Mass Storage support for supercomputing at Boeing Computer Services is satisfied by a locally developed storage system known as FMS, the File Management System. The product of a joint development project between CDC and Boeing, it runs on a dedicated ...
Profiles in mass storage: a tale of two systems
The Los Alamos Common File System (CFS) and the NCAR Mass Storage System (MSS) are file storage and file management systems that serve heterogeneous computing networks of supercomputers, general purpose computers, scientific workstations and personal ...
Index Terms
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Recommendations
Acceptance Rates
| Year | Submitted | Accepted | Rate |
|---|---|---|---|
| SC '17 | 327 | 61 | 19% |
| SC '16 | 442 | 81 | 18% |
| SC '15 | 358 | 79 | 22% |
| SC '14 | 394 | 83 | 21% |
| SC '13 | 449 | 91 | 20% |
| SC '12 | 461 | 100 | 22% |
| SC '11 | 352 | 74 | 21% |
| SC '10 | 253 | 51 | 20% |
| SC '09 | 261 | 59 | 23% |
| SC '08 | 277 | 59 | 21% |
| SC '07 | 268 | 54 | 20% |
| SC '06 | 239 | 54 | 23% |
| SC '05 | 260 | 62 | 24% |
| SC '04 | 200 | 60 | 30% |
| SC '03 | 207 | 60 | 29% |
| SC '02 | 230 | 67 | 29% |
| SC '01 | 240 | 60 | 25% |
| SC '00 | 179 | 62 | 35% |
| Supercomputing '95 | 241 | 69 | 29% |
| Supercomputing '93 | 300 | 72 | 24% |
| Supercomputing '92 | 220 | 75 | 34% |
| Supercomputing '91 | 215 | 83 | 39% |
| Overall | 6,373 | 1,516 | 24% |


