It is our great pleasure to welcome you to the 18th ACM International Symposium on High Performance Distributed Computing. This year's symposium continues its tradition of being the premier forum for presentation of research results and experience reports on latest research findings on the design and use of parallel and distributed systems for high end computing, collaboration, data analysis, and other innovative applications. This installment takes place in Garching near Munich, Germany, June 11-13, 2009.
Topics of interest include HPDC architectures, high end communications, data management and transport, software environments, operating system technologies, grid middleware, applications and algorithms, as well as fault tolerance.
Co-located with HPDC 09 are six workshops. We welcome the Workshop on Challenges for Large Applications in Distributed Environments (CLADE 09), the Second International Workshop on Data-Aware Distributed Computing (DADC 09), the Workshop on Large-Scale System and Application Performance (LSAP 09), the Workshop on Monitoring, Logging and Accounting in Production Grids (MLA 09), the Workshop on Resiliency in High-Performance Computing (Resilience 09), and the 4th UPGRADE-CN Workshop on Content Management and Delivery in Large-Scale Networks (UPGRADE-CN 09).
The call for papers attracted 68 submissions from Asia, Canada, Europe, Africa, and the United States. The program committee accepted 20 papers that cover a variety of topics, including Grid middleware and distributed algorithms, resource management and scheduling, data management, parallel algorithms and applications, workflow and dataflow applications, I/O and parallel computing. We hope that these proceedings will serve as a valuable reference for researchers and developers.
Proceeding Downloads
Harnessing parallelism in multicore clusters with the all-pairs and wavefront abstractions
Both distributed systems and multicore computers are difficult programming environments. Although the expert programmer may be able to tune distributed and multicore computers to achieve high performance, the non-expert may struggle to achieve a program ...
Pluggable parallelisation
This paper presents the concept of pluggable parallelisation that allows scientists to develop sequential like codes that can take advantage of multi-core, cluster and grid systems. In this approach parallel applications are developed by plugging ...
Performance enhancement with speculative execution based parallelism for processing large-scale xml-based application data
We present the design and implementation of a toolkit for processing large-scale XML datasets that utilizes the capabilities for parallelism that are available in the emerging multi-core architectures. Multi-core processors are expected to be widely ...
Y-lib: a user level library to increase the performance of MPI-IO in a lustre file system environment
It is widely known that MPI-IO performs poorly in a Lustre file system environment, although the reasons for such performance are currently not well understood. The research presented in this paper strongly supports our hypothesis that MPI-IO performs ...
DataStager: scalable data staging services for petascale applications
Known challenges for petascale machines are that (1) the costs of I/O for high performance applications can be substantial, especially for output tasks like checkpointing, and (2) noise from I/O actions can inject undesirable delays into the runtimes of ...
Interconnect agnostic checkpoint/restart in open MPI
Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passing Interface (MPI) level transparent checkpoint/restart fault tolerance is ...
Maintaining reference graphs of globally accessible objects in fully decentralized distributed systems
Since the advent of electronic computing, the processors' clock speed has risen tremendously. Now that energy efficiency requirements have stopped that trend, the number of processing cores per machine started to rise. In near future, these cores will ...
Adaptive run-time prediction in heterogeneous environments
In this article we describe an approach for the prediction of the run-time of jobs in heterogeneous environments that applies a meta-prediction algorithm working in multiple phases. For an efficient utilization of hardware resources, it is necessary to ...
Performance prediction based on hierarchy parallel features captured in multi-processing system
As the computing ability of high performance computers are improved by increasing the number of computing elements, how to utilize the available computing resources becomes an important issue. Different strategies to solve an problem based on a multi-...
CLOUDLET: towards mapreduce implementation on virtual machines
The existing MapReduce framework in virtualized environment suffers from poor performance, due to the heavy overhead of I/O virtualization, and management difficulty for storage and computation. To address the problems, we propose Cloudlet, a novel ...
Investigating transactional memory performance on ccNUMA machines
Most Software Transactional Memory (STM) research has focused on multi-core processors and small SMP machines; limited research has been aimed at the clusters, leaving the area of big SMP machines unexplored. Big SMP machine usually use Non-Uniform ...
An adaptive online system for efficient processing of hierarchical data
Concept hierarchies greatly help in the organization and reuse of information and are widely used in a variety of information systems applications. In this paper, we describe a method for efficiently storing and querying data organized into concept ...
High performance wide-area overlay using deadlock-free routing
Overlay networks as the communication medium in parallel and distributed applications have gained prominence, especially in Grid environments. However, providing both throughput performance and reliable communication on overlays have been given little ...
TakTuk, adaptive deployment of remote executions
This article deals with TakTuk, a middleware that deploys efficiently parallel remote executions on large scale grids (thousands of nodes). This tool is mostly intended for interactive use: distributed machines administration and parallel applications ...
Live migration of virtual machine based on full system trace and replay
Live migration of virtual machines (VM) across distinct physical hosts provides a significant new benefit for administrators of data centers and clusters. Previous migration schemes focused on transferring the runtime memory state of the VM. Those ...
Trace-based evaluation of job runtime and queue wait time predictions in grids
Large-scale distributed computing systems such as grids are serving a growing number of scientists. These environments bring about not only the advantages of an economy of scale, but also the challenges of resource and workload heterogeneity. A ...
Modeling user submission strategies on production grids
Production-grid users experience many system faults as well as high and variable latencies due to the scale, complexity and sharing of such infrastructures. To improve performance, they adopt different submission strategies, that are potentially ...
Resource co-allocation for large-scale distributed environments
Advances in the development of large scale distributed computing systems such as Grids and Computing Clouds have intensified the need for developing scheduling algorithms capable of allocating multiple resources simultaneously. In principle, the ...
Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters
In this paper, we investigate the benefits that organisations can reap by using "Cloud Computing" providers to augment the computing capacity of their local infrastructure. We evaluate the cost of six scheduling strategies used by an organisation that ...
Model-guided autotuning of high-productivity languages for petascale computing
addresses the enormous complexity of mapping applications to current and future highly parallel platforms - including scalable architectures consisting of tens of thousands of nodes, many-core devices with tens to hundreds of cores, and hierarchical ...
A novel graph based approach for automatic composition of high quality grid workflows
The workflow paradigm is one of the most important programming models for the Grid. The composition of Grid workflows has been widely studied in the Grid community. However, there is still a lack of a general and efficient approach for automatic ...
An integrated framework for performance-based optimization of scientific workflows
- Vijay S. Kumar,
- P. Sadayappan,
- Gaurang Mehta,
- Karan Vahi,
- Ewa Deelman,
- Varun Ratnakar,
- Jihie Kim,
- Yolanda Gil,
- Mary Hall,
- Tahsin Kurc,
- Joel Saltz
Data analysis processes in scientific applications can be expressed as coarse-grain workflows of complex data processing operations with data flow dependencies between them. Performance optimization of these workflows can be viewed as a search for a set ...
Maestro: a self-organizing peer-to-peer dataflow framework using reinforcement learning
In this paper we describe Maestro, a dataflow computation framework for Ibis, our Java-based grid middleware. The novelty of Maestro is that it is a self-organizing peer-to-peer system, meaning that it distributes the tasks in a flow over the available ...
Collaborative query coordination in community-driven data grids
E-science communities face huge data management challenges due to large existing data sets and expected data rates from forthcoming projects. Community-driven data grids provide a scalable, high-throughput oriented data management solution for ...
The quest for scalable support of data-intensive workloads in distributed systems
- Ioan Raicu,
- Ian T. Foster,
- Yong Zhao,
- Philip Little,
- Christopher M. Moretti,
- Amitabh Chaudhary,
- Douglas Thain
Data-intensive applications involving the analysis of large datasets often require large amounts of compute and storage resources, for which data locality can be crucial to high throughput and performance. We propose a "data diffusion" approach that ...
Exploring data reliability tradeoffs in replicated storage systems
This paper explores the feasibility of a cost-efficient storage architecture that offers the reliability and access performance characteristics of a high-end system. This architecture exploits two opportunities: First, scavenging idle storage from LAN-...
Index Terms
Proceedings of the 18th ACM international symposium on High performance distributed computing




