Contact The DL Team Contact Us | Switch to tabbed view

top of pageABSTRACT

We introduce the coupled placement problem for modern data centers spanning placement of application computation and data among available server and storage resources. While the two have traditionally been addressed independently in data centers, two modern trends make it beneficial to consider them together in a coupled manner: (a) rise in virtualization technologies, which enable applications packaged as VMs to be run on any server in the data center with spare compute resources, and (b) rise in multi-purpose hardware devices in the data center which provide compute resources of varying capabilities at different proximities from the storage nodes.
Advertisements



top of pageAUTHORS



Author image not provided  Madhukar Korupolu

No contact information provided yet.

Bibliometrics: publication history
Publication years1997-2016
Publication count25
Citation Count355
Available for download8
Downloads (6 Weeks)198
Downloads (12 Months)2,177
Downloads (cumulative)8,895
Average downloads per article1,111.88
Average citations per article14.20
View colleagues of Madhukar Korupolu


Author image not provided  Aameek Singh

No contact information provided yet.

Bibliometrics: publication history
Publication years2002-2012
Publication count28
Citation Count272
Available for download7
Downloads (6 Weeks)24
Downloads (12 Months)331
Downloads (cumulative)7,441
Average downloads per article1,063.00
Average citations per article9.71
View colleagues of Aameek Singh


Author image not provided  Bhuvan Bamba

No contact information provided yet.

Bibliometrics: publication history
Publication years2004-2014
Publication count21
Citation Count169
Available for download10
Downloads (6 Weeks)16
Downloads (12 Months)204
Downloads (cumulative)2,988
Average downloads per article298.80
Average citations per article8.05
View colleagues of Bhuvan Bamba

top of pageREFERENCES

References are not available

top of pageCITED BY

17 Citations

 
 
 
 
 
 
 
 
 
 
 
 
 

top of pageINDEX TERMS

Index Terms are not available

top of pagePUBLICATION

Title IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing table of contents
Pages 1-12
Publication Date2009-05-23 (yyyy-mm-dd)
PublisherIEEE Computer Society Washington, DC, USA ©2009
ISBN: 978-1-4244-3751-1 doi>10.1109/IPDPS.2009.5161067

top of pageREVIEWS


Reviews are not available for this item
Computing Reviews logo

top of pageCOMMENTS

Be the first to comment To Post a comment please sign in or create a free Web account

top of pageTable of Contents

Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Table of Contents
Cover
Page: c1
doi>10.1109/IPDPS.2009.5160847
Full text available: Publisher SitePublisher Site
Copyright
Page: 1
doi>10.1109/IPDPS.2009.5160848
Full text available: Publisher SitePublisher Site
Title page
Page: 1
doi>10.1109/IPDPS.2009.5160849
Full text available: Publisher SitePublisher Site
Message from general chair
Pages: 1-2
doi>10.1109/IPDPS.2009.5160852
Full text available: Publisher SitePublisher Site
Message from the program chair
Pages: 1-2
doi>10.1109/IPDPS.2009.5160853
Full text available: Publisher SitePublisher Site
Message from the workshops chairs
Page: 1
doi>10.1109/IPDPS.2009.5160854
Full text available: Publisher SitePublisher Site
Message from steering co-chairs
Page: 1
doi>10.1109/IPDPS.2009.5160855
Full text available: Publisher SitePublisher Site
IPDPS 2009 organization
Pages: 1-4
doi>10.1109/IPDPS.2009.5160856
Full text available: Publisher SitePublisher Site
IPDPS 2009 reviewers
Page: 1
doi>10.1109/IPDPS.2009.5160857
Full text available: Publisher SitePublisher Site
IPDPS 2009 technical program
Pages: 1-4
doi>10.1109/IPDPS.2009.5160858
Full text available: Publisher SitePublisher Site
Many-core parallel computing - Can compilers and tools do the heavy lifting?
Wen-mei W. Hwu
Page: 1
doi>10.1109/IPDPS.2009.5160859
Full text available: Publisher SitePublisher Site

Modern GPUs such as the NVIDIA GeForce GTX280, ATI Radeon 4860, and the upcoming Intel Larrabee are massively parallel, many-core processors. Today, application developers for these many-core chips are reporting 10X–100X speedup over sequential ...
expand
Software transactional memory: Where do we come from? What are we? Where are we going?
Nir Shavit
Page: 1
doi>10.1109/IPDPS.2009.5160860
Full text available: Publisher SitePublisher Site

The transactional memory programming paradigm is gaining momentum as the approach of choice for replacing locks in concurrent programming. Combining sequences of concurrent operations into atomic transactions seems to promise a great reduction in the ...
expand
Green flash: Designing an energy efficient climate supercomputer
Leonid Oliker
Page: 1
doi>10.1109/IPDPS.2009.5160861
Full text available: Publisher SitePublisher Site

It is clear from both the cooling demands and the electricity costs, that the growth in scientific computing capabilities of the last few decades is not sustainable unless fundamentally new ideas are brought to bear. In this talk we propose a novel approach ...
expand
How to build a useful thousand-core manycore system?
Josep Torrellas
Page: 1
doi>10.1109/IPDPS.2009.5160862
Full text available: Publisher SitePublisher Site

Current hardware roadmaps call for doubling the number of on-chip cores approximately every two years. If this trend materializes, in at most a decade and a half, we will reach one thousand cores. This scenario has mind-boggling consequences for the ...
expand
TCPP Ph.D. Forum
Pages: 1-29
doi>10.1109/IPDPS.2009.5160863
Full text available: Publisher SitePublisher Site
SiCortex high-productivity, low-power computers
John Goodhue
Page: 1
doi>10.1109/IPDPS.2009.5160864
Full text available: Publisher SitePublisher Site

In order to work efficiently, clusters for high performance computing require a balance between the compute, memory, inter-node communication, and I/O. Fast communications among one thousand multicore nodes requires short wire paths and power-efficient ...
expand
Tools for scalable performance analysis on Petascale systems
I-Hsin Chung, S. R. Seelam, B. Mohr, J. Labarta
Pages: 1-3
doi>10.1109/IPDPS.2009.5160865
Full text available: Publisher SitePublisher Site

Tools are becoming increasingly important to efficiently utilize the computing power available in contemporary large scale systems. The drastic increase in the size and the complexity of systems require tools to be scalable while producing meaning full ...
expand
HCW 2009 keynote talk: GPU computing: Heterogeneous computing for future systems
John Owens
Page: 1
doi>10.1109/IPDPS.2009.5160866
Full text available: Publisher SitePublisher Site

Over the last decade, commodity graphics processors (GPUs) have evolved from fixed-function graphics units into powerful, programmable data-parallel processors. Today's GPU is capable of sustaining computation rates substantially greater than today's ...
expand
Resilient computing: An engineering discipline
Luca Simoncini
Page: 1
doi>10.1109/IPDPS.2009.5160867
Full text available: Publisher SitePublisher Site

The term resiliency has been used in many fields like child psychology, ecology, business, and several others, with the common meaning of expressing the ability to successfully accommodate unforeseen environmental perturbations or disturbances. The adjective ...
expand
De novo modeling of GPCR class A structures
Charles L. Brooks
Page: 1
doi>10.1109/IPDPS.2009.5160868
Full text available: Publisher SitePublisher Site

In this talk I will describe recent work to develop novel methods to model G protein-coupled receptor (GPCR) structures from their sequence information and statistically significant side chain contacts within a “template” structure. Our approach ...
expand
Crosstalk-free mapping of two-dimensional weak tori on optical slab waveguides
Hatem M. El-Boghdadi
Pages: 1-8
doi>10.1109/IPDPS.2009.5160869
Full text available: Publisher SitePublisher Site

While optical slab waveguides can deliver a huge bandwidth for the communication need through offering a huge number of communication channels, they require a large number of high speed lasers and photodetectors. This makes a limited use of the offered ...
expand
Table-based method for reconfigurable function evaluation
Maria Teresa Signes Pont, Higinio Mora Mora, Juan Manuel Garcia Chamizo, Gregorio de Miguel Casado
Pages: 1-9
doi>10.1109/IPDPS.2009.5160870
Full text available: Publisher SitePublisher Site

This paper presents a new approach to function evaluation using tables. The proposal argues for the use of a more complete primitive, namely a weighted sum, which converts the calculation of the function values into a recursive operation defined by a ...
expand
Uniform scattering of autonomous mobile robots in a grid
Lali Barriere, Paola Flocchini, Eduardo Mesa-Barrameda, Nicola Santoro
Pages: 1-8
doi>10.1109/IPDPS.2009.5160871
Full text available: Publisher SitePublisher Site

We consider the uniform scattering problem for a set of autonomous mobile robots deployed in a grid network: starting from an arbitrary placement in the grid, using purely localized computations, the robots must move so to reach in finite time a state ...
expand
Graph orientation to maximize the minimum weighted outdegree
Yuichi Asahiro, Jesper Jansson, Eiji Miyano, Hirotaka Ono
Pages: 1-8
doi>10.1109/IPDPS.2009.5160872
Full text available: Publisher SitePublisher Site

We study a new a variant of the graph orientation problem called MAXMINO where the input is an undirected, edge-weighted graph and the objective is to assign a direction to each edge so that the minimum weighted outdegree (taken over all vertices in ...
expand
Performance study of interference on GPU and CPU resources with multiple applications
Shinichi Yamagiwa, Koichi Wada
Pages: 1-8
doi>10.1109/IPDPS.2009.5160873
Full text available: Publisher SitePublisher Site

In the last years, the performance and capabilities of Graphics Processing Units (GPUs) improved drastically, mostly due to the demands of the entertainment market, with consumers and companies alike pushing for improvements in the level of visual fidelity, ...
expand
Resource allocation strategies for constructive in-network stream processing
Anne Benoit, Henri Casanova, Veronika Rehn-Sonigo, Yves Robert
Pages: 1-8
doi>10.1109/IPDPS.2009.5160874
Full text available: Publisher SitePublisher Site

We consider the operator mapping problem for in-network stream processing, i.e., the application of a tree of operators in steady-state to multiple data objects that are continuously updated at various locations in a network. Examples of in-network stream ...
expand
Self-Stabilizing k-out-of-l exclusion on tree networks
Ajoy K. Datta, Stephane Devismes, Florian Horn, Lawrence L. Larmore
Pages: 1-8
doi>10.1109/IPDPS.2009.5160875
Full text available: Publisher SitePublisher Site

In this paper, we address the problem of k-out-of-ℓ exclusion, a generalization of the mutual exclusion problem, in which there are ℓ units of a shared resource, and any process can request up to k units (1 ≤ k ≤ ℓ). We ...
expand
New implementation of a BSP composition primitive with application to the implementation of algorithmic skeletons
Frederic Gava, Ilias Garnier
Pages: 1-8
doi>10.1109/IPDPS.2009.5160876
Full text available: Publisher SitePublisher Site

BSML is a ML based language designed to code Bulk Synchronous Parallel (BSP) algorithms. It allows an estimation of execution time, avoids deadlocks and non-determinism. BSML proposes an extension of ML programming with a small set of primitives. One ...
expand
Deciding model of Population Size in time-constrained task scheduling
Wei Sun
Pages: 1-8
doi>10.1109/IPDPS.2009.5160877
Full text available: Publisher SitePublisher Site

Genetic algorithms (GAs) have been well applied in solving scheduling problems and their performance advantages have also been recognized. However, practitioners are often troubled by parameters setting when they are tuning GAs. Population Size (PS) ...
expand
Improving accuracy of host load predictions on computational grids by artificial neural networks
Truong Vinh Truong Duy, Yukinori Sato, Yasushi Inoguchi
Pages: 1-8
doi>10.1109/IPDPS.2009.5160878
Full text available: Publisher SitePublisher Site

The capability to predict the host load of a system is significant for computational grids to make efficient use of shared resources. This paper attempts to improve the accuracy of host load predictions by applying a neural network predictor to reach ...
expand
Combining multiple heuristics on discrete resources
Marin Bougeret, Pierre-Francois Dutot, Alfredo Goldman, Yanik Ngoko, Denis Trystram
Pages: 1-8
doi>10.1109/IPDPS.2009.5160879
Full text available: Publisher SitePublisher Site

In this work we study the portfolio problem which is to find a good combination of multiple heuristics to solve given instances on parallel resources in minimum time. The resources are assumed to be discrete, it is not possible to allocate a resource ...
expand
Filter placement on a pipelined architecture
Anne Benoit, Fanny Dufosse, Yves Robert
Pages: 1-8
doi>10.1109/IPDPS.2009.5160880
Full text available: Publisher SitePublisher Site

In this paper, we explore the problem of mapping filtering query services on chains of heterogeneous processors. Two important optimization criteria should be considered in such a framework. The period, which is the inverse of the throughput, measures ...
expand
Distributed selfish bin packing
Flavio K. Miyazawa, Andre L. Vignatti
Pages: 1-8
doi>10.1109/IPDPS.2009.5160881
Full text available: Publisher SitePublisher Site

We consider a game-theoretic bin packing problem with identical items, and we study the convergence time to a Nash equilibrium. In the model proposed, users choose their strategy simultaneously. We deal with two bins and multiple bins cases. We consider ...
expand
Predictive analysis and optimisation of pipelined wavefront computations
G. R. Mudalige, S. D. Hammond, J. A. Smith, S. A. Jarvis
Pages: 1-8
doi>10.1109/IPDPS.2009.5160882
Full text available: Publisher SitePublisher Site

Pipelined wavefront computations are a ubiquitous class of parallel algorithm used for the solution of a number of scientific and engineering applications. This paper investigates three optimisations to the generic pipelined wavefront algorithm, which ...
expand
RSA encryption and decryption using the redundant number system on the FPGA
Koji Nakano, Kensuke Kawakami, Koji Shigemoto
Pages: 1-8
doi>10.1109/IPDPS.2009.5160883
Full text available: Publisher SitePublisher Site

The main contribution of this paper is to present efficient hardware algorithms for the modulo exponentiation PE mod M used in RSA encryption and decryption, and implement them on the FPGA. The key ideas to accelerate the modulo exponentiation ...
expand
Computation with a constant number of steps in membrane computing.
Akihiro Fujiwara, Takeshi Tateishi
Pages: 1-8
doi>10.1109/IPDPS.2009.5160884
Full text available: Publisher SitePublisher Site

In the present paper, we propose P systems that work in a constant number of steps. We first propose two P systems for computing multiple input logic functions. An input of the logic functions is a set of n binary numbers of m bits, and an output is ...
expand
Analytical model of inter-node communication under multi-versioned coherence mechanisms
Shigero Sasaki, Atsuhiro Tanaka
Pages: 1-8
doi>10.1109/IPDPS.2009.5160885
Full text available: Publisher SitePublisher Site

Our goal is to predict the performance of multi-node systems consisting of identical processing nodes based on single node profiles. The performance of multi-node systems significantly depends on the amount of inter-node communication. Therefore, we ...
expand
A distributed approach for the problem of routing and wavelength assignment in WDM networks
Simone Cintra Chagas, Eber Huanca Cayo, Koji Nakano, Jacir Luiz Bordim
Pages: 1-7
doi>10.1109/IPDPS.2009.5160886
Full text available: Publisher SitePublisher Site

The main contribution of this work is to propose a distributed on-demand routing and wavelength assignment algorithm for WDM networks. The proposed algorithm, termed WDM-DSR, is capable to select routes and establish light-paths via message exchanges ...
expand
Advances in parallel and distributed computing models - APDCM
Pages: 1-2
doi>10.1109/IPDPS.2009.5160887
Full text available: Publisher SitePublisher Site

The past twenty years have seen a flurry of activity in the arena of parallel and distributed computing. In recent years, novel parallel and distributed computational models have been proposed in the literature, reflecting advances in new computational ...
expand
Decoupling memory pinning from the application with overlapped on-demand pinning and MMU notifiers
Brice Goglin
Pages: 1-8
doi>10.1109/IPDPS.2009.5160888
Full text available: Publisher SitePublisher Site

High-performance cluster networks achieve very high throughput thanks to zero-copy techniques that require pinning of application buffers in physical memory. The Open-MX stack implements message passing over generic Ethernet hardware with similar needs.
expand
Efficient and deadlock-free reconfiguration for source routed networks
Ashild Gronstad Solheim, Olav Lysne, Aurelio Bermudez, Rafael Casado, Thomas Sodring, Tor Skeie, Antonio Robles-Gomez
Pages: 1-8
doi>10.1109/IPDPS.2009.5160889
Full text available: Publisher SitePublisher Site

Overlapping Reconfiguration is currently the most efficient method to reconfigure an interconnection network, but is only valid for systems that apply distributed routing. This paper proposes a solution which enables utilization of Overlapping Reconfiguration ...
expand
Using application communication characteristics to drive dynamic MPI reconfiguration
Manjunath Gorentla Venkata, Patrick G. Bridges, Patrick M. Widener
Pages: 1-6
doi>10.1109/IPDPS.2009.5160890
Full text available: Publisher SitePublisher Site

Modern HPC applications, for example adaptive mesh refinement and multi-physics codes, have dynamic communication characteristics which result in poor performance on current MPI implementations. Current MPI implementations do not change transport protocols ...
expand
A power-aware, application-based performance study of modern commodity cluster interconnection networks
Torsten Hoefler, Timo Schneider, Andrew Lumsdaine
Pages: 1-7
doi>10.1109/IPDPS.2009.5160891
Full text available: Publisher SitePublisher Site

Microbenchmarks have long been used to assess the performance characteristics of high-performance networks. It is generally assumed that microbenchmark results indicate the parallel performance of real applications. This paper reports the results of ...
expand
Implementation and analysis of nonblocking collective operations on SCI networks
Christian Kaiser, Torsten Hoefler, Boris Bierbaum, Thomas Bemmerl
Pages: 1-7
doi>10.1109/IPDPS.2009.5160892
Full text available: Publisher SitePublisher Site

Nonblocking collective communication operations are currently being considered for inclusion into the MPI standard and are an area of active research. The benefits of such operations are documented by several recent publications, but so far, research ...
expand
An analysis of the impact of multi-threading on communication performance
Francois Trahay, Elisabeth Brunet, Alexandre Denis
Pages: 1-7
doi>10.1109/IPDPS.2009.5160893
Full text available: Publisher SitePublisher Site

Although processors become massively multicore and therefore new programming models mix message passing and multi-threading, the effects of threads on communication libraries remain neglected. Designing an efficient modern communication library requires ...
expand
RI2N/DRV: Multi-link ethernet for high-bandwidth and fault-tolerant network on PC clusters
Shin'ichi Miura, Toshihiro Hanawa, Taiga Yonemoto, Taisuke Boku, Mitsuhisa Sato
Pages: 1-7
doi>10.1109/IPDPS.2009.5160894
Full text available: Publisher SitePublisher Site

Although recent high-end interconnection network devices and switches provide a high performance to cost ratio, most of the small to medium sized PC clusters are still built on the commodity network, Ethernet. To enhance performance on commonly used ...
expand
Improving RDMA-based MPI eager protocol for frequently-used buffers
Mohammad J. Rashti, Ahmad Afsahi
Pages: 1-8
doi>10.1109/IPDPS.2009.5160895
Full text available: Publisher SitePublisher Site

MPI is the main standard for communication in high-performance clusters. MPI implementations use the Eager protocol to transfer small messages. To avoid the cost of memory registration and pre-negotiation, the Eager protocol involves a data copy to intermediate ...
expand
Designing multi-leader-based Allgather algorithms for multi-core clusters
Krishna Kandalla, Hari Subramoni, Gopal Santhanaraman, Matthew Koop, Dhabaleswar K. Panda
Pages: 1-8
doi>10.1109/IPDPS.2009.5160896
Full text available: Publisher SitePublisher Site

The increasing demand for computational cycles is being met by the use of multi-core processors. Having large number of cores per node necessitates multi-core aware designs to extract the best performance. The Message Passing Interface (MPI) is the dominant ...
expand
Communication architecture for clusters - CAC
Page: 1
doi>10.1109/IPDPS.2009.5160897
Full text available: Publisher SitePublisher Site

Many of the world's largest and fastest computer systems are workstation or server clusters. Numerous research groups in academia, industry, and government are currently engaged in cluster research, seeking new ways to advance the state of the art of ...
expand
Deadlock prevention by turn prohibition in interconnection networks
Lev Levitin, Mark Karpovsky, Mehmet Mustafa
Pages: 1-7
doi>10.1109/IPDPS.2009.5160898
Full text available: Publisher SitePublisher Site

In this paper we consider the problem of constructing minimal cycle-breaking sets of turns for graphs that model communication networks, as a method to prevent deadlocks in the networks. We present a new cycle-breaking algorithm called Simple Cycle-Breaking ...
expand
Message-efficient omission-tolerant consensus with limited synchrony
C. Delporte-Gallet, H. Fauconnier, A. Tielmann, F. C. Freiling, M. Kilic
Pages: 1-8
doi>10.1109/IPDPS.2009.5160899
Full text available: Publisher SitePublisher Site

We study the problem of consensus in the general omission failure model, i.e., in systems where processes can crash and omit messages while sending or receiving. This failure model is motivated from a smart card-based security framework in which certain ...
expand
A flexible and robust lookup algorithm for P2P systems
Mauro Andreolini, Riccardo Lancellotti
Pages: 1-8
doi>10.1109/IPDPS.2009.5160900
Full text available: Publisher SitePublisher Site

One of the most critical operations performed in a P2P system is the lookup of a resource. The main issues to be addressed by lookup algorithms are: (1) support for flexible search criteria (e.g., wildcard or multi-keyword searches), (2) effectiveness ...
expand
Storage architecture with integrity, redundancy and encryption
Henning Klein, Jorg Keller
Pages: 1-6
doi>10.1109/IPDPS.2009.5160901
Full text available: Publisher SitePublisher Site

We propose a storage system that treats confidentiality, integrity and availability of data in a unified manner. Extending RAID6, it allows for failures of multiple disks, encrypts data on disk, and stores checksums to detect faulty data without disks ...
expand
Extending SRT for parallel applications in tiled-CMP architectures
Daniel Sanchez, Juan L. Aragon, Jose M. Garcia
Pages: 1-8
doi>10.1109/IPDPS.2009.5160902
Full text available: Publisher SitePublisher Site

Reliability has become a first-class consideration issue for architects along with performance and energy-efficiency. The increasing scaling technology and subsequent supply voltage reductions are increasing the susceptibility of architectures to soft ...
expand
Byzantine fault-tolerant implementation of a multi-writer regular register
Khushboo Kanjani, Hyunyoung Lee, Jennifer L. Welch
Pages: 1-8
doi>10.1109/IPDPS.2009.5160903
Full text available: Publisher SitePublisher Site

Distributed storage systems have become popular for handling the enormous amounts of data in network-centric systems. A distributed storage system provides client processes with the abstraction of a shared variable that satisfies some consistency and ...
expand
Pre-calculated equation-based decoding in failure-tolerant distributed storage
Peter Sobe
Pages: 1-8
doi>10.1109/IPDPS.2009.5160904
Full text available: Publisher SitePublisher Site

Data distribution together with erasure-tolerant codes allow to store data reliably, even with failed or temporarily disconnected storage resources. The encoding algorithm, i.e. the calculation of the codewords is expressed by XOR equations. Even decoding ...
expand
Dependable QoS support in Mesh Networks
M. Fazio, M. Paone, D. Bruneo, A. Puliafito
Pages: 1-7
doi>10.1109/IPDPS.2009.5160905
Full text available: Publisher SitePublisher Site

Wireless networks are a very challenging communication technology since their ability to be set everywhere and whenever. Among the several types of wireless systems, a new class of networks is gradually emerging: Wireless Mesh Networks (WMNs). A WMN ...
expand
APART+: Boosting APART performance via optimistic pipelining of output events
Paolo Romano, Francesco Quaglia, Bruno Ciciani
Pages: 1-8
doi>10.1109/IPDPS.2009.5160906
Full text available: Publisher SitePublisher Site

APART (A Posteriori Active ReplicaTion) is a recently proposed active replication protocol specifically tailored for multi-tier data acquisition systems. It ensures consistency of middle-tier sink replicas by means of an a-posteriori synchronization ...
expand
AVR-INJECT: A tool for injecting faults in Wireless Sensor Nodes
Marcello Cinque, Domenico Cotroneo, Catello Di Martino, Stefano Russo, Alessandro Testa
Pages: 1-8
doi>10.1109/IPDPS.2009.5160907
Full text available: Publisher SitePublisher Site

As the incidence of faults in real Wireless Sensor Networks (WSNs) increases, fault injection is starting to be adopted to verify and validate their design choices. Following this recent trend, this paper presents a tool, named AVR-INJECT, designed to ...
expand
Robust CDN replica placement techniques
Samee Ullah Khan, Anthony A. Maciejewski, Howard Jay Siegel
Pages: 1-8
doi>10.1109/IPDPS.2009.5160908
Full text available: Publisher SitePublisher Site

Creating replicas of frequently accessed data objects across a read-intensive Content Delivery Network (CDN) can result in reduced user response time. Because CDNs often operate under volatile conditions, it is of the utmost importance to study replica ...
expand
Message from the program committee
Pages: 1-2
doi>10.1109/IPDPS.2009.5160909
Full text available: Publisher SitePublisher Site
Offer-based scheduling of deadline-constrained Bag-of-Tasks applications for utility computing systems
Marco A. S. Netto, Rajkumar Buyya
Pages: 1-11
doi>10.1109/IPDPS.2009.5160910
Full text available: Publisher SitePublisher Site

Metaschedulers can distribute parts of a Bag-of-Tasks (BoT) application among various resource providers in order to speed up its execution. When providers cannot disclose private information such as their load and computing power, which are usually ...
expand
Cost-benefit analysis of Cloud Computing versus desktop grids
Derrick Kondo, Bahman Javadi, Paul Malecot, Franck Cappello, David P. Anderson
Pages: 1-12
doi>10.1109/IPDPS.2009.5160911
Full text available: Publisher SitePublisher Site

Cloud Computing has taken commercial computing by storm. However, adoption of cloud computing platforms and services by the scientific community is in its infancy as the performance and monetary cost-benefits for scientific applications are not perfectly ...
expand
Resource-aware allocation strategies for divisible loads on large-scale systems
Anne Benoit, Loris Marchal, Jean-Francois Pineau, Yves Robert, Frederic Vivien
Pages: 1-4
doi>10.1109/IPDPS.2009.5160912
Full text available: Publisher SitePublisher Site

In this paper, we deal with the large-scale divisible load problem studied in [12]. We show how to reduce this problem to a classical preemptive scheduling problem on a single machine, thereby establishing new complexity results, and providing new approximation ...
expand
Validating Wrekavoc: A tool for heterogeneity emulation
Olivier Dubuisson, Jens Gustedt, Emmanuel Jeannot
Pages: 1-12
doi>10.1109/IPDPS.2009.5160913
Full text available: Publisher SitePublisher Site

Experimental validation and testing of solutions designed for heterogeneous environment is a challenging issue. Wrekavoc is a tool for performing such validation. It runs unmodified applications on emulated multisite heterogeneous platforms. Therefore ...
expand
Robust data placement in urgent computing environments
Jason M. Cope, Nick Trebon, Henry M. Tufo, Pete Beckman
Pages: 1-13
doi>10.1109/IPDPS.2009.5160914
Full text available: Publisher SitePublisher Site

Distributed urgent computing workflows often require data to be staged between multiple computational resources. Since these workflows execute in shared computing environments where users compete for resource usage, it is necessary to allocate resources ...
expand
Portable builds of HPC applications on diverse target platforms
Magdalena Slawinska, Jaroslaw Slawinski, Vaidy Sunderam
Pages: 1-8
doi>10.1109/IPDPS.2009.5160915
Full text available: Publisher SitePublisher Site

High-end machines at modern HPC centers are constantly undergoing hardware and system software upgrades - necessitating frequent rebuilds of application codes. The number of possible combinations of compilers, libraries, application build configurations, ...
expand
Robust sequential resource allocation in heterogeneous distributed systems with random compute node failures
Vladimir Shestak, Edwin K. P. Chong, Anthony A. Maciejewski, Howard Jay Siegel
Pages: 1-12
doi>10.1109/IPDPS.2009.5160916
Full text available: Publisher SitePublisher Site

The problem of finding efficient workload distribution techniques is becoming increasingly important today for heterogeneous distributed systems where the availability of compute nodes may change spontaneously over time. Therefore, the resource-allocation ...
expand
A robust dynamic optimization for MPI Alltoall operation
Hyacinthe Nzigou Mamadou, Takeshi Nanri, Kazuaki Murakami
Pages: 1-15
doi>10.1109/IPDPS.2009.5160917
Full text available: Publisher SitePublisher Site

The performance of the Message Passing Interface collective communications is a critical issue to high performance computing widely discussed. In this paper we propose a mechanism that dynamically selects the most efficient MPI Alltoall algorithm for ...
expand
Revisiting communication performance models for computational clusters
Alexey Lastovetsky, Vladimir Rychkov, Maureen O'Flynn
Pages: 1-11
doi>10.1109/IPDPS.2009.5160918
Full text available: Publisher SitePublisher Site

In this paper, we analyze restrictions of traditional models affecting the accuracy of analytical prediction of the execution time of collective communication operations. In particular, we show that the constant and variable contributions of processors ...
expand
A component-based framework for the Cell Broadband Engine
Timothy D. R. Hartley, Umit V. Catalyurek
Pages: 1-14
doi>10.1109/IPDPS.2009.5160919
Full text available: Publisher SitePublisher Site

With the increasing trend of microprocessor manufacturers to rely on parallelism to increase their products' performance, there is an associated increasing need for simple techniques to leverage this hardware parallelism for good application performance. ...
expand
The 18th Heterogeneity in Computing Workshop (HCW 2009)
Pages: 1-4
doi>10.1109/IPDPS.2009.5160920
Full text available: Publisher SitePublisher Site
High-throughput protein structure determination using grid computing
Jason W. Schmidberger, Blair Bethwaite, Colin Enticott, Mark A. Bate, Steve G. Androulakis, Noel Faux, Cyril F. Reboul, Jennifer M. N. Phan, James C. Whisstock, Wojtek J. Goscinski, Slavisa Garic, David Abramson, Ashley M. Buckle
Pages: 1-8
doi>10.1109/IPDPS.2009.5160921
Full text available: Publisher SitePublisher Site

Determining the X-ray crystallographic structures of proteins using the technique of molecular replacement (MR) can be a time and labor-intensive trial-and-error process, involving evaluating tens to hundreds of possible solutions to this complex 3D ...
expand
Folding@home: Lessons from eight years of volunteer distributed computing
Adam L. Beberg, Daniel L. Ensign, Guha Jayachandran, Siraj Khaliq, Vijay S. Pande
Pages: 1-8
doi>10.1109/IPDPS.2009.5160922
Full text available: Publisher SitePublisher Site

Accurate simulation of biophysical processes requires vast computing resources. Folding@home is a distributed computing system first released in 2000 to provide such resources needed to simulate protein folding and other biomolecular phenomena. Now operating ...
expand
Parallel reconstruction of neighbor-joining trees for large multiple sequence alignments using CUDA
Yongchao Liu, Bertil Schmidt, Douglas L. Maskell
Pages: 1-8
doi>10.1109/IPDPS.2009.5160923
Full text available: Publisher SitePublisher Site

Computing large multiple protein sequence alignments using progressive alignment tools such as ClustalW requires several hours on state-of-the-art workstations. ClustalW uses a three-stage processing pipeline: (i) pairwise distance computation; (ii) ...
expand
Accelerating error correction in high-throughput short-read DNA sequencing data with CUDA
Haixiang Shi, Bertil Schmidt, Weiguo Liu, Wolfgang Muller-Wittig
Pages: 1-8
doi>10.1109/IPDPS.2009.5160924
Full text available: Publisher SitePublisher Site

Emerging DNA sequencing technologies open up exciting new opportunities for genome sequencing by generating read data with a massive throughput. However, produced reads are significantly shorter and more error-prone compared to the traditional Sanger ...
expand
Parallel Monte Carlo study on caffeine-DNA interaction in aqueous solution
M. D. Kalugin, A. V. Teplukhin
Pages: 1-8
doi>10.1109/IPDPS.2009.5160925
Full text available: Publisher SitePublisher Site

Monte Carlo simulation of the caffeine-DNA interaction in aqueous solution at room temperature was carried out using parallel calculations on supercomputer. Very large simulation boxes were used containing superhelical B-DNA fragment surrounded by caffeine ...
expand
Dynamic parallelization for RNA structure comparison
Eric Snow, Eric Aubanel, Patricia Evans
Pages: 1-8
doi>10.1109/IPDPS.2009.5160926
Full text available: Publisher SitePublisher Site

In this paper we describe the parallelization of a dynamic programming algorithm used to find common RNA secondary structures including pseudoknots and similar structures. The sequential algorithm is recursive and uses memoization and data-driven selective ...
expand
Accelerating HMMer on FPGAs using systolic array based architecture
Yanteng Sun, Peng Li, Guochang Gu, Yuan Wen, Yuan Liu, Dong Liu
Pages: 1-8
doi>10.1109/IPDPS.2009.5160927
Full text available: Publisher SitePublisher Site

HMMer is a widely-used bioinformatics software package that uses profile HMMs (Hidden Markov Models) to model the primary structure consensus of a family of protein or nucleic acid sequences. However, with the rapid growth of both sequence and model ...
expand
Resource-efficient computing paradigm for computational protein modeling applications
Yaohang Li, Douglas Wardell, Vincent Freeh
Pages: 1-8
doi>10.1109/IPDPS.2009.5160928
Full text available: Publisher SitePublisher Site

Many computational protein modeling applications using numerical methods such as Molecular Dynamics (MD), Monte Carlo (MC), or Genetic Algorithms (GA) require a large number of energy estimations of the protein molecular system. A typical energy function ...
expand
Exploring FPGAs for accelerating the phylogenetic likelihood function
N. Alachiotis, E. Sotiriades, A. Dollas, A. Stamatakis
Pages: 1-8
doi>10.1109/IPDPS.2009.5160929
Full text available: Publisher SitePublisher Site

Driven by novel biological wet lab techniques such as pyrosequencing there has been an unprecedented molecular data explosion over the last 2–3 years. The growth of biological sequence data has significantly out-paced Moore's law. This development ...
expand
Long time-scale simulations of in vivo diffusion using GPU hardware
Elijah Roberts, John E. Stone, Leonardo Sepulveda, Wen-Mei W. Hwu, Zaida Luthey-Schulten
Pages: 1-8
doi>10.1109/IPDPS.2009.5160930
Full text available: Publisher SitePublisher Site

To address the problem of performing long time simulations of biochemical pathways under in vivo cellular conditions, we have developed a lattice-based, reaction-diffusion model that uses the graphics processing unit (GPU) as a computational co-processor. ...
expand
An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases
Lukasz Ligowski, Witold Rudnicki
Pages: 1-8
doi>10.1109/IPDPS.2009.5160931
Full text available: Publisher SitePublisher Site

The Smith Waterman algorithm for sequence alignment is one of the main tools of bioinformatics. It is used for sequence similarity searches and alignment of similar sequences. The high end Graphical Processing Unit (GPU), used for processing graphics ...
expand
Stochastic multi-particle Brownian Dynamics simulation of biological ion channels: A Finite Element approach
May Siksik, Vikram Krishnamurthy
Pages: 1-6
doi>10.1109/IPDPS.2009.5160932
Full text available: Publisher SitePublisher Site

Biological ion channels are protein tubes that span the cell membrane. They provide a conduction pathway and regulate the flow of ions though the low dielectric membrane. Modeling the dynamics of these channels is crucial in understanding their functionality. ...
expand
Message from the workshop chairs
Pages: 1-3
doi>10.1109/IPDPS.2009.5160933
Full text available: Publisher SitePublisher Site
Smart read/write for MPI-IO
Saba Sehrish, Jun Wang
Pages: 1-8
doi>10.1109/IPDPS.2009.5160934
Full text available: Publisher SitePublisher Site

We present a case for automating the selection of MPI-IO performance optimizations, with an ultimate goal to relieve the application programmer from these details, thereby improving their productivity. Programmers productivity has always been overlooked ...
expand
Sparse collective operations for MPI
Torsten Hoefler, Jesper Larsson Traff
Pages: 1-8
doi>10.1109/IPDPS.2009.5160935
Full text available: Publisher SitePublisher Site

We discuss issues in designing sparse (nearest neighbor) collective operations for communication and reduction operations in small neighborhoods for the Message Passing Interface (MPI).We propose three such operations, namely a sparse gather operation, ...
expand
GPAW optimized for Blue Gene/P using hybrid programming
Mads Ruben Burgdorff Kristensen, Hans Henrik Happe, Brian Vinter
Pages: 1-6
doi>10.1109/IPDPS.2009.5160936
Full text available: Publisher SitePublisher Site

In this work we present optimizations of a Grid-based projector-augmented wave method software, GPAW [1] for the Blue Gene/P architecture. The improvements are achieved by exploring the advantage of shared and distributed memory programming also known ...
expand
CuPP - A framework for easy CUDA integration
Jens Breitbart
Pages: 1-8
doi>10.1109/IPDPS.2009.5160937
Full text available: Publisher SitePublisher Site

This paper reports on CuPP, our newly developed C++ framework designed to ease integration of NVIDIAs GPGPU system CUDA into existing C++ applications. CuPP provides interfaces to reoccurring tasks that are easier to use than the standard CUDA interfaces. ...
expand
A generalized, distributed analysis system for optimization of Parallel Applications
Hung-Hsun Su, Max Billingsley, Alan D. George
Pages: 1-8
doi>10.1109/IPDPS.2009.5160938
Full text available: Publisher SitePublisher Site

Developing a high performance parallel application is difficult. An application must often be analyzed and optimized by the programmer before reaching an acceptable level of performance. Performance tools that collect and visualize performance data can ...
expand
CellFS: Taking the "DMA” out of Cell programming
Latchesar Ionkov, Aki Nyrhinen, Andrey Mirtchovski
Pages: 1-8
doi>10.1109/IPDPS.2009.5160939
Full text available: Publisher SitePublisher Site

In this paper we present a new programming model for the Cell BE architecture called CellFS. CellFS aims to simplify the task of managing I/O between the local store of the synergistic processing units and main memory of the Cell. The CellFS support ...
expand
Fast development of dense linear algebra codes on graphics processors
M. Jesus Zafont, Alberto Martin, Francisco Igual, Enrique S. Quintana-Orti
Pages: 1-8
doi>10.1109/IPDPS.2009.5160940
Full text available: Publisher SitePublisher Site

We present an application programming interface (API) for the C programming language that facilitates the development of dense linear algebra algorithms on graphics processors applying the FLAME methodology. The interface, built on top of the NVIDIA ...
expand
An integrated approach to improving the parallel application development process
Gregory R. Watson, Craig E. Rasmussen, Beth R. Tibbitts
Pages: 1-8
doi>10.1109/IPDPS.2009.5160941
Full text available: Publisher SitePublisher Site

The development of parallel applications is becoming increasingly important to a broad range of industries. Traditionally, parallel programming was a niche area that was primarily exploited by scientists trying to model extremely complicated physical ...
expand
Triple-C: Resource-usage prediction for semi-automatic parallelization of groups of dynamic image-processing tasks
Rob Albers, Eric Suijs, Peter H. N. de With
Pages: 1-8
doi>10.1109/IPDPS.2009.5160942
Full text available: Publisher SitePublisher Site

With the emergence of dynamic video processing, such as in image analysis, runtime estimation of resource usage would be highly attractive for automatic parallelization and QoS control with shared resources. A possible solution is to characterize the ...
expand
MPIXternal: A library for a portable adjustment of parallel MPI applications to heterogeneous environments
Carsten Clauss, Stefan Lankes, Thomas Bemmerl
Pages: 1-8
doi>10.1109/IPDPS.2009.5160943
Full text available: Publisher SitePublisher Site

Nowadays, common systems in the area of high performance computing exhibit highly hierarchical architectures. As a result, achieving satisfactory application performance demands an adaptation of the respective parallel algorithm to such systems. This, ...
expand
A lightweight stream-processing library using MPI
Alan Wagner, Camilo Rostoker
Pages: 1-8
doi>10.1109/IPDPS.2009.5160944
Full text available: Publisher SitePublisher Site

We describe the design of a lightweight library using MPI to support stream-processing on acyclic process structures. The design can be used to connect together arbitrary modules where each module can be its own parallel MPI program. We make extensive ...
expand
Preface
Pages: 1-3
doi>10.1109/IPDPS.2009.5160945
Full text available: Publisher SitePublisher Site
Robust vote sampling in a P2P media distribution system
Rameez Rahman, David Hales, Michel Meulpolder, Vincent Heinink, Johan Pouwelse, Henk Sips
Pages: 1-8
doi>10.1109/IPDPS.2009.5160946
Full text available: Publisher SitePublisher Site

The explosion of freely available media content through BitTorrent file sharing networks over the Internet means that users need guides or recommendations to find the right, high quality, content. Current systems rely on centralized servers to aggregate, ...
expand
Reliable P2P networks: TrebleCast and TrebleCast
Ivan Hernandez-Serrano, Shadanan Sharma, Alberto Leon-Garcia
Pages: 1-8
doi>10.1109/IPDPS.2009.5160947
Full text available: Publisher SitePublisher Site

Node churn can have a severe impact on the performance of P2P applications. In this paper, we consider the design of reliable P2P networks that can provide predictable performance. We exploit the experimental finding that the age of a node can be a reliable ...
expand
Ten weeks in the life of an eDonkey server
Frederic Aidouni, Matthieu Latapy, Clemence Magnien
Pages: 1-5
doi>10.1109/IPDPS.2009.5160948
Full text available: Publisher SitePublisher Site

This paper presents a capture of the queries managed by an eDonkey server during almost 10 weeks, leading to the observation of almost 9 billion messages involving almost 90 million users and more than 275 million distinct files. Acquisition and management ...
expand
Study on maintenance operations in a chord-based Peer-to-Peer session initiation protocol overlay network
Jouni Maenpaa, Gonzalo Camarillo
Pages: 1-9
doi>10.1109/IPDPS.2009.5160949
Full text available: Publisher SitePublisher Site

Peer-to-Peer Session Initiation Protocol (P2PSIP) is a new technology being standardized in the Internet Engineering Task Force. A P2PSIP network consists of a collection of nodes organized in a peer-to-peer fashion for the purpose of enabling real-time ...
expand
Resource advertising in PROSA P2P network
Vincenza Carchiolo, Antonio Lima, Giuseppe Mangioni
Pages: 1-7
doi>10.1109/IPDPS.2009.5160950
Full text available: Publisher SitePublisher Site

P2P communication paradigm is a successful solution to the problem of resources sharing as shown by the numerous real overlay networks present on Internet. One of the issue of P2P networks is how a resource shared by a peer can be made known to the other ...
expand
Relaxed-2-Chord: Efficiency, flexibility and provable stretch
Gennaro Cordasco, Francesca Della Corte, Alberto Negro, Alessandra Sala, Vittorio Scarano
Pages: 1-8
doi>10.1109/IPDPS.2009.5160951
Full text available: Publisher SitePublisher Site

Several proposals have been presented to supplement the traditional measure of routing efficiency in P2P networks, i.e. the (average) number of hops for lookup operations, with measures of the latency incurred in the underlying network. So far, no solution ...
expand
Measurement of eDonkey activity with distributed honeypots
Oussama Allali, Matthieu Latapy, Clemence Magnien
Pages: 1-8
doi>10.1109/IPDPS.2009.5160952
Full text available: Publisher SitePublisher Site

Collecting information about user activity in peer-to-peer systems is a key but challenging task. We describe here a distributed platform for doing so on the eDonkey network, relying on a group of honeypot peers which claim to have certain files and ...
expand
Network awareness of P2P live streaming applications
Delia Ciullo, Maria Antonieta Garcia, Akos Horvath, Emilio Leonardi, Marco Mellia, Dario Rossi, Miklos Telek, Paolo Veglia
Pages: 1-7
doi>10.1109/IPDPS.2009.5160953
Full text available: Publisher SitePublisher Site

Early P2P-TV systems have already attracted millions of users, and many new commercial solutions are entering this market. Little information is however available about how these systems work. In this paper we present large scale sets of experiments ...
expand
BarterCast: A practical approach to prevent lazy freeriding in P2P networks
M. Meulpolder, J. A. Pouwelse, D. H. J. Epema, H. J. Sips
Pages: 1-8
doi>10.1109/IPDPS.2009.5160954
Full text available: Publisher SitePublisher Site

A well-known problem in P2P systems is freeriding, where users do not share content if there is no incentive to do so. In this paper, we distinguish lazy freeriders that are merely reluctant to share but follow the protocol, versus die-hard freeriders ...
expand
Underlay awareness in P2P systems: Techniques and challenges
Osama Abboud, Aleksandra Kovacevic, Kalman Graffi, Konstantin Pussep, Ralf Steinmetz
Pages: 1-8
doi>10.1109/IPDPS.2009.5160955
Full text available: Publisher SitePublisher Site

Peer-to-peer (P2P) applications have recently attracted a large number of Internet users. Traditional P2P systems however, suffer from inefficiency due to lack of information from the underlay, i.e. the physical network. Although there is a plethora ...
expand
Analysis of PPLive through active and passive measurements
Salvatore Spoto, Rossano Gaeta, Marco Grangetto, Matteo Sereno
Pages: 1-7
doi>10.1109/IPDPS.2009.5160956
Full text available: Publisher SitePublisher Site

The P2P-IPTV is an emerging class of Internet applications that is becoming very popular. The growing popularity of these rather bandwidth demanding multimedia streaming applications has the potential to flood the Internet with a huge amount of traffic.
expand
A DDS-compliant P2P infrastructure for reliable and QoS-enabled data dissemination
Antonio Corradi, Luca Foschini
Pages: 1-8
doi>10.1109/IPDPS.2009.5160957
Full text available: Publisher SitePublisher Site

Recent trends in data-centric systems have motivated significant standardization efforts, such as the Data Distribution Service (DDS) to support data dissemination with guaranteed Quality of Service (QoS) in heterogeneous Internet environments. Notwithstanding ...
expand
Peer-to-Peer beyond file sharing: Where are P2P systems going?
Renato Lo Cigno, Tommaso Pecorella, Matteo Sereno, Luca Veltri
Pages: 1-8
doi>10.1109/IPDPS.2009.5160958
Full text available: Publisher SitePublisher Site

Are P2P systems and applications here to stay? Or are they a bright meteor whose destiny is to disappear soon? In this paper we try to give a positive answer to the first question, highlighting reasons why the P2P paradigm should become an integral part ...
expand
International workshop on hot topics in Peer-to-Peer systems - HOTP2P
Page: 1
doi>10.1109/IPDPS.2009.5160959
Full text available: Publisher SitePublisher Site
Ibis: Real-world problem solving using real-world grids
H. E. Bal, N. Drost, R. Kemp, J. Maassen, R. V. van Nieuwpoort, C. van Reeuwijk, F. J. Seinstra
Pages: 1-8
doi>10.1109/IPDPS.2009.5160960
Full text available: Publisher SitePublisher Site

Ibis is an open source software framework that drastically simplifies the process of programming and deploying large-scale parallel and distributed grid applications. Ibis supports a range of programming models that yield efficient implementations, even ...
expand
Grid-enabled hydropad: A scientific application for benchmarking GridRPC-based programming systems
Michele Guidolin, Alexey Lastovetsky
Pages: 1-8
doi>10.1109/IPDPS.2009.5160961
Full text available: Publisher SitePublisher Site

GridRPC is a standard API that allows an application to easily interface with a Grid environment. It implements a remote procedure call with a single task map and client-server communicationmodel. In addition to non-performance-related benefits, scientific ...
expand
Modelling memory requirements for grid applications
Tanvire Elahi, Cameron Kiddle, Rob Simmonds
Pages: 1-8
doi>10.1109/IPDPS.2009.5160962
Full text available: Publisher SitePublisher Site

Automating the execution of applications in grid computing environments is a complicated task due to the heterogeneity of computing resources, resource usage policies, and application requirements. Applications differ in memory usage, performance, scalability ...
expand
Managing the construction and use of Functional Performance Models in a Grid environment
Robert Higgins, Alexey Lastovetsky
Pages: 1-8
doi>10.1109/IPDPS.2009.5160963
Full text available: Publisher SitePublisher Site

This paper presents a tool, the Performance Model Manager, which addresses the complexity of the construction and management of a set of Functional Performance Models on a computing server in a Grid environment. The operation of the tool and the features ...
expand
Assessing the impact of future reconfigurable optical networks on application performance
Jason Maassen, Kees Verstoep, Henri E. Bal, Paola Grosso, Cees de Laat
Pages: 1-8
doi>10.1109/IPDPS.2009.5160964
Full text available: Publisher SitePublisher Site

The introduction of optical private networks (lightpaths) has significantly improved the capacity of long distance network links, making it feasible to run large parallel applications in a distributed fashion on multiple sites of a computational grid. ...
expand
A semantic-aware information system for multi-domain applications over service grids
Carmela Comito, Carlo Mastroianni, Domenico Talia
Pages: 1-8
doi>10.1109/IPDPS.2009.5160965
Full text available: Publisher SitePublisher Site

Service-oriented Grid frameworks offer resources and facilities to support the design and execution of distributed applications in different domains, ranging from scientific applications and public computing projects to commercial and industrial applications. ...
expand
Using a market economy to provision compute resources across planet-wide clusters
Murray Stokely, Jim Winget, Ed Keyes, Carrie Grimes, Benjamin Yolken
Pages: 1-8
doi>10.1109/IPDPS.2009.5160966
Full text available: Publisher SitePublisher Site

We present a practical, market-based solution to the resource provisioning problem in a set of heterogeneous resource clusters. We focus on provisioning rather than immediate scheduling decisions to allow users to change long-term job specifications ...
expand
Improving GridWay with network information: Tuning the monitoring tool
Luis Tomas, Agustin Caminero, Blanca Caminero, Carmen Carrion
Pages: 1-8
doi>10.1109/IPDPS.2009.5160967
Full text available: Publisher SitePublisher Site

The aggregation of heterogeneous and geographically distributed resources for new science and engineering applications has been made possible thanks to the deployment of Grid technologies. These systems have communication network requirements which should ...
expand
INFN-CNAF activity in the TIER-1 and GRID for LHC experiments
M. Bencivenni, M. Canaparo, F. Capannini, L. Carota, M. Carpene, A. Cavalli, A. Ceccanti, M. Cecchi, D. Cesini, A. Chierici, V. Ciaschini, A. Cristofori, S. Dal Pra, L. dell'Agnello, D. De Girolamo, M. Donatelli, D. N. Dongiovanni, E. Fattibene, T. Ferrari, A. Ferraro, A. Forti, A. Ghiselli, D. Gregori, G. Guizzunti, A. Italiano, L. Magnoni, B. Martelli, M. Mazzucato, G. Misurelli, M. Onofri, A. Paolini, A. Prosperini, P. P. Ricci, E. Ronchieri, F. Rosso, D. Salomoni, V. Sapunenko, V. Venturi, R. Veraldi, P. Veronesi, C. Vistoli, D. Vitlacil, S. Zani, R. Zappi
Pages: 1-9
doi>10.1109/IPDPS.2009.5160968
Full text available: Publisher SitePublisher Site

The four High Energy Physics (HEP) detectors at the Large Hadron Collider (LHC) at the European Organization for Nuclear Research (CERN) are among the most important experiments where the National Institute of Nuclear Physics (INFN) is being actively ...
expand
Evaluation of replication and fault detection in P2P-MPI
Stephane Genaud, Choopan Rattanapoka
Pages: 1-8
doi>10.1109/IPDPS.2009.5160969
Full text available: Publisher SitePublisher Site

We present in this paper an evaluation of fault management in the grid middleware P2P-MPI. One of P2P-MPI's objective is to support environments using commodity hardware. Hence, running programs is failure prone and a particular attention must be paid ...
expand
High performance grid computing - HPGC
Page: 1
doi>10.1109/IPDPS.2009.5160970
Full text available: Publisher SitePublisher Site

This workshop gives a forum to researchers and engineers to present their results in grid and distributed computing. Special areas of interest are grid middleware, grid applications, grid benchmarking, data distribution and replication on the grid, fault ...
expand
Clock gate on abort: Towards energy-efficient hardware Transactional Memory
Sutirtha Sanyal, Sourav Roy, Adrian Cristal, Osman S. Unsal, Mateo Valero
Pages: 1-8
doi>10.1109/IPDPS.2009.5160971
Full text available: Publisher SitePublisher Site

Transactional Memory (TM) is an emerging technology which promises to make parallel programming easier compared to earlier lock based approaches. However, as with any form of speculation, Transactional Memory too wastes a considerable amount of energy ...
expand
Time-efficient power-aware scheduling for periodic real-time tasks
Da-Ren Chen, Chiun-Chieh Hsu, Ming-Fong Lai
Pages: 1-8
doi>10.1109/IPDPS.2009.5160972
Full text available: Publisher SitePublisher Site

In this paper, we pay attention to the inter-task dynamic voltage scaling (DVS) algorithms for periodic real-time task systems. We propose a fast dynamic reclaiming scheme for power-aware hard real-time systems and discuss their performances and time ...
expand
Power-aware load balancing of large scale MPI applications
Maja Etinski, Julita Corbalan, Jesus Labarta, Mateo Valero, Alex Veidenbaum
Pages: 1-8
doi>10.1109/IPDPS.2009.5160973
Full text available: Publisher SitePublisher Site

Power consumption is a very important issue for HPC community, both at the level of one application or at the level of whole workload. Load imbalance of a MPI application can be exploited to save CPU energy without penalizing the execution time. An application ...
expand
Analysis of trade-off between power saving and response time in disk storage systems
E. Otoo, D. Rotem, S. C. Tsao
Pages: 1-8
doi>10.1109/IPDPS.2009.5160974
Full text available: Publisher SitePublisher Site

It is anticipated that in the near future disk storage systems will surpass application servers and will become the primary consumer of power in the data centers. Shutting down of inactive disks is one of the more widespread solutions to save power consumption ...
expand
The GREEN-NET framework: Energy efficiency in large scale distributed systems
Georges Da Costa, Jean-Patrick Gelas, Yiannis Georgiou, Laurent Lefevre, Anne-Cecile Orgerie, Jean-Marc Pierson, Olivier Richard, Kamal Sharma
Pages: 1-8
doi>10.1109/IPDPS.2009.5160975
Full text available: Publisher SitePublisher Site

The question of energy savings has been a matter of concern since a long time in the mobile distributed systems and battery-constrained systems. However, for large-scale non-mobile distributed systems, which nowadays reach impressive sizes, the energy ...
expand
Enabling autonomic power-aware management of instrumented data centers
Nanyan Jiang, Manish Parashar
Pages: 1-8
doi>10.1109/IPDPS.2009.5160976
Full text available: Publisher SitePublisher Site

Sensor networks support flexible, non-intrusive and fine-grained data collection and processing and can enable online monitoring of data center operating conditions as well as autonomic data center management. This paper describes the architecture and ...
expand
Power-aware dynamic task scheduling for heterogeneous accelerated clusters
Tomoaki Hamano, Toshio Endo, Satoshi Matsuoka
Pages: 1-8
doi>10.1109/IPDPS.2009.5160977
Full text available: Publisher SitePublisher Site

Recent accelerators such as GPUs achieve better cost-performance and watt-performance ratio, while the range of their application is more limited than general CPUs. Thus heterogeneous clusters and supercomputers equipped both with accelerators and general ...
expand
The Green500 List: Year one
W. Feng, T. Scogland
Pages: 1-7
doi>10.1109/IPDPS.2009.5160978
Full text available: Publisher SitePublisher Site

The latest release of the Green500 List in November 2008 marked its one-year anniversary. As such, this paper aims to provide an analysis and retrospective examination of the Green500 List in order to understand how the list has evolved and what trends ...
expand
Modeling and evaluating energy-performance efficiency of parallel processing on multicore based power aware systems
Rong Ge, Xizhou Feng, Kirk W. Cameron
Pages: 1-8
doi>10.1109/IPDPS.2009.5160979
Full text available: Publisher SitePublisher Site

In energy efficient high end computing, a typical problem is to find an energy-performance efficient resource allocation for computing a given workload. An analytical solution to this problem includes two steps: first estimating the performances and ...
expand
On the energy efficiency of graphics processing units for scientific computing
S. Huang, S. Xiao, W. Feng
Pages: 1-8
doi>10.1109/IPDPS.2009.5160980
Full text available: Publisher SitePublisher Site

The graphics processing unit (GPU) has emerged as a computational accelerator that dramatically reduces the time to discovery in high-end computing (HEC). However, while today's state-of-the-art GPU can easily reduce the execution time of a parallel ...
expand
High-performance, power-aware computing - HPPAC
Page: 1
doi>10.1109/IPDPS.2009.5160981
Full text available: Publisher SitePublisher Site

High-performance computing is and has always been performance-oriented. However, a consequence of the push towards maximum performance is increased energy consumption, especially in datacenters and supercomputing centers. Moreover, as peak performance ...
expand
Sensor network connectivity with multiple directional antennae of a given angular sum
Binay Bhattacharya, Yuzhuang Hu, Qiaosheng Shi, Evangelos Kranakis, Danny Krizanc
Pages: 1-11
doi>10.1109/IPDPS.2009.5160982
Full text available: Publisher SitePublisher Site

We investigate the problem of converting sets of sensors into strongly connected networks of sensors using multiple directional antennae. Consider a set S of n points in the plane modeling sensors of an ad hoc network. Each sensor uses a fixed number, ...
expand
On scheduling dags to maximize area
Gennaro Cordasco, Arnold L. Rosenberg
Pages: 1-12
doi>10.1109/IPDPS.2009.5160983
Full text available: Publisher SitePublisher Site

A new quality metric, called area, is introduced for schedules that execute dags, i.e., computations having intertask dependencies. Motivated by the temporal unpredictability encountered when computing over the Internet, the goal under the new metric ...
expand
Accelerating leukocyte tracking using CUDA: A case study in leveraging manycore coprocessors
Michael Boyer, David Tarjan, Scott T. Acton, Kevin Skadron
Pages: 1-12
doi>10.1109/IPDPS.2009.5160984
Full text available: Publisher SitePublisher Site

The availability of easily programmable manycore CPUs and GPUs has motivated investigations into how to best exploit their tremendous computational power for scientific computing. Here we demonstrate how a systems biology application—detection ...
expand
Path-robust multi-channel wireless networks
Arnold L. Rosenberg
Pages: 1-10
doi>10.1109/IPDPS.2009.5160985
Full text available: Publisher SitePublisher Site

A mathematical-plus-conceptual framework is presented for studying problems such as the following. One wants to deploy an n-node multi-channel wireless network N in an environment that is inaccessible for repair and/or that contains malicious adversaries. ...
expand
Information spreading in stationary Markovian evolving graphs
Andrea E. F. Clementi, Francesco Pasquale, Angelo Monti, Riccardo Silvestri
Pages: 1-12
doi>10.1109/IPDPS.2009.5160986
Full text available: Publisher SitePublisher Site

Markovian evolving graphs [2] are dynamic-graph models where the links among a fixed set of nodes change during time according to an arbitrary Markovian rule. They are extremely general and they can well describe important dynamic-network scenarios.
expand
Multiple priority customer service guarantees in cluster computing
Kaiqi Xiong
Pages: 1-12
doi>10.1109/IPDPS.2009.5160987
Full text available: Publisher SitePublisher Site

Cluster computing is an efficient computing paradigm for solving large-scale computational problems. Resource management is an essential part in such a computing system. A service provider uses computational resources to process a customer's service ...
expand
A cross-input adaptive framework for GPU program optimizations
Yixun Liu, Eddy Z. Zhang, Xipeng Shen
Pages: 1-10
doi>10.1109/IPDPS.2009.5160988
Full text available: Publisher SitePublisher Site

Recent years have seen a trend in using graphic processing units (GPU) as accelerators for general-purpose computing. The inexpensive, single-chip, massively parallel architecture of GPU has evidentially brought factors of speedup to many numerical applications. ...
expand
Packer: An innovative space-time-efficient parallel garbage collection algorithm based on virtual spaces
Shaoshan Liu, Ligang Wang, Xiao-Feng Li, Jean-Luc Gaudiot
Pages: 1-11
doi>10.1109/IPDPS.2009.5160989
Full text available: Publisher SitePublisher Site

The fundamental challenge of garbage collector (GC) design is to maximize the recycled space with minimal time overhead. For efficient memory management, in many GC designs the heap is divided into large object space (LOS) and non-large object space ...
expand
On reducing misspeculations in a pipelined scheduler
R. Gran, E. Morancho, A. Olive, J. M. Llaberia
Pages: 1-12
doi>10.1109/IPDPS.2009.5160990
Full text available: Publisher SitePublisher Site

Pipelining the scheduling logic, which exposes and exploits the instruction level parallelism, degrades processor performance. In a 4-issue processor, our evaluations show that pipelining the scheduling logic over two cycles degrades performance by 10% ...
expand
Best-effort parallel execution framework for Recognition and mining applications
Jiayuan Meng, Srimat Chakradhar, Anand Raghunathan
Pages: 1-12
doi>10.1109/IPDPS.2009.5160991
Full text available: Publisher SitePublisher Site

Recognition and mining (RM) applications are an emerging class of computing workloads that will be commonly executed on future multi-core and many-core computing platforms. The explosive growth of input data and the use of more sophisticated algorithms ...
expand
A metascalable computing framework for large spatiotemporal-scale atomistic simulations
Ken-ichi Nomura, Richard Seymour, Weiqiang Wang, Hikmet Dursun, Rajiv K. Kalia, Aiichiro Nakano, Priya Vashishta, Fuyuki Shimojo, Lin H. Yang
Pages: 1-10
doi>10.1109/IPDPS.2009.5160992
Full text available: Publisher SitePublisher Site

A metascalable (or “design once, scale on new architectures”) parallel computing framework has been developed for large spatiotemporal-scale atomistic simulations of materials based on spatiotemporal data locality principles, which is expected ...
expand
Design, implementation, and evaluation of transparent pNFS on Lustre
Weikuan Yu, Oleg Drokin, Jeffrey S. Vetter
Pages: 1-9
doi>10.1109/IPDPS.2009.5160993
Full text available: Publisher SitePublisher Site

Parallel NFS (pNFS) is an emergent open standard for parallelizing data transfer over a variety of I/O protocols. Prototypes of pNFS are actively being developed by industry and academia to examine its viability and possible enhancements. In this paper, ...
expand
Online time constrained scheduling with penalties
Nicolas Thibault, Christian Laforest
Pages: 1-8
doi>10.1109/IPDPS.2009.5160994
Full text available: Publisher SitePublisher Site

In this paper we prove the (constant) competitiveness of an online algorithm for scheduling jobs on multiple machines, supporting a mechanism of penalties for the scheduler/operator. Our context (online, multiple machines, supporting parameterizable ...
expand
A performance model for Fast Fourier Transform
Yan Li, Li Zhao, Haibo Lin, Alex Chunghen Chow, Jeffrey R. Diamond
Pages: 1-11
doi>10.1109/IPDPS.2009.5160995
Full text available: Publisher SitePublisher Site

The Fast Fourier Transform (FFT) has been considered one of the most important computing algorithms for decades. Its vast application domain makes it an important performance benchmark for new computer architectures. The most common Cooley-Tukey FFT ...
expand
A general approach to toroidal mesh decontamination with local immunity
Fabrizio Luccio, Linda Pagli
Pages: 1-8
doi>10.1109/IPDPS.2009.5160996
Full text available: Publisher SitePublisher Site

Network decontamination is studied on a k-dimensional torus (n1 × … × nk), with k ≥ 1 and 2 ≤ n1 ≥ k … nk. The decontamination is done by a set of agents moving on the net according ...
expand
A snap-stabilizing point-to-point communication protocol in message-switched networks
Alain Cournier, Swan Dubois, Vincent Villain
Pages: 1-11
doi>10.1109/IPDPS.2009.5160997
Full text available: Publisher SitePublisher Site

A snap-stabilizing protocol, starting from any configuration, always behaves according to its specification. In this paper, we present a snap-stabilizing protocol to solve the message forwarding problem in a message-switched network. In this problem, ...
expand
Helgrind+: An efficient dynamic race detector
Ali Jannesari, Kaibin Bao, Victor Pankratius, Walter F. Tichy
Pages: 1-13
doi>10.1109/IPDPS.2009.5160998
Full text available: Publisher SitePublisher Site

Finding synchronization defects is difficult due to non-deterministic orderings of parallel threads. Current tools for detecting synchronization defects tend to miss many data races or produce an overwhelming number of false alarms. In this paper, we ...
expand
Compiler-enhanced incremental checkpointing for OpenMP applications
Greg Bronevetsky, Daniel Marques, Keshav Pingali, Sally McKee, Radu Rugina
Pages: 1-12
doi>10.1109/IPDPS.2009.5160999
Full text available: Publisher SitePublisher Site

As modern supercomputing systems reach the peta-flop performance range, they grow in both size and complexity. This makes them increasingly vulnerable to failures from a variety of causes. Checkpointing is a popular technique for tolerating such failures, ...
expand
Efficient large-scale model checking
Kees Verstoep, Henri E. Bal, Jiri Barnat, Lubos Brim
Pages: 1-12
doi>10.1109/IPDPS.2009.5161000
Full text available: Publisher SitePublisher Site

Model checking is a popular technique to systematically and automatically verify system properties. Unfortunately, the well-known state explosion problem often limits the extent to which it can be applied to realistic specifications, due to the huge ...
expand
Building a parallel pipelined external memory algorithm library
Andreas Beckmann, Roman Dementiev, Johannes Singlery
Pages: 1-10
doi>10.1109/IPDPS.2009.5161001
Full text available: Publisher SitePublisher Site

Large and fast hard disks for little money have enabled the processing of huge amounts of data on a single machine. For this purpose, the well-established STXXL library provides a framework for external memory algorithms with an easy-to-use interface. ...
expand
Combinatorial properties for efficient communication in distributed networks with local interactions
S. Nikoletseas, C. Raptopoulos, P. G. Spirakis
Pages: 1-11
doi>10.1109/IPDPS.2009.5161002
Full text available: Publisher SitePublisher Site

We investigate random intersection graphs, a combinatorial model that quite accurately abstracts distributed networks with local interactions between nodes blindly sharing critical resources from a limited globally available domain. We study important ...
expand
NewMadeleine: An efficient support for high-performance networks in MPICH2
Guillaume Mercier, Francois Trahay, Elisabeth Brunet, Darius Buntinas
Pages: 1-12
doi>10.1109/IPDPS.2009.5161003
Full text available: Publisher SitePublisher Site

This paper describes how the NewMadeleine communication library has been integrated within the MPICH2 MPI implementation and the benefits brought. NewMadeleine is integrated as a Nemesis network module but the upper layers and in particular the CH3 layer ...
expand
Annotation-based empirical performance tuning using Orio
Albert Hartono, Boyana Norris, P. Sadayappan
Pages: 1-11
doi>10.1109/IPDPS.2009.5161004
Full text available: Publisher SitePublisher Site

For many scientific applications, significant time is spent in tuning codes for a particular high-performance architecture. Tuning approaches range from the relatively nonintrusive (e.g., by using compiler options) to extensive code modifications that ...
expand
Designing efficient sorting algorithms for manycore GPUs
Nadathur Satish, Mark Harris, Michael Garland
Pages: 1-10
doi>10.1109/IPDPS.2009.5161005
Full text available: Publisher SitePublisher Site

We describe the design of high-performance parallel radix sort and merge sort routines for manycore GPUs, taking advantage of the full programmability offered by CUDA. Our radix sort is the fastest GPU sort and our merge sort is the fastest comparison-based ...
expand
Using hardware transactional memory for data race detection
Shantanu Gupta, Florin Sultan, Srihari Cadambi, Franjo Ivancic, Martin Rotteler
Pages: 1-11
doi>10.1109/IPDPS.2009.5161006
Full text available: Publisher SitePublisher Site

Widespread emergence of multicore processors will spur development of parallel applications, exposing programmers to degrees of hardware concurrency hitherto unavailable. Dependable multithreaded software will have to rely on the ability to dynamically ...
expand
Treat-before-trick: Free-riding prevention for BitTorrent-like peer-to-peer networks
Kyuyong Shin, Douglas S. Reeves, Injong Rhee
Pages: 1-12
doi>10.1109/IPDPS.2009.5161007
Full text available: Publisher SitePublisher Site

In P2P file sharing systems, free-riders who use others' resources without sharing their own cause system-wide performance degradation. Existing techniques to counter free-riders are either complex (and thus not widely deployed), or easy to bypass (and ...
expand
Performance analysis of Optical Packet Switches enhanced with electronic buffering
Zhenghao Zhang, Yuanyuan Yang
Pages: 1-9
doi>10.1109/IPDPS.2009.5161008
Full text available: Publisher SitePublisher Site

Optical networks with Wavelength Division Multiplexing (WDM), especially Optical Packet Switching (OPS) networks, have attracted much attention in recent years. However, OPS is still not yet ready for deployment, which is mainly because of its high packet ...
expand
Unit disk graph and physical interference model: Putting pieces together
Emmanuelle Lebhar, Zvi Lotker
Pages: 1-8
doi>10.1109/IPDPS.2009.5161009
Full text available: Publisher SitePublisher Site

Modeling communications in wireless networks is a challenging task, since it requires a simple mathematical object on which efficient algorithms can be designed but which must also reflect the complex physical constraints inherent in wireless networks, ...
expand
Minimizing startup costs for performance-critical threading
Anthony M. Castaldo, R. Clint Whaley
Pages: 1-8
doi>10.1109/IPDPS.2009.5161010
Full text available: Publisher SitePublisher Site

Using the well-known ATLAS and LAPACK dense linear algebra libraries, we demonstrate that the parallel management overhead (PMO) can grow with problem size on even statically scheduled parallel programs with minimal task interaction. Therefore, the widely ...
expand
High-order stencil computations on multicore clusters
Liu Peng, Richard Seymour, Ken-ichi Nomura, Rajiv K. Kalia, Aiichiro Nakano, Priya Vashishta, Alexander Loddoch, Michael Netzband, William R. Volz, Chap C. Wong
Pages: 1-11
doi>10.1109/IPDPS.2009.5161011
Full text available: Publisher SitePublisher Site

Stencil computation (SC) is of critical importance for broad scientific and engineering applications. However, it is a challenge to optimize complex, high-order SC on emerging clusters of multicore processors. We have developed a hierarchical SC parallelization ...
expand
On the tradeoff between playback delay and buffer space in streaming
Alix L. H. Chow, Leana Golubchik, Samir Khuller, Yuan Yao
Pages: 1-12
doi>10.1109/IPDPS.2009.5161012
Full text available: Publisher SitePublisher Site

We consider the following basic question: a source node wishes to stream an ordered sequence of packets to a collection of receivers, which are distributed among a number of clusters. A node may send a packet to another node in its own cluster in one ...
expand
Core-aware memory access scheduling schemes
Zhibin Fang, Xian-He Sun, Yong Chen, Surendra Byna
Pages: 1-12
doi>10.1109/IPDPS.2009.5161013
Full text available: Publisher SitePublisher Site

Multi-core processors have changed the conventional hardware structure and require a rethinking of system scheduling and resource management to utilize them efficiently. However, current multi-core systems are still using conventional single-core memory ...
expand
Scalability challenges for massively parallel AMR applications
Brian Van Straalen, John Shalf, Terry Ligocki, Noel Keen, Woo-Sun Yang
Pages: 1-12
doi>10.1109/IPDPS.2009.5161014
Full text available: Publisher SitePublisher Site

PDE solvers using Adaptive Mesh Refinement on block structured grids are some of the most challenging applications to adapt to massively parallel computing environments. We describe optimizations to the Chombo AMR framework that enable it to scale efficiently ...
expand
Accommodating bursts in distributed stream processing systems
Yannis Drougas, Vana Kalogeraki
Pages: 1-11
doi>10.1109/IPDPS.2009.5161015
Full text available: Publisher SitePublisher Site

Stream processing systems have become important, as applications like media broadcasting, sensor network monitoring and on-line data analysis increasingly rely on real-time stream processing. Such systems are often challenged by the bursty nature of ...
expand
Efficient shared cache management through sharing-aware replacement and streaming-aware insertion policy
Yu Chen, Wenlong Li, Changkyu Kim, Zhizhong Tang
Pages: 1-11
doi>10.1109/IPDPS.2009.5161016
Full text available: Publisher SitePublisher Site

Multi-core processors with shared caches are now commonplace. However, prior works on shared cache management primarily focused on multi-programmed workloads. These schemes consider how to partition the cache space given that simultaneously-running applications ...
expand
Minimizing total busy time in parallel scheduling with application to optical networks
Michele Flammini, Gianpiero Monaco, Luca Moscardelli, Hadas Shachnai, Mordechai Shalom, Tami Tamir, Shmuel Zaks
Pages: 1-12
doi>10.1109/IPDPS.2009.5161017
Full text available: Publisher SitePublisher Site

We consider a scheduling problem in which a bounded number of jobs can be processed simultaneously by a single machine. The input is a set of n jobs J = {J1, … , Jn}. Each job, Jj, is associated with an interval ...
expand
A fusion-based approach for tolerating faults in finite state machines
Vinit Ogale, Bharath Balasubramanian, Vijay K. Garg
Pages: 1-11
doi>10.1109/IPDPS.2009.5161018
Full text available: Publisher SitePublisher Site

Given a set of n different deterministic finite state machines (DFSMs) modeling a distributed system, we examine the problem of tolerating f crash or Byzantine faults in such a system. The traditional approach to this problem involves replication and ...
expand
HPCC RandomAccess benchmark for next generation supercomputers
Vikas Aggarwal, Yogish Sabharwal, Rahul Garg, Philip Heidelberger
Pages: 1-11
doi>10.1109/IPDPS.2009.5161019
Full text available: Publisher SitePublisher Site

In this paper we examine the key elements determining the performance of the HPC Challenge RandomAccess benchmark on next generation supercomputers. We find that the performance of this benchmark is closely related to the bisection bandwidth of the underlying ...
expand
vCUDA: GPU accelerated high performance computing in virtual machines
Lin Shi, Hao Chen, Jianhua Sun
Pages: 1-11
doi>10.1109/IPDPS.2009.5161020
Full text available: Publisher SitePublisher Site

This paper describes vCUDA, a GPGPU (General Purpose Graphics Processing Unit) computing solution for virtual machines. vCUDA allows applications executing within virtual machines (VMs) to leverage hardware acceleration, which can be beneficial to the ...
expand
Speculation-based conflict resolution in hardware transactional memory
Ruben Titos, Manuel E. Acacio, Jose M. Garcia
Pages: 1-12
doi>10.1109/IPDPS.2009.5161021
Full text available: Publisher SitePublisher Site

Conflict management is a key design dimension of hardware transactional memory (HTM) systems, and the implementation of efficient mechanisms for detection and resolution becomes critical when conflicts are not a rare event. Current designs address this ...
expand
Efficient microarchitecture policies for accurately adapting to power constraints
Juan M. Cebrian, Juan L. Aragon, Jose M. Garcia, Pavlos Petoumenos, Stefanos Kaxiras
Pages: 1-12
doi>10.1109/IPDPS.2009.5161022
Full text available: Publisher SitePublisher Site

In the past years Dynamic Voltage and Frequency Scaling (DVFS) has been an effective technique that allowed microprocessors to match a predefined power budget. However, as process technology shrinks, DVFS becomes less effective (because of the increasing ...
expand
A resource allocation approach for supporting time-critical applications in grid environments
Qian Zhu, Gagan Agrawal
Pages: 1-12
doi>10.1109/IPDPS.2009.5161023
Full text available: Publisher SitePublisher Site

There are many grid-based applications where a timely response to an important event is needed. Often such response can require a significant computation and possibly communication, and it can be very challenging to complete it within the time-frame ...
expand
Energy minimization for periodic real-time tasks on heterogeneous processing units
Jian-Jia Chen, Andreas Schranzhofer, Lothar Thiele
Pages: 1-12
doi>10.1109/IPDPS.2009.5161024
Full text available: Publisher SitePublisher Site

Adopting multiple processing units to enhance the computing capability or reduce the power consumption has been widely accepted for designing modern computing systems. Such configurations impose challenges on energy efficiency in hardware and software ...
expand
Scalable RDMA performance in PGAS languages
Montse Farreras, George Almasi, Calin Cascaval, Toni Cortes
Pages: 1-12
doi>10.1109/IPDPS.2009.5161025
Full text available: Publisher SitePublisher Site

Partitioned Global Address Space (PGAS) languages provide a unique programming model that can span shared-memory multiprocessor (SMP) architectures, distributed memory machines, or cluster of SMPs. Users can program large scale machines with easy-to-use, ...
expand
On the complexity of mapping pipelined filtering services on heterogeneous platforms
Anne Benoit, Fanny Dufosse, Yves Robert
Pages: 1-12
doi>10.1109/IPDPS.2009.5161026
Full text available: Publisher SitePublisher Site

In this paper, we explore the problem of mapping filtering services on large-scale heterogeneous platforms. Two important optimization criteria should be considered in such a framework. The period, which is the inverse of the throughput, measures the ...
expand
Automatic detection of parallel applications computation phases
Juan Gonzalez, Judit Gimenez, Jesus Labarta
Pages: 1-11
doi>10.1109/IPDPS.2009.5161027
Full text available: Publisher SitePublisher Site

Analyzing parallel programs has become increasingly difficult due to the immense amount of information collected on large systems. The use of clustering techniques has been proposed to analyze applications. However, while the objective of previous works ...
expand
An asynchronous leader election algorithm for dynamic networks
Rebecca Ingram, Patrick Shields, Jennifer E. Walter, Jennifer L. Welch
Pages: 1-12
doi>10.1109/IPDPS.2009.5161028
Full text available: Publisher SitePublisher Site

An algorithm for electing a leader in an asynchronous network with dynamically changing communication topology is presented. The algorithm ensures that, no matter what pattern of topology changes occur, if topology changes cease, then eventually every ...
expand
Small-file access in parallel file systems
Philip Carns, Sam Lang, Robert Ross, Murali Vilayannur, Julian Kunkel, Thomas Ludwig
Pages: 1-11
doi>10.1109/IPDPS.2009.5161029
Full text available: Publisher SitePublisher Site

Today's computational science demands have resulted in ever larger parallel computers, and storage systems have grown to match these demands. Parallel file systems used in this environment are increasingly specialized to extract the highest possible ...
expand
Concurrent SSA for general barrier-synchronized parallel programs
Harshit Shah, R. K. Shyamasundar, Pradeep Varma
Pages: 1-12
doi>10.1109/IPDPS.2009.5161030
Full text available: Publisher SitePublisher Site

Static single assignment (SSA) form has been widely studied and used for sequential programs. This form enables many compiler optimizations to be done efficiently. Work on concurrent static single assignment form (CSSA) for concurrent programs is focused ...
expand
Parallel data-locality aware stencil computations on modern micro-architectures
Matthias Christen, Olaf Schenk, Esra Neufeld, Peter Messmer, Helmar Burkhart
Pages: 1-10
doi>10.1109/IPDPS.2009.5161031
Full text available: Publisher SitePublisher Site

Novel micro-architectures including the Cell Broadband Engine Architecture and graphics processing units are attractive platforms for compute-intensive simulations. This paper focuses on stencil computations arising in the context of a biomedical simulation ...
expand
Taking the heat off transactions: Dynamic selection of pessimistic concurrency control
Nehir Sonmez, Tim Harris, Adrian Cristal, Osman S. Unsal, Mateo Valero
Pages: 1-10
doi>10.1109/IPDPS.2009.5161032
Full text available: Publisher SitePublisher Site

In this paper we investigate feedback-directed dynamic selection between different implementations of atomic blocks. We initially execute atomic blocks using STM with optimistic concurrency control. At runtime, we identify “hot” variables ...
expand
Competitive buffer management with packet dependencies
Alex Kesselman, Boaz Patt-Shamir, Gabriel Scalosub
Pages: 1-12
doi>10.1109/IPDPS.2009.5161033
Full text available: Publisher SitePublisher Site

We introduce the problem of managing a FIFO buffer of bounded space, where arriving packets have dependencies among them. Our model is motivated by the scenario where large data frames must be split into multiple packets, because maximum packet size ...
expand
Autonomic management of non-functional concerns in distributed & parallel application programming
Marco Aldinucci, Marco Danelutto, Peter Kilpatrick
Pages: 1-12
doi>10.1109/IPDPS.2009.5161034
Full text available: Publisher SitePublisher Site

An approach to the management of non-functional concerns in massively parallel and/or distributed architectures that marries parallel programming patterns with autonomic computing is presented. The necessity and suitability of the adoption of autonomic ...
expand
An approach for matching communication patterns in parallel applications
Chao Ma, Yong Meng Teo, Verdi March, Naixue Xiong, Ioana Romelia Pop, Yan Xiang He, Simon See
Pages: 1-12
doi>10.1109/IPDPS.2009.5161035
Full text available: Publisher SitePublisher Site

Interprocessor communication is an important factor in determining the performance scalability of parallel systems. The communication requirements of a parallel application can be quantified to understand its communication pattern and communication pattern ...
expand
Elastic scaling of data parallel operators in stream processing
Scott Schneider, Henrique Andrade, Bugra Gedik, Alain Biem, Kun-Lung Wu
Pages: 1-12
doi>10.1109/IPDPS.2009.5161036
Full text available: Publisher SitePublisher Site

We describe an approach to elastically scale the performance of a data analytics operator that is part of a streaming application. Our techniques focus on dynamically adjusting the amount of computation an operator can carry out in response to changes ...
expand
Multi-users scheduling in parallel systems
Erik Saule, Denis Trystram
Pages: 1-9
doi>10.1109/IPDPS.2009.5161037
Full text available: Publisher SitePublisher Site

We are interested in this paper to study scheduling problems in systems where many users compete to perform their respective jobs on shared parallel resources. Each user has specific needs or wishes for computing his/her jobs expressed as a function ...
expand
Parallel accelerated cartesian expansions for particle dynamics simulations
M. Vikram, A. Baczewzki, B. Shanker, S. Aluru
Pages: 1-11
doi>10.1109/IPDPS.2009.5161038
Full text available: Publisher SitePublisher Site

Rapid evaluation of potentials in large physical systems plays a crucial role in several fields and has been an intensely studied topic on parallel computers. Computational methods and associated parallel algorithms tend to vary depending on the potential ...
expand
A framework for efficient and scalable execution of domain-specific templates on GPUs
Narayanan Sundaram, Anand Raghunathan, Srimat T. Chakradhar
Pages: 1-12
doi>10.1109/IPDPS.2009.5161039
Full text available: Publisher SitePublisher Site

Graphics Processing Units (GPUs) have emerged as important players in the transition of the computing industry from sequential to multi- and many-core computing. We propose a software framework for execution of domain-specific parallel templates on GPUs, ...
expand
Dynamic high-level scripting in parallel applications
Filippo Gioachin, Laxmikant V. Kale
Pages: 1-11
doi>10.1109/IPDPS.2009.5161040
Full text available: Publisher SitePublisher Site

Parallel applications typically run in batch mode, sometimes after long waits in a scheduler queue. In some situations, it would be desirable to interactively add new functionality to the running application, without having to recompile and rerun it. ...
expand
Remote-spanners: What to know beyond neighbors
Philippe Jacquet, Laurent Viennot
Pages: 1-10
doi>10.1109/IPDPS.2009.5161041
Full text available: Publisher SitePublisher Site

Motivated by the fact that neighbors are generally known in practical routing algorithms, we introduce the notion of remote-spanner. Given an unweighted graph G, a sub-graph H with vertex set V (H) = V (G) is an (α, β)-remote-spanner if for ...
expand
Self-stabilizing minimum-degree spanning tree within one from the optimal degree
Lelia Blin, Maria Gradinariu Potop-Butucaru, Stephane Rovedakis
Pages: 1-11
doi>10.1109/IPDPS.2009.5161042
Full text available: Publisher SitePublisher Site

We propose a self-stabilizing algorithm for constructing a Minimum-Degree Spanning Tree (MDST) in undirected networks. Starting from an arbitrary state, our algorithm is guaranteed to converge to a legitimate state describing a spanning tree whose maximum ...
expand
Input-independent, scalable and fast string matching on the Cray XMT
Oreste Villa, Daniel Chavarria-Miranda, Kristyn Maschhoff
Pages: 1-12
doi>10.1109/IPDPS.2009.5161043
Full text available: Publisher SitePublisher Site

String searching is at the core of many security and network applications like search engines, intrusion detection systems, virus scanners and spam filters. The growing size of on-line content and the increasing wire speeds push the need for fast, and ...
expand
Static strategies forworksharing with unrecoverable interruptions
A. Benoit, Y. Robert, A. L. Rosenberg, F. Vivien
Pages: 1-12
doi>10.1109/IPDPS.2009.5161044
Full text available: Publisher SitePublisher Site

One has a large workload that is “divisible”—its constituent work's granularity can be adjusted arbitrarily;—and one has access to p remote computers that can assist in computing the workload. The problem is that the remote computers ...
expand
Efficient scheduling of task graph collections on heterogeneous resources
Matthieu Gallet, Loris Marchal, Frederic Vivien
Pages: 1-11
doi>10.1109/IPDPS.2009.5161045
Full text available: Publisher SitePublisher Site

In this paper, we focus on scheduling jobs on computing Grids. In our model, a Grid job is made of a large collection of input data sets, which must all be processed by the same task graph or workflow, thus resulting in a collection of task graphs problem. ...
expand
Handling OS jitter on multicore multithreaded systems
Pradipta De Vijay Mann, Umang Mittaly
Pages: 1-12
doi>10.1109/IPDPS.2009.5161046
Full text available: Publisher SitePublisher Site

Various studies have shown that OS jitter can degrade parallel program performance considerably at large processor counts. Most sources of system jitter fall broadly into 5 categories - user space processes, kernel threads, interrupts, SMT interference ...
expand
An upload bandwidth threshold for peer-to-peer Video-on-Demand scalability
Yacine Boufkhad, Fabien Mathieu, Fabien de Montgolfier, Diego Perino, Laurent Viennot
Pages: 1-10
doi>10.1109/IPDPS.2009.5161047
Full text available: Publisher SitePublisher Site

We consider the fully distributed Video-on-Demand problem, where n nodes called boxes store a large set of videos and collaborate to serve simultaneously n videos or less between them. It is said to be scalable when Ω(n) videos can be distributively ...
expand
A new mechanism to deal with process variability in NoC links
Carles Hernandez, Federico Silla, Vicente Santonja, Jose Duato
Pages: 1-11
doi>10.1109/IPDPS.2009.5161048
Full text available: Publisher SitePublisher Site

Associated with the ever growing integration scale of VLSI technologies is the increase in process variability, which makes silicon devices to become less predictable. In the context of network-on-chip (NoC), this variability affects the maximum frequency ...
expand
Multi-dimensional characterization of temporal data mining on graphics processors
Jeremy Archuleta, Yong Cao, Tom Scogland, Wu-chun Feng
Pages: 1-12
doi>10.1109/IPDPS.2009.5161049
Full text available: Publisher SitePublisher Site

Through the algorithmic design patterns of data parallelism and task parallelism, the graphics processing unit (GPU) offers the potential to vastly accelerate discovery and innovation across a multitude of disciplines. For example, the exponential growth ...
expand
Crash fault detection in celerating environments
Srikanth Sastry, Scott M. Pike, Jennifer L. Welch
Pages: 1-12
doi>10.1109/IPDPS.2009.5161050
Full text available: Publisher SitePublisher Site

Failure detectors are a service that provides (approximate) information about process crashes in a distributed system. The well-known “eventually perfect” failure detector, ◊P, has been implemented in partially synchronous systems with ...
expand
Parallel implementation of Irregular Terrain Model on IBM Cell Broadband Engine
Yang Song, Jeffrey A. Rudin, Ali Akoglu
Pages: 1-7
doi>10.1109/IPDPS.2009.5161051
Full text available: Publisher SitePublisher Site

Prediction of radio coverage, also known as radio “hear-ability” requires the prediction of radio propagation loss. The Irregular Terrain Model (ITM) predicts the median attenuation of a radio signal as a function of distance and the variability ...
expand
Adaptable, metadata rich IO methods for portable high performance IO
Jay Lofstead, Fang Zheng, Scott Klasky, Karsten Schwan
Pages: 1-10
doi>10.1109/IPDPS.2009.5161052
Full text available: Publisher SitePublisher Site

Since IO performance on HPC machines strongly depends on machine characteristics and configuration, it is important to carefully tune IO libraries and make good use of appropriate library APIs. For instance, on current petascale machines, independent ...
expand
Optimal deterministic self-stabilizing vertex coloring in unidirectional anonymous networks
Samuel Bernard, Stephane Devismes, Maria Gradinariu Potop-Butucaru, Sebastien Tixeuil
Pages: 1-8
doi>10.1109/IPDPS.2009.5161053
Full text available: Publisher SitePublisher Site

A distributed algorithm is self-stabilizing if after faults and attacks hit the system and place it in some arbitrary global state, the systems recovers from this catastrophic situation without external intervention in finite time. Uni-directional networks ...
expand
A scalable auto-tuning framework for compiler optimization
Ananta Tiwari, Chun Chen, Jacqueline Chame, Mary Hall, Jeffrey K. Hollingsworth
Pages: 1-12
doi>10.1109/IPDPS.2009.5161054
Full text available: Publisher SitePublisher Site

We describe a scalable and general-purpose framework for auto-tuning compiler-generated code. We combine Active Harmony's parallel search backend with the CHiLL compiler transformation framework to generate in parallel a set of alternative implementations ...
expand
Understanding the design trade-offs among current multicore systems for numerical computations
Seunghwa Kang, David A. Bader, Richard Vuduc
Pages: 1-12
doi>10.1109/IPDPS.2009.5161055
Full text available: Publisher SitePublisher Site

In this paper, we empirically evaluate fundamental design trade-offs among the most recent multicore processors and accelerator technologies. Our primary aim is to aid application designers in better mapping their software to the most suitable architecture, ...
expand
TupleQ: Fully-asynchronous and zero-copy MPI over InfiniBand
Matthew J. Koop, Jaidev K. Sridhar, Dhabaleswar K. Panda
Pages: 1-8
doi>10.1109/IPDPS.2009.5161056
Full text available: Publisher SitePublisher Site

The Message Passing Interface (MPI) is the defacto standard for parallel programming. As system scales increase, application writers often try to increase the overlap of communication and computation. Unfortunately, even on offloaded hardware such as ...
expand
Performance projection of HPC applications using SPEC CFP2006 benchmarks
Sameh Sharkawi, Don DeSota, Raj Panda, Rajeev Indukuru, Stephen Stevens, Valerie Taylor, Xingfu Wu
Pages: 1-12
doi>10.1109/IPDPS.2009.5161057
Full text available: Publisher SitePublisher Site

Performance projections of High Performance Computing (HPC) applications onto various hardware platforms are important for hardware vendors and HPC users. The projections aid hardware vendors in the design of future systems, enable them to compare the ...
expand
Singular value decomposition on GPU using CUDA
Sheetal Lahabar, P. J. Narayanan
Pages: 1-10
doi>10.1109/IPDPS.2009.5161058
Full text available: Publisher SitePublisher Site

Linear algebra algorithms are fundamental to many computing applications. Modern GPUs are suited for many general purpose processing tasks and have emerged as inexpensive high performance co-processors due to their tremendous computing power. In this ...
expand
Dynamic iterations for the solution of ordinary differential equations on multicore processors
Yanan Yu, Ashok Srinivasan
Pages: 1-10
doi>10.1109/IPDPS.2009.5161059
Full text available: Publisher SitePublisher Site

In the past few years, there has been a trend of providing increased computing power through greater number of cores on a chip, rather than through higher clock speeds. In order to exploit the available computing power, applications need to be parallelized ...
expand
Compact graph representations and parallel connectivity algorithms for massive dynamic network analysis
Kamesh Madduri, David A. Bader
Pages: 1-11
doi>10.1109/IPDPS.2009.5161060
Full text available: Publisher SitePublisher Site

Graph-theoretic abstractions are extensively used to analyze massive data sets. Temporal data streams from socio-economic interactions, social networking web sites, communication traffic, and scientific computing can be intuitively modeled as graphs. ...
expand
The Weak Mutual Exclusion problem
Paolo Romano, Luis Rodrigues, Nuno Carvalho
Pages: 1-12
doi>10.1109/IPDPS.2009.5161061
Full text available: Publisher SitePublisher Site

In this paper we define the Weak Mutual Exclusion (WME) problem. Analogously to classical Distributed Mutual Exclusion (DME), WME serializes the accesses to a shared resource. Differently from DME, however, the WME abstraction regulates the access to ...
expand
CellMR: A framework for supporting mapreduce on asymmetric cell-based clusters
M. Mustafa Rafique, Benjamin Rose, Ali R. Butt, Dimitrios S. Nikolopoulos
Pages: 1-12
doi>10.1109/IPDPS.2009.5161062
Full text available: Publisher SitePublisher Site

The use of asymmetric multi-core processors with on-chip computational accelerators is becoming common in a variety of environments ranging from scientific computing to enterprise applications. The focus of current research has been on making efficient ...
expand
DMTCP: Transparent checkpointing for cluster computations and the desktop
Jason Ansel, Kapil Arya, Gene Cooperman
Pages: 1-12
doi>10.1109/IPDPS.2009.5161063
Full text available: Publisher SitePublisher Site

DMTCP (Distributed MultiThreaded CheckPointing) is a transparent user-level checkpointing package for distributed applications. Checkpointing and restart is demonstrated for a wide range of over 20 well known applications, including MATLAB, Python, TightVNC, ...
expand
A partition-based approach to support streaming updates over persistent data in an active datawarehouse
Abhirup Chakraborty, Ajit Singh
Pages: 1-11
doi>10.1109/IPDPS.2009.5161064
Full text available: Publisher SitePublisher Site

Active warehousing has emerged in order to meet the high user demands for fresh and up-to-date information. Online refreshment of the source updates introduces processing and disk overheads in the implementation of the warehouse transformations. This ...
expand
Message passing on data-parallel architectures
Jeff A. Stuart, John D. Owens
Pages: 1-12
doi>10.1109/IPDPS.2009.5161065
Full text available: Publisher SitePublisher Site

This paper explores the challenges in implementing a message passing interface usable on systems with data-parallel processors. As a case study, we design and implement the “DCGN” API on NVIDIA GPUs that is similar to MPI and allows full ...
expand
Sequence alignment with GPU: Performance and design challenges
Gregory M. Striemer, Ali Akoglu
Pages: 1-10
doi>10.1109/IPDPS.2009.5161066
Full text available: Publisher SitePublisher Site

In bioinformatics, alignments are commonly performed in genome and protein sequence analysis for gene identification and evolutionary similarities. There are several approaches for such analysis, each varying in accuracy and computational complexity. ...
expand
Coupled placement in modern data centers
Madhukar Korupolu, Aameek Singh, Bhuvan Bamba
Pages: 1-12
doi>10.1109/IPDPS.2009.5161067
Full text available: Publisher SitePublisher Site

We introduce the coupled placement problem for modern data centers spanning placement of application computation and data among available server and storage resources. While the two have traditionally been addressed independently in data centers, two ...
expand
Exploring the multiple-GPU design space
Dana Schaa, David Kaeli
Pages: 1-12
doi>10.1109/IPDPS.2009.5161068
Full text available: Publisher SitePublisher Site

Graphics Processing Units (GPUs) have been growing in popularity due to their impressive processing capabilities, and with general purpose programming languages such as NVIDIA's CUDA interface, are becoming the platform of choice in the scientific computing ...
expand
An on/off link activation method for low-power ethernet in PC clusters
Michihiro Koibuchi, Tomohiro Otsuka, Hiroki Matsutani, Hideharu Amano
Pages: 1-11
doi>10.1109/IPDPS.2009.5161069
Full text available: Publisher SitePublisher Site

The power consumption of interconnects is increased as the link bandwidth is improved in PC clusters. In this paper, we propose an on/off link activation method that uses the static analysis of the traffic in order to reduce the power consumption of ...
expand
Making resonance a common case: A high-performance implementation of collective I/O on parallel file systems
Xuechen Zhang, Song Jiang, Kei Davis
Pages: 1-12
doi>10.1109/IPDPS.2009.5161070
Full text available: Publisher SitePublisher Site

Collective I/O is a widely used technique to improve I/O performance in parallel computing. It can be implemented as a client-based or as a server-based scheme. The client-based implementation is more widely adopted in the MPIIO software such as ROMIO ...
expand
Phaser accumulators: A new reduction construct for dynamic parallelism
J. Shirako, D. M. Peixotto, V. Sarkar, W. N. Scherer
Pages: 1-12
doi>10.1109/IPDPS.2009.5161071
Full text available: Publisher SitePublisher Site

A reduction is a computation in which a common operation, such as a sum, is to be performed across multiple pieces of data, each supplied by a separate task. We introduce phaser accumulators, a new reduction construct that meshes seamlessly with phasers ...
expand
Transitive closure on the cell broadband engine: A study on self-scheduling in a multicore processor
Sudhir Vinjamuri, Viktor K. Prasanna
Pages: 1-11
doi>10.1109/IPDPS.2009.5161072
Full text available: Publisher SitePublisher Site

In this paper, we present a mappingmethodology and optimizations for solving transitive closure on the Cell multicore processor. Using our approach, it is possible to achieve near peak performance for transitive closure on the Cell processor. We first ...
expand
Evaluating the use of GPUs in liver image segmentation and HMMER database searches
John Paul Walters, Vidyananth Balu, Suryaprakash Kompalli, Vipin Chaudhary
Pages: 1-12
doi>10.1109/IPDPS.2009.5161073
Full text available: Publisher SitePublisher Site

In this paper we present the results of parallelizing two life sciences applications, Markov random fields-based (MRF) liver segmentation and HMMER's Viterbi algorithm, using GPUs. We relate our experiences in porting both applications to the GPU as ...
expand
Improving MPI-HMMER's scalability with parallel I/O
John Paul Walters, Rohan Darole, Vipin Chaudhary
Pages: 1-11
doi>10.1109/IPDPS.2009.5161074
Full text available: Publisher SitePublisher Site

We present PIO-HMMER, an enhanced version of MPI-HMMER. PIO-HMMER improves on MPI-HMMER's scalability through the use of parallel I/O and a parallel file system. In addition, we describe several enhancements, including a new load balancing scheme, enhanced ...
expand
Parallel short sequence mapping for high throughput genome sequencing
Doruk Bozdag, Catalin C. Barbacioru, Umit V. Catalyurek
Pages: 1-10
doi>10.1109/IPDPS.2009.5161075
Full text available: Publisher SitePublisher Site

With the advent of next-generation high throughput sequencing instruments, large volumes of short sequence data are generated at an unprecedented rate. Processing and analyzing these massive data requires overcoming several challenges including mapping ...
expand
Scaling communication-intensive applications on BlueGene/P using one-sided communication and overlap
Rajesh Nishtala, Paul H. Hargrove, Dan O. Bonachea, Katherine A. Yelick
Pages: 1-12
doi>10.1109/IPDPS.2009.5161076
Full text available: Publisher SitePublisher Site

In earlier work, we showed that the one-sided communication model found in PGAS languages (such as UPC) offers significant advantages in communication efficiency by decoupling data transfer from processor synchronization. We explore the use of the PGAS ...
expand
Scheduling resizable parallel applications
Rajesh Sudarsan, Calvin J. Ribbens
Pages: 1-10
doi>10.1109/IPDPS.2009.5161077
Full text available: Publisher SitePublisher Site

Most conventional parallel job schedulers only support static scheduling thereby restricting schedulers from being able to modify the number of processors allocated to parallel applications at runtime. The drawbacks of static scheduling can be overcome ...
expand
Architectural implications for spatial object association algorithms
Vijay S. Kumar, Tahsin Kurc, Joel Saltz, Ghaleb Abdulla, Scott R. Kohn, Celeste Matarazzo
Pages: 1-12
doi>10.1109/IPDPS.2009.5161078
Full text available: Publisher SitePublisher Site

Spatial object association, also referred to as crossmatch of spatial datasets, is the problem of identifying and comparing objects in two or more datasets based on their positions in a common spatial coordinate system. In this work, we evaluate two ...
expand
Work-first and help-first scheduling policies for async-finish task parallelism
Yi Guo, Rajkishore Barik, Raghavan Raman, Vivek Sarkar
Pages: 1-12
doi>10.1109/IPDPS.2009.5161079
Full text available: Publisher SitePublisher Site

Multiple programming models are emerging to address an increased need for dynamic task parallelism in applications for multicore processors and shared-address-space parallel computing. Examples include OpenMP 3.0, Java Concurrency Utilities, Microsoft ...
expand
Map construction and exploration by mobile agents scattered in a dangerous network
Paola Flocchini, Matthew Kellett, Peter Mason, Nicola Santoro
Pages: 1-10
doi>10.1109/IPDPS.2009.5161080
Full text available: Publisher SitePublisher Site

We consider the map construction problem in a simple, connected graph by a set of mobile computation entities or agents that start from scattered locations throughout the graph. The problem is further complicated by dangerous elements, nodes and links, ...
expand
Disjoint-path routing: Efficient communication for streaming applications
DaeHo Seo, Mithuna Thottethodi
Pages: 1-12
doi>10.1109/IPDPS.2009.5161081
Full text available: Publisher SitePublisher Site

Streaming is emerging as an important programming model for multicores. Streaming provides an elegant way to express task decomposition and inter-task communication, while hiding laborious orchestration details such as load balancing, assignment (of ...
expand
Providing security for MOCCA component environment
Michal Dyrda, Maciej Malawski, Marian Bubak, Syed Naqvi
Pages: 1-7
doi>10.1109/IPDPS.2009.5161082
Full text available: Publisher SitePublisher Site

The subject of this paper is a detailed analysis and development of security in MOCCA, a CCA-compliant Grid component framework build over H2O, a Java-based distributed computing platform. The approach is to extend H2O with an authentication mechanism ...
expand
Towards efficient shared memory communications in MPJ express
Aamir Shafi, Jawad Manzoor
Pages: 1-7
doi>10.1109/IPDPS.2009.5161083
Full text available: Publisher SitePublisher Site

The need to increase performance while conserving energy lead to the emergence of multi-core processors. These processors provide a feasible option to improve performance of software applications by increasing the number of cores, instead of relying ...
expand
TM-Stream: An STM framework for distributed event stream processing
Heiko Sturzrehm, Pascal Felber, Christof Fetzer
Pages: 1-8
doi>10.1109/IPDPS.2009.5161084
Full text available: Publisher SitePublisher Site

We extend DSTM2 with a combination of two techniques: First, we applied speculative dependencies between transactions, as first introduced in [1]. Specifically, transactions may read data of earlier transactions that have completed their execution, but ...
expand
Is shared memory programming attainable on clusters of embedded processors?
Konstantinos I. Karantasis, Eleftherios D. Polychronopoulos
Pages: 1-7
doi>10.1109/IPDPS.2009.5161085
Full text available: Publisher SitePublisher Site

The wide increase of total processing cores in commodity processors tends to lighten the need for computer performance by the classical scientific problems as well as by the modern multimedia and every day embedded applications. Nevertheless, the introduction ...
expand
High performance computing using ProActive environment and the asynchronous iteration model
Raphael Couturier, David Laiymani, Sebastien Miquee
Pages: 1-7
doi>10.1109/IPDPS.2009.5161086
Full text available: Publisher SitePublisher Site

This paper presents a new library for the ProActive environment, called AIL-PA (Asynchronous Iterative Library for ProActive). This new library allows to execute programs for solving large scale problems on various architectures. Two models of algorithm ...
expand
Workshop on Java and components for parallelism, distribution and concurrency - JAVAPDC
Page: 1
doi>10.1109/IPDPS.2009.5161087
Full text available: Publisher SitePublisher Site
Workshop on job scheduling strategies for parallel processing - JSSPP
Page: 1
doi>10.1109/IPDPS.2009.5161088
Full text available: Publisher SitePublisher Site
The world's fastest CPU and SMP node: Some performance results from the NEC SX-9
Thomas Zeiser, Georg Hager, Gerhard Wellein
Pages: 1-8
doi>10.1109/IPDPS.2009.5161089
Full text available: Publisher SitePublisher Site

Classic vector systems have all but vanished from recent TOP500 lists. Looking at the newly introduced NEC SX-9 series, we benchmark its memory subsystem using the low level vector triad and employ an advanced lattice Boltzmann flow solver kernel to ...
expand
GPU acceleration of Zernike moments for large-scale images
Manuel Ujaldon
Pages: 1-8
doi>10.1109/IPDPS.2009.5161090
Full text available: Publisher SitePublisher Site

Zernike moments are trascendental digital image descriptors used in many application areas like biomedical image processing and computer vision due to their good properties of orthogonality and rotation invariance. However, their computation is too expensive ...
expand
Harnessing the power of idle GPUs for acceleration of biological sequence alignment
Fumihiko Ino, Yuki Kotani, Kenichi Hagihara
Pages: 1-8
doi>10.1109/IPDPS.2009.5161091
Full text available: Publisher SitePublisher Site

This paper presents a parallel system capable of accelerating biological sequence alignment on the graphics processing unit (GPU) grid. The GPU grid in this paper is a desktop grid system that utilizes idle GPUs and CPUs in the office and home. Our parallel ...
expand
Application profiling on Cell-based clusters
Hikmet Dursun, Kevin J. Barker, Darren J. Kerbyson, Scott Pakin
Pages: 1-8
doi>10.1109/IPDPS.2009.5161092
Full text available: Publisher SitePublisher Site

In this paper, we present a methodology for profiling parallel applications executing on the IBM PowerXCell 8i (commonly referred to as the “Cell” processor). Specifically, we examine Cell-centric MPI programs on hybrid clusters containing ...
expand
Non-uniform fat-meshes for chip multiprocessors
Yu Zhang, Alex K. Jones
Pages: 1-8
doi>10.1109/IPDPS.2009.5161093
Full text available: Publisher SitePublisher Site

This paper studies the traffic hot spots of mesh networks in the context of chip multiprocessors. To mitigate these effects, this paper describes a non-uniform fat-mesh extension to mesh networks, which are popular for chip multiprocessors. The fat-mesh ...
expand
An evaluative study on the effect of contention on message latencies in large supercomputers
Abhinav Bhatele, V. Laxmikant
Pages: 1-8
doi>10.1109/IPDPS.2009.5161094
Full text available: Publisher SitePublisher Site

Significant theoretical research was done on interconnect topologies and topology aware mapping for parallel computers in the 80s. With the deployment of virtual cut-through, wormhole routing and faster interconnects, message latencies reduced and research ...
expand
The impact of network noise at large-scale communication performance
Torsten Hoefler, Timo Schneider, Andrew Lumsdaine
Pages: 1-8
doi>10.1109/IPDPS.2009.5161095
Full text available: Publisher SitePublisher Site

The impact of operating system noise on the performance of large-scale applications is a growing concern and ameliorating the effects of OS noise is a subject of active research. A related problem is that of network noise, which arises from shared use ...
expand
Large scale experiment and optimization of a distributed stochastic control algorithm. Application to energy management problems
Pascal Vezolle, Stephane Vialle, Xavier Warin
Pages: 1-8
doi>10.1109/IPDPS.2009.5161096
Full text available: Publisher SitePublisher Site

Asset management for the electricity industry leads to very large stochastic optimization problem. We explain in this article how to efficiently distribute the Bellman algorithm used, re-distributing data and computations at each time step, and we examine ...
expand
Performance analysis and projections for Petascale applications on Cray XT series systems
Sadaf R. Alam, Richard F. Barrett, Jeffery A. Kuehn, Steve W. Poole
Pages: 1-8
doi>10.1109/IPDPS.2009.5161097
Full text available: Publisher SitePublisher Site

The Petascale Cray XT5 system at the Oak Ridge National Laboratory (ORNL) Leadership Computing Facility (LCF) shares a number of system and software features with its predecessor, the Cray XT4 system including the quad-core AMD processor and a multi-core ...
expand
Performance modeling in action: Performance prediction of a Cray XT4 system during upgrade
Kevin J. Barker, Kei Davis, Darren J. Kerbyson
Pages: 1-8
doi>10.1109/IPDPS.2009.5161098
Full text available: Publisher SitePublisher Site

We present predictive performance models of two of the petascale applications, S3D and GTC, from the DOE Office of Science workload. We outline the development of these models and demonstrate their validation on an Opteron/Infiniband cluster and the ...
expand
Workshop on Large-Scale Parallel Processing - LSPP
Page: 1
doi>10.1109/IPDPS.2009.5161099
Full text available: Publisher SitePublisher Site
A faster parallel algorithm and efficient multithreaded implementations for evaluating betweenness centrality on massive datasets
Kamesh Madduri, David Ediger, Karl Jiang, David A. Bader, Daniel Chavarria-Miranda
Pages: 1-8
doi>10.1109/IPDPS.2009.5161100
Full text available: Publisher SitePublisher Site

We present a new lock-free parallel algorithm for computing betweenness centrality of massive complex networks that achieves better spatial locality compared with previous approaches. Betweenness centrality is a key kernel in analyzing the importance ...
expand
Enabling high-performance memory migration for multithreaded applications on LINUX
Brice Goglin, Nathalie Furmento
Pages: 1-9
doi>10.1109/IPDPS.2009.5161101
Full text available: Publisher SitePublisher Site

As the number of cores per machine increases, memory architectures are being redesigned to avoid bus contention and sustain higher throughput needs. The emergence of Non-Uniform Memory Access (NUMA) constraints has caused affinities between threads and ...
expand
Implementing a portable Multi-threaded Graph Library: The MTGL on Qthreads
Brian W. Barrett, Jonathan W. Berry, Richard C. Murphy, Kyle B. Wheeler
Pages: 1-8
doi>10.1109/IPDPS.2009.5161102
Full text available: Publisher SitePublisher Site

Developing multi-threaded graph algorithms, even when using the MTGL infrastructure, provides a number of challenges, including discovering appropriate levels of parallelism, preventing memory hot spotting, and eliminating accidental synchronization. ...
expand
Early experiences on accelerating Dijkstra's algorithm using transactional memory
Nikos Anastopoulos, Konstantinos Nikas, Georgios Goumas, Nectarios Koziris
Pages: 1-8
doi>10.1109/IPDPS.2009.5161103
Full text available: Publisher SitePublisher Site

In this paper we use Dijkstra's algorithm as a challenging, hard to parallelize paradigm to test the efficacy of several parallelization techniques in a multicore architecture. We consider the application of Transactional Memory (TM) as a means of concurrent ...
expand
Multi-threaded library for many-core systems
Allan Porterfield, Nassib Nassar, Rob Fowler
Pages: 1-8
doi>10.1109/IPDPS.2009.5161104
Full text available: Publisher SitePublisher Site

MAESTRO is a prototype runtime designed to provide simple, very light threads and synchronization between those threads on modern commodity (×86) hardware. The MAESTRO threading library is designed to be a target for a high-level language compiler or ...
expand
A super-efficient adaptable bit-reversal algorithm for multithreaded architectures
Anne C. Elster, Jan C. Meyer
Pages: 1-8
doi>10.1109/IPDPS.2009.5161105
Full text available: Publisher SitePublisher Site

Fast bit-reversal algorithms have been of strong interest for many decades, especially after Cooley and Tukey introduced their FFT implementation in 1965. Many recent algorithms, including FFTW try to avoid the bit-reversal all together by doing in-place ...
expand
Linear optimization on modern GPUs
Daniele G. Spampinato, Anne C. Elstery
Pages: 1-8
doi>10.1109/IPDPS.2009.5161106
Full text available: Publisher SitePublisher Site

Optimization algorithms are becoming increasingly more important in many areas, such as finance and engineering. Typically, real problems involve several hundreds of variables, and are subject to as many constraints. Several methods have been developed ...
expand
Implementing OpenMP on a high performance embedded multicore MPSoC
Barbara Chapman, Lei Huang, Eric Biscondi, Eric Stotzer, Ashish Shrivastava, Alan Gatherer
Pages: 1-8
doi>10.1109/IPDPS.2009.5161107
Full text available: Publisher SitePublisher Site

In this paper we discuss our initial experiences adapting OpenMP to enable it to serve as a programming model for high performance embedded systems. A high-level programming model such as OpenMP has the potential to increase programmer productivity, ...
expand
Early experiences with large-scale Cray XMT systems
David Mizell, Kristyn Maschhoff
Pages: 1-9
doi>10.1109/IPDPS.2009.5161108
Full text available: Publisher SitePublisher Site

Several 64-processor XMT systems have now been shipped to customers and there have been 128-processor, 256-processor and 512-processor systems tested in Cray's development lab. We describe some techniques we have used for tuning performance in hopes ...
expand
Implementing and evaluating multithreaded triad census algorithms on the Cray XMT
George Chin, Andres Marquez, Sutanay Choudhury, Kristyn Maschhoff
Pages: 1-9
doi>10.1109/IPDPS.2009.5161109
Full text available: Publisher SitePublisher Site

Commonly represented as directed graphs, social networks depict relationships and behaviors among social entities such as people, groups, and organizations. Social network analysis denotes a class of mathematical and statistical methods designed to study ...
expand
Accelerating numerical calculation on the Cray XMT
Chad Scherrer, Tim Shippert, Andres Marquez
Pages: 1-7
doi>10.1109/IPDPS.2009.5161110
Full text available: Publisher SitePublisher Site

The Cray XMT provides hardware support for parallel algorithms that would be communication- or memory-bound on other machines. Unfortunately, even if an algorithm meets these criteria, performance suffers if the algorithm is too numerically intensive. ...
expand
Exploiting DMA to enable non-blocking execution in Decoupled Threaded Architecture
Roberto Giorgi, Zdravko Popovic, Nikola Puzovic
Pages: 1-8
doi>10.1109/IPDPS.2009.5161111
Full text available: Publisher SitePublisher Site

DTA (Decoupled Threaded Architecture) is designed to exploit fine/medium grained Thread Level Parallelism (TLP) by using a distributed hardware scheduling unit and relying on existing simple cores (in-order pipelines, no branch predictors, no ROBs).
expand
Workshop on Multi-Threaded Architectures and Applications - MTAAP
Pages: 1-2
doi>10.1109/IPDPS.2009.5161112
Full text available: Publisher SitePublisher Site
Exact pairwise alignment of megabase genome biological sequences using a novel z-align parallel strategy
Azzedine Boukerche, Rodolfo Bezerra Batista, Alba Cristina Magalhaes Alves de Melo
Pages: 1-8
doi>10.1109/IPDPS.2009.5161113
Full text available: Publisher SitePublisher Site

Pairwise Sequence Alignment is a basic operation in Bioinformatics that is performed thousands of times, in a daily basis. The exact methods proposed in the literature have quadratic time complexity. For this reason, heuristic methods such as BLAST are ...
expand
Solving multiprocessor scheduling problem with GEO metaheuristic
Piotr Switalski, Franciszek Seredynski
Pages: 1-8
doi>10.1109/IPDPS.2009.5161114
Full text available: Publisher SitePublisher Site

We propose a solution of the multiprocessor scheduling problem based on applying a relatively new metaheuristic called Generalized Extremal Optimization (GEO). GEO is inspired by a simple coevolutionary model known as Bak-Sneppen model. The model assumes ...
expand
Using XMPP for ad-hoc grid computing - an application example using parallel ant colony optimisation
Gerhard Weis, Andrew Lewis
Pages: 1-4
doi>10.1109/IPDPS.2009.5161115
Full text available: Publisher SitePublisher Site

XMPP (XML Messaging and Presence Protocol), also known as Jabber, is a popular instant messaging protocol that uses XML streams for communication. Due to it's high extensibility, XMPP is very easy to adapt to other uses than instant messaging. Furthermore, ...
expand
Hybridization of Genetic and Quantum Algorithm for gene selection and classification of Microarray data
Allani Abderrahim, El-Ghazali Talbi, Mellouli Khaled
Pages: 1-8
doi>10.1109/IPDPS.2009.5161116
Full text available: Publisher SitePublisher Site

In this work, we hybridize the Genetic Quantum Algorithm with the Support Vector Machines classifier for gene selection and classification of high dimensional Microarray Data. We named our algorithm GQASVM. Its purpose is to identify a small subset of ...
expand
Fine grained population diversity analysis for parallel genetic programming
Stephan M. Winkler, Michael Affenzeller, Stefan Wagner
Pages: 1-8
doi>10.1109/IPDPS.2009.5161117
Full text available: Publisher SitePublisher Site

In this paper we describe a formalism for estimating the structural similarity of formulas that are evolved by parallel genetic programming (GP) based identification processes. This similarity measurement can be used for measuring the genetic diversity ...
expand
New sequential and parallel algorithm for Dynamic Resource Constrained Project Scheduling Problem
Andre Renato Villela da Silva, Luiz Satoru Ochi
Pages: 1-7
doi>10.1109/IPDPS.2009.5161118
Full text available: Publisher SitePublisher Site

This paper proposes a new Evolutionary Algorithm for the Dynamic Resource Constrained Project Scheduling Problem. This algorithm has new features that get around some problems like premature convergence and other ones. The indirect representation approach ...
expand
Interweaving heterogeneous metaheuristics using harmony search
Young Choon Lee, Albert Y. Zomaya
Pages: 1-8
doi>10.1109/IPDPS.2009.5161119
Full text available: Publisher SitePublisher Site

In this paper, we present a novel parallel-metaheuristic framework, which enables a set of heterogeneous metaheuristics to be effectively interwoven and coordinated. The key player of this framework is a harmony-search-based coordinator devised using ...
expand
Adaptative clustering Particle Swarm Optimization
Salomao S. Madeiro, Carmelo J. A. Bastos-Filho, Fernando B. Lima Neto, Elliackin M. N. Figueiredo
Pages: 1-8
doi>10.1109/IPDPS.2009.5161120
Full text available: Publisher SitePublisher Site

The performance of Particle Swarm Optimization (PSO) algorithms depends strongly upon the interaction among the particles. The existing communication topologies for PSO (e.g. star, ring, wheel, pyramid, von Neumann, clan, four clusters) can be viewed ...
expand
Metaheuristic traceability attack against SLMAP, an RFID lightweight authentication protocol
Julio C. Hernandez-Castro, Juan E. Tapiador, Pedro Peris-Lopez, John A. Clark, El-Ghazali Talbi
Pages: 1-5
doi>10.1109/IPDPS.2009.5161121
Full text available: Publisher SitePublisher Site

We present a metaheuristic-based attack against the traceability of an ultra-lightweight authentication protocol for RFID environments called SLMAP, and analyse its implications. The main interest of our approach is that it is a complete black-box technique ...
expand
Parallel Nested Monte-Carlo search
Tristan Cazenave, Nicolas Jouandeau
Pages: 1-6
doi>10.1109/IPDPS.2009.5161122
Full text available: Publisher SitePublisher Site

We address the parallelization of a Monte-Carlo search algorithm. On a cluster of 64 cores we obtain a speedup of 56 for the parallelization of Morpion Solitaire. An algorithm that behaves better than a naive one on heterogeneous clusters is also detailed.
expand
Combining genetic algorithm with time-shuffling in order to evolve agent systems more efficiently
Patrick Ediger, Rolf Hoffmann
Pages: 1-8
doi>10.1109/IPDPS.2009.5161123
Full text available: Publisher SitePublisher Site

We have optimized a multi-agent system for all-to-all communication modeled in cellular automata. The agents' task is to solve the problem by communicating their initially mutually exclusive distributed information to all the other agents. We used a ...
expand
Multi-thread integrative cooperative optimization for rich combinatorial problems
Teodor Gabriel Crainic, Gloria Cerasela Crisan, Michel Gendreau, Nadia Lahrichi, Walter Rei
Pages: 1-8
doi>10.1109/IPDPS.2009.5161124
Full text available: Publisher SitePublisher Site

Addressing multi-attribute, “rich” combinatorial optimization problems in a comprehensive manner presents significant methodological and computational challenges. In this paper, we present an integrative multi-thread cooperative optimization ...
expand
The effect of population density on the performance of a spatial social network algorithm for multi-objective optimisation
Andrew Lewis
Pages: 1-6
doi>10.1109/IPDPS.2009.5161125
Full text available: Publisher SitePublisher Site

Particle Swarm Optimisation (PSO) is increasingly being applied to optimisation of multi-objective problems in engineering design and scientific investigation. This paper investigates the behaviour of a novel algorithm based on an extension of the concepts ...
expand
A parallel hybrid genetic algorithm-simulated annealing for solving Q3AP on computational grid
Lakhdar Loukil, Malika Mehdi, Nouredine Melab, El-Ghazali Talbi, Pascal Bouvry
Pages: 1-8
doi>10.1109/IPDPS.2009.5161126
Full text available: Publisher SitePublisher Site

In this paper we propose a parallel hybrid genetic method for solving Quadratic 3-dimensional Assignment Problem (Q3AP). This problem is proved to be computationally NP-hard. The parallelism in our algorithm is of two hierarchical levels. The first level ...
expand
Solving the industrial car sequencing problem in a Pareto sense
Arnaud Zinflou, Caroline Gagne, Marc Gravel
Pages: 1-8
doi>10.1109/IPDPS.2009.5161127
Full text available: Publisher SitePublisher Site

Until now, the industrial car sequencing problem, as defined during the ROADEF 2005 Challenge, has been tackled by organizing objectives in a hierarchy. In this paper, we suggest tackling this problem in a Pareto sense for the first time. We thus suggest ...
expand
A multi-objective strategy for concurrent mapping and routing in networks on chip
Rafael Tornero, Valentino Sterrantino, Maurizio Palesi, Juan M. Orduna
Pages: 1-8
doi>10.1109/IPDPS.2009.5161128
Full text available: Publisher SitePublisher Site

The design flow of network-on-chip (NoCs) include several key issues. Among other parameters, the decision of where cores have to be topologically mapped and also the routing algorithm represent two highly correlated design problems that must be carefully ...
expand
Evolutionary game theoretical analysis of reputation-based packet forwarding in civilian mobile Ad Hoc networks
Marcin Seredynski, Pascal Bouvry
Pages: 1-8
doi>10.1109/IPDPS.2009.5161129
Full text available: Publisher SitePublisher Site

A mobile wireless ad hoc network (MANET) consists of a number of devices that form a temporary network operating without support of a fixed infrastructure. The correct operation of such a network requires its users to cooperate on the level of packet ...
expand
Workshop on nature inspired distributed computing - NIDISC
Page: 1
doi>10.1109/IPDPS.2009.5161130
Full text available: Publisher SitePublisher Site
An analysis of resource costs in a public computing grid
John A. Chandy
Pages: 1-8
doi>10.1109/IPDPS.2009.5161131
Full text available: Publisher SitePublisher Site

Public resource computing depends on the availability of computing resources that have been contributed by individuals. The amount of resources can be increased by incentivizing resource providers through payment for resources. However, there are costs ...
expand
PyMW - A Python module for desktop grid and volunteer computing
Eric M. Heien, Yusuke Takata, Kenichi Hagihara, Adam Kornafeld
Pages: 1-7
doi>10.1109/IPDPS.2009.5161132
Full text available: Publisher SitePublisher Site

We describe a general purpose master-worker parallel computation Python module called PyMW. PyMW is intended to support rapid development, testing and deployment of large scale master-worker style computations on a desktop grid or volunteer computing ...
expand
MGST: A framework for performance evaluation of Desktop Grids
Majd Kokaly, Issam Al-Azzoni, Douglas G. Down
Pages: 1-8
doi>10.1109/IPDPS.2009.5161133
Full text available: Publisher SitePublisher Site

Desktop Grids are rapidly gaining popularity as a costeffective computing platform for the execution of applications with extensive computing needs. As opposed to grids and clusters, these systems are characterized by having a non-dedicated infrastructure. ...
expand
Evaluating the performance and intrusiveness of virtual machines for desktop grid computing
Patricio Domingues, Filipe Araujo, Luis Silva
Pages: 1-8
doi>10.1109/IPDPS.2009.5161134
Full text available: Publisher SitePublisher Site

We experimentally evaluate the performance overhead of the virtual environments VMware Player, QEMU, VirtualPC and VirtualBox on a dual-core machine. Firstly, we assess the performance of a Linux guest OS running on a virtual machine by separately benchmarking ...
expand
EmBOINC: An emulator for performance analysis of BOINC projects
Trilce Estrada, Michela Taufer, Kevin Reed, David P. Anderson
Pages: 1-8
doi>10.1109/IPDPS.2009.5161135
Full text available: Publisher SitePublisher Site

BOINC is a platform for volunteer computing. The server component of BOINC embodies a number of scheduling policies and parameters that have a large impact on the projects throughput and other performance metrics. We have developed a system, EmBOINC, ...
expand
GenWrapper: A generic wrapper for running legacy applications on desktop grids
Attila Csaba Marosi, Zoltan Balaton, Peter Kacsuk
Pages: 1-6
doi>10.1109/IPDPS.2009.5161136
Full text available: Publisher SitePublisher Site

Desktop Grids represent an alternative trend in Grid computing using the same software infrastructure as Volunteer Computing projects, such as BOINC. Applications to be deployed on a BOINC infrastructure need special preparations. However, there are ...
expand
Towards a formal model of volunteer computing systems
Yu Wang, Haiwu He, Zhijian Wang
Pages: 1-5
doi>10.1109/IPDPS.2009.5161137
Full text available: Publisher SitePublisher Site

Volunteer Computing is a form of distributed computing in which the general public offers processing power and storage to scientific research projects. A large variety of Volunteer Computing Systems (VCS) have been proposed in the literature which use ...
expand
Monitoring the EDGeS project infrastructure
Filipe Araujo, David Santiago, Diogo Ferreira, Jorge Farinha, Patricio Domingues, Luis Moura Silva, Etienne Urbah, Oleg Lodygensky, Haiwu He, Attila Csaba Marosi, Gabor Gombas, Zoltan Balaton, Zoltan Farkas, Peter Kacsuk
Pages: 1-8
doi>10.1109/IPDPS.2009.5161138
Full text available: Publisher SitePublisher Site

EDGeS is an European funded Framework Program 7 project that aims to connect desktop and service grids together. While in a desktop grid, personal computers pull jobs when they are idle, in service grids there is a scheduler that pushes jobs to available ...
expand
Thalweg: A framework for programming 1,000 machines with 1,000 cores
Adam L. Beberg, Vijay S. Pande
Pages: 1-7
doi>10.1109/IPDPS.2009.5161139
Full text available: Publisher SitePublisher Site

While modern large-scale computing tasks have grown to span many machines, each with many cores, traditional programming models have not kept up with these advancements, resulting in difficulty exploiting these computing resources with only modest programmer ...
expand
BonjourGrid: Orchestration of multi-instances of grid middlewares on institutional Desktop Grids
Heithem Abbes, Christophe Cerin, Mohamed Jemni
Pages: 1-8
doi>10.1109/IPDPS.2009.5161140
Full text available: Publisher SitePublisher Site

While the rapidly increasing number of users and applications running on Desktop Grid (DG) systems does demonstrate its inherent potential, current DG implementations follow the traditional masterworker paradigm and DG middlewares do not cooperate. To ...
expand
Workshop on Large-Scale, Volatile Desktop Grids - PCGRID
Pages: 1-2
doi>10.1109/IPDPS.2009.5161141
Full text available: Publisher SitePublisher Site
Pricing American options with the SABR model
Michel Vellekoop, Geeske Vlaming
Pages: 1-6
doi>10.1109/IPDPS.2009.5161142
Full text available: Publisher SitePublisher Site

We introduce a simple and flexible method to price derivative securities on assets with volatilities which are stochastic. As a special case we treat the SABR model in more detail. Our approach is based on the construction of recombining trees using ...
expand
High dimensional pricing of exotic European contracts on a GPU Cluster, and comparison to a CPU cluster
Lokman A. Abbas-Turki, Stephane Vialle, Bernard Lapeyre, Patrick Mercier
Pages: 1-8
doi>10.1109/IPDPS.2009.5161143
Full text available: Publisher SitePublisher Site

The aim of this paper is the efficient use of CPU and GPU clusters for a general path-dependent exotic European pricing, and their comparison in terms of speed and energy consumption. To reach our goal, we propose a parallel random number generator which ...
expand
Using Premia and Nsp for constructing a risk management benchmark for testing parallel architecture
Jean-Philippe Chancelier, Bernard Lapeyre, Jerome Lelong
Pages: 1-6
doi>10.1109/IPDPS.2009.5161144
Full text available: Publisher SitePublisher Site

Financial institutions have massive computations to carry out overnight which are very demanding in terms of the consumed CPU. The challenge is to price many different products on a cluster-like architecture. We have used the Premia software to valuate ...
expand
Towards the balancing real-time computational model: Example of pricing and risk management of exotic derivatives
Grzegorz Gawron
Pages: 1-6
doi>10.1109/IPDPS.2009.5161145
Full text available: Publisher SitePublisher Site

Instant pricing and risk calculation of exotic financial derivative instruments is essential in the process of risk management and trading performed by financial institutions. Due to the lack of analytical solutions for pricing of such instruments, systems ...
expand
Advanced risk analytics on the cell broadband engine
Ciprian Docan, Manish Parashar, Christopher Marty
Pages: 1-8
doi>10.1109/IPDPS.2009.5161146
Full text available: Publisher SitePublisher Site

This paper explores the effectiveness of using the CBE platform for Value-at-Risk (VaR) calculations. Specifically, it focuses on the design, optimization and evaluation of pricing European and American stock options across Monte-Carlo VaR scenarios. ...
expand
A high performance pair trading application
Jieren Wang, Camilo Rostoker, Alan Wagner
Pages: 1-8
doi>10.1109/IPDPS.2009.5161147
Full text available: Publisher SitePublisher Site

This paper describes a high-frequency pair trading strategy that exploits the power of MarketMiner, a high-performance analytics platform that enables a real-time, market-wide search for short-term correlation breakdowns across multiple markets and asset ...
expand
Option pricing with COS method on graphics processing units
Bowen Zhang, Cornelis W. Oosterlee
Pages: 1-8
doi>10.1109/IPDPS.2009.5161148
Full text available: Publisher SitePublisher Site

In this paper, acceleration on the GPU for option pricing by the COS method is demonstrated. In particular, both European and Bermudan options will be discussed in detail. For Bermudan options, we consider both the Black-Scholes model and Lévy processes ...
expand
Calculation of default probability (PD) solving Merton Model PDEs on sparse grids
Philipp Schroeder, Gabriel Wittum
Pages: 1-6
doi>10.1109/IPDPS.2009.5161149
Full text available: Publisher SitePublisher Site

Actual developements of the sub-prime crisis of 2008 have put a strong focus on the importance of credit default models. The Merton Model is one of these models, using partial differential equations to calculate the probability of default (PD) for a ...
expand
An Aggregated Ant Colony Optimization approach for pricing options
Yeshwanth Udayshankar, Sameer Kumar, Girish K. Jha, Ruppa K. Thulasiram, Parimala Thulasiraman
Pages: 1-7
doi>10.1109/IPDPS.2009.5161150
Full text available: Publisher SitePublisher Site

Estimating the current cost of an option by predicting the underlying asset prices is the most common methodology for pricing options. Pricing options has been a challenging problem for a long time due to unpredictability in market which gives rise to ...
expand
A novel application of option pricing to distributed resources management
David Allenotor, Ruppa Thulasiram, Parimala Thulasiraman
Pages: 1-8
doi>10.1109/IPDPS.2009.5161151
Full text available: Publisher SitePublisher Site

In this paper, we address a novel application of financial option pricing theory to the management of distributed computing resources. To achieve the set objective, first, we highlight the importance of finance models for the given problem and explain ...
expand
Message from PDCoF-09 Workshop Chairs
Pages: 1-2
doi>10.1109/IPDPS.2009.5161152
Full text available: Publisher SitePublisher Site
Optimization techniques for concurrent STM-based implementations: A concurrent binary heap as a case study
Kristijan Dragicevic, Daniel Bauer
Pages: 1-8
doi>10.1109/IPDPS.2009.5161153
Full text available: Publisher SitePublisher Site

Much research has been done in the area of software transactional memory (STM) as a new programming paradigm to help ease the implementation of parallel applications. While most research has been invested for answering the question of how STM should ...
expand
Optimizing the execution of a parallel meteorology simulation code
Sonia Jerez, Juan-Pedro Montavez, Domingo Gimenez
Pages: 1-6
doi>10.1109/IPDPS.2009.5161154
Full text available: Publisher SitePublisher Site

Climate simulations are very computational time consuming tasks which are usually solved in parallel systems. However, to reduce the time needed for the simulations, a set of parameters must be optimally selected. This paper presents a methodology to ...
expand
NUMA-ICTM: A parallel version of ICTM exploiting memory placement strategies for NUMA machines
Marcio Castro, Luiz Gustavo Fernandes, Christiane Pousa, Jean-Francois Mehaut, Marilton Sanchotene de Aguiar
Pages: 1-8
doi>10.1109/IPDPS.2009.5161155
Full text available: Publisher SitePublisher Site

In geophysics, the appropriate subdivision of a region into segments is extremely important. ICTM (Interval Categorizer Tesselation Model) is an application that categorizes geographic regions using information extracted from satellite images. The categorization ...
expand
Distributed randomized algorithms for low-support data mining
Alfredo Ferro, Rosalba Giugno, Misael Mongiovi, Alfredo Pulvirenti
Pages: 1-7
doi>10.1109/IPDPS.2009.5161156
Full text available: Publisher SitePublisher Site

Data mining in distributed systems has been facilitated by using high-support association rules. Less attention has been paid to distributed low-support/high-correlation data mining. This has proved useful in several fields such as computational biology, ...
expand
Towards a framework for automated performance tuning
G. Cong, S. Seelam, I. Chung, H. Wen, D. Klepacki
Pages: 1-8
doi>10.1109/IPDPS.2009.5161157
Full text available: Publisher SitePublisher Site

As part of the DARPA sponsored High Productivity Computing Systems (HPCS) program, IBM is building petaflop supercomputers that will be fast, power-efficient, and easy to program. In addition to high performance, high productivity to the end user is ...
expand
Parallel numerical asynchronous iterative algorithms: Large scale experimentations
Jean-Claude Charr, Raphael Couturier, David Laiymani
Pages: 1-8
doi>10.1109/IPDPS.2009.5161158
Full text available: Publisher SitePublisher Site

This paper presents many typical problems that are encountered when executing large scale scientific applications over distributed architectures. The causes and effects of these problems are explained and a solution for some classes of scientific applications ...
expand
Exploring the effect of block shapes on the performance of sparse kernels
Vasileios Karakasis, Georgios Goumas, Nectarios Koziris
Pages: 1-8
doi>10.1109/IPDPS.2009.5161159
Full text available: Publisher SitePublisher Site

In this paper we explore the impact of the block shape on blocked and vectorized versions of the Sparse Matrix-Vector Multiplication (SpMV) kernel and build upon previous work by performing an extensive experimental evaluation of the most widespread ...
expand
Coupled thermo-hydro-mechanical modelling: A new parallel approach
P. J. Vardon, I. Banicescu, P. J. Cleall, H. R. Thomas, R. N. Philp
Pages: 1-9
doi>10.1109/IPDPS.2009.5161160
Full text available: Publisher SitePublisher Site

hybrid MPI/OpenMP method of parallelising a bi-conjugate gradient iterative solver for coupled thermo-hydro-mechanical finite-element simulations in unsaturated soil is implemented and found to be efficient on modern parallel computers. In particular, ...
expand
Concurrent scheduling of parallel task graphs on multi-clusters using constrained resource allocations
Tchimou N'Takpe, Frederic Suter
Pages: 1-8
doi>10.1109/IPDPS.2009.5161161
Full text available: Publisher SitePublisher Site

Scheduling multiple applications on heterogeneous multi-clusters is challenging as the different applications have to compete for resources. A scheduler thus has to ensure a fair distribution of resources among the applications and prevent harmful selfish ...
expand
Solving "large” dense matrix problems on multi-core processors
Mercedes Marques, Gregorio Quintana-Orti, Enrique S. Quintana-Orti, Robert A. van de Geijn
Pages: 1-8
doi>10.1109/IPDPS.2009.5161162
Full text available: Publisher SitePublisher Site

Few realize that for large matrices dense matrix computations achieve nearly the same performance when the matrices are stored on disk as when they are stored in a very large main memory. Similarly, few realize that, given the right programming abstractions, ...
expand
Parallel solvers for dense linear systems for heterogeneous computational clusters
Ravi Reddy, Alexey Lastovetsky, Pedro Alonso
Pages: 1-8
doi>10.1109/IPDPS.2009.5161163
Full text available: Publisher SitePublisher Site

This paper describes the design and the implementation of parallel routines in the Heterogeneous ScaLAPACK library that solve a dense system of linear equations. This library is written on top of HeteroMPI and ScaLAPACK whose building blocks, the de ...
expand
Concurrent adaptive computing in heterogeneous environments (CACHE)
John U. Duselis, Isaac D. Scherson
Pages: 1-8
doi>10.1109/IPDPS.2009.5161164
Full text available: Publisher SitePublisher Site

We introduce a computational framework for concurrent adaptive computing in heterogeneous environments for computationally intensive applications. This framework considers the presence of inter-connected computational resources which are discoverable ...
expand
Toward adjoinable MPI
Jean Utke, Laurent Hascoet, Patrick Heimbach, Chris Hill, Paul Hovland, Uwe Naumann
Pages: 1-8
doi>10.1109/IPDPS.2009.5161165
Full text available: Publisher SitePublisher Site

Automatic differentiation is the primary means of obtaining analytic derivatives from a numerical model given as a computer program. Therefore, it is an essential productivity tool in numerous computational science and engineering domains. Computing ...
expand
Parallelization and optimization of a CBVIR system on multi-core architectures
Qiankun Miao, Yurong Chen, Jianguo Li, Qi Zhang, Yimin Zhang, Guoliang Chen
Pages: 1-8
doi>10.1109/IPDPS.2009.5161166
Full text available: Publisher SitePublisher Site

Technique advances have made image capture and storage very convenient, which results in an explosion of the amount of visual information. It becomes difficult to find useful information from these tremendous data. Content-based Visual Information Retrieval ...
expand
EHGRID: An emulator of heterogeneous computational grids
Basile Clout, Eric Aubanel
Pages: 1-8
doi>10.1109/IPDPS.2009.5161167
Full text available: Publisher SitePublisher Site

Heterogeneous distributed computing is found in a variety of fields including scientific computing, Internet and mobile devices. Computational grids focusing primarily on computationally-intensive operations have emerged as a new infrastructure for high ...
expand
Optimizing assignment of threads to SPEs on the cell BE processor
C. D. Sudheer, T. Nagaraju, P. K. Baruah, Ashok Srinivasan
Pages: 1-8
doi>10.1109/IPDPS.2009.5161168
Full text available: Publisher SitePublisher Site

The Cell is a heterogeneous multicore processor that has attracted much attention in the HPC community. The bulk of the computational workload on the Cell processor is carried by eight co-processors called SPEs. The SPEs are connected to each other and ...
expand
Guiding performance tuning for grid schedules
Jorg Keller, Wolfram Schiffmann
Pages: 1-6
doi>10.1109/IPDPS.2009.5161169
Full text available: Publisher SitePublisher Site

Grid jobs often consist of a large number of tasks. If the performance of a statically scheduled grid job is unsatisfactory, one must decide which code of which task should be improved. We propose a novel method to guide grid users as to which tasks ...
expand
Design and analysis of an active predictive algorithm in wireless multicast networks
Naixue Xiong, Laurence T. Yang, Yi Pan, Athanasios V. Vasilakos, Jing He
Pages: 1-8
doi>10.1109/IPDPS.2009.5161170
Full text available: Publisher SitePublisher Site

With the ever-increasing wireless multicast data applications recently, considerable efforts have focused on the large scale heterogeneous wireless multicast, especially those with large propagation delays, which means the feedbacks arriving at the source ...
expand
Message from the PDSEC-09 workshop chairs
Beniamino Di Martino, Christoph W. Kessler, Yi Pan, Thomas Rauber, Gudula Runger, Laurence T. Yang
Pages: 1-2
doi>10.1109/IPDPS.2009.5161171
Full text available: Publisher SitePublisher Site

Welcome to the 10th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC-09), held on 29 May 2009 in Rome, Italy, in conjunction with the 23rd IEEE Int. Parallel and Distributed Processing Symposium (IPDPS ...
expand
Performance evaluation of gang scheduling in a two-cluster system with migrations
Zafeirios C. Papazachos, Helen D. Karatza
Pages: 1-8
doi>10.1109/IPDPS.2009.5161172
Full text available: Publisher SitePublisher Site

Gang scheduling is considered to be a highly effective task scheduling policy for distributed systems. In this paper we present a migration scheme which reduces the fragmentation in the schedule caused by gang scheduled jobs which cannot start. Furthermore, ...
expand
Performance evaluation of a resource discovery scheme in a Grid environment prone to resource failures
Konstantinos I. Karaoglanoglou, Helen D. Karatza
Pages: 1-8
doi>10.1109/IPDPS.2009.5161173
Full text available: Publisher SitePublisher Site

This paper studies the problem of discovering the most suitable resource for a specific request in a Grid system. A Grid can be seen as an environment comprised by routers and resources, where each router is in charge of its local resources. In our previous ...
expand
A novel information model for efficient routing protocols in delay tolerant networks
Xiao Chen, Jian Shen, Jie Wu
Pages: 1-8
doi>10.1109/IPDPS.2009.5161174
Full text available: Publisher SitePublisher Site

Delay tolerant networks (DTNs) are wireless mobile networks that do not guarantee the existence of a path between a source and a destination at any time. When two nodes move within each other's transmission range during a period of time, they can contact ...
expand
Accurate analytical performance model of communications in MPI applications
D. R. Martinez, J. C. Cabaleiro, T. F. Pena, F. F. Rivera, V. Blanco
Pages: 1-8
doi>10.1109/IPDPS.2009.5161175
Full text available: Publisher SitePublisher Site

This paper presents a new LogP-based model, called LoOgGP, which allows an accurate characterization of MPI applications based on microbenchmark measurements. This new model is an extension of LogP for long messages in which both overhead and gap parameters ...
expand
Prolonging lifetime via mobility and load-balanced routing in Wireless Sensor Networks
Zuzhi Fan
Pages: 1-6
doi>10.1109/IPDPS.2009.5161176
Full text available: Publisher SitePublisher Site

One of the main challenges for a sensor network is conserving the available energy at each sensor node and then prolonging the network lifetime. Many energy efficient/conserving routing protocols have been proposed to the issue; however, the “funnelling ...
expand
A performance model of multicast communication in wormhole-routed networks on-chip
Mahmoud Moadeli, Wim Vanderbauwhede
Pages: 1-8
doi>10.1109/IPDPS.2009.5161177
Full text available: Publisher SitePublisher Site

Collective communication operations form a part of overall traffic in most applications running on platforms employing direct interconnection networks. This paper presents a novel analytical model to compute communication latency of multicast as a widely ...
expand
Reduction of Quality (RoQ) attacks on structured peer-to-peer networks
Yanxiang He, Qiang Cao, Yi Han, Libing Wu, Tao Liu
Pages: 1-9
doi>10.1109/IPDPS.2009.5161178
Full text available: Publisher SitePublisher Site

In contrast to traditional brute-force attacks, RoQ (Reduction of Quality) attacks are periodic, stealthy, yet potent, which exploit the vulnerability of adaptation mechanisms to undermine certain services. As the application-level peer-to-peer (p2p) ...
expand
New adaptive counter based broadcast using neighborhood information in MANETS
M. Bani Yassein, A. Al-Dubai, M. Ould Khaoua, Omar M. Al-jarrah
Pages: 1-7
doi>10.1109/IPDPS.2009.5161179
Full text available: Publisher SitePublisher Site

Broadcasting in MANETs is a fundamental data dissemination mechanism, with important applications, e.g., route query process in many routing protocols, address resolution and diffusing information to the whole network. Broadcasting in MANETs has traditionally ...
expand
A Distributed filesystem framework for transparent accessing heterogeneous storage services
Yutong Lu, Huajian Mao, Jie Shen
Pages: 1-8
doi>10.1109/IPDPS.2009.5161180
Full text available: Publisher SitePublisher Site

This paper introduces an extensible distributed file system framework, YaFS, using heterogeneous online storage services as its back-ends. It provides a configurable solution for simplifying the usage of multiple storage resources and accessing data ...
expand
Dynamic adaptive redundancy for quality-of-service control in wireless sensor networks
Ing-Ray Chen, Anh Phan Speer, Mohamed Eltoweissy
Pages: 1-8
doi>10.1109/IPDPS.2009.5161181
Full text available: Publisher SitePublisher Site

In this paper, we develop and evaluate a new concept of adaptive optimal redundancy to efficiently provide wireless sensor network (WSN) users with QoS-aware information services. Our approach to satisfying application QoS requirements while maximizing ...
expand
The effect of heavy-tailed distribution on the performance of non-contiguous allocation strategies in 2D mesh connected multicomputers
Saad Bani Mohammad
Pages: 1-8
doi>10.1109/IPDPS.2009.5161182
Full text available: Publisher SitePublisher Site

The performance of non-contiguous allocation strategies has been evaluated under the assumption that the number of messages sent by jobs, which is one of the factors that the job execution times depend on, follow an exponential distribution. However, ...
expand
Energy efficient and seamless data collection with mobile sinks in massive sensor networks
Taisoo Park, Daeyoung Kim, Seonghun Jang, Seong-eun Yoo, Yohhan Lee
Pages: 1-8
doi>10.1109/IPDPS.2009.5161183
Full text available: Publisher SitePublisher Site

Wireless Sensor Networks (WSNs) enable the surveillance and reconnaissance of a particular area with low cost and less manpower. However, the biggest problem against the commercialization of the WSN is the limited lifetime of the battery-operated sensor ...
expand
Priority-based QoS MAC protocol for wireless sensor networks
Hoon Kim, Sung-Gi Min
Pages: 1-8
doi>10.1109/IPDPS.2009.5161184
Full text available: Publisher SitePublisher Site

The media access control (MAC) protocol in wireless sensor networks provides a periodic listen/sleep state for protection from overhearing and idle listening. However, many scenarios and applications exist in which sensor nodes must send data quickly ...
expand
Experimental evaluation of a WSN platform power consumption
Ch. Antonopoulos, A. Prayati, T. Stoyanova, C. Koulamas, G. Papadopoulos
Pages: 1-8
doi>10.1109/IPDPS.2009.5161185
Full text available: Publisher SitePublisher Site

Critical characteristics of wireless sensor networks, as being autonomous and comprising small or miniature devices are achieved at the expense of very strict available energy related limitations. Therefore, it is apparent that optimal resource management ...
expand
Throughput-fairness tradeoff in Best Effort flow control for on-chip architectures
Fahimeh Jafari, Mohammad S. Talebi, Mohammad H. Yaghmaee, Ahmad Khonsari, Mohamed Ould-Khaoua
Pages: 1-8
doi>10.1109/IPDPS.2009.5161186
Full text available: Publisher SitePublisher Site

We consider two flow control schemes for Best Effort traffic in on-chip architectures, which can be deemed as the solutions to the boundary extremes of a class of utility maximization problem. At one extreme, we consider the so-called Rate-Sum flow control ...
expand
Analysis of data scheduling algorithms in supporting real-time multi-item requests in on-demand broadcast environments
Jun Chen, Kai Liu, Victor C. S. Lee
Pages: 1-8
doi>10.1109/IPDPS.2009.5161187
Full text available: Publisher SitePublisher Site

On-demand broadcast is an effective wireless data dissemination technique to enhance system scalability and capability to handle dynamic data access patterns. Previous studies on time-critical on-demand data broadcast were under the assumption that each ...
expand
Network processing performability evaluation on heterogeneous reliability multicore processors using SRN model
Peter D. Ungsunan, Chuang Lin, Yang Wang, Yi Gai
Pages: 1-6
doi>10.1109/IPDPS.2009.5161188
Full text available: Publisher SitePublisher Site

Future network systems and embedded infrastructure devices in ubiquitous environments will need to consume low power and process large amounts of network packet traffic. In order to meet necessary high processing efficiency requirements, future processors ...
expand
A statistical study on the impact of wireless signals' behavior on location estimation accuracy in 802.11 fingerprinting systems
Reza Farivar, David Wiczer, Alejandro Gutierrez, Roy H. Campbell
Pages: 1-8
doi>10.1109/IPDPS.2009.5161189
Full text available: Publisher SitePublisher Site

Much of the recent interest in location estimation systems has focused on 802.11 fingerprinting. Unlike GPS systems, 802.11 based systems can accurately estimate a user's location inside buildings. Moreover, users don't need any special equipment to ...
expand
Performance prediction for running workflows under role-based authorization mechanisms
Ligang He, Mark Calleja, Mark Hayes, Stephen A. Jarvis
Pages: 1-8
doi>10.1109/IPDPS.2009.5161190
Full text available: Publisher SitePublisher Site

When investigating the performance of running scientific/commercial workflows in parallel and distributed systems, we often take into account only the resources allocated to the tasks constituting the workflow, assuming that computational resources will ...
expand
Routing, data gathering, and neighbor discovery in delay-tolerant wireless sensor networks
Abbas Nayebi, Hamid Sarbazi-Azad, Gunnar Karlsson
Pages: 1-6
doi>10.1109/IPDPS.2009.5161191
Full text available: Publisher SitePublisher Site

This paper investigates a class of mobile wireless sensor networks that are not connected most of the times. The characteristics of these networks is inherited from both delay tolerate networks (DTN) and wireless sensor networks. First, delay-tolerant ...
expand
A Service Discovery protocol for vehicular ad hoc networks: A proof of correctness
Azzedine Boukerche, Kaouther Abrougui
Pages: 1-8
doi>10.1109/IPDPS.2009.5161192
Full text available: Publisher SitePublisher Site

Recently, vehicle networks are gaining great deal of attention from the research community. In order to provide efficient and pervasive road communication, Next Generation Vehicular Networks (NVN) are considered a promising solution. NVNs have unique ...
expand
A QoS aware multicast algorithm for wireless mesh networks
Liang Zhao, Ahmed Yassin Al-Dubai, Geyong Min
Pages: 1-8
doi>10.1109/IPDPS.2009.5161193
Full text available: Publisher SitePublisher Site

Wireless mesh networks have been attracting significant attention due to its promising technology. It is becoming a major avenue for the fourth generation of wireless mobility. Communication in large-scale wireless networks can create bottlenecks for ...
expand
Design and implemention of a novel MAC layer handoff protocol for IEEE 802.11 wireless networks
Zhenxia Zhang, Azzedine Boukerche
Pages: 1-5
doi>10.1109/IPDPS.2009.5161194
Full text available: Publisher SitePublisher Site

In recent years, IEEE 802.11 wireless networks become one of the most important components in wireless networks, since compared with other wireless technologies, IEEE 802.11 devices are inexpensive and easier to be configured. To provide seamless roaming ...
expand
The 8th International Workshop on Performance Modeling, Evaluation, and Optimization of Ubiquitous Computing and Networked Systems (PMEO-UCNS'09)
Pages: 1-4
doi>10.1109/IPDPS.2009.5161195
Full text available: Publisher SitePublisher Site
Energy benefits of reconfigurable hardware for use in underwater snesor nets
Bridget Benson, Ali Irturk, Junguk Cho, Ryan Kastner
Pages: 1-7
doi>10.1109/IPDPS.2009.5161196
Full text available: Publisher SitePublisher Site

Small, dense underwater sensor networks have the potential to greatly improve undersea environmental and structural monitoring. However, few sensor nets exist because commercially available underwater acoustic modems are too costly and energy inefficient ...
expand
The Radio Virtual Machine: A solution for SDR portability and platform reconfigurability
Riadh Ben Abdallah, Tanguy Risset, Antoine Fraboulet, Yves Durand
Pages: 1-4
doi>10.1109/IPDPS.2009.5161197
Full text available: Publisher SitePublisher Site

Instead of a single circuit dedicated to a particular physical (PHY) layer standard, a Software Defined Radio (SDR) platform embeds several hardware accelerators which enable it to support different modulation schemes. In this study we propose an architecture ...
expand
A multiprocessor self-reconfigurable JPEG2000 encoder
Antonino Tumeo, Simone Borgio, Davide Bosisio, Matteo Monchiero, Gianluca Palermo, Fabrizio Ferrandi, Donatella Sciuto
Pages: 1-8
doi>10.1109/IPDPS.2009.5161198
Full text available: Publisher SitePublisher Site

This paper presents a multiprocessor architecture prototype on a Field Programmable Gate Arrays (FPGA) with support for hardware and software multithreading. Thanks to partial dynamic reconfiguration, this system can, at run time, spawn both software ...
expand
System-level runtime mapping exploration of reconfigurable architectures
Kamana Sigdel, Mark Thompson, Andy D. Pimentel, Carlo Galuzzi, Koen Bertels
Pages: 1-8
doi>10.1109/IPDPS.2009.5161199
Full text available: Publisher SitePublisher Site

Dynamic reconfigurable systems can evolve under various conditions due to changes imposed either by the architecture, or by the applications, or by the environment. In such systems, the design process becomes more sophisticated as all the design decisions ...
expand
Efficient implementation of QRD-RLS algorithm using hardware-software co-design
Nupur Lodha, Nivesh Rai, Aarthy Krishnamurthy, Hrishikesh Venkataraman
Pages: 1-4
doi>10.1109/IPDPS.2009.5161200
Full text available: Publisher SitePublisher Site

This paper presents the implementation of QR Decomposition based Recursive Least Square (QRD-RLS) algorithm on Field Programmable Gate Arrays (FPGA) using hardware-software co-design. The system has been implemented on Xilinx Spartan 3E FPGA with Microblaze ...
expand
3D FPGA resource management and fragmentation metric for hardware multitasking
J. A. Valero, J. Septien, D. Mozos, H. Mecha
Pages: 1-7
doi>10.1109/IPDPS.2009.5161201
Full text available: Publisher SitePublisher Site

This research work presents a novel proposal to get hardware multitasking in 3D FPGAs. Such architectures are still academic, but recent advances in 3D IC technologies allow foreseeing true 3D FPGAs in the near future. Starting from models for the 3D ...
expand
Achieving network on chip fault tolerance by adaptive remapping
Cristinel Ababei, Rajendra Katti
Pages: 1-4
doi>10.1109/IPDPS.2009.5161202
Full text available: Publisher SitePublisher Site

This paper investigates achieving fault tolerance by adaptive remapping in the context of Networks on Chip. The problem of dynamic application remapping is formulated and an efficient algorithm is proposed to address single and multiple PE failures. ...
expand
Generation of Synthetic Floating-Point benchmark circuits
Thomas C. P. Chau, Sam M. H. Ho, Philip H. W. Leong, Peter Zipf, Manfred Glesner
Pages: 1-9
doi>10.1109/IPDPS.2009.5161203
Full text available: Publisher SitePublisher Site

Synthetic Floating-Point (SFP), a synthetic benchmark generator program for floating-point circuits is presented. SFP consists of two independent modules for characterisation and generation. The characterisation module extracts key dataflow statistics ...
expand
A MicroBlaze specific co-processor for real-time hyperelliptic curve cryptography on Xilinx FPGAs
Alexander Klimm, Oliver Sander, Jurgen Becker
Pages: 1-8
doi>10.1109/IPDPS.2009.5161204
Full text available: Publisher SitePublisher Site

A Hardware/Software Codesign approach based on a MicroBlaze softcore processor and a GF2n-coprocessor module to form a minimal hardware architecture for HECC on low-cost Xilinx FPGAs is described in this paper. Exploiting the features of the ...
expand
On the acceptance tests of aperiodic real-time tasks for FPGAs
Ahmed A. El Farag, Hatem M. El-Boghdadi, Samir I. Shaheen
Pages: 1-4
doi>10.1109/IPDPS.2009.5161205
Full text available: Publisher SitePublisher Site

Partially Runtime-Reconfigurable devices allow tasks to be placed and removed dynamically at runtime. For real-time systems, tasks have to complete their work and also to meet their deadlines. It is important to decide at arrival time whether the real-time ...
expand
Implementing protein seed-based comparison algorithm on the SGI RASC-100 platform
Van-Hoa Nguyen, Alexandre Cornu, Dominique Lavenier
Pages: 1-7
doi>10.1109/IPDPS.2009.5161206
Full text available: Publisher SitePublisher Site

This paper describes a parallel FPGA implementation of a genomic sequence comparison algorithm for finding similarities between a large set of protein sequences and full genomes. Results comparable to the tblastn program from the BLAST family are provided ...
expand
High performance true random number generator based on FPGA block RAMs
Tamas Gyorfi, Octavian Cret, Alin Suciu
Pages: 1-8
doi>10.1109/IPDPS.2009.5161207
Full text available: Publisher SitePublisher Site

This paper presents a new method for creating TRNGs in Xilinx FPGAs. Due to its simplicity and ease of implementation, the design constitutes a valuable alternative to existing methods for creating single-chip TRNGs. Its main advantages are the high ...
expand
High-level estimation and trade-off analysis for adaptive real-time systems
Ingo Sander, Jun Zhu, Axel Jantsch, Andreas Herrholz, Philipp A. Hartmann, Wolfgang Nebel
Pages: 1-4
doi>10.1109/IPDPS.2009.5161208
Full text available: Publisher SitePublisher Site

We propose a novel design estimation method for adaptive streaming applications to be implemented on a partially reconfigurable FPGA. Based on experimental results we enable accurate design cost estimates at an early design stage. Given the size and ...
expand
Hardware accelerated montecarlo financial simulation over low cost FPGA cluster
J. Castillo, Jose L. Bosque, E. Castillo, P. Huerta, J. I. Martinez
Pages: 1-8
doi>10.1109/IPDPS.2009.5161209
Full text available: Publisher SitePublisher Site

The use of computational systems to help making the right investment decisions in financial markets is an open research field where multiple efforts have being carried out during the last few years. The ability of improving the assessment process and ...
expand
Design and implementation of the Quarc Network on-Chip
M. Moadeli, P. P. Maji, W. Vanderbauwhede
Pages: 1-9
doi>10.1109/IPDPS.2009.5161210
Full text available: Publisher SitePublisher Site

Networks-on-Chip (NoC) have emerged as alternative to buses to provide a packet-switched communication medium for modular development of large Systems-on-Chip. However, to successfully replace its predecessor, the NoC has to be able to efficiently exchange ...
expand
On-line task management for a reconfigurable cryptographic architecture
Ivan Beretta, Vincenzo Rana, Marco D. Santambrogio, Donatella Sciuto
Pages: 1-4
doi>10.1109/IPDPS.2009.5161211
Full text available: Publisher SitePublisher Site

The increasing amount of programmable logic provided by modern FPGAs makes it possible to execute multiple hardware applications on the same device. This approach is reinforced by dynamic reconfiguration, which allows a single part of the device to be ...
expand
Double Throughput Multiply-Accumulate unit for FlexCore processor enhancements
Tung Thanh Hoang, Magnus Sjalander, Per Larsson-Edefors
Pages: 1-7
doi>10.1109/IPDPS.2009.5161212
Full text available: Publisher SitePublisher Site

As a simple five-stage General-Purpose Processor (GPP), the baseline FlexCore processor has a limited set of datapath units. By utilizing a flexible datapath interconnect and a wide control word, a FlexCore processor is explicitly designed to support ...
expand
Modeling reconfiguration in a FPGA with a hardwired network on chip
Muhammad Aqeel Wahlah, Kees Goossens
Pages: 1-8
doi>10.1109/IPDPS.2009.5161213
Full text available: Publisher SitePublisher Site

We propose that FPGAs use a hardwired network on chip (HWNOC) as a unified interconnect for functional communications (data and control) as well as configuration (bitstreams for soft IP). In this paper we model such a platform. Using the HWNOC applications ...
expand
Smith-Waterman implementation on a FSB-FPGA module using the Intel Accelerator Abstraction Layer
Jeff Allred, Jack Coyne, William Lynch, Vincent Natoli, Joseph Grecco, Joel Morrissette
Pages: 1-4
doi>10.1109/IPDPS.2009.5161214
Full text available: Publisher SitePublisher Site

The Smith-Waterman algorithm is employed in the field of Bioinformatics to find optimal local alignments of two DNA or protein sequences. It is a classic example of a dynamic programming algorithm. Because it is highly parallel both spatially and temporally ...
expand
Flexible pipelining design for recursive variable expansion
Zubair Nawaz, Thomas Marconi, Koen Bertels, Todor Stefanov
Pages: 1-8
doi>10.1109/IPDPS.2009.5161215
Full text available: Publisher SitePublisher Site

Many image and signal processing kernels can be optimized for performance consuming a reasonable area by doing loops parallelization with extensive use of pipelining. This paper presents an automated flexible pipeline design algorithm for our unique ...
expand
High-level synthesis with coarse grain reconfigurable components
George Economakos, Sotiris Xydis
Pages: 1-4
doi>10.1109/IPDPS.2009.5161216
Full text available: Publisher SitePublisher Site

High-level synthesis is the process of balancing the distribution of RTL components throughout the execution of applications. However, a lot of balancing and optimization opportunities exist below RTL. In this paper, a coarse grain reconfigurable RTL ...
expand
A low cost and adaptable routing network for reconfigurable systems
Ricardo Ferreira, Marcone Laure, Antonio C. Beck, Thiago Lo, Mateus Rutzig, Luigi Carro
Pages: 1-8
doi>10.1109/IPDPS.2009.5161217
Full text available: Publisher SitePublisher Site

Nowadays, scalability, parallelism and fault-tolerance are key features to take advantage of last silicon technology advances, and that is why reconfigurable architectures are in the spotlight. However, one of the major problems in designing reconfigurable ...
expand
Reconfigurable accelerator for WFS-based 3D-audio
Dimitris Theodoropoulos, Georgi Kuzmanov, Georgi Gayd
Pages: 1-8
doi>10.1109/IPDPS.2009.5161218
Full text available: Publisher SitePublisher Site

In this paper, we propose a reconfigurable and scalable hardware accelerator for 3D-audio systems based on the Wave Field Synthesis technology. Previous related work reveals that WFS sound systems are based on using standard PCs. However, two major obstacles ...
expand
ARMLang: A language and compiler for programming reconfigurable mesh many-cores
Heiner Giefers, Marco Platzner
Pages: 1-8
doi>10.1109/IPDPS.2009.5161219
Full text available: Publisher SitePublisher Site

The reconfigurable mesh serves as a theoretical model for massively parallel computing, but has recently been investigated as a practical architecture for many-cores with light-weight, circuit-switched interconnects. There is a lack of programming environments, ...
expand
Runtime decision of hardware or software execution on a heterogeneous reconfigurable platform
Vlad-Mihai Sima, Koen Bertels
Pages: 1-6
doi>10.1109/IPDPS.2009.5161220
Full text available: Publisher SitePublisher Site

In this paper, we present a runtime optimization targeting the speedup of applications running on a reconfigurable platform supporting the MOLEN programming paradigm. More specifically, for functions that have an execution time dependent on parameters, ...
expand
Impact of run-time reconfiguration on design and speed - A case study based on a grid of run-time reconfigurable modules inside a FPGA
Jochen Strunk, Toni Volkmer, Klaus Stephan, Wolfgang Rehm, Heiko Schick
Pages: 1-8
doi>10.1109/IPDPS.2009.5161221
Full text available: Publisher SitePublisher Site

This paper examines the feasibility of utilizing a grid of run-time reconfigurable (RTR) modules on a dynamically and partially reconfigurable (DPR) FPGA. The aim is to create a homogeneous array of RTR regions on a FPGA, which can be reconfigured on ...
expand
Scheduling tasks on reconfigurable hardware with a list scheduler
Justin Teller, Fusun Ozguner
Pages: 1-4
doi>10.1109/IPDPS.2009.5161222
Full text available: Publisher SitePublisher Site

In this paper, we propose a static (compile-time) scheduling extension that considers reconfiguration and task execution together when scheduling tasks on reconfigurable hardware, designated as Mutually Exclusive Groups (-MEG), that can be used to extend ...
expand
RDMS: A hardware task scheduling algorithm for Reconfigurable Computing
Miaoqing Huang, Harald Simmler, Olivier Serres, Tarek El-Ghazawi
Pages: 1-8
doi>10.1109/IPDPS.2009.5161223
Full text available: Publisher SitePublisher Site

Reconfigurable Computers (RC) can provide significant performance improvement for domain applications. However, wide acceptance of today's RCs among domain scientist is hindered by the complexity of design tools and the required hardware design experience. ...
expand
Software-like debugging methodology for reconfigurable platforms
Loic Lagadec, Damien Picard
Pages: 1-4
doi>10.1109/IPDPS.2009.5161224
Full text available: Publisher SitePublisher Site

This paper presents a new debugging methodology for applications targeting reconfigurable platforms. The key issue behind is that bringing software engineering techniques advantages to hardware design would reduce design cycles hence time-to-market. ...
expand
Evaluation of a multicore reconfigurable architecture with variable core sizes
Vu Manh Tuan, Naohiro Katsura, Hiroki Matsutani, Hideharu Amano
Pages: 1-8
doi>10.1109/IPDPS.2009.5161225
Full text available: Publisher SitePublisher Site

A multicore architecture for processors has emerged as a dominant trend in the chip making industry. As reconfigurable devices gradually prove their capability in improving computation power while preserving flexibility, we are examining a multicore ...
expand
Reconfigurable architectures workshop - RAW
Pages: 1-3
doi>10.1109/IPDPS.2009.5161226
Full text available: Publisher SitePublisher Site
Performability evaluation of EFT systems for SLA assurance
Erica Sousa, Paulo Maciel, Carlos Araujo, Fabio Chicout
Pages: 1-8
doi>10.1109/IPDPS.2009.5161227
Full text available: Publisher SitePublisher Site

The performance evaluation of Electronic Funds Transfer (EFT) Systems has an enormous importance for Electronic Transactions providers, since the computing resources must be efficiently used in order to attain requirements defined in Service Level Agreements ...
expand
A global scheduling framework for virtualization environments
Yoav Etsion, Tal Ben-Nun, Dror G. Feitelson
Pages: 1-8
doi>10.1109/IPDPS.2009.5161228
Full text available: Publisher SitePublisher Site

A premier goal of resource allocators in virtualization environments is to control the relative resource consumption of the different virtual machines, and moreover, to be able to change the relative allocations at will. However, it is not clear what ...
expand
Symmetric Mapping: An architectural pattern for resource supply in grids and clouds
Xavier Grehant, Isabelle Demeure
Pages: 1-8
doi>10.1109/IPDPS.2009.5161229
Full text available: Publisher SitePublisher Site

This paper presents the Symmetric Mapping pattern, an architectural pattern for the design of resource supply systems. The focus of Symmetric Mapping is on separation of concerns for cost-effective resource allocation. It divides resource supply in three ...
expand
Application level I/O caching on Blue Gene/P systems
Seetharami Seelam, I-Hsin Chung, John Bauer, Hao Yu, Hui-Fang Wen
Pages: 1-8
doi>10.1109/IPDPS.2009.5161230
Full text available: Publisher SitePublisher Site

In this paper, we present an application level aggressive I/O caching and prefetching system to hide I/O access latency experienced by out-of-core applications. Without the application level prefetching and caching capability, users of I/O intensive ...
expand
Low power mode in cloud storage systems
Danny Harnik, Dalit Naor, Itai Segall
Pages: 1-8
doi>10.1109/IPDPS.2009.5161231
Full text available: Publisher SitePublisher Site

We consider large scale, distributed storage systems with a redundancy mechanism; cloud storage being a prime example. We investigate how such systems can reduce their power consumption during low-utilization time intervals by operating in a low-power ...
expand
Blue Eyes: Scalable and reliable system management for cloud computing
Sukhyun Song, Kyung Dong Ryu, Dilma Da Silva
Pages: 1-8
doi>10.1109/IPDPS.2009.5161232
Full text available: Publisher SitePublisher Site

With the advent of cloud computing, massive and automated system management has become more important for successful and economical operation of computing resources. However, traditional monolithic system management solutions are designed to scale to ...
expand
Predicting cache needs and cache sensitivity for applications in cloud computing on CMP servers with configurable caches
Jacob Machina, Angela Sodan
Pages: 1-8
doi>10.1109/IPDPS.2009.5161233
Full text available: Publisher SitePublisher Site

QoS criteria in cloud computing require guarantees about application runtimes, even if CMP servers are shared among multiple parallel or serial applications. Performance of computation-intensive application depends significantly on memory performance ...
expand
Resource monitoring and management with OVIS to enable HPC in cloud computing environments
Jim Brandt, Ann Gentile, Jackson Mayo, Philippe Pebay, Diana Roe, David Thompson, Matthew Wong
Pages: 1-8
doi>10.1109/IPDPS.2009.5161234
Full text available: Publisher SitePublisher Site

Using the cloud computing paradigm, a host of companies promise to make huge compute resources available to users on a pay-as-you-go basis. These resources can be configured on the fly to provide the hardware and operating system of choice to the customer ...
expand
Distributed management of virtual cluster infrastructures
Michael A. Murphy, Michael Fenn, Linton Abraham, Joshua A. Canter, Benjamin T. Sterrett, Sebastien Goasguen
Pages: 1-8
doi>10.1109/IPDPS.2009.5161235
Full text available: Publisher SitePublisher Site

Cloud services that provide virtualized computational clusters present a dichotomy of systems management challenges, as the virtual clusters may be owned and administered by one entity, while the underlying physical fabric may belong to a different entity. ...
expand
Desktop to cloud transformation planning
Kirk Beaty, Andrzej Kochut, Hidayatullah Shaikh
Pages: 1-8
doi>10.1109/IPDPS.2009.5161236
Full text available: Publisher SitePublisher Site

Traditional desktop delivery model is based on a large number of distributed PCs executing operating system and desktop applications. Managing traditional desktop environments is incredibly challenging and costly. Tasks like installations, configuration ...
expand
Fifth International Workshop on System Management Techniques, Processes, and Services (SMTPS)
Jose E. Moreira
Page: 1
doi>10.1109/IPDPS.2009.5161237
Full text available: Publisher SitePublisher Site

It is our pleasure to welcome all speakers, authors and participants to this fifth edition of the International Workshop on System Management Techniques, Processes, and Services (SMTPS), organized as a 2009 IPDPS (IEEE International Parallel & Distributed ...
expand
Security analysis of Micali's fair contract signing protocol by using Coloured Petri Nets: Multi-session case
Panupong Sornkhom, Yongyuth Permpoontanalarp
Pages: 1-8
doi>10.1109/IPDPS.2009.5161238
Full text available: Publisher SitePublisher Site

Micali proposed a simple and practical optimistic fair exchange protocol, called ECS1, for contract signing. Bao et al. found some message replay attacks in both the original ECS1 and a modified ECS1 where the latter aims to solve an ambiguity in the ...
expand
Modeling and analysis of self-stopping BTWorms using dynamic hit list in P2P networks
Jiaqing Luo, Bin Xiao, Guobin Liu, Qingjun Xiao, Shijie Zhou
Pages: 1-8
doi>10.1109/IPDPS.2009.5161239
Full text available: Publisher SitePublisher Site

Worm propagation analysis, including exploring mechanisms of worm propagation and formulating effects of network/worm parameters, has great importance for worm containment and host protection in P2P networks. Previous work only focuses on topological ...
expand
SFTrust: A double trust metric based trust model in unstructured P2P system
Yunchang Zhang, Shanshan Chen, Geng Yang
Pages: 1-7
doi>10.1109/IPDPS.2009.5161240
Full text available: Publisher SitePublisher Site

The P2P system is an anonymous and dynamic system, which offers enormous opportunities, and also presents potential threats and risks. In order to restrain malicious behaviors in P2P system, previous studies try to establish efficient trust models on ...
expand
TLS client handshake with a payment card
David J. Boyd
Pages: 1-8
doi>10.1109/IPDPS.2009.5161241
Full text available: Publisher SitePublisher Site

Transport Layer Security (TLS) is the de facto standard for preventing eavesdropping, tampering or message forgery of higher-risk Internet communications, for example when making a payment. At heart TLS is a stateful cryptographic protocol built around ...
expand
Design of a parallel AES for graphics hardware using the CUDA framework
Andrea Di Biagio, Alessandro Barenghi, Giovanni Agosta, Gerardo Pelosi
Pages: 1-8
doi>10.1109/IPDPS.2009.5161242
Full text available: Publisher SitePublisher Site

Web servers often need to manage encrypted transfers of data. The encryption activity is computationally intensive, and exposes a significant degree of parallelism. At the same time, cheap multicore processors are readily available on graphics hardware, ...
expand
A new RFID authentication protocol with resistance to server impersonation
Mete Akgun, M. Ufuk Caglayan, Emin Anarim
Pages: 1-8
doi>10.1109/IPDPS.2009.5161243
Full text available: Publisher SitePublisher Site

Security is one of the main issues to adopt RFID technology in daily use. Due to resource constraints of RFID systems, it is very restricted to design a private authentication protocol based on existing cryptographic functions. In this paper, we propose ...
expand
Intrusion detection and tolerance for transaction based applications in wireless environments
Yacine Djemaiel, Noureddine Boudriga
Pages: 1-8
doi>10.1109/IPDPS.2009.5161244
Full text available: Publisher SitePublisher Site

Nowadays, many intrusion detection and tolerance systems have been proposed in order to detect attacks in both wired and wireless networks. Even if these solutions have shown some efficiency by detecting a set of complex attacks in wireless environments, ...
expand
A topological approach to detect conflicts in firewall policies
Subana Thanasegaran, Yi Yin, Yuichiro Tateiwa, Yoshiaki Katayama, Naohisa Takahashi
Pages: 1-7
doi>10.1109/IPDPS.2009.5161245
Full text available: Publisher SitePublisher Site

Packet filtering provides initial layer of security based upon set of ordered filters called firewall policies. It examines the network packets and decides whether to accept or deny them. But when a packet matches two or more filters conflicts arise. ...
expand
Automated detection of confidentiality goals
Anders Moen Hagalisletto
Pages: 1-8
doi>10.1109/IPDPS.2009.5161246
Full text available: Publisher SitePublisher Site

The security goals of an authentication protocol specify the high level properties of a protocol. Despite the importance of goals, these are rarely specified explicitly. Yet, a qualified analysis of a security protocol requires that the goals are stated ...
expand
Performance analysis of distributed intrusion detection protocols for mobile group communication systems
Jin-Hee Cho, Ing-Ray Chen
Pages: 1-8
doi>10.1109/IPDPS.2009.5161247
Full text available: Publisher SitePublisher Site

Under highly security vulnerable, resource restricted, and dynamically changing mobile ad hoc environments, it is critical to be able to maximize the system lifetime while bounding the communication response time for mission-oriented mobile groups. In ...
expand
Combating side-channel attacks using key management
Donggang Liu, Qi Dong
Pages: 1-8
doi>10.1109/IPDPS.2009.5161248
Full text available: Publisher SitePublisher Site

Embedded devices are widely used in military and civilian operations. They are often unattended, publicly accessible, and thus vulnerable to physical capture. Tamper-resistant modules are popular for protecting sensitive data such as cryptographic keys ...
expand
The 5th International Workshop on Security in Systems and Networks
Bin Xiao
Pages: 1-2
doi>10.1109/IPDPS.2009.5161249
Full text available: Publisher SitePublisher Site

The InternationalWorkshop on Security in Systems and Networks is a forum for the presentation and discussion of approaches, research findings, and experiences in the area of privacy, integrity, and availability of resources in distributed systems. This ...
expand
Authors
Pages: 1-50
doi>10.1109/IPDPS.2009.5161250
Full text available: Publisher SitePublisher Site

Powered by The ACM Guide to Computing Literature


The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2016 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us

Useful downloads: Adobe Reader    QuickTime    Windows Media Player    Real Player
Did you know the ACM DL App is now available?
Did you know your Organization can subscribe to the ACM Digital Library?
The ACM Guide to Computing Literature
All Tags
Export Formats
 
 
Save to Binder