Abstract
Hierarchical storage architectures are required to meet both, capacity and bandwidth requirements for future high-end storage architectures. In this paper we present the results of an evaluation of an emerging technology, DataDirect Networks' (DDN) Infinite Memory Engine (IME). IME allows to realize a fast buffer in front of a large capacity storage system. We collected benchmarking data with IOR and with the HPC application NEST. The IOR bandwidth results show how well network bandwidth towards such fast buffer can be exploited compared to the external storage system. The NEST benchmarks clearly demonstrate that IME can reduce I/O-induced load imbalance between MPI ranks to a minimum while speeding up I/O as a whole by a considerable factor. In addition to these direct measurements, a performance model for NEST is developed. In combination with a generic and abstract burst buffer architecture, this model generates predictions about appropriate burst buffer and I/O parameters to achieve specific performance goals for NEST on HPC clusters of varying size. Specifically, it is investigated in which parameter range burst buffers are able to counteract the widening performance gap between compute and I/O.
- H. Abbasi, M. Wolf, G. Eisenhauer, S. Klasky, K. Schwan, and F. Zheng. DataStager: Scalable Data Staging Services for Petascale Applications. HPDC '09, pages 39--48, New York, NY, USA, 2009. ACM. Google Scholar
Digital Library
- P. F. Baumeister, T. Hater, J. Kraus, D. Pleiter, and P. Wahl. A performance model for GPU-accelerated FDTD applications. In 2015 IEEE 22nd International Conference on High Performance Computing (HiPC), pages 185--193, Dec 2015.Google Scholar
Digital Library
- J. Bent, G. Gibson, G. Grider, B. McClelland, P. Nowoczynski, J. Nunez, M. Polte, and M. Wingate. Plfs: a checkpoint filesystem for parallel applications. In High Performance Computing Networking, Storage and Analysis, Proceedings of the Conference on, pages 1--12, Nov 2009. Google Scholar
Digital Library
- J. Bent, G. Grider, B. Kettering, A. Manzanares, M. McClelland, A. Torres, and A. Torrez. Storage challenges at los alamos national lab. In Mass Storage Systems and Technologies (MSST), 2012 IEEE 28th Symposium on, pages 1--5, April 2012. Google Scholar
Cross Ref
- G. Bilardi, A. Pietracaprina, G. Pucci, F. Schifano, and R. Tripiccio ne. The Potential of On-Chip Multiprocessing for QCD Machines. In D. Bader, M. Parashar, V. Sridhar, and V. Prasanna, editors, High Performance Computing | HiPC 2005, volume 3769 of Lecture Notes in Computer Science, pages 386--397. Springer Berlin Heidelberg, 2005.Google Scholar
Digital Library
- P. Carns, K. Harms, W. Allcock, C. Bacon, S. Lang, R. Latham, and R. Ross. Understanding and improving computational science storage access through continuous characterization. In Mass Storage Systems and Technologies (MSST), 2011 IEEE 27th Symposium on, pages 1--14, May 2011. Google Scholar
Digital Library
- C. Docan, M. Parashar, and S. Klasky. Enabling high-speed asynchronous data extraction and transfer using dart. Concurrency and Computation: Practice and Experience, 22(9):1181--1204, 2010. Google Scholar
Digital Library
- S. El Sayed, S. Graf, M. Hennecke, D. Pleiter, G. Schwarz, H. Schick, and M. Stephan. Using GPFS to manage NVRAM-based storage cache. In J. M. Kunkel, T. Ludwig, and H. W. Meuer, editors, Supercomputing, volume 7905 of Lecture Notes in Computer Science, pages 435--446. Springer Berlin Heidelberg, 2013.Google Scholar
- M.-O. Gewaltig and M. Diesmann. NEST (Neural Simulation Tool). Scholarpedia, 2(4):1430, 2007. Google Scholar
Cross Ref
- J. Gray and P. Shenoy. Rules of Thumb in Data Engineering. pages 3--10, 2000. Google Scholar
Cross Ref
- J. He, A. Jagatheesan, S. Gupta, J. Bennett, and A. Snavely. Dash: a recipe for a ash-based data intensive supercomputer. In High Performance Computing, Networking, Storage and Analysis (SC), 2010 International Conference for, pages 1--11, Nov 2010. Google Scholar
Digital Library
- T. Hoeer, W. Gropp, W. Kramer, and M. Snir. Performance Modeling for Systematic Performance Tuning. In State of the Practice Reports, SC '11, pages 6:1--6:12, New York, NY, USA, 2011. ACM.Google Scholar
- S. Kannan, A. Gavrilovska, K. Schwan, D. Milojicic, and V. Talwar. Using active NVRAM for I/O staging. PDAC '11, pages 15--22, New York, NY, USA, 2011. ACM.Google Scholar
Digital Library
- S. Kunkel, M. Schmidt, J. M. Eppler, H. E. Plesser, G. Masumoto, J. Igarashi, S. Ishii, T. Fukai, A. Morrison, M. Diesmann, and M. Helias. Spiking network.Google Scholar
Index Terms
(auto-classified)Evaluation and Performance Modeling of a Burst Buffer Solution
Recommendations
Optimizing the SSD Burst Buffer by Traffic Detection
Currently, HPC storage systems still use hard disk drive (HDD) as their dominant storage device. Solid state drive (SSD) is widely deployed as the buffer to HDDs. Burst buffer has also been proposed to manage the SSD buffering of bursty write requests. ...
Optimization of I/O Intensive Genome Assemblies on the Cori Supercomputer with Burst Buffer
BCB '16: Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health InformaticsSince the development of next generation sequencing technologies, genome assembly has become one of the most computational and I/O intensive analyses done on the genomic data. The flood of genomic sequence data has increased the demand for more ...
The impact of bursty traffic on FPCF packet switch performance
This paper analyses and compares the performance of forward planning conflict-free (FPCF), virtual output queuing-partitioned (VOQ-P) and virtual output queuing-shared (VOQ-S) packet switches. The influence of packet burst size, offered switch load and ...






Comments