Abstract
Data visualization is a thriving field of computer science, with widespread impact on diverse scientific disciplines, from medicine and meteorology to visual data mining. Advances in large-scale storage systems, as well as low-level storage technology, played a significant role in accelerating the applicability and adoption of modern visualization techniques. Ironically, “the cobbler’s children have no shoes”: Researchers who wish to analyze storage systems and devices are usually limited to a variety of static histograms and basic displays.
The dynamic nature of data movement on flash has motivated the introduction of SSDPlayer, a graphical tool for visualizing the various processes that cause data movement on solid-state drives (SSDs). In 2015, we used the initial version of SSDPlayer to demonstrate how visualization can assist researchers and developers in their understanding of modern, complex flash-based systems. While we continued to use SSDPlayer for analysis purposes, we found it extremely useful for education and presentation purposes as well. In this article, we describe our experience from two years of using, sharing, and extending SSDPlayer and how similar techniques can further advance storage systems research and education.
- SNIA IOTTA Repository. 2014. SNIA IOTTA. Retrieved from http://iotta.snia.org/traces/388.Google Scholar
- 2015. Introduction to the EMC XtremIO Storage Array (Ver. 4.0). White Paper H11752.7. EMC. https://www.emc.com/collateral/white-papers/h11752-intro-to-XtremIO-array-wp.pdf.Google Scholar
- AbleData: Tools 8 Technologies to Enhance Life. 2017. AbleData. Retrieved from http://www.abledata.com/.Google Scholar
- Auslogics Disk Defrag Pro. 2017. Auslogics. Retrieved from http://www.auslogics.com/en/software/disk-defrag-pro/.Google Scholar
- The analytics platform for all your metrics. 2017. Grafana. Retrieved from https://grafana.com/grafana.Google Scholar
- The OpenSSD Project. 2017. OpenSSD. Retrieved from http://www.openssd-project.org/.Google Scholar
- PerfectDisk Pro. 2017. Raxco. Retrieved from http://www.raxco.com/home/products/perfectdisk-pro/.Google Scholar
- Web Accessibility Initiative (WAI). 2017. WAI. Retrieved from https://www.w3.org/WAI/.Google Scholar
- Abutalib Aghayev and Peter Desnoyers. 2015. Skylight—a window on shingled disk operation. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). Google Scholar
Digital Library
- Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark Manasse, and Rina Panigrahy. 2008. Design tradeoffs for SSD performance. In Proceedings of the USENIX Annual Technical Conference (ATC’08). Google Scholar
Digital Library
- Ronald T. Azuma. 1997. A survey of augmented reality. Presence Teleoperat. Virtual Environ. 6, 4 (1997), 355--385. Google Scholar
Digital Library
- Mahesh Balakrishnan, Asim Kadav, Vijayan Prabhakaran, and Dahlia Malkhi. 2010. Differential RAID: Rethinking RAID for SSD reliability. Trans. Stor. 6, 2, Article 4 (July 2010), 4:1--4:22 pages. Google Scholar
Digital Library
- Werner Bux and Ilias Iliadis. 2010. Performance of greedy garbage collection in flash-based solid-state drives. Perform. Eval. 67, 11 (Nov. 2010), 1172--1186. Google Scholar
Digital Library
- Jeff Carter and Mike Markel. 2001. Web accessibility for people with disabilities: An introduction for Web developers. IEEE Trans. Profess. Commun. 44, 4 (Dec. 2001), 225--233.Google Scholar
Cross Ref
- Jichuan Chang and Gurindar S. Sohi. 2006. Cooperative caching for chip multiprocessors. In Proceedings of the 33rd Annual International Symposium on Computer Architecture (ISCA’06). Google Scholar
Digital Library
- Chaomei Chen. 2010. Information visualization. Wiley Interdisc. Rev.: Comput. Stat. 2, 4 (2010), 387--403. Google Scholar
Digital Library
- John Colgrove, John D. Davis, John Hayes, Ethan L. Miller, Cary Sandvig, Russell Sears, Ari Tamches, Neil Vachharajani, and Feng Wang. 2015. Purity: Building fast, highly-available enterprise flash storage from commodity components. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’15). Google Scholar
Digital Library
- Kevin L. Crow. 2008. Four types of disabilities: Their impact on online learning. TechTrends 52, 1 (2008), 51--55.Google Scholar
Cross Ref
- Peter Desnoyers. 2013. What systems researchers need to know about NAND flash. In Proceedings of the 5th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’13). Google Scholar
Digital Library
- Peter Desnoyers. 2014. Analytic models of SSD write performance. Trans. Stor. 10, 2, Article 8 (March 2014), 25 pages. Google Scholar
Digital Library
- David DiBiase, Alan M. MacEachren, John B. Krygier, and Catherine Reeves. 1992. Animation and the role of map design in scientific visualization. Cartogr. Geogr. Inf. Syst. 19, 4 (1992), 201--214.Google Scholar
Cross Ref
- Fred Douglis, Abhinav Duggal, Philip Shilane, Tony Wong, Shiqin Yan, and Fabiano Botelho. 2017. The logic of physical garbage collection in deduplicating storage. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’17). Google Scholar
Digital Library
- Daniel Ford, François Labelle, Florentina I. Popovici, Murray Stokely, Van-Anh Truong, Luiz Barroso, Carrie Grimes, and Sean Quinlan. 2010. Availability in globally distributed storage systems. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI’10). Google Scholar
Digital Library
- Michael Friendly. 2008. A Brief History of Data Visualization. Springer, Berlin, 15--56.Google Scholar
- Min Fu, Dan Feng, Yu Hua, Xubin He, Zuoning Chen, Wen Xia, Fangting Huang, and Qing Liu. 2014. Accelerating restore and garbage collection in deduplication-based backup systems via exploiting historical information. In Proceedings of the USENIX Annual Technical Conference (ATC’14). Google Scholar
Digital Library
- Binny Gill. 2008. On multi-level exclusive caching: Offline optimality and why promotions are better than demotions. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’08). Google Scholar
Digital Library
- Binny S. Gill and Luis Angel D. Bathen. 2007. AMP: Adaptive multi-stream prefetching in a shared cache. In USENIX Conference on File and Storage Technologies (FAST’07). Google Scholar
Digital Library
- Binny S. Gill and Dharmendra S. Modha. 2005. SARC: Sequential prefetching in adaptive replacement cache. In Proceedings of the USENIX Annual Technical Conference. Google Scholar
Digital Library
- Kevin Greenan, Darrell D. E. Long, Ethan L. Miller, Thomas Schwarz, and Avani Wildani. 2009. Building flexible, fault-tolerant flash-based storage systems. In Proceedings of the 5th Workshop on Hot Topics in System Dependability (HotDep’09).Google Scholar
- Daniel A. Griffith. 2013. Spatial Autocorrelation and Spatial Filtering: Gaining Understanding Through Theory and Scientific Visualization. Springer Science 8 Business Media.Google Scholar
- Aayush Gupta, Raghav Pisolkar, Bhuvan Urgaonkar, and Anand Sivasubramaniam. 2011. Leveraging value locality in optimizing NAND flash-based SSDs. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11). Google Scholar
Digital Library
- Zvika Guz, Idit Keidar, Avinoam Kolodny, and Uri C. Weiser. 2008. Utilizing shared data in chip multiprocessors with the nahalal architecture. In Proceedings of the Annual Symposium on Parallelism in Algorithms and Architectures (SPAA’08). Google Scholar
Digital Library
- Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin. 2012. Erasure coding in windows azure storage. In Proceedings of the USENIX Annual Technical Conference (ATC’12). Google Scholar
Digital Library
- Soojun Im and Dongkun Shin. 2011. Flash-aware RAID techniques for dependable and high-performance flash memory SSD. IEEE Trans. Comput. 60, 1 (Jan. 2011), 80--92. Google Scholar
Digital Library
- Rajiv Jauhari, Michael J. Carey, and Miron Livny. 1990. Priority-hints: An algorithm for priority-based buffer management. In Proceedings of the International Conference on Very Large Data Bases (VLDB’90). Google Scholar
Digital Library
- Luke Jefferson and Richard Harvey. 2006. Accommodating color blind computer users. In Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’06). 40--47. Google Scholar
Digital Library
- C. Johnson. 2004. Top scientific visualization research problems. IEEE Comput. Graph. Appl. 24, 4 (July 2004), 13--17. Google Scholar
Digital Library
- Yogesh Kandlikar, Diana Bocskai, and Art Cruz. 2016. Graphical storage system visualization, timeline based event visualization, and storage system configuration visualization. Patent No. 9,383,892, Filed July 5, 2016.Google Scholar
- Daniel A. Keim. 2002. Information visualization and visual data mining. IEEE Trans. Vis. Comput. Graph. 8, 1 (Jan. 2002), 1--8. Google Scholar
Digital Library
- Hyojun Kim and Seongjun Ahn. 2008. BPLRU: A buffer management scheme for improving random writes in flash storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). Google Scholar
Digital Library
- Jaeho Kim, Jongmin Lee, Jongmoo Choi, Donghee Lee, and Sam H. Noh. 2013. Improving SSD reliability with RAID via elastic striping and anywhere parity. In Proceedings of the 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’13). Google Scholar
Digital Library
- Youngjae Kim, Brendan Tauras, Aayush Gupta, and Bhuvan Urgaonkar. 2009. FlashSim: A simulator for NAND flash-based solid-state drives. In Proceedings of the 1st International Conference on Advances in System Simulation (SIMUL’09). Google Scholar
Digital Library
- Guy Laden, Paula Ta-Shma, Eitan Yaffe, Michael Factor, and Shachar Fienblit. 2007. Architectures for controller based CDP. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’07). Google Scholar
Digital Library
- Richard A. Lagueux, Joel H. Stave, John B. Yeaman, Brian E. Stevens, Robert M. Higgins, and James M. Collins. 2003. Graphical user interface for configuration of a storage system. Patent No. 6,538,669. Retrieved from https://www.google.com/patents/US6538669.Google Scholar
- Robert S. Laramee, Hamish Carr, Min Chen, Helwig Hauser, Lars Linsen, Klaus Mueller, Vijay Natarajan, Harald Obermaier, Ronald Peikert, and Eugene Zhang. 2014. Future Challenges and Unsolved Problems in Multi-field Visualization. Springer London, London, 205--211.Google Scholar
- Florian Lautenschlager, Michael Philippsen, Andreas Kumlehn, and Josef Adersberger. 2017. Chronix: Long term storage and retrieval technology for anomaly detection in operational data. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’17). Google Scholar
Digital Library
- Sehwan Lee, Bitna Lee, Kern Koh, and Hyokyung Bahn. 2011. A lifespan-aware reliability scheme for RAID-based flash storage. In Proceedings of the ACM Symposium on Applied Computing (SAC’11). Google Scholar
Digital Library
- Sungjin Lee, Dongkun Shin, Young-Jin Kim, and Jihong Kim. 2008. LAST: Locality-aware sector translation for NAND flash memory-based storage systems. SIGOPS Oper. Syst. Rev. 42, 6 (Oct. 2008), 36--42. Google Scholar
Digital Library
- Xiang-Jun Lu and Wilma K. Olson. 2003. 3DNA: A software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures. Nucleic Acids Res. 31, 17 (2003), 5108.Google Scholar
Cross Ref
- Xiang Luojie, B. M. Kurkoski, and E. Yaakobi. 2012. WOM codes reduce write amplification in NAND flash memory. In Proceedings of the IEEE Global Communications Conference (GLOBECOM’12).Google Scholar
- Fabio Margaglia and André Brinkmann. 2015. Improving MLC flash performance and endurance with extended P/E cycles. In Proceedings of the IEEE 31st Symposium on Mass Storage Systems and Technologies (MSST’15).Google Scholar
Cross Ref
- Fabio Margaglia, Gala Yadgar, Eitan Yaakobi, Yue Li, Assaf Schuster, and André Brinkmann. 2016. The devil is in the details: Implementing flash page reuse with WOM codes. In Proceedings of the 14th Usenix Conference on File and Storage Technologies (FAST’16). Google Scholar
Digital Library
- Nimrod Megiddo and Dharmendra S. Modha. 2003. ARC: A self-tuning, low overhead replacement cache. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’03). Google Scholar
Digital Library
- Matthew W. Moskewicz, Conor F. Madigan, Ying Zhao, Lintao Zhang, and Sharad Malik. 2001. Chaff: Engineering an efficient SAT solver. In Proceedings of the 38th Annual Design Automation Conference (DAC’01). Google Scholar
Digital Library
- Thomas Naps, Stephen Cooper, Boris Koldehofe, Charles Leska, Guido Rößling, Wanda Dann, Ari Korhonen, Lauri Malmi, Jarmo Rantakokko, Rockford J. Ross, Jay Anderson, Rudolf Fleischer, Marja Kuittinen, and Myles McNally. 2003. Evaluating the educational impact of visualization. SIGCSE Bull. 35, 4 (June 2003), 124--136. Google Scholar
Digital Library
- Thomas L. Naps, Guido Rößling, Vicki Almstrum, Wanda Dann, Rudolf Fleischer, Chris Hundhausen, Ari Korhonen, Lauri Malmi, Myles McNally, Susan Rodger, and J. Ángel Velázquez-Iturbide. 2002. Exploring the role of visualization and engagement in computer science education. In Proceedings of the Working Group Reports from ITiCSE on Innovation and Technology in Computer Science Education (ITiCSE-WGR’02). Google Scholar
Digital Library
- Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. 2008. Write off-loading: Practical power management for enterprise storage. Trans. Storage 4, 3, Article 10 (Nov. 2008), 23 pages. Google Scholar
Digital Library
- Saher Odeh and Yuval Cassuto. 2014. NAND flash architectures reducing write amplification through multi-write codes. In Proceedings of the 30th International Conference on Massive Storage Systems and Technology (MSST’14).Google Scholar
Cross Ref
- Yongseok Oh, Jongmoo Choi, Donghee Lee, and Sam H. Noh. 2012. Caching less for better performance: Balancing cache size and update cost of flash memory cache in hybrid storage systems. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google Scholar
Digital Library
- Norma Presmeg. 2006. Handbook of Research on the Psychology of Mathematics Education: Past, Present and Future. Sense Publishers, 205--235.Google Scholar
- K. V. Rashmi, Nihar B. Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, and Kannan Ramchandran. 2013. A solution to the network challenges of data recovery in erasure-coded distributed storage systems: A study on the facebook warehouse cluster. In Proceedings of the 5th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’13). Google Scholar
Digital Library
- Philipp Reinecke, Granville Barnett, Patrick Goldsack, and Brian Monahan. 2017. GAS: Guess, Abstract, and Speculate. Technical Report HPE-2017-05. Hewlett Packard Labs.Google Scholar
- Ross Shaull, Liuba Shrira, and Barbara Liskov. 2014. A modular and efficient past state system for berkeley DB. In Proceedings of the USENIX Annual Technical Conference (ATC’14). Google Scholar
Digital Library
- Roman Shor, Gala Yadgar, Or Mauda, Dolev Hadar, and Roee Matza. 2017. SSDPlayer Visualization Platform Programmers Guide for Version 1.2.1. (May 2017).Google Scholar
- Siglead Inc. 2012. SigNAS-II: Siglead NAND Analyzer System (2.2 ed.). Siglead Inc.Google Scholar
- Kiran Srinivasan, Tim Bisson, Garth Goodson, and Kaladhar Voruganti. 2012. iDedup: Latency-aware, inline data deduplication for primary storage. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’12). Google Scholar
Digital Library
- Radu Stoica and Anastasia Ailamaki. 2013. Improving flash write performance by using update frequency. Proc. VLDB Endow. 6, 9 (July 2013), 733--744. Google Scholar
Digital Library
- Hironobu Takagi, Chieko Asakawa, Kentarou Fukuda, and Junji Maeda. 2003. Accessibility designer: Visualizing usability for the blind. SIGACCESS Access. Comput. 77--78 (Sept. 2003), 177--184. Google Scholar
Digital Library
- Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). Google Scholar
Digital Library
- Sage A. Weil, Scott A. Brandt, Ethan L. Miller, and Carlos Maltzahn. 2006. CRUSH: Controlled, scalable, decentralized placement of replicated data. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC’06). Google Scholar
Digital Library
- Sage A. Weil, Andrew W. Leung, Scott A. Brandt, and Carlos Maltzahn. 2007. RADOS: A scalable, reliable storage service for petabyte-scale storage clusters. In Proceedings of the 2nd International Workshop on Petascale Data Storage (PDSW’07). Google Scholar
Digital Library
- Jake Wires and Andrew Warfield. 2017. Mirador: An active control plane for datacenter storage. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’17). Google Scholar
Digital Library
- J. A. Wise, J. J. Thomas, K. Pennock, D. Lantrip, M. Pottier, A. Schur, and V. Crow. 1995. Visualizing the non-visual: Spatial analysis and interaction with information from text documents. In Proceedings of the IEEE Symposium on Information Visualization (INFOVIS’95). Google Scholar
Digital Library
- Bang Wong. 2011. Points of view: Color blindness. Nat. Methods 8, 6 (May 2011), 441.Google Scholar
- Theodore M. Wong and John Wilkes. 2002. My cache or yours? Making storage more exclusive. In Proceedings of the USENIX Annual Technical Conference (ATC’02). Google Scholar
Digital Library
- Eitan Yaakobi, Alexander Yucovich, Gal Maor, and Gala Yadgar. 2015. When do WOM codes improve the erasure factor in flash memories? In Proceedings of the IEEE International Symposium on Information Theory (ISIT’15).Google Scholar
Cross Ref
- Gala Yadgar, Michael Factor, Kai Li, and Assaf Schuster. 2011. Management of multilevel, multiclient cache hierarchies with application hints. ACM Trans. Comput. Syst. 29, 2, Article 5 (2011), 51 pages. Google Scholar
Digital Library
- Gala Yadgar, Michael Factor, and Assaf Schuster. 2007. Karma: Know-it-all replacement for a multilevel cache. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’07). Google Scholar
Digital Library
- Gala Yadgar, Michael Factor, and Assaf Schuster. 2013. Cooperative caching with return on investment. In Proceedings of the 29th IEEE Symposium on Massive Storage Systems and Technologies (MSST’13).Google Scholar
Cross Ref
- Gala Yadgar, Roman Shor, Eitan Yaakobi, and Assaf Schuster. 2015. It’s not where your data is, it’s how it got there. In Proceedings of the 7th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’15). Google Scholar
Digital Library
- Gala Yadgar, Roman Shor, Eitan Yaakobi, and Assaf Schuster. 2017. SSDPlayer Visualization Platform Version 1.2.1 Users Guide. Retrieved from http://ssdplayer.cswp.cs.technion.ac.il/downloads/.Google Scholar
- Gala Yadgar, Eitan Yaakobi, and Assaf Schuster. 2015. Write once, get 50% free: Saving SSD erase costs using WOM codes. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). Google Scholar
Digital Library
- Zhe Zhang, Kyuhyung Lee, Xiaosong Ma, and Yuanyuan Zhou. 2008. PFC: Transparent optimization of existing prefetching strategies for multi-level storage systems. In Proceedings of the 28th IEEE International Conference on Distributed Computing Systems (ICDCS’08). Google Scholar
Digital Library
- Benjamin Zhu, Kai Li, and Hugo Patterson. 2008. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’08). Google Scholar
Digital Library
Index Terms
Experience from Two Years of Visualizing Flash with SSDPlayer
Recommendations
Prov Viewer: A Graph-Based Visualization Tool for Interactive Exploration of Provenance Data
IPAW 2016: Proceedings of the 6th International Workshop on Provenance and Annotation of Data and Processes - Volume 9672The analysis of provenance data for an experiment is often crucial to understand the achieved results. For long-running experiments or when provenance is captured at a low granularity, this analysis process can be overwhelming to the user due to the ...
Flash-Based Storage Deduplication Techniques: A Survey
Exponential growth of the amount of data stored worldwide together with high level of data redundancy motivates the active development of data deduplication techniques. The overall increasing popularity of solid-state drives (SSDs) as primary storage ...
File-Level, Host-Side Flash Caching with Loris
ICPADS '13: Proceedings of the 2013 International Conference on Parallel and Distributed SystemsAs enterprises shift from using direct-attached storage to network-based storage for housing primary data, flash-based, host-side caching has gained momentum as the primary latency reduction technique. In this paper, we make the case for integration of ...






Comments