Abstract
The combination of the explosive growth in digital data and the demand to preserve much of these data in the long term has made it imperative to find a more cost-effective way than HDD arrays and a more easily accessible way than tape libraries to store massive amounts of data. While modern optical discs are capable of guaranteeing more than 50-year data preservation without media replacement, individual optical discs’ lack of the performance and capacity relative to HDDs or tapes has significantly limited their use in datacenters. This article presents a Rack-scale Optical disc library System, or ROS in short, which provides a PB-level total capacity and inline accessibility on thousands of optical discs built within a 42U Rack. A rotatable roller and robotic arm separating and fetching discs are designed to improve disc placement density and simplify the mechanical structure. A hierarchical storage system based on SSDs, hard disks, and optical discs is proposed to effectively hide the delay of mechanical operation. However, an optical library file system (OLFS) based on FUSE is proposed to schedule mechanical operation and organize data on the tiered storage with a POSIX user interface to provide an illusion of inline data accessibility. We further optimize OLFS by reducing unnecessary user/kernel context switches inheriting from legacy FUSE framework. We evaluate ROS on a few key performance metrics, including operation delays of the mechanical structure and software overhead in a prototype PB-level ROS system. The results show that ROS stacked on Samba and FUSE as network-attached storage (NAS) mode almost saturates the throughput provided by underlying samba via 10GbE network for external users, as well as in this scenario provides about 53ms file write and 15ms read latency, exhibiting its inline accessibility. Besides, ROS is able to effectively hide and virtualize internal complex operational behaviors and be easily deployable in datacenters.
- Amazon. 2017. Amazone Galcier. Retrieved from http://aws.amazon.com/glacier/.Google Scholar
- Optical Storage Technology Association. 2003. Universal disk format specification. Retrieved from http://www.osta.org/specs/pdf/udf250.pdf.Google Scholar
- Shobana Balakrishnan, Richard Black, Austin Donnelly, Paul England, Adam Glass, Dave Harper, Sergey Legtchenko, Aaron Ogus, Eric Peterson, and Antony Rowstron. 2014. Pelican: A building block for exascale cold data storage. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). USENIX Association, 351--365. Google Scholar
Digital Library
- Jean Jacques Cassiman, Segolene Ayme, Beatrice Godard, and Jorg Schmidtke. 2003. Data storage and DNA banking for biomedical research: Informed consent, confidentiality, quality issues, ownership, return of benefits. A professional perspective. Eur. J. Hum. Genet. 11 (2003), S88--S122.Google Scholar
Cross Ref
- Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, and Peter Vajgel. 2010. Finding a needle in haystack: Facebook’s photo storage. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI’10). USENIX Association, Berkeley, CA, 47--60. Google Scholar
Digital Library
- John Bent, Garth Gibson, Gary Grider, Ben McClelland, Paul Nowoczynski, James Nunez, Milo Polte, and Meghan Wingate. 2009. PLFS: A checkpoint filesystem for parallel applications. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC’09). ACM, New York, NY, Article 21. Google Scholar
Digital Library
- S. Boyd, A. Horvath, and D. Dornfeld. 2011. Life-cycle assessment of NAND flash memory. IEEE Trans. Semicond. Manufactur. 24, 1 (Feb. 2011), 117--124.Google Scholar
Cross Ref
- Victor Chang. 2015. Towards a big data system disaster recovery in a private cloud. Ad Hoc Networks 35 (July 2015), 65--82. Google Scholar
Digital Library
- Brian Cornell, Peter A. Dinda, and Fabián E. Bustamante. 2004. Wayback: A user-level versioning file system for linux. In Proceedings of the Annual Conference on USENIX Annual Technical Conference (ATEC’04). USENIX Association, Berkeley, CA, 27. Google Scholar
Digital Library
- Panasonic Corp. 2016. Data archiver LB-DH8 series. Retrieved from http://panasonic.net/avc/archiver/lb-dh8/.Google Scholar
- Douglas Crockford. 2016. JavaScript object notation. Retrieved from http://www.json.org/.Google Scholar
- B. Biskeborn, M. Richmond, A. Abe, D. Pease, A. Amir, and L. V. Real. 2010. The linear tape file system. In Proceedings of the IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST’10). IEEE, 1--8. Google Scholar
Digital Library
- W. Wamsteker, I. Skillen, J. D. Ponz, A. de la Fuente, M. Barylak, and I. Yurrita. 2000. INES: Astronomy data distribution for the future. Astrophys. Space Sci. 273, 1 (Sep. 2000), 155--161.Google Scholar
Cross Ref
- G. Deepika. 2011. Holographic versatile disc. In Proceedings of the 2011 National Conference on Innovations in Emerging Technology (NCOIET’11). IEEE, 145--146.Google Scholar
Cross Ref
- Hiroshi Fujiwara. 2016. What is the importance of data archives and what are the issues? Retrieved from http://panasonic.net/avc/archiver/voices/experts01_bbtower.html.Google Scholar
- Gregory R. Ganger and M. Frans Kaashoek. 1997. Embedded inodes and explicit grouping: Exploiting disk bandwidth for small files. In Proceedings of the Annual Conference on USENIX Annual Technical Conference (ATEC’97). USENIX Association, Berkeley, CA, 1. Google Scholar
Digital Library
- Vasily Tarasov and George Amvrosiadis. 2018. Filebench. Retrieved from https://github.com/filebench/filebench/wiki.Google Scholar
- Google. 2018. Archival cloud storage: Nearline and Coldline. Retrieved from http://cloud.google.com/storage/archival/.Google Scholar
- Matthias Grawinkel, Lars Nagel, Markus Mäsker, Federico Padua, André Brinkmann, and Lennart Sorth. 2015. Analysis of the ECMWF storage landscape. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, 15--27. Google Scholar
Digital Library
- Min Gu and Xiangping Li. 2010. The road to multi-dimensionalbit-by-bit optical data storage. Optics Photon. News 21, 7 (July 2010), 28--33.Google Scholar
Cross Ref
- P. Gupta, A. Wildani, E. L. Miller, D. S. H. Rosenthal, and D. D. E. Long. 2016. Effects of prolonged media usage and long-term planning on archival systems. In Proceedings of the IEEE 32nd Symposium on Mass Storage Systems and Technologies (MSST’16). IEEE, 1--12.Google Scholar
Cross Ref
- Kazutoshi Katayama, Yuka Chinda, Osamu Shimizu, Yasutomo Goto, Mayumi Suzuki, and Hitoshi Noguchi. 2015. Long term stabilities of magnetic tape for data storage in office environment. J. Appl. Phys. 117, 17 (Feb. 2015), 17E305.Google Scholar
Cross Ref
- Sameer Kumar and Thomas R. McCaffrey. 2003. Engineering economics at a hard disk drive manufacturer. Technovation 23, 9 (Sep. 2003), 749--755.Google Scholar
Cross Ref
- Sergey Legtchenko, Xiaozhou Li, Antony I. T. Rowstron, Austin Donnelly, and Richard Black. 2016. Flamingo: Enabling evolvable HDD-based near-line storage. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, 213--226. Google Scholar
Digital Library
- Rich Miller. 2014. Inside Facebook’s Blu-Ray Cold Storage Data Center. Retrieved from http://datacenterfrontier.com/inside-facebooks-blu-ray-cold-storage-data-center/.Google Scholar
- Yaoyu Cao, Min Gu, and Xiangping Li. 2014. Optical storage arrays: A perspective for future big data storage. Light: Science and Applications 3, e177 (May 2014).Google Scholar
Cross Ref
- Hiroyuki Minemura, Koichi Watanabe, Kazuyoshi Adachi, and Reiji Tamura. 2006. High-speed write/read techniques for blu-ray write-once discs. Japan. J. Appl. Phys. 45, 2S (Feb. 2006), 1213. Retrieved from http://stacks.iop.org/1347-4065/45/i=2S/a=1213.Google Scholar
Cross Ref
- Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, and Sanjeev Kumar. 2014. F4: Facebook’s warm BLOB storage system. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, 383--398. Retrieved from http://dl.acm.org/citation.cfm?id=2685048.2685078. Google Scholar
Digital Library
- NASA. 2017. The Lou Mass Storage System. Retrieved from http://www.nas.nasa.gov/hecc/resources/storage_systems.html.Google Scholar
- Babak Nikoobakht and Mostafa A. El-Sayed. 2003. Preparation and growth mechanism of gold nanorods (NRs) using seed-mediated growth method. Chem. Mater. 35, 10 (Apr. 2003), 1957--1962.Google Scholar
- Academy of Motion Picture Arts, Sciences, and Technology Council. 2007. The digital dilemma: Strategic issues in archiving and accessing digital motion picture materials. Technical Report. Beverly Hills, CA.Google Scholar
- Kestutis Patiejunas. 2014. Freezing Exabytes of Data at Facebook’s Cold Storage. Technical Report. Washington, D.C.Google Scholar
- Marty Perlmutter. 2017. The Lost Picture Show: Hollywood Archivists Can’t Outpace Obsolescence. Retrieved from http://spectrum.ieee.org/computing/it/the-lost-picture-show-hollywood-archivists-cant-outpace-obsolescence. Google Scholar
Digital Library
- Min Gu Peter Zijlstra and James W. M. Chon. 2009. Five-dimensional optical recording mediated by surface plasmons in gold nanorods. Nature 459 (May 2009), 410--413.Google Scholar
- Aditya Rajgarhia and Ashish Gehani. 2010. Performance and extension of user space file systems. In Proceedings of the 2010 ACM Symposium on Applied Computing (SAC’10). ACM, New York, NY, 206--213. Google Scholar
Digital Library
- Arnon Rosenthal, Peter Mork, Maya Hao Li, Jean Stanford, David Koester, and Patti Reynolds. 2010. Cloud computing: A new business paradigm for biomedical information sharing. J. Biomed. Info. 43, 2 (Apr. 2010), 342--353. Google Scholar
Digital Library
- Bianca Schroeder, Raghav Lagisetty, and Arif Merchant. 2016. Flash reliability in production: The expected and the unexpected. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, 67--80. Google Scholar
Digital Library
- Sony. 2016. Sony Everspan. Retrieved from http://www.everspan.com/specs/.Google Scholar
- Ivan Svrcek. 2009. Accelerated life cycle comparison of millenniata archival DVD. Retrieved from http://www.esystor.com/images/China_Lake_Full_Report.pdf.Google Scholar
- C. Thompson. 2014. Optical disc system for long term archiving of multi-media content. In Proceedings of the 21st International Conference on Systems, Signals and Image Processing (IWSSIP’14). IEEE, 11--14.Google Scholar
- Cristian Ungureanu, Benjamin Atkin, Akshat Aranya, Salil Gokhale, Stephen Rago, Grzegorz Calkowski, Cezary Dubnicki, and Aniruddha Bohra. 2010. HydraFS: A high-throughput file system for the hydrastor content-addressable storage system. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). USENIX Association, 225--239. Google Scholar
Digital Library
- Akinobu Watanabe. 2013. Optical Library System for Long-term Preservation with Extended Error Correction Coding. Technical Report. Long Beach, CA.Google Scholar
- Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). USENIX Association, Berkeley, CA, 307--320. Google Scholar
Digital Library
- Naotaka Yamamoto, Osamu Tatebe, and Satoshi Sekiguchi. 2004. Parallel and distributed astronomical data analysis on grid datafarm. In Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing (GRID’04). IEEE Computer Society, Washington, DC, 461--466. Google Scholar
Digital Library
- Shuanglong Zhang, Helen Catanese, and An-I Andy Wang. 2016. The composite-file file system: Decoupling the one-to-one mapping of files and metadata for better performance. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, 15--22. Google Scholar
Digital Library
Index Terms
ROS: A Rack-based Optical Storage System with Inline Accessibility for Long-Term Data Preservation
Recommendations
ROS: A Rack-based Optical Storage System with Inline Accessibility for Long-Term Data Preservation
EuroSys '17: Proceedings of the Twelfth European Conference on Computer SystemsThe combination of the explosive growth in digital data and the need to preserve much of this data in the long term has made it an imperative to find a more cost-effective way than HDD arrays and more easily accessible way than tape libraries to store ...
A Storage Slab Allocator for Disk Storage Management in File System
NAS '09: Proceedings of the 2009 IEEE International Conference on Networking, Architecture, and StorageAbsorbing the slab idea for memory management, this paper presents a novel technique for the disk storage management called a storage slab allocator.A storage slab allocator doesn’t discard the file layout during the process of deleting files, but ...
CosaFS: A Cooperative Shingle-Aware File System
Special Issue on MSST 2017 and Regular PapersIn this article, we design and implement a cooperative shingle-aware file system, called CosaFS, on heterogeneous storage devices that mix solid-state drives (SSDs) and shingled magnetic recording (SMR) technology to improve the overall performance of ...






Comments