Abstract
Cloud providers routinely schedule multiple applications per physical host to increase efficiency. The resulting interference on shared resources often leads to performance degradation and, more importantly, security vulnerabilities. Interference can leak important information ranging from a service's placement to confidential data, like private keys. We present Bolt, a practical system that accurately detects the type and characteristics of applications sharing a cloud platform based on the interference an adversary sees on shared resources. Bolt leverages online data mining techniques that only require 2-5 seconds for detection. In a multi-user study on EC2, Bolt correctly identifies the characteristics of 385 out of 436 diverse workloads. Extracting this information enables a wide spectrum of previously-impractical cloud attacks, including denial of service attacks (DoS) that increase tail latency by 140x, as well as resource freeing (RFA) and co-residency attacks. Finally, we show that while advanced isolation mechanisms, such as cache partitioning lower detection accuracy, they are insufficient to eliminate these vulnerabilities altogether. To do so, one must either disallow core sharing, or only allow it between threads of the same application, leading to significant inefficiencies and performance penalties.
- Amazon ec2. http://aws.amazon.com/ec2/.Google Scholar
- Aman Bakshi and Yogesh B. Dujodwala. Securing cloud from ddos attacks using intrusion detection system in virtual machine. In Proc. of the 2010 Second International Conference on Communication Software and Networks (ICCSN). 2010. Google Scholar
Digital Library
- Luiz Barroso and Urs Hoelzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. MC Publishers, 2009.Google Scholar
Digital Library
- Naomi Benger, Joop van de Pol, Nigel P. Smart, and Yuval Yarom. "ooh aah... just a little bit" : A small amount of side channel can go a long way. In Proc. of the International Cryptographic Hardware and Embedded Systems Workshop (CHES). Busan, South Korea, 2014.Google Scholar
- Major Bhadauria and Sally A. McKee. An approach to resource-aware co-scheduling for cmps. In Proc. of the 24th ACM International Conference on Supercomputing (ICS). Tsukuba, Japan, 2010. Google Scholar
Digital Library
- Leon Bottou. Large-scale machine learning with stochastic gradient descent. In Proceedings of the International Conference on Computational Statistics (COMPSTAT). Paris, France, 2010. Google Scholar
Cross Ref
- Eric Brewer. Kubernetes: The path to cloud native. http://goo.gl/QgkzYB, SOCC Keynote, August 2015.Google Scholar
- Martin A. Brown. Traffic control howto. http://linux-ip.net/articles/Traffic-Control-HOWTO/.Google Scholar
- Robin Burke. Hybrid recommender systems: Survey and experiments. User Modeling and User-Adapted Interaction, 12(4):331--370, November 2002. Google Scholar
Digital Library
- Apache cassandra. http://cassandra.apache.org/.Google Scholar
- InteltextcircledR64 and IA-32 Architecture Software Developer's Manual, vol3B: System Programming Guide, Part 2, September 2014.Google Scholar
- Ludmila Cherkasova, Diwaker Gupta, and Amin Vahdat. Comparison of the three cpu schedulers in xen. SIGMETRICS Perform. Eval. Rev., 35(2):42--51, September 2007. Google Scholar
Digital Library
- Scott A. Crosby and Dan S. Wallach. Denial of service via algorithmic complexity attacks. In Proceedings of the 12th Conference on USENIX Security. Washington, DC, 2003.Google Scholar
- Marwan Darwish, Abdelkader Ouda, and Luiz Fernando Capretz. Cloud-based ddos attacks and defenses. In Proc. of i-Society. Toronto, ON, 2013.Google Scholar
- Christina Delimitrou and Christos Kozyrakis. iBench: Quantifying Interference for Datacenter Workloads. In Proceedings of the 2013 IEEE International Symposium on Workload Characterization (IISWC). Portland, OR, September 2013. Google Scholar
Cross Ref
- Christina Delimitrou and Christos Kozyrakis. Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Houston, TX, USA, 2013. Google Scholar
Digital Library
- Christina Delimitrou and Christos Kozyrakis. Quasar: Resource-Efficient and QoS-Aware Cluster Management. In Proceedings of the Nineteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Salt Lake City, UT, USA, 2014. Google Scholar
Digital Library
- Christina Delimitrou and Christos Kozyrakis. HCloud: Resource-Efficient Provisioning in Shared Cloud Systems. In Proceedings of the Twenty First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), April 2016. Google Scholar
Digital Library
- Jake Edge. Denial of service via hash collisions. http://lwn.net/Articles/474912/, January 2012.Google Scholar
- Benjamin Farley, Ari Juels, Venkatanathan Varadarajan, Thomas Ristenpart, Kevin D. Bowers, and Michael M. Swift. More for your money: Exploiting performance heterogeneity in public clouds. In Proc. of the ACM Symposium on Cloud Computing (SOCC). San Jose, CA, 2012. Google Scholar
Digital Library
- Alexandra Fedorova, Margo Seltzer, and Michael D. Smith. Improving performance isolation on chip multiprocessors via an operating system scheduler. In Proceedings of the 16th Intl. Conference on Parallel Architecture and Compilation Techniques (PACT). Brasov, Romania, 2007. Google Scholar
Cross Ref
- Alexander Felfernig and Robin Burke. Constraint-based recommender systems: Technologies and research issues. In Proceedings of the ACM International Conference on Electronic Commerce (ICEC). Innsbruck, Austria, 2008. Google Scholar
Digital Library
- Brad Fitzpatrick. Distributed caching with memcached. In Linux Journal, Volume 2004, Issue 124, 2004.Google Scholar
Digital Library
- Oded Goldreich and Rafail Ostrovsky. Software protection and simulation on oblivious rams. J. ACM, 43(3):431--473, May 1996. Google Scholar
Digital Library
- Google container engine. https://cloud.google.com/container-engine.Google Scholar
- Asela Gunawardana and Christopher Meek. A unified approach to building hybrid recommender systems. In Proc. of the Third ACM Conference on Recommender Systems (RecSys). New York, NY, 2009. Google Scholar
Digital Library
- Sanchika Gupta and Padam Kumar. Vm profile based optimized network attack pattern detection scheme for ddos attacks in cloud. In Proc. of SSCC. Mysore, India, 2013. Google Scholar
Cross Ref
- Yi Han, Tansu Alpcan, Jeffrey Chan, and Christopher Leckie. Security games for virtual machine allocation in cloud computing. In 4th International Conference on Decision and Game Theory for Security. Fort Worth, TX, 2013. Google Scholar
Digital Library
- Amir Herzberg, Haya Shulman, Johanna Ullrich, and Edgar Weippl. Cloudoscopy: Services discovery and topology mapping. In Proceedings of the ACM Workshop on Cloud Computing Security Workshop (CCSW). Berlin, Germany, 2013. Google Scholar
Digital Library
- Ben Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, and Ion Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of NSDI. Boston, MA, 2011.Google Scholar
Digital Library
- Jingwei Huang, David M. Nicol, and Roy H. Campbell. Denial-of-service threat to hadoop/yarn clusters with multi-tenancy. In Proc. of the IEEE International Congress on Big Data. Washington, DC, 2014. Google Scholar
Digital Library
- Alexandru Iosup, Nezih Yigitbasi, and Dick Epema. On the performance variability of production cloud services. In Proceedings of CCGRID. Newport Beach, CA, 2011. Google Scholar
Digital Library
- Vimalkumar Jeyakumar, Mohammad Alizadeh, David Mazières, Balaji Prabhakar, Changhoon Kim, and Albert Greenberg. Eyeq: Practical network performance isolation at the edge. In Proc. of the 10th USENIX Conference on Networked Systems Design and Implementation (NSDI). Lombard, IL, 2013.Google Scholar
- Yaakoub El Khamra, Hyunjoo Kim, Shantenu Jha, and Manish Parashar. Exploring the performance fluctuations of hpc workloads on clouds. In Proceedings of CloudCom. Indianapolis, IN, 2010. Google Scholar
Digital Library
- Krzysztof C. Kiwiel. Convergence and efficiency of subgradient methods for quasiconvex minimization. In Mathematical Programming (Series A) (Berlin, Heidelberg: Springer) 90 (1): pp. 1--25, 2001. Google Scholar
Cross Ref
- Ruby B. Lee. Rethinking computers for cybersecurity. IEEE Computer, 48(4):16--25, 2015. Google Scholar
Cross Ref
- Jacob Leverich and Christos Kozyrakis. Reconciling high server utilization and sub-millisecond quality-of-service. In Proceedings of EuroSys. Amsterdam, The Netherlands, 2014. Google Scholar
Digital Library
- Host server cpu utilization in amazon ec2 cloud. http://goo.gl/2LTx4T.Google Scholar
- Fangfei Liu, Yuval Yarom, Qian Ge, Gernot Heiser, and Ruby B. Lee. Last-level cache side-channel attacks are practical. In Proc. of IEEE Symposium on Security and Privacy (S&P). San Jose, CA, 2015. Google Scholar
Digital Library
- Fei Liu, Lanfang Ren, and Hongtao Bai. Mitigating cross-vm side channel attack on multiple tenants cloud platform. In Journal of Computers, Vol 9, No 4 (2014), 1005--1013, April 2014. Google Scholar
Cross Ref
- David Lo, Liqun Cheng, Rama Govindaraju, Luiz André Barroso, and Christos Kozyrakis. Towards energy proportionality for large-scale latency-critical workloads. In Proceedings of the 41st Annual International Symposium on Computer Architecuture (ISCA). Minneapolis, MN, 2014. Google Scholar
Cross Ref
- David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. Heracles: Improving resource efficiency at scale. In Proc. of the 42Nd Annual International Symposium on Computer Architecture (ISCA). Portland, OR, 2015.Google Scholar
Digital Library
- Mahout. http://mahout.apache.org/.Google Scholar
- Dave Mangot. Ec2 variability: The numbers revealed. http://tech.mangot.com/roller/dave/entry/ec2_variability_the_numbers_re%vealed.Google Scholar
- Jason Mars and Lingjia Tang. Whare-map: heterogeneity in "homogeneous" warehouse-scale computers. In Proceedings of ISCA. Tel-Aviv, Israel, 2013. Google Scholar
Digital Library
- Robert Martin, John Demme, and Simha Sethumadhavan. Timewarp: Rethinking timekeeping and performance monitoring mechanisms to mitigate side-channel attacks. In Proceedings of the International Symposium on Computer Architecture (ISCA). Portland, OR, 2012. Google Scholar
Digital Library
- David Meisner, Christopher M. Sadler, Luiz André Barroso, Wolf-Dietrich Weber, and Thomas F. Wenisch. Power management of online data-intensive services. In Proceedings of the 38th annual international symposium on Computer architecture, pages 319--330, 2011. Google Scholar
Digital Library
- Jelena Mirkovic and Peter Reiher. A taxonomy of ddos attack and ddos defense mechanisms. ACM SIGCOMM Computer Communication Review (CCR), April 2004.Google Scholar
Digital Library
- Thomas Moscibroda and Onur Mutlu. Memory performance attacks: Denial of memory service in multi-core systems. In Proc. of 16th USENIX Security Symposium on USENIX Security Symposium (SS). Boston, MA, 2007.Google Scholar
- Ripal Nathuji, Aman Kansal, and Alireza Ghaffarkhah. Q-clouds: Managing performance interference effects for qos-aware clouds. In Proceedings of EuroSys. Paris,France, 2010.Google Scholar
- Simon Ostermann, Alexandru Iosup, Nezih Yigitbasi, Radu Prodan, Thomas Fahringer, and Dick Epema. A performance analysis of ec2 cloud computing services for scientific computing. In Lecture Notes on Cloud Computing. Volume 34, p.115--131, 2010. Google Scholar
Cross Ref
- Tao Peng, Christopher Leckie, and Kotagiri Ramamohanarao. Survey of network-based defense mechanisms countering the dos and ddos problems. ACM Comput. Surv., 39(1), April 2007. Google Scholar
Digital Library
- Diego Perez-Botero, Jakub Szefer, and Ruby B. Lee. Characterizing hypervisor vulnerabilities in cloud computing servers. In Proceedings of the 2013 International Workshop on Security in Cloud Computing, [email protected]. Hangzhou, China, 2013. Google Scholar
Digital Library
- Moinuddin K. Qureshi and Yale N. Patt. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 39, 2006. Google Scholar
Digital Library
- Himanshu Raj, Ripal Nathuji, Abhishek Singh, and Paul England. Resource management for isolation enhanced cloud services. In Proc. of the ACM Workshop on Cloud Computing Security (CCSW). Chicago, IL, 2009. Google Scholar
Digital Library
- Suhail Rehman and Majd Sakr. Initial findings for provisioning variation in cloud computing. In Proceedings of CloudCom. Indianapolis, IN, 2010. Google Scholar
Digital Library
- Thomas Ristenpart, Eran Tromer, Hovav Shacham, and Stefan Savage. Hey, you, get off of my cloud: Exploring information leakage in third-party compute clouds. In Proc. of the ACM Conference on Computer and Communications Security (CCS). Chicago, IL, 2009. Google Scholar
Digital Library
- Daniel Sanchez and Christos Kozyrakis. Vantage: Scalable and Efficient Fine-Grain Cache Partitioning. In Proceedings of the 38th annual International Symposium in Computer Architecture (ISCA-38). San Jose, CA, June, 2011. Google Scholar
Digital Library
- Jörg Schad, Jens Dittrich, and Jorge-Arnulfo Quiané-Ruiz. Runtime measurements in the cloud: Observing, analyzing, and reducing variance. Proceedings VLDB Endow., 3(1--2):460--471, September 2010. Google Scholar
Digital Library
- Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, and John Wilkes. Omega: flexible, scalable schedulers for large compute clusters. In Proceedings of EuroSys. Prague, Czech Republic, 2013. Google Scholar
Digital Library
- Alan Shieh, Srikanth Kandula, Albert Greenberg, and Changhoon Kim. Seawall: Performance isolation for cloud datacenter networks. In Proc. of the USENIX Conference on Hot Topics in Cloud Computing (HotCloud). Boston, MA, 2010.Google Scholar
- David Shue, Michael J. Freedman, and Anees Shaikh. Performance isolation and fairness for multi-tenant cloud storage. In Proc. of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI). Hollywood, CA, 2012.Google Scholar
Digital Library
- Dan Tsafrir, Yoav Etsion, and Dror G. Feitelson. Secretly monopolizing the cpu without superuser privileges. In Proc. of 16th USENIX Security Symposium on USENIX Security Symposium. Boston, MA, 2007.Google Scholar
Digital Library
- Venkatanathan Varadarajan, Thawan Kooburat, Benjamin Farley, Thomas Ristenpart, and Michael M. Swift. Resource-freeing attacks: Improve your cloud performance (at your neighbor's expense). In Proc. of the ACM Conference on Computer and Communications Security (CCS). Raleigh, NC, 2012. Google Scholar
Digital Library
- Venkatanathan Varadarajan, Thomas Ristenpart, and Michael Swift. Scheduler-based defenses against cross-vm side-channels. In Proc. of the 23rd Usenix Security Symposium. San Diego, CA, 2014.Google Scholar
Digital Library
- Venkatanathan Varadarajan, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. A placement vulnerability study in multi-tenant public clouds. In Proc. of the 24th USENIX Security Symposium (USENIX Security). Washington, DC, 2015.Google Scholar
Digital Library
- Huaibin Wang, Haiyun Zhou, and Chundong Wang. Virtual machine-based intrusion detection system framework in cloud computing environment. In Journal of Computers, October 2012. Google Scholar
Cross Ref
- Hui Wang, Canturk Isci, Lavanya Subramanian, Jongmoo Choi, Depei Qian, and Onur Mutlu. A-drm: Architecture-aware distributed resource management of virtualized clusters. In Proceedings of the 11th ACM SIGPLAN/SIGOPS international conference on Virtual Execution Environments (VEE). Istanbul, Turkey, 2015.Google Scholar
- Ian H. Witten, Eibe Frank, and Geoffrey Holmes. Data Mining: Practical Machine Learning Tools and Techniques. 3rd Edition.Google Scholar
- Zhenyu Wu, Zhang Xu, and Haining Wang. Whispers in the hyper-space: High-speed covert channel attacks in the cloud. In Proc. of the 21st USENIX Conference on Security Symposium (USENIX Security). Bellevue, WA, 2012.Google Scholar
- Yunjing Xu, Michael Bailey, Farnam Jahanian, Kaustubh Joshi, Matti Hiltunen, and Richard Schlichting. An exploration of l2 cache covert channels in virtualized environments. In Proc. of the 3rd ACM Workshop on Cloud Computing Security Workshop (CCSW). Chicago, IL, 2011. Google Scholar
Digital Library
- Zhang Xu, Haining Wang, and Zhenyu Wu. A measurement study on co-residence threat inside the cloud. In Proc. of the 24th USENIX Security Symposium (USENIX Security). Washington, DC, 2015.Google Scholar
Digital Library
- Yuval Yarom and Katrina Falkner. FlushGoogle Scholar
- reload: a high resolution, low noise, l3 cache side-channel attack. In Proc. of the 23rd Usenix Security Symposium. San Diego, CA, 2014.Google Scholar
- Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J. Franklin, Scott Shenker, and Ion Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of NSDI. San Jose, CA, 2012.Google Scholar
- Yinqian Zhang, Ari Juels, Alina Oprea, and Michael K. Reiter. Homealone: Co-residency detection in the cloud via side-channel analysis. In Proc. of the IEEE Symposium on Security and Privacy. Oakland, CA, 2011. Google Scholar
Digital Library
- Yinqian Zhang, Ari Juels, Michael K. Reiter, and Thomas Ristenpart. Cross-tenant side-channel attacks in paas clouds. In Proc. of the ACM SIGSAC Conference on Computer and Communications Security (CCS). Scottsdale, AZ, 2014. Google Scholar
Digital Library
- Yinqian Zhang, Ari Juels, Michael K. Reiter, and Thomas Ristenpart. Cross-vm side channels and their use to extract private keys. In Proceedings of the ACM Conference on Computer and Communications Security (CCS). Raleigh, NC, 2012. Google Scholar
Digital Library
- Yinqian Zhang and Michael K. Reiter. Duppel: retrofitting commodity operating systems to mitigate cache side channels in the cloud. In Proc. of the ACM Conference on Computer and Communications Security (CCS). Berlin, Germany, 2013. Google Scholar
Digital Library
- Fangfei Zhou, Manish Goel, Peter Desnoyers, and Ravi Sundaram. Scheduler vulnerabilities and coordinated attacks in cloud computing. J. Comput. Secur., 21(4):533--559, July 2013. Google Scholar
Digital Library
- Jieming Zhu, Pinjia He, Zibin Zheng, and Michael R. Lyu. Towards online, accurate, and scalable qos prediction for runtime service adaptation. In Proc. of the IEEE International Conference on Distributed Computing Systems (ICDCS). Madrid, Spain, 2014. Google Scholar
Digital Library
- Sergey Zhuravlev, Sergey Blagodurov, and Alexandra Fedorova. Addressing shared resource contention in multicore processors via scheduling. In Proc. of the Fifteenth Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Pittsburgh, PA, 2010. Google Scholar
Digital Library
Index Terms
Bolt: I Know What You Did Last Summer... In The Cloud
Recommendations
Bolt: I Know What You Did Last Summer... In The Cloud
Asplos'17Cloud providers routinely schedule multiple applications per physical host to increase efficiency. The resulting interference on shared resources often leads to performance degradation and, more importantly, security vulnerabilities. Interference can ...
Bolt: I Know What You Did Last Summer... In The Cloud
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating SystemsCloud providers routinely schedule multiple applications per physical host to increase efficiency. The resulting interference on shared resources often leads to performance degradation and, more importantly, security vulnerabilities. Interference can ...
PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services
ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating SystemsMulti-tenancy in modern datacenters is currently limited to a single latency-critical, interactive service, running alongside one or more low-priority, best-effort jobs. This limits the efficiency gains from multi-tenancy, especially as an increasing ...







Comments