Abstract
Consistent low response time is essential for e-commerce due to intense competitive pressure. However, practitioners of web applications have often encountered the long-tail response time problem in cloud data centers as the system utilization reaches moderate levels (e.g., 50%). Our fine-grained measurements of an open source n-tier benchmark application (RUBBoS) show such long response times are often caused by Cross-tier Queue Overflow (CTQO). Our experiments reveal the CTQO is primarily created by the synchronous nature of RPC-style call/response inter-tier communications, which create strong inter-tier dependencies due to the request processing chain of classic n-tier applications composed of synchronous RPC/thread-based servers. We remove gradually the dependencies in n-tier applications by replacing the classic synchronous servers (e.g., Apache, Tomcat, and MySQL) with their corresponding event-driven asynchronous version (e.g., Nginx, XTomcat, and XMySQL) one-by-one. Our measurements with two application scenarios (virtual machine co-location and background monitoring interference) show that replacing a subset of asynchronous servers will shift the CTQO, without significant improvements in long-tail response time. Only when all the servers become asynchronous the CTQO is resolved. In synchronous n-tier applications, long-tail response times resulting from CTQO arise at utilization as low as 43%. On the other hand, the completely asynchronous n-tier system can disrupt CTQO and remove the long tail latency at utilization as high as 83%.
- Stephen Adler. 1999. The Slashdot effect: An analysis of three Internet publications. Linux Gazette 38 (1999), 2.Google Scholar
- Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data center TCP (DCTCP). In Proceedings of the ACM SIGCOMM 2010 Conference. 63--74. Google Scholar
Digital Library
- Mohammad Alizadeh, Abdul Kabbani, Tom Edsall, Balaji Prabhakar, Amin Vahdat, and Masato Yasuda. 2012. Less is more: Trading a little bandwidth for ultra-low latency in the data center. In Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI’12). 253--266. Google Scholar
Digital Library
- Apache Software Foundation. 2019. Java Non Blocking Connector (NIO). Retrieved from https://tomcat.apache.org/tomcat-7.0-doc/config/http.html.Google Scholar
- Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. 2003. Xen and the art of virtualization. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP’03). 164--177. Google Scholar
Digital Library
- Daniel S. Berger, Benjamin Berg, Timothy Zhu, Siddhartha Sen, and Mor Harchol-Balter. 2018. RobinHood: Tail latency aware caching--dynamic reallocation from cache-rich to cache-poor. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18). 195--212. Google Scholar
Digital Library
- Andrew D. Birrell and Bruce Jay Nelson. 1984. Implementing remote procedure calls. ACM Trans. Comput. Syst. 2, 1 (Feb. 1984), 39--59. Google Scholar
Digital Library
- Peter Bodik, Armando Fox, Michael J. Franklin, Michael I. Jordan, and David A. Patterson. 2010. Characterizing, modeling, and generating workload spikes for stateful services. In Proceedings of the 1st ACM Symposium on Cloud Computing. ACM, 241--252. Google Scholar
Digital Library
- Hui Chen, Qingyang Wang, Balaji Palanisamy, and Pengcheng Xiong. 2017. DCM: Dynamic concurrency management for scaling n-tier applications in cloud. In Proceedings of the IEEE 37th International Conference on Distributed Computing Systems (ICDCS’17). IEEE, 2097--2104.Google Scholar
Cross Ref
- Frank Dabek, Nickolai Zeldovich, Frans Kaashoek, David Mazires, and Robert Morris. 2002. Event-driven programming for robust software. In Proceedings of the 10th ACM SIGOPS European Workshop. 186--189. Google Scholar
Digital Library
- James Davis, Arun Thekumparampil, and Dongyoon Lee. 2017. Node. fz: Fuzzing the server-side event-driven architecture. In Proceedings of the T12th European Conference on Computer Systems. ACM, 145--160. Google Scholar
Digital Library
- Jeffrey Dean and Luiz André Barroso. 2013. The tail at scale. Commun. ACM 56, 2 (2013), 74--80. Google Scholar
Digital Library
- Christina Delimitrou and Christos Kozyrakis. 2018. Amdahl’s law for tail latency. Commun. ACM 61, 8 (2018), 65--72.Google Scholar
Digital Library
- Qi Fan and Qingyang Wang. 2015. Performance comparison of web servers with different architectures: A case study using high concurrency workload. In Proceedings of the 3rd IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb’15). IEEE. Google Scholar
Digital Library
- Jim Gettys and Kathleen Nichols. 2012. Bufferbloat: Dark buffers in the internet. Commun. ACM 55, 1 (2012), 57--65. Google Scholar
Digital Library
- Google Code Archive. 2009. Non-Blocking (asynchronous) MySQL Connector for Java. Retrieved from https://code.google.com/archive/p/async-mysql-connector/.Google Scholar
- Sriram Govindan, Jie Liu, Aman Kansal, and Anand Sivasubramaniam. 2011. Cuanta: Quantifying effects of shared on-chip resource interference for consolidated virtual machines. In Proceedings of the 2nd ACM Symposium on Cloud Computing (SoCC’11). 22. Google Scholar
Digital Library
- Ashif S. Harji, Peter A. Buhr, and Tim Brecht. 2012. Comparing high-performance multi-core web-server architectures. In Proceedings of the 5th Annual International Systems and Storage Conference. 1. Google Scholar
Digital Library
- Instagram Engineering. 2018. Open-sourcing a 10x reduction in Apache Cassandra tail latency. Retrieved from https://instagram-engineering.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589.Google Scholar
- Virajith Jalaparti, Peter Bodik, Srikanth Kandula, Ishai Menache, Mikhail Rybalkin, and Chenyu Yan. 2013. Speeding up distributed request-response workflows. In ACM SIGCOMM Computer Communication Review, vol. 43. ACM, 219--230. Google Scholar
Digital Library
- Deepal Jayasinghe, Calton Pu, Tamar Eilam, Malgorzata Steinder, Ian Whally, and Ed Snible. 2011. Improving performance and availability of services hosted on iaas clouds with structural constraint-aware virtual machine placement. In Proceedings of the IEEE International Conference on Services Computing (SCC’11). IEEE, 72--79. Google Scholar
Digital Library
- Myeongjae Jeon, Yuxiong He, Hwanju Kim, Sameh Elnikety, Scott Rixner, and Alan L. Cox. 2016. TPC: Target-driven parallelism combining prediction and correction to reduce tail latency in interactive services. In Proceedings of the 21st International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 129--141. Google Scholar
Digital Library
- Yasuhiko Kanemasa, Qingyang Wang, Jack Li, Masazumi Matsubara, and Calton Pu. 2013. Revisiting performance interference among consolidated n-tier applications: Sharing is better than isolation. In Proceedings of the 10th IEEE International Conference on Services Computing (SCC’13). 136--143. Google Scholar
Digital Library
- Rishi Kapoor, George Porter, Malveeka Tewari, Geoffrey M. Voelker, and Amin Vahdat. 2012. Chronos: Predictable low latency for data center applications. In Proceedings of the 3rd ACM Symposium on Cloud Computing (SoCC’12). 9:1--9:14.Google Scholar
Digital Library
- Ron Kohavi and Roger Longbotham. 2007. Online experiments: Lessons learned. Computer 40, 9 (2007), 103--105. Google Scholar
Digital Library
- Maxwell N. Krohn, Eddie Kohler, and M. Frans Kaashoek. 2007. Events can make sense. In Proceedings of the USENIX Annual Technical Conference. 87--100. Google Scholar
Digital Library
- Jacob Leverich and Christos Kozyrakis. 2014. Reconciling high server utilization and sub-millisecond quality-of-service. In Proceedings of the 9th European Conference on Computer Systems (EuroSys’14). 4:1--4:14. Google Scholar
Digital Library
- Ding Li, James Mickens, Suman Nath, and Lenin Ravindranath. 2015. Domino: Understanding wide-area, asynchronous event causality in web applications. In Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC’15). ACM, New York, NY, 182--188. Google Scholar
Digital Library
- Jialin Li, Naveen Kr. Sharma, Dan R. K. Ports, and Steven D. Gribble. 2014. Tales of the tail: Hardware, OS, and application-level sources of tail latency. In Proceedings of the ACM Symposium on Cloud Computing (SOCC’14). New York, NY. Google Scholar
Digital Library
- Harold C. Lim, Shivnath Babu, and Jeffrey S. Chase. 2010. Automated control for elastic storage. In Proceedings of the IEEE International Conference on Autonomic Computing (ICAC’10). Google Scholar
Digital Library
- LinkedIn Engineering. 2015. Who moved my 99th percentile latency. Retrieved from https://engineering.linkedin.com/performance/who-moved-my-99th-percentile-latency.Google Scholar
- David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2016. Improving resource efficiency at scale with heracles. ACM Trans. Comput. Syst. 34 (2016), 6:1--6:33. Retrieved from http://dl.acm.org/citation.cfm?id=2882783. Google Scholar
Digital Library
- Simon Malkowski, Yasuhiko Kanemasa, Hanwei Chen, Masao Yamamoto, Qingyang Wang, Deepal Jayasinghe, Calton Pu, and Motoyuki Kawaba. 2012. Challenges and opportunities in consolidation at high resource utilization: Non-monotonic response time variations in n-tier applications. In Proceedings of the IEEE 5th International Conference on Cloud Computing (CLOUD’12). IEEE, 162--169. Google Scholar
Digital Library
- Ningfang Mi, Giuliano Casale, Ludmila Cherkasova, and Evgenia Smirni. 2008. Burstiness in multi-tier applications: Symptoms, causes, and new models. In Proceedings of the ACM/IFIP/USENIX 9th International Middleware Conference (Middleware’08). 265--286. Google Scholar
Digital Library
- Ningfang Mi, Giuliano Casale, Ludmila Cherkasova, and Evgenia Smirni. 2009. Injecting realistic burstiness to a traditional client-server benchmark. In Proceedings of the 6th International Conference on Autonomic computing (ICAC’09). 149--158. Google Scholar
Digital Library
- Jeffrey C. Mogul. 2006. Emergent (mis) behavior vs. complex software systems. ACM SIGOPS Operat. Syst. Rev. 40, 4 (2006), 293--304. Google Scholar
Digital Library
- Ripal Nathuji, Aman Kansal, and Alireza Ghaffarkhah. 2010. Q-clouds: Managing performance interference effects for qos-aware clouds. In Proceedings of the 5th European Conference on Computer Systems. ACM, 237--250. Google Scholar
Digital Library
- NGINX. 2017. nginx. Retrieved from http://nginx.org/.Google Scholar
- Dejan Novaković, Nedeljko Vasić, Stanko Novaković, Dejan Kostić, and Ricardo Bianchini. 2013. DeepDive: Transparently identifying and managing performance interference in virtualized environments. In Proceedings of the 2013 USENIX Annual Technical Conference. 219--230. Google Scholar
Digital Library
- ObjectWeb Consortium. 2005. RUBBoS: Bulletin board benchmark. Retrieved from http://jmob.ow2.org/rubbos.html.Google Scholar
- David Pariag, Tim Brecht, Ashif Harji, Peter Buhr, Amol Shukla, and David R. Cheriton. 2007. Comparing the performance of web server architectures. In ACM SIGOPS Operating Systems Review, vol. 41. 231--243. Google Scholar
Digital Library
- Junhee Park, Qingyang Wang, Jack Li, Chien-An Lai, Tao Zhu, and Calton Pu. 2016. Performance interference of memory thrashing in virtualized cloud environments: A study of consolidated n-tier applications. In Proceedings of the IEEE 9th International Conference on Cloud Computing (CLOUD’16). IEEE, 276--283.Google Scholar
Cross Ref
- Vern Paxson, Mark Allman, Jerry Chu, and Matt Sargent. 2011. Computing TCP’s Retransmission Timer. Technical Report. Google Scholar
- Bill Snyder. 2010. Server virtualization has stalled, despite the hype. Retrieved from https://www.infoworld.com/article/2624771/server-virtualization-has-stalled--despite-the-hype.html.Google Scholar
- SOURCEFORGE. 2018. Collectl. Retrieved from http://collectl.sourceforge.net/.Google Scholar
- Lalith Suresh, Marco Canini, Stefan Schmid, and Anja Feldmann. 2015. C3: Cutting tail latency in cloud data stores via adaptive replica selection. In Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation (NSDI’15). 513--527. Retrieved from http://dl.acm.org/citation.cfm?id=2789770.2789806. Google Scholar
Digital Library
- David Terei and Amit Levy. 2015. Blade: A data center garbage collector. arXiv preprint arXiv:1504.02578.Google Scholar
- The Apache Software Foundation. 2018. Apache Flink. Retrieved from https://flink.apache.org/.Google Scholar
- The Apache Software Foundation. 2018. Apache Storm. Retrieved from http://storm.apache.org.Google Scholar
- Thibaud Lopez Schneider. 2008. Writing Effective Asynchronous XmlHttpRequests. Retrieved from https://www.thibaudlopez.net/xhr/Writing%20effective%20asynchronous%20XmlHttpRequests.pdf.Google Scholar
- Robert von Behren, Jeremy Condit, and Eric Brewer. 2003. Why events are a bad idea (for high-concurrency servers). In Proceedings of the 9th Workshop on Hot Topics in Operating Systems (HotOS’03). 19--24. Google Scholar
Digital Library
- Rob Von Behren, Jeremy Condit, Feng Zhou, George C. Necula, and Eric Brewer. 2003. Capriccio: Scalable threads for internet services. In ACM SIGOPS Operating Systems Review, vol. 37. 268--281. Google Scholar
Digital Library
- Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Randy Katz, and Ion Stoica. 2012. Cake: Enabling high-level SLOs on shared storage systems. In Proceedings of the 3rd ACM Symposium on Cloud Computing (SoCC’12). ACM, New York, NY. Google Scholar
Digital Library
- Qingyang Wang, Hui Chen, Shungeng Zhang, Liting Hu, and Balaji Palanisamy. 2019. Integrating concurrency control in n-tier application scaling management in the cloud. IEEE Trans. Parallel Distrib. Syst. 30, 4 (2019), 855--869. Google Scholar
Digital Library
- Qingyang Wang, Yasuhiko Kanemasa, Chien-An Li, Jack Lai, Masazumi Matsubara, and Calton Pu. 2013. Impact of DVFS on n-tier application performance. In Proceedings of ACM Conference on Timely Results in Operating Systems (TRIOS’13). 33--42. Google Scholar
Digital Library
- Qingyang Wang, Yasuhiko Kanemasa, Jack Li, Deepal Jayasinghe, Toshihiro Shimizu, Masazumi Matsubara, Motoyuki Kawaba, and Calton Pu. 2013. Detecting transient bottlenecks in n-tier applications through fine-grained analysis. In Proceedings of the 33rd IEEE International Conference on Distributed Computing Systems (ICDCS’13). 31--40. Google Scholar
Digital Library
- Qingyang Wang, Yasuhiko Kanemasa, Jack Li, Chien-An Lai, Chien-An Cho, Yuji Nomura, and Calton Pu. 2014. Lightning in the cloud: A study of very short bottlenecks on n-tier web application performance. In Proceedings of USENIX Conference on Timely Results in Operating Systems (TRIOS’14). Google Scholar
Digital Library
- Qingyang Wang, Simon Malkowski, Yasuhiko Kanemasa, Deepal Jayasinghe, Pengcheng Xiong, Calton Pu, Motoyuki Kawaba, and Lilian Harada. 2011. The impact of soft resource allocation on n-tier application scalability. In Proceedings of the 25th IEEE International Parallel 8 Distributed Processing Symposium (IPDPS’11). 1034--1045. Google Scholar
Digital Library
- Matt Welsh, David Culler, and Eric Brewer. 2001. SEDA: An architecture for well-conditioned, scalable internet services. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’01). 230--243. Google Scholar
Digital Library
- Yunjing Xu, Michael Bailey, Brian Noble, and Farnam Jahanian. 2013. Small is better: Avoiding latency traps in virtualized data centers. In Proceedings of the 4th Annual Symposium on Cloud Computing (SOCC’13). Google Scholar
Digital Library
- Yunjing Xu, Zachary Musgrave, Brian Noble, and Michael Bailey. 2013. Bobtail: Avoiding long tails in the cloud. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI’13). 329--342. Google Scholar
Digital Library
- Shungeng Zhang, Qingyang Wang, and Yasuhiko Kanemas. 2018. Improving asynchronous invocation performance in client-server systems. In Proceedings of the IEEE 38th International Conference on Distributed Computing Systems (ICDCS’18). IEEE, 907--917.Google Scholar
Cross Ref
- Timothy Zhu, Alexey Tumanov, Michael A. Kozuch, Mor Harchol-Balter, and Gregory R. Ganger. 2014. PriorityMeister: Tail latency QoS for shared networked storage. In Proceedings of the ACM Symposium on Cloud Computing (SOCC’14). ACM, New York, NY. Google Scholar
Digital Library
Index Terms
Mitigating Tail Response Time of n-Tier Applications: The Impact of Asynchronous Invocations
Recommendations
Migration of Multi-tier Applications to Infrastructure-as-a-Service Clouds: An Investigation Using Kernel-Based Virtual Machines
GRID '11: Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid ComputingTo investigate challenges of multi-tier application migration to Infrastructure-as-a-Service (IaaS) clouds we performed an experimental investigation by deploying a processor bound and input-output bound variant of the RUSLE2 erosion model to an IaaS ...
Applications-aware virtual machine provisioning
VTDC '13: Proceedings of the 7th international workshop on Virtualization technologies in distributed computingVirtualization is widely used in large-scale computing environments, such as clouds, data centers, and grids, to provide multi-tenancy while retaining application isolation and portability. The consumers and providers of such virtualized infrastructures ...
Revisiting Performance Interference among Consolidated n-Tier Applications: Sharing is Better Than Isolation
SCC '13: Proceedings of the 2013 IEEE International Conference on Services ComputingPerformance unpredictability is one of the major concerns slowing down the migration of mission-critical applications into cloud computing infrastructures. An example of non-intuitive result is the measured n-tier application performance in a ...






Comments