Abstract
As cloud and utility computing spreads, computer architects must ensure continued capability growth for the data centers that comprise the cloud. Given megawatt scale power budgets, increasing data center capability requires increasing computing hardware energy efficiency. To increase the data center's capability for work, the work done per Joule must increase. We pursue this efficiency even as the nature of data center applications evolves. Unlike traditional enterprise workloads, which are typically memory or I/O bound, big data computation and analytics exhibit greater compute intensity. This article examines the efficiency of mobile processors as a means for data center capability. In particular, we compare and contrast the performance and efficiency of the Microsoft Bing search engine executing on the mobile-class Atom processor and the server-class Xeon processor. Bing implements statistical machine learning to dynamically rank pages, producing sophisticated search results but also increasing computational intensity. While mobile processors are energy-efficient, they exact a price for that efficiency. The Atom is 5× more energy-efficient than the Xeon when comparing queries per Joule. However, search queries on Atom encounter higher latencies, different page results, and diminished robustness for complex queries. Despite these challenges, quality-of-service is maintained for most, common queries. Moreover, as different computational phases of the search engine encounter different bottlenecks, we describe implications for future architectural enhancements, application tuning, and system architectures. After optimizing the Atom server platform, a large share of power and cost go toward processor capability. With optimized Atoms, more servers can fit in a given data center power budget. For a data center with 15MW critical load, Atom-based servers increase capability by 3.2× for Bing.
- Abts, D., Marty, M., Wells, P., Klausler, P., and Liu, H. 2010. Energy proportional datacenter networks. In Proceedings of the 37th International Symposium on Computer Architecture (ISCA). ACM, New York, NY, 338--347. Google Scholar
Digital Library
- Andersen, D. G., Franklin, J., Kaminsky, M., Phanishayee, A., Tan, L., and Vasudevan, V. 2009. FAWN: A fast array of wimpy nodes. In Proceedings of the Symposium on Operating Systems Principles (SOSP). ACM, New York, NY, 1--14. Google Scholar
Digital Library
- Baek, W. and Chilimbi, T. M. 2010. Green: A framework for supporting energy-conscious programming using controlled approximation. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI). ACM, New York, NY, 198--209. Google Scholar
Digital Library
- Barroso, L. 2005. The price of performance. ACM Queue 3, 48--53. Google Scholar
Digital Library
- Barroso, L., Dean, J., and Hölzle, U. 2003. Web search for a planet: The Google cluster architecture. IEEE Micro 23, 22--28. Google Scholar
Digital Library
- Barroso, L., Gharachorloo, K., McNamara, R., Nowatzyk, A., Qadeer, S., Sano, B., Smith, S., Stets, R., and Verghese, B. 2000. Piranha: A scalable architecture based on single-chip multiprocessing. In Proceedings of the 37th International Symposium on Computer Architecture (ISCA). ACM, New York, NY, 282--293. Google Scholar
Digital Library
- Barroso, L. and Hölzle, U. 2007. The case for energy-proportional computing. IEEE Micro 40, 33--37. Google Scholar
Digital Library
- Barroso, L. and Hölzle, U. 2009. The Datacenter as a computer—An Introduction to the Design of Warehouse-Scale Machines. Morgan & Claypool. Google Scholar
Digital Library
- Bienia, C., Kumar, S., Singh, J. P., and Li, K. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT). ACM, New York, NY, 72--81. Google Scholar
Digital Library
- Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th International Conference on World Wide Web 7 (WWW). Elsevier Science Publishers B. V., Amsterdam, Netherlands, 107--117. Google Scholar
Digital Library
- Cisco Systems. 2009. Power and cooling savings with unified fabric. Tech. rep. C11-497077, Cisco Systems.Google Scholar
- Dally, W. J., Balfour, J., Black-Shaffer, D., Chen, J., Harting, R. C., Parikh, V., Park, J., and Sheffield, D. 2008. Efficient embedded computing. IEEE Computer 41, 27--32. Google Scholar
Digital Library
- Davis, J., Laudon, J., and Olukotun, K. 2005. Maximizing CMP throughput with mediocre cores. In Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE Computer Society, Los Alamitos, CA, 51--62. Google Scholar
Digital Library
- Dougherty, H. 2010. Facebook Reaches Top Ranking in US. http://weblogs.hitwise.com/heather-dougherty/2010/03/facebook_reaches_t op_ranking_i.html.Google Scholar
- Dubey, P. 2005. A platform 2015 workload model: Recognition, mining, and synthesis moves computers to the era of tera. Tech. rep., Intel Corporation.Google Scholar
- Eastwood, M., Harrington, D., and Broderick, K. 2009. Server Workloads 2009: Understanding the Applications Behind the Deployment. IDC.Google Scholar
- Gantz, J. F., Chute, C., Manfrediz, A., Minton, S., Reinsel, D., Schlichting, W., and Toncheva, A. 2008. The Diverse and Exploding Digital Universe. IDC.Google Scholar
- George, V., Jahagirdar, S., Tong, C., Smits, K., Damaraju, S., Siers, S., Naydenov, V., Khondker, T., Sarkar, S., and Singh, P. 2007. Penryn: 45-nm next generation Intel Core 2 processor. In Proceedings of the Asian Solid-State Circuits Conference (ASSCC). 14--17.Google Scholar
- Geppert, L. 2005. Sun's big splash. IEEE Spectrum.Google Scholar
- Gerosa, G., Curtis, S., D'Addeo, M., Jiang, B., Kuttanna, B., Merchant, F., Patel, B., Taufique, M., and Samarchi, H. 2008. A sub-1W to 2W low-power IA processor for mobile internet devices and ultra-mobile PCs in 45nm hi-K metal gate CMOS. In Proceedings of the International Solid-State Circuits Conference - Digest of Technical Papers (ISSCC). 256--257.Google Scholar
- Grochowski, E. and Annavaram, M. 2006. Energy per instruction trends in Intel microprocessors. [email protected] Mag., 1--8.Google Scholar
- Hamilton, J. 2008. Cost of power in large-scale data centers. http://perspectives.mvdirona.com.Google Scholar
- Heller, B., Seetharaman, S., Mahadevan, P., Yiakoumis, Y., Sharma, P., Banerjee, S., and McKeown, N. 2010. ElasticTree saving energy in data center networks. In Proceedings of the 7th Symposium on Networked Systems Design and Implementation (NSDI). USENIX Association, Berkeley, CA, 17--17. Google Scholar
Digital Library
- Intel Corporation. 2006. Intel 5000 series chipset memory controller hub (MCH). Tech. rep. 313067-001.Google Scholar
- Intel Corporation. 2008a. 45nm Intel Core 2 Duo Processor: BAClears. In Intel VTune Performance Analyzer 9.1 Help.Google Scholar
- Intel Corporation. 2008b. Intel desktop board D945GCLF2: Technical product specification. Tech. rep. E45013-002US.Google Scholar
- Intel Corporation. 2008c. Intel processor pricing. Tech. rep.Google Scholar
- Intel Corporation. 2009a. Intel server board S5000PAL/S5000XAL: Technical product specification. Tech. rep. D31979-010.Google Scholar
- Intel Corporation. 2009b. Volume 1 basic architecture. In Intel 64 and IA-32 Architectures: Software Developers Manual.Google Scholar
- Junee, R. 2009. Zoinks! 20 hours of video uploaded every minute! http://youtube-global.blogspot.com/2009/05/zoinks-20-hours-of-video-uploaded-every_20.html.Google Scholar
- Kogge, P. et al. 2008. Exascale Computing Study: Technology Challenges in Achieving Exascale Systems. Information Processing Techniques Office, Defense Advanced Research Projects Agency (DARPA).Google Scholar
- Kongetira, P., Aingaran, K., and Olukotun, K. 2005. Niagara: A 32-way multithreaded Sparc processor. IEEE Micro 25, 21--29. Google Scholar
Digital Library
- Kozyrakis, C., Kansal, A., Sankar, S., and Vaid, K. 2010. Server engineering insights for large-scale online services. IEEE Micro 30, 8--19. Google Scholar
Digital Library
- Kumar, R., Farkas, K., Jouppi, N., Ranganathan, P., and Tullsen, D. 2003. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In Proceedings of the 36th International Symposium on Microarchitecture (MICRO). IEEE Computer Society, Los Alamitos, CA, 81. Google Scholar
Digital Library
- Kumar, R. and Hinton, G. 2009. A family of 45nm IA processors. In Proceedings of the International Solid-State Circuits Conference - Digest of Technical Papers (ISSCC). 58--59.Google Scholar
- Kumar, R., Tullsen, D., Ranganathan, P., Jouppi, N., and Farkas, K. 2004. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance. In Proceedings of the 31st International Symposium on Computer Architecture (ISCA). IEEE Computer Society, Los Alamitos, CA, 64. Google Scholar
Digital Library
- Kuskin, J. S. et al. 2008. Incorporating flexibility in Anton, a specialized machine for molecular dynamics simulation. In Proceedings of the 14th International Symposium on High-Performance Computer Architecture (HPCA). IEEE, Los Alamitos, CA, 343--354.Google Scholar
- Larson, R. H. et al. 2008. High-throughput pairwise point interactions in Anton, a specialized machine for molecular dynamics simulation. In Proceedings of the 14th International Symposium on High-Performance Computer Architecture (HPCA). IEEE, Los Alamitos, CA, 331--342.Google Scholar
- Lee, B. C. and Brooks, D. M. 2007. Illustrative design space studies with microarchitectural regression models. In Proceedings of the 13th International Symposium on High Performance Computer Architecture (HPCA). IEEE Computer Society, Los Alamitos, CA, 340--351. Google Scholar
Digital Library
- Lim, K., Ranganathan, P., Chang, J., Patel, C., Mudge, T., and Reinhardt, S. 2008. Understanding and designing new server architectures for emerging warehouse-computing environments. In Proceedings of the 35th International Symposium on Computer Architecture (ISCA). IEEE Computer Society, Los Alamitos, CA, 315--326. Google Scholar
Digital Library
- Micron. 2005. Calculating memory system power for DDR2. Tech. rep. TN-47-04.Google Scholar
- Microsoft Corporation. 2009. Microsoft's new search at Bing.com helps people make better decisions. Tech. rep.Google Scholar
- Mienik, M. 2000. CPU burn-in homepage. http://users.bigpond.net.au/CPUburn.Google Scholar
- Mogul, J., Mudigonda, J., Binkert, N., Ranganathan, P., and Talwar, V. 2008. Using asymmetric single-ISA CMPs to save energy on operating systems. IEEE Micro 28, 26--41. Google Scholar
Digital Library
- Ranganathan, P., Irwin, P. L. D., and Chase, J. 2006. Ensemble-level power management for dense blade servers. In Proceedings of the 33rd International Symposium on Computer Architecture (ISCA). IEEE Computer Society, Los Alamitos, CA, 66--77. Google Scholar
Digital Library
- Ranganathan, P. and Jouppi, N. 2005. Enterprise IT trends and implications for architecture research. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA). IEEE Computer Society, Los Alamitos, CA, 253--256. Google Scholar
Digital Library
- Rao, A. 2010. SeaMicro technology overview. Tech. rep., SeaMicro.Google Scholar
- Rao, A. 2011. SeaMicro SM10000-64 system overview. Tech. rep., SeaMicro.Google Scholar
- Reddi, V., Lee, B., Chilimbi, T., and Vaid, K. 2010. Web search using mobile cores: Quantifying and mitigating the price of efficiency. In Proceedings of the 37th International Symposium on Computer Architecture (ISCA). ACM, New York, NY, 314--325. Google Scholar
Digital Library
- Seagate. 2010. Barracuda product manual. Tech. rep. 100636864.Google Scholar
- SeaMicro. 2011. SeaMicro now shipping the world's most energy efficient x86 server with new 64-bit Intel Atom N570 processor. Tech. rep.Google Scholar
- Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., and Hanrahan, P. 2008. Larrabee: A many-core x86 architecture for visual computing. ACM Trans. Graph. 27, 18:1--18:15. Google Scholar
Digital Library
- Shaw, D. E. et al. 2007. Anton: A special-purpose machine for molecular dynamics simulation. In Proceedings of the 34th International Symposium on Computer Architecture (ISCA). ACM, New York, NY, 1--12. Google Scholar
Digital Library
- Suleman, M. A., Mutlu, O., Qureshi, M. K., and Patt, Y. N. 2009. Accelerating critical section execution with asymmetric multi-core architectures. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, New York, NY, 253--264. Google Scholar
Digital Library
- Vasudevan, V., Andersen, D., Kaminsky, M., Tan, L., Franklin, J., and Moraru, I. 2010. Energy-efficient cluster computing with FAWN: Workloads and implications. In Proceedings of the 1st International Conference on Energy-Efficient Computer and Networking. ACM, New York, NY, 195--204. Google Scholar
Digital Library
- VMware. 2009. Vmmark benchmark. www.vmware.com/products/vmmark.Google Scholar
- Wehner, M., Oliker, L., and Shalf, J. 2008. Towards ultra-high resolution models of climate and weather. Int. J. High Perform. Comput. Appl. 22, 149--165. Google Scholar
Digital Library
Index Terms
(auto-classified)Mobile processors for energy-efficient web search
Recommendations
Web search using mobile cores: quantifying and mitigating the price of efficiency
ISCA '10The commoditization of hardware, data center economies of scale, and Internet-scale workload growth all demand greater power efficiency to sustain scalability. Traditional enterprise workloads, which are typically memory and I/O bound, have been well ...
Identifying popular search goals behind search queries to improve web search ranking
AIRS'11: Proceedings of the 7th Asia conference on Information Retrieval TechnologyWeb users usually have a certain search goal before they submit a search query. However, many laypersons can't transform their search goals into suitable queries. Thus, understanding original search goals behind a query is very important for search ...
Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge ManagementIn this paper, we propose a new idea called ranking consistency in web search. Relevance ranking is one of the biggest problems in creating an effective web search system. Given some queries with similar search intents, conventional approaches typically ...






Comments