skip to main content
research-article

Mobile processors for energy-efficient web search

Published:30 August 2011Publication History
Skip Abstract Section

Abstract

As cloud and utility computing spreads, computer architects must ensure continued capability growth for the data centers that comprise the cloud. Given megawatt scale power budgets, increasing data center capability requires increasing computing hardware energy efficiency. To increase the data center's capability for work, the work done per Joule must increase. We pursue this efficiency even as the nature of data center applications evolves. Unlike traditional enterprise workloads, which are typically memory or I/O bound, big data computation and analytics exhibit greater compute intensity. This article examines the efficiency of mobile processors as a means for data center capability. In particular, we compare and contrast the performance and efficiency of the Microsoft Bing search engine executing on the mobile-class Atom processor and the server-class Xeon processor. Bing implements statistical machine learning to dynamically rank pages, producing sophisticated search results but also increasing computational intensity. While mobile processors are energy-efficient, they exact a price for that efficiency. The Atom is 5× more energy-efficient than the Xeon when comparing queries per Joule. However, search queries on Atom encounter higher latencies, different page results, and diminished robustness for complex queries. Despite these challenges, quality-of-service is maintained for most, common queries. Moreover, as different computational phases of the search engine encounter different bottlenecks, we describe implications for future architectural enhancements, application tuning, and system architectures. After optimizing the Atom server platform, a large share of power and cost go toward processor capability. With optimized Atoms, more servers can fit in a given data center power budget. For a data center with 15MW critical load, Atom-based servers increase capability by 3.2× for Bing.

References

  1. Abts, D., Marty, M., Wells, P., Klausler, P., and Liu, H. 2010. Energy proportional datacenter networks. In Proceedings of the 37th International Symposium on Computer Architecture (ISCA). ACM, New York, NY, 338--347. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andersen, D. G., Franklin, J., Kaminsky, M., Phanishayee, A., Tan, L., and Vasudevan, V. 2009. FAWN: A fast array of wimpy nodes. In Proceedings of the Symposium on Operating Systems Principles (SOSP). ACM, New York, NY, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Baek, W. and Chilimbi, T. M. 2010. Green: A framework for supporting energy-conscious programming using controlled approximation. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI). ACM, New York, NY, 198--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Barroso, L. 2005. The price of performance. ACM Queue 3, 48--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Barroso, L., Dean, J., and Hölzle, U. 2003. Web search for a planet: The Google cluster architecture. IEEE Micro 23, 22--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Barroso, L., Gharachorloo, K., McNamara, R., Nowatzyk, A., Qadeer, S., Sano, B., Smith, S., Stets, R., and Verghese, B. 2000. Piranha: A scalable architecture based on single-chip multiprocessing. In Proceedings of the 37th International Symposium on Computer Architecture (ISCA). ACM, New York, NY, 282--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Barroso, L. and Hölzle, U. 2007. The case for energy-proportional computing. IEEE Micro 40, 33--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Barroso, L. and Hölzle, U. 2009. The Datacenter as a computer—An Introduction to the Design of Warehouse-Scale Machines. Morgan & Claypool. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bienia, C., Kumar, S., Singh, J. P., and Li, K. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT). ACM, New York, NY, 72--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th International Conference on World Wide Web 7 (WWW). Elsevier Science Publishers B. V., Amsterdam, Netherlands, 107--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Cisco Systems. 2009. Power and cooling savings with unified fabric. Tech. rep. C11-497077, Cisco Systems.Google ScholarGoogle Scholar
  12. Dally, W. J., Balfour, J., Black-Shaffer, D., Chen, J., Harting, R. C., Parikh, V., Park, J., and Sheffield, D. 2008. Efficient embedded computing. IEEE Computer 41, 27--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Davis, J., Laudon, J., and Olukotun, K. 2005. Maximizing CMP throughput with mediocre cores. In Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE Computer Society, Los Alamitos, CA, 51--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dougherty, H. 2010. Facebook Reaches Top Ranking in US. http://weblogs.hitwise.com/heather-dougherty/2010/03/facebook_reaches_t op_ranking_i.html.Google ScholarGoogle Scholar
  15. Dubey, P. 2005. A platform 2015 workload model: Recognition, mining, and synthesis moves computers to the era of tera. Tech. rep., Intel Corporation.Google ScholarGoogle Scholar
  16. Eastwood, M., Harrington, D., and Broderick, K. 2009. Server Workloads 2009: Understanding the Applications Behind the Deployment. IDC.Google ScholarGoogle Scholar
  17. Gantz, J. F., Chute, C., Manfrediz, A., Minton, S., Reinsel, D., Schlichting, W., and Toncheva, A. 2008. The Diverse and Exploding Digital Universe. IDC.Google ScholarGoogle Scholar
  18. George, V., Jahagirdar, S., Tong, C., Smits, K., Damaraju, S., Siers, S., Naydenov, V., Khondker, T., Sarkar, S., and Singh, P. 2007. Penryn: 45-nm next generation Intel Core 2 processor. In Proceedings of the Asian Solid-State Circuits Conference (ASSCC). 14--17.Google ScholarGoogle Scholar
  19. Geppert, L. 2005. Sun's big splash. IEEE Spectrum.Google ScholarGoogle Scholar
  20. Gerosa, G., Curtis, S., D'Addeo, M., Jiang, B., Kuttanna, B., Merchant, F., Patel, B., Taufique, M., and Samarchi, H. 2008. A sub-1W to 2W low-power IA processor for mobile internet devices and ultra-mobile PCs in 45nm hi-K metal gate CMOS. In Proceedings of the International Solid-State Circuits Conference - Digest of Technical Papers (ISSCC). 256--257.Google ScholarGoogle Scholar
  21. Grochowski, E. and Annavaram, M. 2006. Energy per instruction trends in Intel microprocessors. [email protected] Mag., 1--8.Google ScholarGoogle Scholar
  22. Hamilton, J. 2008. Cost of power in large-scale data centers. http://perspectives.mvdirona.com.Google ScholarGoogle Scholar
  23. Heller, B., Seetharaman, S., Mahadevan, P., Yiakoumis, Y., Sharma, P., Banerjee, S., and McKeown, N. 2010. ElasticTree saving energy in data center networks. In Proceedings of the 7th Symposium on Networked Systems Design and Implementation (NSDI). USENIX Association, Berkeley, CA, 17--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Intel Corporation. 2006. Intel 5000 series chipset memory controller hub (MCH). Tech. rep. 313067-001.Google ScholarGoogle Scholar
  25. Intel Corporation. 2008a. 45nm Intel Core 2 Duo Processor: BAClears. In Intel VTune Performance Analyzer 9.1 Help.Google ScholarGoogle Scholar
  26. Intel Corporation. 2008b. Intel desktop board D945GCLF2: Technical product specification. Tech. rep. E45013-002US.Google ScholarGoogle Scholar
  27. Intel Corporation. 2008c. Intel processor pricing. Tech. rep.Google ScholarGoogle Scholar
  28. Intel Corporation. 2009a. Intel server board S5000PAL/S5000XAL: Technical product specification. Tech. rep. D31979-010.Google ScholarGoogle Scholar
  29. Intel Corporation. 2009b. Volume 1 basic architecture. In Intel 64 and IA-32 Architectures: Software Developers Manual.Google ScholarGoogle Scholar
  30. Junee, R. 2009. Zoinks! 20 hours of video uploaded every minute! http://youtube-global.blogspot.com/2009/05/zoinks-20-hours-of-video-uploaded-every_20.html.Google ScholarGoogle Scholar
  31. Kogge, P. et al. 2008. Exascale Computing Study: Technology Challenges in Achieving Exascale Systems. Information Processing Techniques Office, Defense Advanced Research Projects Agency (DARPA).Google ScholarGoogle Scholar
  32. Kongetira, P., Aingaran, K., and Olukotun, K. 2005. Niagara: A 32-way multithreaded Sparc processor. IEEE Micro 25, 21--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kozyrakis, C., Kansal, A., Sankar, S., and Vaid, K. 2010. Server engineering insights for large-scale online services. IEEE Micro 30, 8--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Kumar, R., Farkas, K., Jouppi, N., Ranganathan, P., and Tullsen, D. 2003. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In Proceedings of the 36th International Symposium on Microarchitecture (MICRO). IEEE Computer Society, Los Alamitos, CA, 81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Kumar, R. and Hinton, G. 2009. A family of 45nm IA processors. In Proceedings of the International Solid-State Circuits Conference - Digest of Technical Papers (ISSCC). 58--59.Google ScholarGoogle Scholar
  36. Kumar, R., Tullsen, D., Ranganathan, P., Jouppi, N., and Farkas, K. 2004. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance. In Proceedings of the 31st International Symposium on Computer Architecture (ISCA). IEEE Computer Society, Los Alamitos, CA, 64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Kuskin, J. S. et al. 2008. Incorporating flexibility in Anton, a specialized machine for molecular dynamics simulation. In Proceedings of the 14th International Symposium on High-Performance Computer Architecture (HPCA). IEEE, Los Alamitos, CA, 343--354.Google ScholarGoogle Scholar
  38. Larson, R. H. et al. 2008. High-throughput pairwise point interactions in Anton, a specialized machine for molecular dynamics simulation. In Proceedings of the 14th International Symposium on High-Performance Computer Architecture (HPCA). IEEE, Los Alamitos, CA, 331--342.Google ScholarGoogle Scholar
  39. Lee, B. C. and Brooks, D. M. 2007. Illustrative design space studies with microarchitectural regression models. In Proceedings of the 13th International Symposium on High Performance Computer Architecture (HPCA). IEEE Computer Society, Los Alamitos, CA, 340--351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Lim, K., Ranganathan, P., Chang, J., Patel, C., Mudge, T., and Reinhardt, S. 2008. Understanding and designing new server architectures for emerging warehouse-computing environments. In Proceedings of the 35th International Symposium on Computer Architecture (ISCA). IEEE Computer Society, Los Alamitos, CA, 315--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Micron. 2005. Calculating memory system power for DDR2. Tech. rep. TN-47-04.Google ScholarGoogle Scholar
  42. Microsoft Corporation. 2009. Microsoft's new search at Bing.com helps people make better decisions. Tech. rep.Google ScholarGoogle Scholar
  43. Mienik, M. 2000. CPU burn-in homepage. http://users.bigpond.net.au/CPUburn.Google ScholarGoogle Scholar
  44. Mogul, J., Mudigonda, J., Binkert, N., Ranganathan, P., and Talwar, V. 2008. Using asymmetric single-ISA CMPs to save energy on operating systems. IEEE Micro 28, 26--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Ranganathan, P., Irwin, P. L. D., and Chase, J. 2006. Ensemble-level power management for dense blade servers. In Proceedings of the 33rd International Symposium on Computer Architecture (ISCA). IEEE Computer Society, Los Alamitos, CA, 66--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Ranganathan, P. and Jouppi, N. 2005. Enterprise IT trends and implications for architecture research. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA). IEEE Computer Society, Los Alamitos, CA, 253--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Rao, A. 2010. SeaMicro technology overview. Tech. rep., SeaMicro.Google ScholarGoogle Scholar
  48. Rao, A. 2011. SeaMicro SM10000-64 system overview. Tech. rep., SeaMicro.Google ScholarGoogle Scholar
  49. Reddi, V., Lee, B., Chilimbi, T., and Vaid, K. 2010. Web search using mobile cores: Quantifying and mitigating the price of efficiency. In Proceedings of the 37th International Symposium on Computer Architecture (ISCA). ACM, New York, NY, 314--325. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Seagate. 2010. Barracuda product manual. Tech. rep. 100636864.Google ScholarGoogle Scholar
  51. SeaMicro. 2011. SeaMicro now shipping the world's most energy efficient x86 server with new 64-bit Intel Atom N570 processor. Tech. rep.Google ScholarGoogle Scholar
  52. Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., and Hanrahan, P. 2008. Larrabee: A many-core x86 architecture for visual computing. ACM Trans. Graph. 27, 18:1--18:15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Shaw, D. E. et al. 2007. Anton: A special-purpose machine for molecular dynamics simulation. In Proceedings of the 34th International Symposium on Computer Architecture (ISCA). ACM, New York, NY, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Suleman, M. A., Mutlu, O., Qureshi, M. K., and Patt, Y. N. 2009. Accelerating critical section execution with asymmetric multi-core architectures. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, New York, NY, 253--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Vasudevan, V., Andersen, D., Kaminsky, M., Tan, L., Franklin, J., and Moraru, I. 2010. Energy-efficient cluster computing with FAWN: Workloads and implications. In Proceedings of the 1st International Conference on Energy-Efficient Computer and Networking. ACM, New York, NY, 195--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. VMware. 2009. Vmmark benchmark. www.vmware.com/products/vmmark.Google ScholarGoogle Scholar
  57. Wehner, M., Oliker, L., and Shalf, J. 2008. Towards ultra-high resolution models of climate and weather. Int. J. High Perform. Comput. Appl. 22, 149--165. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

(auto-classified)
  1. Mobile processors for energy-efficient web search

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Computer Systems
      ACM Transactions on Computer Systems  Volume 29, Issue 3
      August 2011
      104 pages
      ISSN:0734-2071
      EISSN:1557-7333
      DOI:10.1145/2003690
      Issue’s Table of Contents

      Copyright © 2011 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 August 2011
      • Accepted: 1 April 2011
      • Received: 1 March 2011
      Published in tocs Volume 29, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!