skip to main content
research-article

Prediction and Predictability for Search Query Acceleration

Published:16 August 2016Publication History
Skip Abstract Section

Abstract

A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail latency, or high-percentile response time, of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel; otherwise, it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when responses are aggregated from many servers because this requires each server to reduce a substantially higher tail latency (e.g., the 99.99th percentile), which we call extreme tail latency.

To address tighter requirements of extreme tail latency, we propose a new design space for the problem, subsuming existing work and also proposing a new solution space. Existing work makes a prediction using features available at indexing time and focuses on optimizing prediction features for accelerating tail queries. In contrast, we identify “when to predict?” as another key optimization question. This opens up a new solution of delaying a prediction by a short duration to allow many short-running queries to complete without parallelization and, at the same time, to allow the predictor to collect a set of dynamic features using runtime information. This new question expands a solution space in two meaningful ways. First, we see a significant reduction of tail latency by leveraging “dynamic” features collected at runtime that estimate query execution time with higher accuracy. Second, we can ask whether to override prediction when the “predictability” is low. We show that considering predictability accelerates the query by achieving a higher recall.

With this prediction, we propose to accelerate the queries that are predicted to be long-running. In our preliminary work, we focused on parallelization as an acceleration scenario. We extend to consider heterogeneous multicore hardware for acceleration. This hardware combines processor cores with different microarchitectures such as energy-efficient little cores and high-performance big cores, and accelerating web search using this hardware has remained an open problem.

We evaluate the proposed prediction framework in two scenarios: (1) query parallelization on a multicore processor and (2) query scheduling on a heterogeneous processor. Our extensive evaluation results show that, for both scenarios of query acceleration using parallelization and heterogeneous cores, the proposed framework is effective in reducing the extreme tail latency compared to a start-of-the-art predictor because of its higher recall, and it improves server throughput by more than 70% because of its improved precision.

References

  1. R. Baeza-Yates, A. Gionis, F. P. Junqueira, V. Murdock, V. Plachouras, and F. Silvestri. 2008. Design trade-offs for search engine caching. ACM Transactions on Web 2, 4 (Oct. 2008), 1--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Baeza-Yates, V. Murdock, and C. Hauff. 2009. Efficiency trade-offs in two-tier web search systems. In SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Becchi and P. Crowley. 2006. Dynamic thread assignment on heterogeneous multiprocessor architectures. ACM Computing Frontiers (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Bienia, S. Kumar, J. P. Singh, and K. Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. Technical Report (2008).Google ScholarGoogle Scholar
  5. Z. Bosnic and I. Kononenko. 2008. Comparison of approaches for estimating reliability of individual regression predictions. Data Knowledge Engineering (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Briesemeister, J. Rahnenfuhrer, and O. Kohlbacher. 2012. No longer confidential: Estimating the confidence of individual regression predictions. PLos ONE (2012).Google ScholarGoogle Scholar
  7. S. Brin and L. Page. 1998. The anatomy of a large-scale hypertextual web search engine. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Z. Broder, D. Carmel, M. Herscovici, A. Soffer, and J. Zien. 2003. Efficient query evaluation using a two-level retrieval process. In CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Cacheda and R. Baeza-Yates. 2004. An optimistic model for searching web directories. In ECIR.Google ScholarGoogle Scholar
  10. F. Cacheda, V. Plachouras, and I. Ounis. 2005. A case study of distributed information retrieval architectures to index one terabyte of text. Information Processing Management 41, 5 (Sept. 2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Chen and L. K. John. 2009. Efficient program scheduling for heterogeneous multi-core processors. In DAC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Van Craeynest, A. Jalelle, L. Eeckhout, P. Narvaez, and J. Emer. 2012. Scheduling heterogeneous multi-cores through performance impact estimation (PIE). In ISCA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Craswell, B. Billerbeck, D. Fetterly, and M. Najork. 2013. Robust query rewriting using anchor data. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Dean and L. A. Barroso. 2013. The tail at scale. Communications of the ACM 56, 2 (Feb. 2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. Frachtenberg. 2009. Reducing query latencies in web search using fine-grained parallelism. World Wide Web (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Freire, C. Macdonald, N. Tonellotto, I. Ounis, and F. Cacheda. 2013. Hybrid query scheduling for a replicated search engine. In ECIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Freire, C. Macdonald, N. Tonellotto, I. Ounis, and F. Cacheda. 2014. A self-adapting latency/power tradeoff model for replicated search engines. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Friedman. 2001. Greedy function approximation: A gradient boosting machine. Annals of Statistics 29, 5 (2001), 1189--1232.Google ScholarGoogle ScholarCross RefCross Ref
  19. P. Greenhalgh. 2011. Big.LITTLE processing with ARM Cortex-A15 & Cortex-A7. ARM Whitepaper (2011).Google ScholarGoogle Scholar
  20. V. Janapa Reddi, B. C. Lee, T. Chilimbi, and K. Vaid. 2010. Web search using mobile cores: Quantifying and mitigating the price of efficiency. In ISCA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Jeon, Y. He, S. Elnikety, A. L. Cox, and S. Rixner. 2013. Adaptive parallelism for web search. In EuroSys. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Jeon, S. Kim, S. Hwang, Y. He, S. Elnikety, A. L. Cox, and S. Rixner. 2014. Predictive parallelization: Taming tail latencies in web search. In SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Kim, Y. He, S. Hwang, S. Elnikety, and S. Choi. 2015. Delayed-dynamic-selective (DDS) prediction for reducing extreme tail latency in web search. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Kim, A. Hassan, R. W. White, and Y.-M. Wang. 2013. Playing by the rules: Mining query associations to predict search performance. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Koufaty, D. Reddy, and S. Hahn. 2010. Bias scheduling in heterogeneous multi-core architectures. In EuroSys. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Kumar, K. I. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen. 2003. Single-ISA heterogeneous multicore architectures: The potential for processor power reduction. In MICRO. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. B. Lakshminarayana, J. Lee, and H. Kim. 2009. Age based scheduling for asymmetric multiprocessors. In SC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Li, P. Brett, R. C. Knauerhase, D. A. Koufaty, D. Reddy, and S. Hahn. 2010. Operating system support for overlapping-ISA heterogeneous multi-core architectures. In HPCA.Google ScholarGoogle Scholar
  29. C. Macdonald, I. Ounis, and N. Tonellotto. 2011. Upper-bound approximations for dynamic pruning. ACM Transactions on Information Systems 29, 4 (Dec. 2011), 17:1--17:28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. Macdonald, N. Tonellotto, and I. Ounis. 2012. Learning to predict response times for online query scheduling. In SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. Moffat, W. Webber, J. Zobel, and R. Baeza-Yates. 2007. A pipelined architecture for distributed text query evaluation. Information Retrieval 10, 3 (June 2007), 205--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. J. Oentaryo, E. P. Lim, D. J. W. Low, D. Lo, and M. Finegold. 2014. Predicting response in mobile advertising with hierarchical importance-aware factorization machine. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. Page and T. Lechler. 2005. Desmo-J. http://desmoj.sourceforge.net/overview.html.Google ScholarGoogle Scholar
  34. J. Pitkow, H. Schütze, T. Cass, R. Cooley, D. Turnbull, A. Edmonds, E. Adar, and T. Breuel. 2002. Personalized search. Communications of the ACM 45, 9 (Sept. 2002), 50--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. C. E. Rasmussen and C. K. I. Williams. 2006. Gaussian Processes for Machine Learning. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. Ren, Y. He, S. Elnikety, and K. S. McKinley. 2013. Exploiting processor heterogeneity for interactive services. In ICAC.Google ScholarGoogle Scholar
  37. K. Magne Risvik, T. Chilimbi, H. Tan, K. Kalyanaraman, and C. Anderson. 2013. Maguro, a system for indexing and searching over very large text collections. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. C. Saez, D. Shelepov, A. Fedorova, and M. Prieto. 2011. Leveraging workload diversity through OS scheduling to maximize performance on single-ISA heterogeneous multicore systems. JPDC 71, 1 (Jan. 2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. E. Schurman and J. Brutlag. 2009. Performance related changes and their user impact. Velocity (2009).Google ScholarGoogle Scholar
  40. S. Tatikonda, B. B. Cambazoglu, and F. P. Junqueira. 2011. Posting list intersection on multicore architectures. In SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. N. Tonellotto, C. Macdonald, and I. Ounis. 2013. Efficient and effective retrieval using selective pruning. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. H. Turtle and J. Flood. 1995. Query evaluation: Strategies and optimizations. Information Processing Management 31, 6 (Nov. 1995), 831--850. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Prediction and Predictability for Search Query Acceleration

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 10, Issue 3
      August 2016
      201 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/2988335
      Issue’s Table of Contents

      Copyright © 2016 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 August 2016
      • Revised: 1 May 2016
      • Accepted: 1 May 2016
      • Received: 1 September 2015
      Published in tweb Volume 10, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!