ABSTRACT
Over the past 10+ years, online companies large and small have adopted widespread A/B testing as a robust data-based method for evaluating potential product improvements. In online experimentation, it is straightforward to measure the short-term effect, i.e., the impact observed during the experiment. However, the short-term effect is not always predictive of the long-term effect, i.e., the final impact once the product has fully launched and users have changed their behavior in response. Thus, the challenge is how to determine the long-term user impact while still being able to make decisions in a timely manner.
We tackle that challenge in this paper by first developing experiment methodology for quantifying long-term user learning. We then apply this methodology to ads shown on Google search, more specifically, to determine and quantify the drivers of ads blindness and sightedness, the phenomenon of users changing their inherent propensity to click on or interact with ads.
We use these results to create a model that uses metrics measurable in the short-term to predict the long-term. We learn that user satisfaction is paramount: ads blindness and sightedness are driven by the quality of previously viewed or clicked ads, as measured by both ad relevance and landing page quality. Focusing on user satisfaction both ensures happier users but also makes business sense, as our results illustrate. We describe two major applications of our findings: a conceptual change to our search ads auction that further increased the importance of ads quality, and a 50% reduction of the ad load on Google's mobile search interface.
The results presented in this paper are generalizable in two major ways. First, the methodology may be used to quantify user learning effects and to evaluate online experiments in contexts other than ads. Second, the ads blindness/sighted-ness results indicate that a focus on user satisfaction could help to reduce the ad load on the internet at large with long-term neutral, or even positive, business impact.
Supplemental Material
- M. Bayles. Just how "Blind"Are We to Advertising Banners on the Web? In Usability News, 22 2000.Google Scholar
- J.P. Benway, D.M. Lane. Banner Blindness: Web Searchers Often Miss "Obvious" Links. http://www.ruf.rice.edu/ lane/papers/bannerblindness.pdf\textttwww.ruf.rice.edu.Google Scholar
- A. Broder, M. Ciaramita, M. Fontoura, E. Gabrilovich, V. Josifovski, D. Metzler, V. Murdock, V. Plachouras. To Swing or not to Swing: Learning when (not) to Advertise. In CIKM 2008. Google Scholar
Digital Library
- R.R. Bush, F. Mosteller. A Mathematical Model for Simple Learning. In Psychological Review, 58 1951.Google Scholar
- W.K. Estes. Toward a Statistical Theory of Learning. In Pyschological Review, 57 1950.Google Scholar
- Google. AdWords Help: Check and understand Quality Score. https://support.google.com/adwords/answer/2454010\textttsupport.google.com.Google Scholar
- Google. AdWords Help: Things you should know about Ads Quality. https://support.google.com/adwords/answer/156066\textttsupport.google.com.Google Scholar
- Google. AdWords Help: Understanding Landing Page Experience. https://support.google.com/adwords/answer/2404197?hl=en\textttsupport.google.com.Google Scholar
- G. Hotchkiss. More Ads = Better Ads = Better User Experience: Microsoft's Success Formula? http://searchengineland.com/more-ads-better-ads-better-user-experience-microsoft%E2%80%99s-success-formula-16086searchengineland.com.Google Scholar
- A. Juda. Ads quality improvements rolling out globally. http://adwords.blogspot.com/2011/10/ads-quality-improvements-rolling-out.html\textttadwords.blogspot.com.Google Scholar
- R. Kohavi, M. Round. Front Line Internet Analytics at Amazon.com. http://ai.stanford.edu/ ronnyk/emtricsAmazon.pdf\textttai.stanford.edu.Google Scholar
- R. Kohavi, R. Henne, D. Sommerfield. Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO. In KDD 2007. Google Scholar
Digital Library
- R. Kohavi, T. Crook, R. Longbotham. Online Experimentation at Microsoft. In Third Workshop on Data Mining Case Studies 2009.Google Scholar
- R. Kohavi, A. Deng, B. Frasca, R. Longbotham, T. Walker, Y. Xu. Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained. In KDD 2012. Google Scholar
Digital Library
- R. Kohavi, A. Deng, B. Frasca, T. Walker, Y. Xu, N. Pohlmann. Online Controlled Experiments at Large Scale. In KDD 2013. Google Scholar
Digital Library
- J.W. Owens, B.S. Chaparro, E.M. Palmer. Text Advertising Blindness: The New Banner Blindness? In Jounal of Usability Studies, May 2011. Google Scholar
Digital Library
- G. Sadler. Why Not Treat Marketing Like Sales? http://www.dnb.com/connectors/marketing-driven-growth-A-or-B-testing.html\textttwww.dnb.com.Google Scholar
- D. Tang, A. Agarwal, D. O'Brien, M. Meyer. Overlapping Experiment Infrastructure: More, Better, Faster Experimentation. In KDD 2010. Google Scholar
Digital Library
- E.L. Thorndike. Animal Intelligence: An experimental study of the associative processes in animals. In Psychological Monographs: General and Applied, 2(4) 1898.Google Scholar
Cross Ref
Index Terms
Focusing on the Long-term: It's Good for Users and Business
Recommendations
Top Challenges from the first Practical Online Controlled Experiments Summit
Online controlled experiments (OCEs), also known as A/B tests, have become ubiquitous in evaluating the impact of changes made to software products and services. While the concept of online controlled experiments is simple, there are many practical ...
The Netflix Recommender System: Algorithms, Business Value, and Innovation
This article discusses the various algorithms that make up the Netflix recommender system, and describes its business purpose. We also describe the role of search and related algorithms, which for us turns into a recommendations problem as well. We ...
Online controlled experiments at large scale
KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data miningWeb-facing companies, including Amazon, eBay, Etsy, Facebook, Google, Groupon, Intuit, LinkedIn, Microsoft, Netflix, Shop Direct, StumbleUpon, Yahoo, and Zynga use online controlled experiments to guide product development and accelerate innovation. At ...





Comments