skip to main content
research-article

Search and Breast Cancer: On Episodic Shifts of Attention over Life Histories of an Illness

Published:29 April 2016Publication History
Skip Abstract Section

Abstract

We seek to understand the evolving needs of people who are faced with a life-changing medical diagnosis based on analyses of queries extracted from an anonymized search query log. Focusing on breast cancer, we manually tag a set of Web searchers as showing patterns of search behavior consistent with someone grappling with the screening, diagnosis, and treatment of breast cancer. We build and apply probabilistic classifiers to detect these searchers from multiple sessions and to identify the timing of diagnosis using temporal and statistical features. We explore the changes in information seeking over time before and after an inferred diagnosis of breast cancer by aligning multiple searchers by the estimated time of diagnosis. We employ the classifier to automatically identify 1,700 candidate searchers with an estimated 90% precision, and we predict the day of diagnosis within 15 days with an 88% accuracy. We show that the geographic and demographic attributes of searchers identified with high probability are strongly correlated with ground truth of reported incidence rates. We then analyze the content of queries over time for inferred cancer patients, using a detailed ontology of cancer-related search terms. The analysis reveals the rich temporal structure of the evolving queries of people likely diagnosed with breast cancer. Finally, we focus on subtypes of illness based on inferred stages of cancer and show clinically relevant dynamics of information seeking based on the dominant stage expressed by searchers.

References

  1. John W. Ayers, Benjamin M. Althouse, Jon-Patrick Allem, Daniel E. Ford, Kurt M. Ribisl, and Joanna E. Cohen. 2012. A novel evaluation of world no tobacco day in latin America. J. Med. Internet Res. 14, 3 (2012).Google ScholarGoogle ScholarCross RefCross Ref
  2. Stephanie L. Ayers and Jennie Jacobs Kronenfeld. 2007. Chronic illness and health-seeking information on the internet. Health 11, 3 (2007).Google ScholarGoogle Scholar
  3. Mike Benigeri and Pierre Pluye. 2003. Shortcomings of health information on the internet. Health Promot. Int. 18, 4 (2003).Google ScholarGoogle Scholar
  4. Andrei Z. Broder, Marcus Fontoura, Evgeniy Gabrilovich, Amruta Joshi, Vanja Josifovski, and Tong Zhang. 2007. Robust classification of rare queries using web knowledge. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’07). ACM, New York, NY, 231--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. J. Burstein, K. Polyak, J. S. Wong, S. C. Lester, and C. M. Kaelin. 2004. Ductal carcinoma in situ of the breast. N. Engl. J. Med. 350, 14 (2004).Google ScholarGoogle ScholarCross RefCross Ref
  6. Marc-Allen Cartright, Ryen W. White, and Eric Horvitz. 2011. Intentions and attention in exploratory health search. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Castleton, T. Fong, A. Wang-Gillam, M. A. Waqar, D. B. Jeffe, L Kehlenbrink, F. Gao, and R. Govindan. 2011. A survey of internet utilization among patients with cancer. Support Care Cancer 19, 8 (2011).Google ScholarGoogle Scholar
  8. Emily H. Chan, Vikram Sahai, Corrie Conrad, and John S. Brownstein. 2011. Using web search query data to monitor dengue epidemics: A new model for neglected tropical disease surveillance. PLoS Negl. Trop. Dis. 5, 5 (2011).Google ScholarGoogle Scholar
  9. R. J. W. Cline and K. M. Haynes. 2001. Consumer health information seeking on the internet: The state of the art. Health Educ. Res. 16, 6 (2001).Google ScholarGoogle Scholar
  10. L. F. Degner, L. J. Kristjanson, D. Bowman, and et al. 1997. Information needs and decisional preferences in women with breast cancer. J. Am. Med. Assoc. 277, 18 (1997).Google ScholarGoogle ScholarCross RefCross Ref
  11. R. Desai, A. J. Hall, B. A. Lopman, Y. Shimshoni, M. Rennick, N. Efron, Y. Matias, M. M. Patel, and U. D. Parashar. 2012. Norovirus disease surveillance using gGoogle internet query share data. Clin. Infect. Dis. 55, 8 (Oct. 2012), e75--78.Google ScholarGoogle ScholarCross RefCross Ref
  12. Doug Downey, Susan Dumais, and Eric Horvitz. 2007. Models of searching and browsing: Languages, studies, and applications. In Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Georges E. Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Eysenbach. 2006. Infodemiology: Tracking flu-related searches on the web for syndromic surveillance. In Proceedings of the AMIA Annual Symposium.Google ScholarGoogle Scholar
  15. G. Eysenbach and C. Kohler. 2002. How do consumers search for and appraise health information on the world wide web? Qualitative studies using focus groups, usability test, and in-depth interviews. Br. Med. J. 324, 7337, 573--577.Google ScholarGoogle ScholarCross RefCross Ref
  16. Lesley Fallowfield. 2001. Participation of patients in decisions about treatment for cancer. Br. Med. J. 323, 7322 (2001).Google ScholarGoogle ScholarCross RefCross Ref
  17. J. L. Fleiss. 1981. Statistical Methods for Rates and Proportions. Second Edition. John Wiley & Sons, New York, NY.Google ScholarGoogle Scholar
  18. Adam Fourney, Ryen W. White, and Eric Horvitz. 2015. Exploring time-dependent concerns about pregnancy and childbirth from search logs. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI’15). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Susannah Fox and Maeve Duggan. 2013. Health Online 2013. Technical Report. Pew Internet and American Life Project. Retrieved from http://pewinternet.org/Commentary/2011/November/Pew-Internet-Health.aspx.Google ScholarGoogle Scholar
  20. Steve Fox, Kuldeep Karnawat, Mark Mydland, Susan Dumais, and Thomas White. 2005. Evaluating implicit measures to improve web search. ACM Trans. Inf. Syst. 23, 2 (2005), 147--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 2000. Additive logistic regression: A statistical view of boosting. Ann. Stat. 28, 2 (2000), 337--407.Google ScholarGoogle Scholar
  22. G. M. Fulgoni. 2005. The “professional respondent” problem in online survey panels today. In Proceedings of the Market Research Association Annual Conference.Google ScholarGoogle Scholar
  23. Christine M. Gaston and Geoffrey Mitchell. 2005. Information giving and decision-making in patients with advanced cancer: A systematic review. Soc. Sci. Med. 61, 10 (2005).Google ScholarGoogle Scholar
  24. J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski, and L. Brilliant. 2008. Detecting influenza epidemics using search engine query data. Nature 457, 7232 (2008).Google ScholarGoogle Scholar
  25. Ronan W. Glynn, John C. Kelly, Norma Coffey, Karl J. Sweeney, and Michael J. Kerin. 2011. The effect of breast cancer awareness month on internet search activity—A comparison with awareness campaigns for lung and prostate cancer. BMC Cancer 11, 442 (2011).Google ScholarGoogle Scholar
  26. Qi Guo and Eugene Agichtein. 2010. Ready to buy or just browsing? Detecting web searcher goals from interaction data. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). ACM, New York, NY, 130--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Thomas F. Hack, Lesley F. Degner, Peter Watson, and Luella Sinha. 2006. Do patients benefit from participating in medical decision making? Longitudinal follow-up of women with breast cancer. Psycho-Oncology 15, 1 (2006).Google ScholarGoogle Scholar
  28. Ahmed Hassan, Yang Song, and Li-wei He. 2011. A task level metric for measuring web search satisfaction and its application on improving relevance estimation. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ahmed Hassan, Ryen W. White, Susan T. Dumais, and Yi-Min Wang. 2014. Struggling or exploring? Disambiguating long search sessions. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM’14). ACM, New York, NY, 53--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Paul R. Helft. 2012. Patients with cancer, internet information, and the clinical encounter: A taxonomy of patient users. In Proceedings of the 48th Annual Meeting of the American Society of Clinical Ontology.Google ScholarGoogle ScholarCross RefCross Ref
  31. A. Kotov, P. Bennett, R. W. White, S. Dumais, and J. Teevan. 2011. Modeling and analysis of cross-session search tasks. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. T. Kusmierczyk, C. Trattner, and K. Norvag. 2015. Temporality in online food recipe consumption and production. In Proceedings of the International Conference on World Wide Web (WWW). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Tessa Lau and Eric Horvitz. 1999. Patterns of search: Analyzing and modeling web query refinement. In Proceedings of the 7th International Conference on User Modeling. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Morrow and J. R. Harris. 2000. Local management of invasive breast cancer. In Diseases of the Breast, J. R. Harris, M. E. Lippman, M. Morrow, and C. K. Osborne (Eds.). Lippincott, Williams & Wilkins.Google ScholarGoogle Scholar
  35. National Cancer Institute. 2013. Stages of Breast Cancer. Retrieved from http://www.cancer.gov/cancertopics/pdq/treatment/breast/Patient/page2.Google ScholarGoogle Scholar
  36. Yishai Ofran, Ora Paltiel, Dan Pelleg, Jacob M. Rowe, and Elad Yom-Tov. 2012. Patterns of information-seeking for cancer on the internet: An analysis of real world data. PLOS One 7, 9 (2012).Google ScholarGoogle Scholar
  37. Michael J. Paul. 2012. Mixed membership Markov models for unsupervised conversation modeling. In Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing (EMNLP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Michael J. Paul, Byron C. Wallace, and Mark Dredze. 2013. What affects patient (dis)satisfaction? Analyzing online doctor ratings with a joint topic-sentiment model. In Proceedings of the AAAI Workshop on Expanding the Boundaries of Health Informatics Using AI.Google ScholarGoogle Scholar
  39. Michael J. Paul, Ryen W. White, and Eric Horvitz. 2015. Diagnoses, decisions, and outcomes: Web search as decision support for cancer. In Proceedings of the International Conference on World Wide Web (WWW). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Eliseo J. Pérez-Stable, Aimee Afable-Munsuz, Celia Patricia Kaplan, Lydia Pace, Cathy Samayoa, and Carol Somkin. 2013. Factors influencing time to diagnosis after abnormal mammography in diverse women. J. Wom. Health 22, 2 (2013).Google ScholarGoogle ScholarCross RefCross Ref
  41. Geraldine Peterson, Parisa Aslani, and Kylie A. Williams. 2003. How do consumers search for and appraise information on medicines on the internet? A qualitative study using focus groups. J. Med. Internet Res. 5, 4 (2003).Google ScholarGoogle Scholar
  42. Karthik Raman, Paul N. Bennett, and Kevyn Collins-Thompson. 2014. Understanding intrinsic diversity in web search: Improving whole-session relevance. ACM Trans. Inf. Syst. 32, 4 (2014), 20:1--20:45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. M. Richardson. 2009. Learning about the world from long-term query logs. ACM Trans. Web 2, 4 (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Daniel E. Rose and Danny Levinson. 2004. Understanding user goals in web search. In Proceedings of the 13th International Conference on World Wide Web (WWW’04). 13--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Lila J. Finney Rutten, Neeraj K. Arora, Alexis D. Bakos, Noreen Aziz, and Julia Rowland. 2005. Information needs and sources of information among cancer patients: A systematic review of research (1980--2003). Patient Educ. Counsel. 57, 3 (2005).Google ScholarGoogle Scholar
  46. M. Santillana, D. W. Zhang, B. M. Althouse, and J. W. Ayers. 2014. What can digital disease detection learn from (an external revision to) google flu trends? Am. J. Prev. Med. 47, 3 (Sept. 2014), 341--347.Google ScholarGoogle ScholarCross RefCross Ref
  47. Melisa J. Satterlund, Kevin D. McCaul, and Ann K. Sandgren. 2003. Information gathering over time by breast cancer patients. J. Med. Internet Res. 5, 3 (2003).Google ScholarGoogle ScholarCross RefCross Ref
  48. Matthew I. Trotter and David W. Morgan. 2008. Patients’ use of the internet for health related matters: A study of internet usage in 2000 and 2006. Health Inform. 14, 3 (2008).Google ScholarGoogle Scholar
  49. J. L. Vandergrift, J. C. Niland, R. L. Theriault, S. B. Edge, Y. Wong, and et al. 2013. Time to adjuvant chemotherapy for breast cancer in national comprehensive cancer network institutions. J. Natl. Cancer Inst. 105, 2 (2013).Google ScholarGoogle ScholarCross RefCross Ref
  50. Robert West, Ryen W. White, and Eric Horvitz. 2013. From cookies to cooks: Insights on dietary patterns via analysis of web usage logs. In Proceedings of the International Conference on World Wide Web (WWW). Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Ryen W. White, Paul N. Bennett, and Susan T. Dumais. 2010. Predicting short-term interests using activity-based search context. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM’10). ACM, New York, NY, 1009--1018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Ryen W. White and Steven M. Drucker. 2007. Investigating behavioral variability in web search. In International Conference on World Wide Web (WWW). Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Ryen W. White and Eric Horvitz. 2009. Cyberchondria: Studies of the escalation of medical concerns in web search. ACM Trans. Inf. Syst. 27, 4 (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. R. W. White and E. Horvitz. 2010. Web to world: Predicting transitions from self-diagnosis to the pursuit of local medical assistance in web search. AMIA Annu. Symp. Proc. 2010 (2010), 882--886.Google ScholarGoogle Scholar
  55. Ryen W. White and Eric Horvitz. 2012. Studies of the onset and persistence of medical concerns in search logs. In ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Ryen W. White and Eric Horvitz. 2013a. From health search to healthcare: Explorations of intention and utilization via query logs and user surveys. J. Am. Med. Inform. Assoc. epub (2013).Google ScholarGoogle Scholar
  57. Ryen W. White and Eric Horvitz. 2013b. From web search to healthcare utilization: Privacy-sensitive studies from mobile data. J. Am. Med. Inform. Assoc. 20 (2013).Google ScholarGoogle Scholar
  58. Ryen W. White, Nicholas P. Tatonetti, Nigam H. Shah, Russ B. Altman, and Eric Horvitz. 2013. Web-scale pharmacovigilance: Listening to signals from the crowd. J. Am. Med. Inform. Assoc. 20, 3 (2013).Google ScholarGoogle ScholarCross RefCross Ref
  59. Elad Yom-Tov and Evgeniy Gabrilovich. 2013. Postmarket drug surveillance without trial costs: Discovery of adverse drug reactions through large-scale analysis of web search queries. J. Med. Internet Res. 15, 6 (2013), e124.Google ScholarGoogle ScholarCross RefCross Ref
  60. Sue Ziebland, Alison Chapple, Carol Dumelow, Julie Evans, Suman Prinjha, and Linda Rozmovits. 2004. How the internet affects patients’ experience of cancer: A qualitative study. Br. Med. J. 328, 7439 (2004).Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Search and Breast Cancer: On Episodic Shifts of Attention over Life Histories of an Illness

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on the Web
            ACM Transactions on the Web  Volume 10, Issue 2
            May 2016
            214 pages
            ISSN:1559-1131
            EISSN:1559-114X
            DOI:10.1145/2932204
            Issue’s Table of Contents

            Copyright © 2016 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 29 April 2016
            • Accepted: 1 February 2016
            • Revised: 1 January 2016
            • Received: 1 September 2015
            Published in tweb Volume 10, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!