skip to main content
research-article

A Study of Web Print: What People Print in the Digital Era

Published:25 July 2017Publication History
Skip Abstract Section

Abstract

This article analyzes a proprietary log of printed web pages and aims at answering questions regarding the content people print (what), the reasons they print (why), as well as attributes of their print profile (who). We present a classification of pages printed based on their print intent and we describe our methodology for processing the print dataset used in this study. In our analysis, we study the web sites, topics, and print intent of the pages printed along the following aspects: popularity, trends, activity, user diversity, and consistency. We present several findings that reveal interesting insights into printing. We analyze our findings and discuss their impact and directions for future work.

References

  1. Netflix recommendations (2012) Retrieved from http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html.Google ScholarGoogle Scholar
  2. M. Agosti, F. Crivellari, and G. Maria Di Nunzio. 2012. Web log analysis: A review of a decade of studies about information acquisition, inspection and interpretation of user interaction. Data Min. Knowl. Discov. 24, 3 (2012), 663--696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. V. Aho and M. J. Corasick. 1975. Efficient string matching: An aid to bibliographic search. Commun. ACM 18, 6 (1975), 333--340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Stuart. 2014. Web Metrics for Library and Information Professionals. Facet Publishing.Google ScholarGoogle Scholar
  5. R. Baeza-Yates, L. Calderan-Benavides, and C. Gonzalez-Caro. 2006. The intention behind web queries. In String Processing and Information Retrieval. Lecture Notes in Computer Science, Vol. 4209. Springer, Berlin, 98--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Baykan, M. Henzinger, L. Marian, and I. Weber. 2011. A comprehensive study of features and algorithms for URL-based topic classification. ACM Trans. Web 5, 3, Article 15. DOI:http://dx.doi.org/10.1145/1993053.1993057 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. O. Bondarenko and R. Janssen. 2005. Documents at hand: Learning from paper to improve digital technologies. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’05). ACM, New York, 121--130. DOI:http://dx.doi.org/10.1145/1054972.1054990 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Broder. 2002. A taxonomy of Web search. SIGIR Forum 36, 2, 3--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Buttfield-Addison, C. Lueg, L. Ellis, and J. Manning. 2012. “Everything goes into or out of the iPad”: The iPad, information scraps and personal information management. In Proceedings of the 24th Australian Computer-Human Interaction Conference (OzCHI’12). ACM, New York, 61--67. DOI:http://dx.doi.org/10.1145/2414536.2414546 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Dooley. 2015. Paper beats digital in many ways, according to neuroscience. Retrieved from http://www.forbes.com/sites/rogerdooley/2015/09/16/paper-vs-digital/#6ae01e5f1aa2.Google ScholarGoogle Scholar
  11. M. Rojas Herrera, E. Silva de Moura, M. Cristo, T. Philippe C. Silva, and A. Soares da Silva. 2010. Exploring features for the automatic identification of user goals in web search. Inf. Process. Manage. 46, 2 (2010), 131--142.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Jansen and A. Spink. 2006. How are we searching the web? A comparison of nine search engine query logs. Inf. Process. Manage. 42 (2006).Google ScholarGoogle Scholar
  13. B. J. Jansen, D. L. Booth, and A. Spink. 2008a. Determining the informational, navigational, and transactional intent of Web queries. Inf. Process. Manage. 44, 1251--1266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. J. Jansen, D. L. Booth, and A. Spink. 2008b. Determining the informational, navigational, and transactional intent of web queries. Inf. Process. Manage. 44, 3, 1251--1266. DOI:http://dx.doi.org/10.1016/j.ipm.2007.07.015 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. W. Jones, H. Bruce, and S. T. Dumais. 2001. Keeping found things found on the web. In Proceedings of the Conference on Information and Knowledge Management (CIKM’01). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. I. Kang and G. Kim. 2003. Query type classification for web document retrieval. In Proceedings of the ACM Special Interest Group on Information Retrieval Conference (SIGIR’03). 64--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Kaye, J. Vertesi, S. Avery, A. Dafoe, S. David, L. Onaga, I. Rosero, and T. Pinch. 2006. To have and to hold: Exploring the personal archive. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’06). ACM, New York, 275--284. DOI:http://dx.doi.org/10.1145/1124772.1124814 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Kelly and J. Teevan. 2003. Implicit feedback for inferring user preference: A bibliography. In SIGIR Forum. 37, 2, 18--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Kotov, P. N. Bennett, R. W. White, S. T. Dumais, and J. Teevan. 2011. Modeling and analysis of cross-session search tasks. In Proceedings of the ACM Special Interest Group on Information Retrieval Conference (SIGIR’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. U. Lee, Z. Liu, and J. Cho. 2005. Automatic identification of user goals in Web search. In Proceedings of the Conference on the World Wide Web (WWW’05). 391--400. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Li, R. Krishnamurthy, S. Vaithyanathan, and H. V. Jagadish. 2006. Getting work done on the web: Supporting transactional queries. In Proceedings of the Special Interest Group on Information Retrieval Conference (SIGIR’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. Linden, B. Smith, and J. York. 2003. Amazon.com item-to-item collaborative filtering. IEEE Int. Comput. 7, 1, 76--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Liu, P. Dolan, and E. Ronby Pedersen. 2010. Personalized news recommendation based on click behavior. In Proceedings of the Conference on Intelligent User Interfaces (IUI’10). 31--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Longadge, S. Dongre, and L. Malik. 2013. Class imbalance problem in data mining: Review. Int. J. Comput. Sci. Netw. (IJCSN’13) 2, 1.Google ScholarGoogle Scholar
  25. M. Maslov, A. Golovko, I. Segalovich, and P. Braslavski. 2006. Extracting news-related queries from web query log. In Proceedings of the Conference on the World Wide Web (WWW’06). 931--932. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Mayer. 2009. Web history tools and revisitation support: A survey of existing approaches and directions. Found. Trends Hum.-Comput. Interact. 2, 3, 173--278. DOI:http://dx.doi.org/10.1561/1100000011 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nancy Messieh. 2012. Repinly gives you insight into the most popular content on Pinterest. Retrived from http://tnw.to/1E4Ix.Google ScholarGoogle Scholar
  28. J. C. Miller, G. Rae, and F. Schaefer. 2001. Modifications of kleinberg’s hits algorithm using matrix exponentiation and weblog records. In Proceedings of the ACM Special Interest Group on Information Retrieval Conference (SIGIR’01). 444--445.Google ScholarGoogle Scholar
  29. D. Mladenic. 1998. Turning Yahoo into an Automatic Web-Page Classifier. In Proceedings of the 13th European Conference on Artificial Intelligence (ECAI’98). 473--474.Google ScholarGoogle Scholar
  30. H. Obendorf, H. Weinreich, E. Herder, and M. Mayer. 2007. Web page revisitation revisited: Implications of a long-term click-stream study of browser usage. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’07). ACM, New York, 597--606. DOI:http://dx.doi.org/10.1145/1240624.1240719 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. X. Qi and B. D. Davison. 2009. Web page classification: Features and algorithms. ACM Comput. Surv. 41, 2, Article 12. DOI:http://dx.doi.org/10.1145/1459352.1459357 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. E. Rose and D. Levinson. 2004. In understanding user goals in web search. In Proceedings of the Conference on the World Wide Web (WWW’04). 13--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Abigail J. Sellen and Richard H. R. Harper. 2003. The Myth of the Paperless Office. MIT Press, Cambridge.Google ScholarGoogle Scholar
  34. X. Shi and C. C. Yang. 2007. Mining related queries from web search engine query logs using an improved association rule mining model. J. Am. Soc. Info. Sci. Technol. 58, 12 (2007), 1871--1883. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. Srikant and Y. Yang. 2001. Mining web logs to improve website organization. In Proceedings of the Conference on the World Wide Web (WWW’01). 430--437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. L. Tauscher and S. Greenberg. 1997. How people revisit web pages: Empirical findings and implications for the design of history systems. Int. J. Hum.-Comput. Studies 47, 1 (1997), 97--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Teevan, E. Adar, R. Jones, and M. Potts. 2006. Information re-retrieval: Repeat queries in Yahoo’s logs. In Proceedings of the ACM Special Interest Group on Information Retrieval Conference (SIGIR’06). 151--158.Google ScholarGoogle Scholar
  38. J. Tolle. 1983. Transactional log analysis: Online catalogs. In Proceedings of the ACM Special Interest Group on Information Retrieval Conference (SIGIR’83). 147--160.Google ScholarGoogle Scholar
  39. S. K. Tyler and J. Teevan. 2010. Large scale query log analysis of re-finding. In Proceedings of the Conference on Web Search and data Mining (WSDM’10). 191--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. Wedig and O. Madani. 2006. A large-scale analysis of query logs for assessing personalization opportunities. In Proceedings of the Conference on Knowledge Discovery and Dat Mining (KDD’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. R. W. White, P. N. Bennett, and Susan T. Dumais. 2010. Predicting short-term interests using activity-based search context. In Proceedings of the Conference on Information and Knowledge Management (CIKM’10). 1009--1018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. S. Whittaker and J. Hirschberg. 2001. The character, value, and management of personal paper archives. ACM Trans. Comput. Hum. Interact. 8 (2001), 150--170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Z. Zhang and O. Nasraoui. 2008. Mining search engine query logs for social filtering-based query recommendation. Appl. Soft. Comput. 8, 4 (2008), 1326--1334. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Study of Web Print: What People Print in the Digital Era

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 11, Issue 4
      November 2017
      257 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/3127338
      Issue’s Table of Contents

      Copyright © 2017 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 July 2017
      • Revised: 1 March 2017
      • Accepted: 1 March 2017
      • Received: 1 May 2016
      Published in tweb Volume 11, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)16
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!