Abstract
Nowadays, due to the explosive growth of web content and usage, users deal with their complex search tasks by web search engines. However, conventional search engines consider a search query corresponding only to a simple search task. In order to accomplish a complex search task, which consists of multiple subtask search goals, users usually have to issue a series of queries. For example, the complex search task “travel to Dubai” may involve several subtask search goals, including reserving hotel room, surveying Dubai landmarks, booking flights, and so forth. Therefore, a user can efficiently accomplish his or her complex search task if search engines can predict the complex search task with a variety of subtask search goals. In this work, we propose a complex search task model (CSTM) to deal with this problem. The CSTM first groups queries into complex search task clusters, and then generates subtask search goals from each complex search task cluster. To raise the performance of CSTM, we exploit four web resources including community question answering, query logs, search engine result pages, and clicked pages. Experimental results show that our CSTM is effective in identifying the comprehensive subtask search goals of a complex search task.
- E. Agichtein, R. W. White, S. T. Dumais, and P. N. Bennett. 2012. Search, interrupted: Understanding and predicting search task continuation. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 315--324). ACM. Google Scholar
Digital Library
- L. M. Aiello, D. Donato, U. Ozertem, and F. Menczer. 2011. Behavior-driven clustering of queries into topics. In Proceedings of the 20th ACM (pp. 1373--1382). ACM. Google Scholar
Digital Library
- E. Barsky and J. Bar-llan. 2012. The impact of task phrasing on the choice of search keywords and on the search process and success. Journal of the American Society for Information Science and Technology 63, 10, 1987--2005. Google Scholar
Digital Library
- D. Beeferman and A. Berger. 2000. Agglomerative clustering of a search engine query log. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 407--416). ACM. Google Scholar
Digital Library
- L. Bing, W. Lam, S. Jameel, and C. Lu. 2014. Website community mining from query logs with two-phase clustering. In Computational Linguistics and Intelligent Text Processing (pp. 201--212). Springer, Berlin. Google Scholar
Digital Library
- D. M. Blei, A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993--1022. Google Scholar
Digital Library
- P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. 2008. The query-flow graph: Model and applications. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (pp. 609--618). ACM. Google Scholar
Digital Library
- A. Broder. 2002. A taxonomy of web search. In ACM Sigir Forum 36, 2, 3--10. ACM. Google Scholar
Digital Library
- J. Cui, H. Liu, J. Yan, L. Ji, R. Jin, J. He, Y. Gu, Z. Chen, and X. Du. 2011. Multi-view random walk framework for search task discovery from click-through log. In Proceedings of the 20th ACM (pp. 135--140). ACM. Google Scholar
Digital Library
- D. Downey, S. Dumais, D. Liebling, and E. Horvitz. 2008. Understanding the relationship between searchers’ queries and information goals. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (pp. 449--458). ACM. Google Scholar
Digital Library
- H. Field and J. Allan. 2013. Task-aware query recommendation. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 83--92). ACM. Google Scholar
Digital Library
- E. B. Fowlkes and C. L. Mallows. 1983. A method for comparing two hierarchical clusterings. Journal of the American Statistical Association 78, 383, 553--569.Google Scholar
- T. Griffiths. 2002. Gibbs sampling in the generative model of latent Dirichlet allocation. Standford University 518, 11, 1--3.Google Scholar
- Q. Guo and E. Agichtein. 2010. Ready to buy or just browsing? Detecting web searcher goals from interaction data. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 130--137). ACM. Google Scholar
Digital Library
- Y. Hu, Y. Qian, H. Li, D. Jiang, J. Pei, and Q. Zheng. 2012. Mining query subtopics from search log data. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 305--314). ACM. Google Scholar
Digital Library
- V. Jethava, L. Calderón-Benavides, R. Baeza-Yates, C. Bhattacharyya, and D. Dubhashi. 2011. Scalable multi-dimensional user intent identification using tree structured distributions. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 395--404). ACM. Google Scholar
Digital Library
- M. Ji, J. Yan, S. Gu, J. Han, X. He, W. V. Zhang, and Z. Chen. 2011. Learning search tasks in queries and web pages via graph regularization. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 55--64). ACM. Google Scholar
Digital Library
- R. Jones and K. Klinkner. 2008. Beyond the session timeout: Automatic hierarchical segmentation of search topics in query logs. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (pp. 699--708). ACM. Google Scholar
Digital Library
- M. Kellar, C. Watters, and M. Shepherd. 2007. A field study characterizing web-based information-seeking tasks. Journal of the American Society for Information Science and Technology 58, 7, 999--1018. Google Scholar
Digital Library
- A. Kotov, P. N. Bennett, R. W. White, S. T. Dumais, and J. Teevan. 2011. Modeling and analysis of cross-session search tasks. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 5--14). ACM. Google Scholar
Digital Library
- U. Lee, Z. Liu, and J. Cho. 2005. Automatic identification of user goals in web search. In Proceedings of the 14th International Conference on World Wide Web (pp. 391--400). ACM. Google Scholar
Digital Library
- K. T. Leung, W. Ng, and D. L. Lee. 2008. Personalized concept-based clustering of search engine queries. IEEE Transactions on Knowledge and Data Engineering 20, 11, 1505--1518. Google Scholar
Digital Library
- Y. Li and N. J. Belkin. 2010. An exploration of the relationships between work task and interactive information search behavior. Journal of the American Society for Information Science and Technology 61, 9, 1771--1789. Google Scholar
Digital Library
- Z. Liao, Y. Song, L.-W. He, and Y. Huang. 2012. Evaluating the effectiveness of search task trails. In Proceedings of of WWW. Google Scholar
Digital Library
- T. Lin, P. Pantel, M. Gamon, A. Kannan, and A. Fuxman. 2012. Active objects: Actions for entity-centric search. In Proceedings of the 21st International Conference on World Wide Web (pp. 489--498). ACM. Google Scholar
Digital Library
- J. Liu and N. J. Belkin. 2010. Personalizing information retrieval for multi-session tasks: The roles of task stage and task type. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 26--33). ACM. Google Scholar
Digital Library
- C. Lucchese, S. Orlando, R. Perego, F. Silvestri, and G. Tolomei. 2011. Identifying task-based sessions in search engine query logs. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (pp. 277--286). ACM. Google Scholar
Digital Library
- B. MacKay and C. Watters. 2008. Exploring multi-session web tasks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1187--1196). ACM. Google Scholar
Digital Library
- B. MacKay and C. Watters. 2012. An examination of multisession web tasks. Journal of the American Society for Information Science and Technology 63, 6, 1183--1197. Google Scholar
Digital Library
- C. D. Manning and H. Schütze. 1999. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA. Google Scholar
Digital Library
- L. Mihalkova and R. Mooney. 2009. Learning to disambiguate search queries form short sessions. In Machine Learning and Knowledge Discovery in Databases (pp. 111--127). Springer, Berlin. Google Scholar
Digital Library
- F. Murtagh. 1984. Complexities of hierarchic clustering algorithms: The state of the art. Computational Statistics Quarterly 1, 2, 101--113.Google Scholar
- K. Raman, P. N. Bennett, and K. Collins-Thompson. 2013. Toward whole-session relevance: Exploring intrinsic diversity in web search. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 463--472). ACM. Google Scholar
Digital Library
- X. Ren, Y. Wang, X. Yu, J. Yan, Z. Chen, and J. Han. 2014. Heterogeneous graph-based intent learning with queries, web pages and Wikipedia concepts. In Proceedings of WSDM, 23--32. Google Scholar
Digital Library
- S. Pandey and K. Punera. 2012. Unsupervised extraction of template structure in web search queries. In Proceedings of the 21st International Conference on World Wide Web (pp. 409--418). ACM. Google Scholar
Digital Library
- D. E. Rose and D. Levinson. 2004. Understanding user goals in web search. In Proceedings of the 13th International Conference on World Wide Web (pp. 13--19). ACM. Google Scholar
Digital Library
- E. Sadikov, J. Madhavan, L. Wang, and A. Halevy. 2010. Clustering query refinements by user intent. In Proceedings of the 19th International Conference on World Wide Web (pp. 841--850). ACM. Google Scholar
Digital Library
- B. Tan, X. Shen, and C. Zhai. 2006. Mining long-term search history to improve search accuracy. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 718--723). ACM. Google Scholar
Digital Library
- T.-X. Wang and W.-S. Lu. 2011. Identifying popular search goals behind search queries to improve web search ranking. In Information Retrieval Technology (pp. 250--262). Springer, Berlin. Google Scholar
Digital Library
- T. X. Wang, K. Y. Tsai, and W. H. Lu. 2014. Identifying real-life complex task names with task-intrinsic entities from microblogs. ACL.Google Scholar
- J.-R. Wen, J.-Y. Nie, and H.-J. Zhang. 2001. Clustering user queries of search engine. In Proceedings of the 10th International Conference on World Wide Web (pp. 162--168). ACM. Google Scholar
Digital Library
- T. Yamamoto, T. Sakai, M. Iwata, C. Yu, J.-R. Wen, and K. Tanaka. 2012. The wisdom of advertisers: Mining subgoals via query clustering. In Proceedings of the 21st ACM (pp. 505--514). ACM. Google Scholar
Digital Library
- X. Yin and S. Shah. 2010. Building taxonomy of web search intents for name entity queries. In Proceedings of the 19th International Conference on World Wide Web (pp. 1001--1010). ACM. Google Scholar
Digital Library
- Y. Zhang, W. Chen, D. Wang, and Q. Yang. 2011. User-click modeling for understanding and predicting search-behavior. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1388--1396). ACM. Google Scholar
Digital Library
Index Terms
Constructing Complex Search Tasks with Coherent Subtask Search Goals
Recommendations
Identifying and Predicting the States of Complex Search Tasks
CHIIR '20: Proceedings of the 2020 Conference on Human Information Interaction and RetrievalComplex search tasks that involve uncertain solution space and multi-round search iterations are integral to everyday life and information-intensive workplace practices, affecting how people learn, work, and resolve problematic situations. However, ...
Supporting Complex Search Tasks
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementWe present methods to automatically identify and recommend sub-tasks to help people explore and accomplish complex search tasks. Although Web searchers often exhibit directed search behaviors such as navigating to a particular Website or locating a ...
Discovering tasks from search engine query logs
Although Web search engines still answer user queries with lists of ten blue links to webpages, people are increasingly issuing queries to accomplish their daily tasks (e.g., finding a recipe, booking a flight, reading online news, etc.). In this work, ...






Comments