ABSTRACT
Generically, search engines fail to understand the user's temporal intents when expressed as implicit temporal queries. This causes the retrieval of less relevant information and prevents users from being aware of the possible temporal dimension of the query results. In this paper, we aim to develop a language-independent model that tackles the temporal dimensions of a query and identifies its most relevant time periods. For this purpose, we propose a temporal similarity measure capable of associating a relevant date(s) to a given query and filtering out irrelevant ones. Our approach is based on the exploitation of temporal information from web content, particularly within the set of k-top retrieved web snippets returned in response to a query. We particularly focus on extracting years, which are a kind of temporal information that often appears in this type of collection. We evaluate our methodology using a set of real-world text temporal queries, which are clear concepts (i.e. queries which are non-ambiguous in concept and temporal in their purpose). Experiments show that when compared to baseline methods, determining the most relevant dates relating to any given implicit temporal query can be improved with a new temporal similarity measure.
References
- ANNIE (2002). http://www.aktors.org/technologies/annie/Google Scholar
- Alonso, O., Baeza-Yates, R., and Gertz, M. (2009). Effectiveness of Temporal Snippets. In WSSP'09 - WWW'09. Madrid, Spain.Google Scholar
- Alonso, O., Baeza-Yates, R., & Gertz, M. (2007). Exploratory Search Using Timelines. In ESCHI - CHI'07. San Jose, USA.Google Scholar
- Alonso, O., Gertz, M., and Baeza-Yates, R. (2009). Clustering and Exploring Search Results using Timeline Constructions. In CIKM'09. Google Scholar
Digital Library
- Alonso, O., Gertz, M., and Baeza-Yates, R. (2011). Enhancing Document Snippets Using Temp. Information. LNCS 7024, 26--31. Google Scholar
Digital Library
- Berberich, K., Bedathur, S., Alonso, O., and Weikum, G. (2010). A Language Modeling Approach for Temporal Information Needs. LNCS, 5993, 13--25. Google Scholar
Digital Library
- Bollegala, D., Matsuo, Y., and Ishizuka, M. (2007). Measuring Semantic Similarity between Words Using Web Search Engines. In WWW'07, 757--766. Banff, Canada. May 8--12. Google Scholar
Digital Library
- Campos, R. (2011). http://www.ccc.ipt.pt/~ricardo/softwareGoogle Scholar
- Campos, R., Dias, G., and Jorge, A. M. (2011). What is the Temporal Value of Web Snippets? In WWW'11-TWAW, Hyderabad, India.Google Scholar
- Campos, R., Jorge, A., & Dias, G. (2011). Using Web Snippets and Query-logs to Measure Implicit Temporal Intents in Queries. In SIGIR'11-QRU, 13--16. Beijing, China. July 28.Google Scholar
- Church, K., and Hanks, P. (1990). Word Association Norms Mutual Information and Lexicography. In Comp. Linguistics, 16(1), 23--29. Google Scholar
Digital Library
- Cilibrasi, R. L., and Vitányi, P. M. (2007). The Google Similarity Distance. In IEEE TKDE, 19(3), 370--373 Google Scholar
Digital Library
- Dakka, W., Gravano, L., and Ipeirotis, P. G. (2008). Answering General Time Sensitive Queries. In CIKM'08, 1437--1438. Google Scholar
Digital Library
- Deerwester, S., Dumais, S., Landauer, T., Furnas, G., and Harshman, R. (1990). Indexing by Latent Semantic Analysis. In Journal of the American Society for Information Science, 41(6), 391--407.Google Scholar
Cross Ref
- Dias, G., Alves, E., and Lopes, J. (2007). Topic Segmentation Algorithms for Text Summarization and Passage Retrieval: An Exhaustive Evaluation. In AAAI'07, 1334--1340. Canada. July 22--26. Google Scholar
Digital Library
- Dice, L. R. (1945). Measures of the Amount of Ecologic Association between Species. In Ecological Society of America, 26, 297--302.Google Scholar
- Dumais, S. T. (2005). Latent Semantic Analysis. In Annual Review of Information Science and Technology, 38(1), 188--230.Google Scholar
Cross Ref
- Freitag, D., Blume, M., Byrnes, J., Chow, E., Kapadia, S., Rohwer, R., et al. (2005). New Experiments in Distributional Representations of Synonymy. In CoNLL'05, 25--32. Michigan, USA. Google Scholar
Digital Library
- Georgetown University. (2002). GUTime Download. http://www.timeml.org/site/tarsqi/modules/gutime/download.htmlGoogle Scholar
- Google Insights (2011). http://www.google.com/insights/searchGoogle Scholar
- Ikehara, S., Murakami, J., and Kimoto, Y. (2003). Vector Space Model based on Semantic Attributes of Words. In JNLP: Journal of Natural Language Processing, 10(2), 111--128.Google Scholar
Cross Ref
- Jaccard, P. (1901). Étude comparative de la distribution florale dans une portion des Alpes et des Jura. In Bulletin del la Société Vaudoise des Sciences Naturelles, 37, 547--579.Google Scholar
- Jones, R., and Diaz, F. (2007). Temporal Profiles of Queries. In TOIS: ACM Transactions on Information Systems, 25(3). Google Scholar
Digital Library
- Kanhabua, N., and Nørvåg, K. (2010). Determining Time of Queries for Re-Ranking Search Results. In ECDL'10. Glasgow, Scotland. Google Scholar
Digital Library
- Katzell, R. A., and Cureton, E. E. (1947). Biserial Correlation and Prediction. In The Journal of Psychology, 24(2), 273--278.Google Scholar
Cross Ref
- Machado, D., Barbosa, T., Pais, S., Martins, B., and Dias, G. Universal Mobile Information Retrieval. In HCII'09, 345--354. USA. Google Scholar
Digital Library
- McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. In Psychometrika 12(2), 153--157.Google Scholar
Cross Ref
- Metzler, D., Jones, R., Peng, F., and Zhang, R. (2009). Improving Search Relevance for Implicitly Temporal Queries. In SIGIR'09, 700--701. Boston, USA. July 19--23. Google Scholar
Digital Library
- Rogers, D. J., and Tanimoto, T. T. (1960). A Computer Program for Classifying Plants. In Science, 132, 1115--1118.Google Scholar
Cross Ref
- Ruprecht-Karl University Heidelberg. (2011). Temporal Tagging. http://dbs.ifi.uni-heidelberg.de/index.php?id=129#c784Google Scholar
- Silva, J. F., Dias, G., Guilloré, S., and Pereira, J. G. (1999). Using LocalMaxs Algorithm for the Extraction of Contiguous and Non-contiguous Multiword Lexical Units. In EPIA'99, 21--24. Portugal. Google Scholar
Digital Library
- Turney, P. D. (2001). Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL. In EMCL'01, 491--502. Freiburg, Germany. Google Scholar
Digital Library
Index Terms
Enriching temporal query understanding through date identification


Ricardo Campos
Alípio Mário Jorge


Comments