The ACM CIKM 2009 Workshop on Web Information and Data Management (WIDM 2009) is the eleventh in a series of workshops on Web Information and Data Management held in conjunction with the International Conference on Information and Knowledge Management (CIKM). The objective of the workshop is to bring together researchers, industrial practitioners and developers to study how Web information can be extracted, stored, analyzed, and processed to provide useful information to the end users for various advanced database applications. We hope that these proceedings will serve as a valuable reference for all experts in the field.
In response to the call for papers, we received 41 papers from 18 countries: Australia, Brazil, Canada, China, the Czech Republic, Finland, Greece, India, Indonesia, Iran, Italy, Japan, Korea, Malaysia, the Netherlands, Norway, the United States, and Vietnam.
Starting from 2005, the workshop has a one-day schedule. This year, we adopted a double-blind review process and all papers were reviewed thoroughly by the program committee and external reviewers. The program committee accepted seven full papers and nine short papers, resulting in a competitive 39% acceptance rate. The 16 accepted papers have been divided into four sessions: "Querying, Question Answering and Searching", "Web Algorithms", "Web Information Mining and Extraction Techniques", and "Searching, Matching and Browsing". In addition, Professor Dik Lun Lee from the Hong Kong University of Science and Technology will present a keynote talk this year.
Proceeding Downloads
User profiling and personalized information delivery on the static and mobile web
Imagine a system that can push highly selective information right to our hands when and only when we need it. This requires a mind-reading machine, but unfortunately we don't have one --- yet. User profiling attempts to estimate what is most important ...
Retrieving good, better, and best answers to questions in advertisements
Question-Answering (QA) service is a growing area of research study, and commercial QA systems have recently been developed. We are motivated to provide complementary QA service that answers questions in advertisements (ads). These days with almost all ...
Impact of search results on user queries
In this paper, we experimentally study how web searchers select the keywords to describe their information needs and specifically we investigate whether query keyword selections are influenced by the results the users reviewed for a previous search. For ...
Distinct nearest neighbors queries for similarity search in very large multimedia databases
As the volume of multimedia data available on internet is tremendously increasing, the content-based similarity search becomes a popular approach to multimedia retrieval. The most popular retrieval concept is the k nearest neighbor (kNN) search. For a ...
Satisfiability of simple xpath fragments in the presence of dtds
For an XPath expression q and a DTD D, q is satisfiable under D if there exists an XML document t such that t is valid against D and that the answer of q on t is nonempty. Evaluating an unsatisfiable XPath expression is meaningless, since such an ...
A session generalization technique for improved web usage mining
Generalization of web sessions is an effective approach used to overcome two major challenges in web usage mining, namely quality and scalability. Given a concept hierarchy, such as a website, generalization replaces actual page-clicks with their ...
Automatic seed set expansion for trust propagation based anti-spamming algorithms
Seed sets are of significant importance for trust propagation based anti-spamming algorithms, e.g., TrustRank. Conventional approaches require manual evaluation to construct a seed set, which restricts the seed set to be small in size, since it would ...
Novel web page classification techniques in contextual advertising
Contextual advertising seeks to place relevant ads to generic web pages based on their contents. Recently, it has been observed that classifying web pages into a well-organized taxonomy of topics is promising for matching topically relevant ads to web ...
Efficient approach for incremental Vietnamese document clustering
In this paper, we present how to use graph model for clustering Vietnamese document incrementally. Graph based model allows us to model completely the structure of not only each document but also the whole collection of documents. The graph structure is ...
Bursty topics extraction for web forums
Many bursty topics which are difficult to summarize and search exist in web forums. Most existing topic detection and tracking (TDT) methods deal with the news stories, but the language used in web forums are much casual, oral and informal compared with ...
Extracting position relations from the web
In this paper, we present a new algorithm to extract people's position in a corporation from the Web. People's position in a corporation, which the term position relation refers to, is a kind of significant competitive intelligence for enterprises. Our ...
Post processing wrapper generated tables for labeling anonymous datasets
A large number of wrappers generate tables without column names for human consumption because the meaning of the columns are apparent from the context and easy for humans to understand, but in emerging applications, labels are needed for autonomous ...
I seek you: searching and matching individuals in social networks
The first task any individual faces after joining an online social network (OSN) is locating friends that are present on that particular site. Most OSNs offer some variation of a tool that imports email contact lists to facilitate the task of finding ...
Web personal name disambiguation based on reference entity tables mined from the web
Ambiguous personal names are common on the Web, which pose a challenge for many different tasks. The traditional disambiguation employs the clustering methods. However, without reference entity tables, the clustering method can only identify whether two ...
Finding intermediate entity between two examples on the web
We propose a method for finding an intermediate entity between two examples on the Web. For example, a user wants to find events that occurred between the Battle of Red Cliffs and the death of Cao Cao. In this situation, the user wants to find something ...
Semantic relatedness hits bibliographic data
In this paper we introduce a novel approach for the thematic organization of bibliographic records that builds upon a semantic relatedness measure we have implemented for this task. In particular, we introduce the Omiotis measure, which captures the ...
Investigation of children's characteristics for web browsing
Owing to the proliferation of Internet environment, young children have started accessing the Web. However, most existing pages are oriented for grown-ups. Particularly, children are not good at browsing Web pages with full of characters, such as news ...
Cited By
-
Abdelhaq H, Sengstock C and Gertz M (2013). EvenTweet, Proceedings of the VLDB Endowment, 10.14778/2536274.2536307, 6:12, (1326-1329), Online publication date: 28-Aug-2013.
-
Wang B (2011). Audience Intelligence in Online Advertising Online Multimedia Advertising, 10.4018/978-1-60960-189-8.ch014, (262-277)
Index Terms
Proceedings of the eleventh international workshop on Web information and data management




