skip to main content
10.1145/1651587acmconferencesBook PagePublication PagescikmConference Proceedingsconference-collections
WIDM '09: Proceedings of the eleventh international workshop on Web information and data management
ACM2009 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
CIKM '09: Conference on Information and Knowledge Management Hong Kong China 2 November 2009
ISBN:
978-1-60558-808-7
Published:
02 November 2009
Sponsors:
Next Conference
Bibliometrics
Skip Abstract Section
Abstract

The ACM CIKM 2009 Workshop on Web Information and Data Management (WIDM 2009) is the eleventh in a series of workshops on Web Information and Data Management held in conjunction with the International Conference on Information and Knowledge Management (CIKM). The objective of the workshop is to bring together researchers, industrial practitioners and developers to study how Web information can be extracted, stored, analyzed, and processed to provide useful information to the end users for various advanced database applications. We hope that these proceedings will serve as a valuable reference for all experts in the field.

In response to the call for papers, we received 41 papers from 18 countries: Australia, Brazil, Canada, China, the Czech Republic, Finland, Greece, India, Indonesia, Iran, Italy, Japan, Korea, Malaysia, the Netherlands, Norway, the United States, and Vietnam.

Starting from 2005, the workshop has a one-day schedule. This year, we adopted a double-blind review process and all papers were reviewed thoroughly by the program committee and external reviewers. The program committee accepted seven full papers and nine short papers, resulting in a competitive 39% acceptance rate. The 16 accepted papers have been divided into four sessions: "Querying, Question Answering and Searching", "Web Algorithms", "Web Information Mining and Extraction Techniques", and "Searching, Matching and Browsing". In addition, Professor Dik Lun Lee from the Hong Kong University of Science and Technology will present a keynote talk this year.

Skip Table Of Content Section
SESSION: Querying, question answering, & searching
keynote
User profiling and personalized information delivery on the static and mobile web

Imagine a system that can push highly selective information right to our hands when and only when we need it. This requires a mind-reading machine, but unfortunately we don't have one --- yet. User profiling attempts to estimate what is most important ...

short-paper
Retrieving good, better, and best answers to questions in advertisements

Question-Answering (QA) service is a growing area of research study, and commercial QA systems have recently been developed. We are motivated to provide complementary QA service that answers questions in advertisements (ads). These days with almost all ...

short-paper
Impact of search results on user queries

In this paper, we experimentally study how web searchers select the keywords to describe their information needs and specifically we investigate whether query keyword selections are influenced by the results the users reviewed for a previous search. For ...

short-paper
Distinct nearest neighbors queries for similarity search in very large multimedia databases

As the volume of multimedia data available on internet is tremendously increasing, the content-based similarity search becomes a popular approach to multimedia retrieval. The most popular retrieval concept is the k nearest neighbor (kNN) search. For a ...

SESSION: Web algorithms
research-article
Satisfiability of simple xpath fragments in the presence of dtds

For an XPath expression q and a DTD D, q is satisfiable under D if there exists an XML document t such that t is valid against D and that the answer of q on t is nonempty. Evaluating an unsatisfiable XPath expression is meaningless, since such an ...

research-article
A session generalization technique for improved web usage mining

Generalization of web sessions is an effective approach used to overcome two major challenges in web usage mining, namely quality and scalability. Given a concept hierarchy, such as a website, generalization replaces actual page-clicks with their ...

research-article
Automatic seed set expansion for trust propagation based anti-spamming algorithms

Seed sets are of significant importance for trust propagation based anti-spamming algorithms, e.g., TrustRank. Conventional approaches require manual evaluation to construct a seed set, which restricts the seed set to be small in size, since it would ...

SESSION: Web Information mining & extraction techniques
research-article
Novel web page classification techniques in contextual advertising

Contextual advertising seeks to place relevant ads to generic web pages based on their contents. Recently, it has been observed that classifying web pages into a well-organized taxonomy of topics is promising for matching topically relevant ads to web ...

research-article
Efficient approach for incremental Vietnamese document clustering

In this paper, we present how to use graph model for clustering Vietnamese document incrementally. Graph based model allows us to model completely the structure of not only each document but also the whole collection of documents. The graph structure is ...

short-paper
Bursty topics extraction for web forums

Many bursty topics which are difficult to summarize and search exist in web forums. Most existing topic detection and tracking (TDT) methods deal with the news stories, but the language used in web forums are much casual, oral and informal compared with ...

short-paper
Extracting position relations from the web

In this paper, we present a new algorithm to extract people's position in a corporation from the Web. People's position in a corporation, which the term position relation refers to, is a kind of significant competitive intelligence for enterprises. Our ...

short-paper
Post processing wrapper generated tables for labeling anonymous datasets

A large number of wrappers generate tables without column names for human consumption because the meaning of the columns are apparent from the context and easy for humans to understand, but in emerging applications, labels are needed for autonomous ...

SESSION: Searching, matching, & browsing
research-article
I seek you: searching and matching individuals in social networks

The first task any individual faces after joining an online social network (OSN) is locating friends that are present on that particular site. Most OSNs offer some variation of a tool that imports email contact lists to facilitate the task of finding ...

research-article
Web personal name disambiguation based on reference entity tables mined from the web

Ambiguous personal names are common on the Web, which pose a challenge for many different tasks. The traditional disambiguation employs the clustering methods. However, without reference entity tables, the clustering method can only identify whether two ...

short-paper
Finding intermediate entity between two examples on the web

We propose a method for finding an intermediate entity between two examples on the Web. For example, a user wants to find events that occurred between the Battle of Red Cliffs and the death of Cao Cao. In this situation, the user wants to find something ...

short-paper
Semantic relatedness hits bibliographic data

In this paper we introduce a novel approach for the thematic organization of bibliographic records that builds upon a semantic relatedness measure we have implemented for this task. In particular, we introduce the Omiotis measure, which captures the ...

short-paper
Investigation of children's characteristics for web browsing

Owing to the proliferation of Internet environment, young children have started accessing the Web. However, most existing pages are oriented for grown-ups. Particularly, children are not good at browsing Web pages with full of characters, such as news ...

Contributors
  • National University of Singapore
  • Pennsylvania State University

Recommendations