The papers in this volume were presented at the 18th International Workshop on the Web and Databases (WebDB 2015), held in Melbourne, Australia on May 31, 2015. The workshop was co-located with ACM SIGMOD, and attracted 31 submissions, of which 9 were selected for oral presentation and for publication in this volume. This corresponds to an acceptance rate of 29%, re-affirming WebDB's standing as a premier data and knowledge management workshop.
Proceeding Downloads
Addressing Instance Ambiguity in Web Harvesting
Web Harvesting enables the enrichment of incomplete data sets by retrieving required information from the Web. However, the ambiguity of instances may greatly decrease the quality of the harvested data, given that any instance in the local data set may ...
IBEX: Harvesting Entities from the Web Using Unique Identifiers
In this paper we study the prevalence of unique entity identifiers on the Web. These are, e.g., ISBNs (for books), GTINs (for commercial products), DOIs (for documents), email addresses, and others. We show how these identifiers can be harvested ...
Person-Name Parsing for Linking User Web Profiles
A person-name parser involves the identification of constituent parts of a person's name. Due to multiple writing styles ("John Smith" versus "Smith, John"), extra information ("John Smith, PhD", "Rev. John Smith"), and country-specific last-name ...
Truth Finding with Attribute Partitioning
Truth finding is the problem of determining which of the statements made by contradictory sources is correct, in the absence of prior information on the trustworthiness of the sources. A number of approaches to truth finding have been proposed, from ...
Long-term Optimization of Update Frequencies for Decaying Information
Many kinds of information, such as addresses, crawls of webpages, or academic affiliations, are prone to becoming outdated over time. Therefore, in some applications, updates are performed periodically in order to keep the correctness and usefulness of ...
Analyzing Crowd Rankings
Ranked data is ubiquitous in real-world applications, arising naturally when users express preferences about products and services, when voters cast ballots in elections, and when funding proposals are evaluated based on their merits or university ...
TriAL-QL: Distributed Processing of Navigational Queries
Navigational queries are among the most natural query patterns for RDF data, but yet most existing RDF query languages fail to cover all the varieties inherent to its triple-based model, including SPARQL 1.1 and its derivatives. As a consequence, the ...
FOREST: Focused Object Retrieval by Exploiting Significant Tag Paths
Content-intensive websites, e.g., of blogs or news, present pages that contain Web articles automatically generated by content management systems. Identification and extraction of their main content is critical in many applications, such as indexing or ...
Discovering Subsumption Relationships for Web-Based Ontologies
As search engines are becoming smarter at interpreting user queries and providing meaningful responses, they rely on ontologies to understand the meaning of entities. Creating ontologies manually is a laborious process, and resulting ontologies may not ...
Index Terms
Proceedings of the 18th International Workshop on Web and Databases




