skip to main content
10.5555/2018936dlproceedingsBook PagePublication PagesconllConference Proceedingsconference-collections
CoNLL '11: Proceedings of the Fifteenth Conference on Computational Natural Language Learning
2011 Proceeding
Publisher:
  • Association for Computational Linguistics
  • N. Eight Street, Stroudsburg, PA, 18360
  • United States
Conference:
Portland Oregon June 23 - 24, 2011
ISBN:
978-1-932432-92-3
Published:
23 June 2011

Bibliometrics
Skip Abstract Section
Abstract

The 2011 Conference on Computational Natural Language Learning is the fifteenth in the series of annual meetings organized by SIGNLL, the ACL special interest group on natural language learning. CONLL-2011 will be held in Portland, Oregon, USA, June 23-24 2011, in conjunction with ACL-HLT.

Skip Table Of Content Section
research-article
Free
Modeling syntactic context improves morphological segmentation
pp 1–9

The connection between part-of-speech (POS) categories and morphological properties is well-documented in linguistics but underutilized in text processing systems. This paper proposes a novel model for morphological segmentation that is driven by this ...

research-article
Free
The effect of automatic tokenization, vocalization, stemming, and POS tagging on Arabic dependency parsing
pp 10–18

We use an automatic pipeline of word tokenization, stemming, POS tagging, and vocalization to perform real-world Arabic dependency parsing. In spite of the high accuracy on the modules, the very few errors in tokenization, which reaches an accuracy of ...

research-article
Free
Punctuation: making a point in unsupervised dependency parsing
pp 19–28

We show how punctuation can be used to improve unsupervised dependency parsing. Our linguistic analysis confirms the strong connection between English punctuation and phrase boundaries in the Penn Treebank. However, approaches that naively include ...

research-article
Free
Modeling infant word segmentation
pp 29–38

While many computational models have been created to explore how children might learn to segment words, the focus has largely been on achieving higher levels of performance and exploring cues suggested by artificial learning experiments. We propose a ...

research-article
Free
Word segmentation as general chunking
pp 39–47

During language acquisition, children learn to segment speech into phonemes, syllables, morphemes, and words. We examine word segmentation specifically, and explore the possibility that children might have general-purpose chunking mechanisms to perform ...

research-article
Computational linguistics for studying language in people: principles, applications and research problems (invited talk)
pp 48

One of the goals of computational linguistics is to create automated systems that can learn, generate, and understand language at all levels of structure (semantics, syntax, morphology, phonology, phonetics). This is a very demanding task whose complete ...

research-article
Free
Search-based structured prediction applied to biomedical event extraction
pp 49–57

We develop an approach to biomedical event extraction using a search-based structured prediction framework, SEARN, which converts the task into cost-sensitive classification tasks whose models are learned jointly. We show that SEARN improves on a simple ...

research-article
Free
Using sequence kernels to identify opinion entities in Urdu
pp 58–67

Automatic extraction of opinion holders and targets (together referred to as opinion entities) is an important subtask of sentiment analysis. In this work, we attempt to accurately extract opinion entities from Urdu newswire. Due to the lack of ...

research-article
Free
Subword and spatiotemporal models for identifying actionable information in Haitian Kreyol
pp 68–77

Crisis-affected populations are often able to maintain digital communications but in a sudden-onset crisis any aid organizations will have the least free resources to process such communications. Information that aid agencies can actually act on, '...

research-article
Free
Gender attribution: tracing stylometric evidence beyond topic and genre
pp 78–86

Sociolinguistic theories (e.g., Lakoff (1973)) postulate that women's language styles differ from that of men. In this paper, we explore statistical techniques that can learn to identify the gender of authors in modern English text, such as web blogs ...

research-article
Free
Improving the impact of subjectivity word sense disambiguation on contextual opinion analysis
pp 87–96

Subjectivity word sense disambiguation (SWSD) is automatically determining which word instances in a corpus are being used with subjective senses, and which are being used with objective senses. SWSD has been shown to improve the performance of ...

research-article
Free
Effects of meaning-preserving corrections on language learning
pp 97–105

We present a computational model of language learning via a sequence of interactions between a teacher and a learner. Experiments learning limited sublanguages of 10 natural languages show that the learner achieves a high level of performance after a ...

research-article
Free
Assessing benefit from feature feedback in active learning for text classification
pp 106–114

Feature feedback is an alternative to instance labeling when seeking supervision from human experts. Combination of instance and feature feedback has been shown to reduce the total annotation cost for supervised learning. However, learning problems may ...

research-article
Free
ULISSE: an unsupervised algorithm for detecting reliable dependency parses
pp 115–124

In this paper we present ULISSE, an unsupervised linguistically--driven algorithm to select reliable parses from the output of a dependency parser. Different experiments were devised to show that the algorithm is robust enough to deal with the output of ...

research-article
Free
Language models as representations for weakly-supervised NLP tasks
pp 125–134

Finding the right representation for words is critical for building accurate NLP systems when domain-specific labeled data for the task is scarce. This paper investigates language model representations, in which language models trained on unlabeled ...

research-article
Free
Automatic keyphrase extraction by bridging vocabulary gap
pp 135–144

Keyphrase extraction aims to select a set of terms from a document as a short summary of the document. Most methods extract keyphrases according to their statistical properties in the given document. Appropriate keyphrases, however, are not always ...

research-article
Free
Using second-order vectors in a knowledge-based method for acronym disambiguation
pp 145–153

In this paper, we introduce a knowledge-based method to disambiguate biomedical acronyms using second-order co-occurrence vectors. We create these vectors using information about a long-form obtained from the Unified Medical Language System and Medline. ...

research-article
Free
Using the mutual k-nearest neighbor graphs for semi-supervised classification of natural language data
pp 154–162

The first step in graph-based semi-supervised classification is to construct a graph from input data. While the k-nearest neighbor graphs have been the de facto standard method of graph construction, this paper advocates using the less well-known mutual ...

research-article
Free
Automatically building training examples for entity extraction
pp 163–171

In this paper we present methods for automatically acquiring training examples for the task of entity extraction. Experimental evidence show that: (1) our methods compete with a current heavily supervised state-of-the-art system, within 0.04 absolute ...

research-article
Free
Probabilistic word alignment under the L0-norm
pp 172–180

This paper makes two contributions to the area of single-word based word alignment for bilingual sentence pairs. Firstly, it integrates the -- seemingly rather different -- works of (Bodrumlu et al., 2009) and the standard probabilistic ones into a ...

research-article
Free
Authorship attribution with latent Dirichlet allocation
pp 181–189

The problem of authorship attribution -- attributing texts to their original authors -- has been an active research area since the end of the 19th century, attracting increased interest in the last decade. Most of the work on authorship attribution ...

research-article
Free
Evaluating a semantic network automatically constructed from lexical co-occurrence on a word sense disambiguation task
pp 190–199

We describe the extension and objective evaluation of a network1 of semantically related noun senses (or concepts) that has been automatically acquired by analyzing lexical cooccurrence in Wikipedia. The acquisition process makes no use of the metadata ...

research-article
Free
Filling the gap: semi-supervised learning for opinion detection across domains
pp 200–209

We investigate the use of Semi-Supervised Learning (SSL) in opinion detection both in sparse data situations and for domain adaptation. We show that co-training reaches the best results in an in-domain setting with small labeled data sets, with a ...

research-article
Free
A normalized-cut alignment model for mapping hierarchical semantic structures onto spoken documents
pp 210–218

We propose a normalized-cut model for the problem of aligning a known hierarchical browsing structure, e.g., electronic slides of lecture recordings, with the sequential transcripts of the corresponding spoken documents, with the aim to help index and ...

research-article
Bayesian tools for natural language learning
pp 219

In recent years Bayesian techniques have made good inroads in computational linguistics, due to their protection against overfitting and expressiveness of the Bayesian modeling language. However most Bayesian models proposed so far have used pretty ...

research-article
Free
Composing simple image descriptions using web-scale n-grams
pp 220–228

Studying natural language, and especially how people describe the world around them can help us better understand the visual world. In turn, it can also help us in the quest to generate natural language that describes this world in a human manner. We ...

research-article
Free
Adapting text instead of the model: an open domain approach
pp 229–237

Natural language systems trained on labeled data from one domain do not perform well on other domains. Most adaptation algorithms proposed in the literature train a new model for the new domain using unlabeled data. However, it is time consuming to ...

research-article
Free
Learning with lookahead: can history-based models rival globally optimized models?
pp 238–246

This paper shows that the performance of history-based models can be significantly improved by performing lookahead in the state space when making each classification decision. Instead of simply using the best action output by the classifier, we ...

research-article
Free
Learning discriminative projections for text similarity measures
pp 247–256

Traditional text similarity measures consider each term similar only to itself and do not model semantic relatedness of terms. We propose a novel discriminative training method that projects the raw term vectors into a common, low-dimensional vector ...

Contributors
  • The University of Edinburgh
  • Stanford University

Recommendations