skip to main content
10.5555/990820dlproceedingsBook PagePublication PagescolingConference Proceedingsconference-collections
COLING '00: Proceedings of the 18th conference on Computational linguistics - Volume 1
2000 Proceeding
Publisher:
  • Association for Computational Linguistics
  • N. Eight Street, Stroudsburg, PA, 18360
  • United States
Conference:
Saarbrücken Germany 31 July 2000- 4 August 2000
ISBN:
978-1-55860-717-0
Published:
31 July 2000
Sponsors:
DFKI, Ministète de la Recherche Français, Deutsche Forschungsgemeinschaft, Loria, Centre Universitaire de Luxembourg, Universität des Saarlandes, Université; Nancy 2, Ministerium für Bildung, Kultur und Wissenschaft des Saarlandes

Bibliometrics
Skip Abstract Section
Abstract

This is the 18th International Conference on Computational Linguistics and, for myself and a few others, the 18th occasion on which I experience a growing sense of excitement at the prospect of meeting old friends, making new ones, and learning about the new discoveries and inventions that my colleagues have made in the last two years.The late Hans Karlgren invented the name "Coling" as an obvious contraction of "computational linguistics", and also in memory of the vagrant hero of a well-known Swedish comic strip who went by that name. The term "computational linguistics" itself was coined only a few years earlier by the late David Hays to refer to a field of endeavor whose creation had been recommended by the Automatic Language Processing Advisory Board (ALPAC) to provide a more solid theoretical foundation for work on machine translation. During the five or so years that intervened, the International Committee on Computational Linguistics stumbled into existence and organised its first conference in New York in 1965.During the 35 years since that first meeting, our field has become broader and deeper, and our students and colleagues can now command good salaries throughout much of the world. For evidence of the vigor of the field, you have only to look around you. The day before the opening of this meeting, a ceremony in Saarbrticken will mark the termination of the 8-year, 89-million dollar Verbmobil project on speech-to-speech translation that involved at one time or another, some 900 people. It will therefore be no surprise that Germany is one of the three countries to whom we owe such a debt of gratitude for the organisation of the meeting.The programme committee decided to make some modest innovations of its own this year, determining for the first time to conduct its business entirely through the Internet. Electronic submission of papers was strongly encouraged, against the advice of several experienced and cautious people who predicted chaos for the conference and a mental breakdown for me. There were, of course, problems. But things went more smoothly than anyone expected and we will know how to do things better next time. No submission was sent for review as hard copy, the handful that arrived on paper being scanned to make the digital version that reviewers then accessed through the world-wide web.323 regular papers, and about 100 project notes and demonstration proposals were received from 22 countries. Each was read by three members of one of the 11 panels of reviewers, totalling 222 people altogether. My gratitude to those dedicated and able people knows no bounds. 110 regular papers, 24 project notes, and 10 demonstration proposals were accepted, or about a third of the submissions. As one who has read many of the papers, I can assure you that the quality of the presentations is likely to be even higher than even this good ratio suggests. So I hope you share some of my excitement at what will be either the last Coling of the old millenium or the first of the new one, but surely one of a series that has only just begun.

Article
Free
A word-grammar based morphological analyzer for agglutinative languages

Agglutinative languages present rich morphology and for some applications they need deep analysis at word level. The work here presented proposes a model for designing a full morphological analyzer.The model integrates the two-level formalism and a ...

Article
Free
Learning word clusters from data types

The paper illustrates a linguistic knowledge acquisition model making use of data types, infinite memory, and an inferential mechanism for inducing new information from known data. The model is compared with standard stochastic methods applied to data ...

Article
Free
Selectional restrictions in HPSG

Selectional restrictions are semantic sortal constraints imposed on the participants of linguistic constructions to capture contextually-dependent constraints on interpretation. Despite their limitations, selectional restrictions have proven very useful ...

Article
Free
Extended models and tools for high-performance part-of-speech tagger

Statistical part-of-speech (POS) taggers achieve high accuracy and robustness when based on large scale manually tagged corpora. However, enhancements of the learning models are necessary to achieve better performance. We are developing a learning tool ...

Article
Free
An ontology of systematic relations for a shared grammar of Slavic

Sharing portions of grammars across languages greatly reduces the costs of multilingual grammar engineering. Related languages share a much wider range of linguistic information than typically assumed in standard multilingual grammar architectures. ...

Article
Free
The effects of word order and segmentation on translation retrieval performance

This research looks at the effects of word order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both bag-of-words and word order-sensitive similarity metrics,...

Article
Free
Exploiting a probabilistic hierarchical model for generation

Previous stochastic approaches to generation do not include a tree-based representation of syntax. While this may be adequate or even advantageous for some applications, other applications profit from using as much syntactic knowledge as is available, ...

Article
Free
Incremental identification of inflectional types

We present an approach to the incremental accrual of lexical information for unknown words that is constraint-based and compatible with standard unification-based grammars. Although the techniques are language-independent and can be applied to all kinds ...

Article
Free
Combination of n-grams and Stochastic Context-Free Grammars for language modeling

This paper describes a hybrid proposal to combine n-grams and Stochastic Context-Free Grammars (SCFGs) for language modeling. A classical n-gram model is used to capture the local relations between words, while a stochastic grammatical model is ...

Article
Free
An empirical evaluation of LFG-DOP

This paper presents an empirical assessment of the LFG-DOP model introduced by Bod & Kaplan (1998). The parser we describe uses fragments from LFG-annotated sentences to parse new sentences and Monte Carlo techniques to compute the most probable parse. ...

Article
Free
Parsing with the shortest derivation

Common wisdom has it that the bias of stochastic grammars in favor of shorter derivations of a sentence is harmful and should be redressed. We show that the common wisdom is wrong for stochastic grammars that use elementary trees instead of context-free ...

Article
Free
The effects of analysing cohesion on document summarisation

We argue that in general, the analysis of lexical cohesion factors in a document can drive a summarizer, as well as enable other content characterization tasks. More narrowly, this paper focuses on how one particular cohesion factor--simple lexical ...

Article
Free
Creating a Universal Networking Language module within an advanced NLP system

A multifunctional NLP environment, ETAP-3, is presented. The environment has several NLP applications, including a machine translation system, a natural language interface to SQL type databases, synonymous paraphrasing of sentences, syntactic error ...

Article
Free
Reusing an ontology to generate numeral classifiers

In this paper, we present a solution to the problem of generating Japanese numeral classifiers using semantic classes from an ontology. Most nouns must take a numeral classifier when they are quantified in languages such as Chinese, Japanese, Korean, ...

Article
Free
You'll take the high road and I'll take the low road: using a third language to improve bilingual word alignment

While language-independent sentence alignment programs typically achieve a recall in the 90 percent range, the same cannot be said about word alignment systems, where normal recall figures tend to fall somewhere between 20 and 40 percent, in the ...

Article
Free
Binding constraints as instructions of binding machines

Binding constraints have resisted to be fully integrated into the course of grammatical processing despite its practical relevance and cross-linguistic generality. The ultimate root for this is to be found in the exponential "overgenerate & filter" ...

Article
Free
Probabilistic parsing and psychological plausibility

Given the recent, evidence for probabilistic mechanisms in models of human ambiguity resolution, this paper investigates the plausibility of exploiting current wide-coverage, probabilistic parsing techniques to model human linguistic performance. In ...

Article
Free
The use of instrumentation in grammar engineering

This paper explores the usefulness of a technique from software engineering, code instrumentation, for the development of large-scale natural language grammars. Information about the usage of grammar rules in test and corpus sentences is used to improve ...

Article
Free
Automated generalization of translation examples

Previous work has shown that adding generalization of the examples in the corpus of an example-based machine translation (EBMT) system can reduce the required amount of pretranslated example text by as much as an order of magnitude for Spanish-English ...

Article
Free
A client/server architecture for word sense disambiguation

This paper presents a robust client/server implementation of a word sense disambiguator for English. This system associates a word with its meaning in a given context using dictionaries as tagged corpora in order to extract semantic disambiguation ...

Article
Free
Tagging of very large corpora: topic-focus articulation

After a brief characterization of the theory of the topic-focus articulation of the sentence (TFA), rules are formulated that determine the assignment of appropriate values of the TFA attribute in the process of syntactico-semantic tagging of a very ...

Article
Free
Exogeneous and endogeneous approaches to semantic categorization of unknown technical terms

Acquiring and updating terminological resources are difficult and tedious tasks, especially when semantic information should be provided. This paper deals with Term Semantic Categorization. The goal of this process is to assign semantic categories to ...

Article
Free
Word sense disambiguation of adjectives using probabilistic networks

In this paper, word sense disambiguation (WSD) accuracy achievable by a probabilistic classifier, using very minimal training sets, is investigated. We made the assumption that there are no tagged corpora available and identified what information, ...

Article
Free
A multilingual news summarizer

Huge multilingual news articles are reported and disseminated on the Internet. How to extract the key information and save the reading time is a crucial issue. This paper proposes architecture of multilingual news summarizer, including monolingual and ...

Article
Free
Mining tables from large scale HTML texts

Table is a very common presentation scheme, but few papers touch on table extraction in text data mining. This paper focuses on mining tables from large-scale HTML texts. Table filtering, recognition, interpretation, and presentation are discussed. ...

Article
Free
Automatic semantic classification for Chinese unknown compound nouns

The paper describes a similarity-based model to present the morphological rules for Chinese compound nouns. This representation model serves functions of 1) as the morphological rules of the compounds, 2) as a mean to evaluate the properness of a ...

Article
Free
Empirical estimates of adaptation: the chance of two noriegas is closer to p/2 than p2

Repetition is very common. Adaptive language models, which allow probabilities to change or adapt after seeing just a few words of a text, were introduced in speech recognition to account for text cohesion. Suppose a document mentions Noriega once. What ...

Article
Free
Explaining away ambiguity: learning verb selectional preference with Bayesian networks

This paper presents a Bayesian model for unsupervised learning of verb selectional preferences. For each verb the model creates a Bayesian network whose architecture is determined by the lexical hicrarchy of Wordnet and whose parameters are estimated ...

Article
Free
A class-based probabilistic approach to structural disambiguation

Knowledge of which words are able to fill particular argument slots of a predicate can be used for structural disambiguation. This paper describes a proposal for acquiring such knowledge, and in line with much of the recent work in this area, a ...

Article
Free
Extracting the names of genes and gene products with a hidden Markov model

We report the results of a study into the use of a linear interpolating hidden Markov model (HMM) for the task of extracting technical terminology from MEDLINE abstracts and texts in the molecular-biology domain. This is the first stage in a system that ...

Contributors
  • Palo Alto Research Center Incorporated

Recommendations

Acceptance Rates

Overall Acceptance Rate1,537of1,537submissions,100%
YearSubmittedAcceptedRate
COLING-ACL '06126126100%
COLING '041,4111,411100%
Overall1,5371,537100%