skip to main content
10.3115/991250.991322dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free Access

Document classification by machine: theory and practice

Published:05 August 1994Publication History

ABSTRACT

In this note, we present results concerning the theory and practice of determining for a given document which of several categories it best fits. We describe a mathematical model of classification schemes and the one scheme which can be proved optimal among all those based on word frequencies. Finally, we report the results of an experiment which illustrates the efficacy of this classification method.

References

  1. {Hayes, 1992} Philip Hayes, Intelligent High-Volume Text Processing Using Shallow, Domain Specific Techniques, Text-Based Intelligent Systems, P. Jacobs, ed., Lawrence Erlbaum, Hillsdale, NJ, pp. 227--241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. {Lewis, 1992} David Lewis, Feature Selection and Feature Extraction for Text Categorization, Proceedings Speech and Natural Language Workshop, Morgan Kaufman, San Mateo, CA, February 1992, pp. 212--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. {Sundheim, 1991} Beth Sundheim, editor. Proceedings of the Third Message Understanding Evaluation and Conference, Morgan Kaufman, Los Altos, CA, May 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. {Walker and Amsler, 1986} D. Walker and R. Amsler, The Use of Machine-Readable Dictionaries in Sublanguage Analysis, Analyzing Language in Restricted Domains, Grishman and Kittredge, eds., Lawrence Erlbaum, Hillsdale, NJ.Google ScholarGoogle Scholar

Index Terms

(auto-classified)
  1. Document classification by machine: theory and practice

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image DL Hosted proceedings
      COLING '94: Proceedings of the 15th conference on Computational linguistics - Volume 2
      August 1994
      661 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      • Published: 5 August 1994

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,537of1,537submissions,100%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!