Author image not provided
 Klaus U Schulz

Authors:
Add personal information
  Affiliation history
Bibliometrics: publication history
Average citations per article6.18
Citation Count377
Publication count61
Publication years1991-2017
Available for download18
Average downloads per article382.06
Downloads (cumulative)6,877
Downloads (12 Months)215
Downloads (6 Weeks)27
SEARCH
ROLE
Arrow RightAuthor only
· Other only
· All roles


AUTHOR'S COLLEAGUES
See all colleagues of this author

SUBJECT AREAS
See all subject areas




BOOKMARK & SHARE


61 results found Export Results: bibtexendnoteacmrefcsv

Result 1 – 20 of 61
Result page: 1 2 3 4

Sort by:

1 published by ACM
June 2017 DATeCH2017: Proceedings of the 2nd International Conference on Digital Access to Textual Cultural Heritage
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 2,   Downloads (12 Months): 11,   Downloads (Overall): 11

Full text available: PDFPDF
In the absence of ground truth it is not possible to automatically determine the exact spectrum and occurrences of OCR errors in an OCR'ed text. Yet, for interactive postcorrection of OCR'ed historical printings it is extremely useful to have a statistical profile available that provides an estimate of error classes ...
Keywords: Postcorrection, German language, historical OCR

2 published by ACM
May 2014 DATeCH '14: Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 3,   Downloads (12 Months): 37,   Downloads (Overall): 97

Full text available: PDFPDF
When applied to historical texts, OCR engines often produce a non-negligible number of OCR errors. For research in the Humanities, text mining and retrieval, the option is important to improve the quality of OCRed historical texts using interactive postcorrection. We describe a system for interactive postcorrection of OCRed historical documents ...
Keywords: decision support, user interfaces, error correction

3 published by ACM
May 2014 DATeCH '14: Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 0,   Downloads (12 Months): 5,   Downloads (Overall): 48

Full text available: PDFPDF
During the last decade, a huge amount of OCRed historical texts has been made available on the Internet. For most of these documents meta data are missing that assign topic categories from library classification systems to texts. Data of this form would offer a much better access to these collections. ...
Keywords: topic and subject classification, automated topic detection

4
April 2014
Bibliometrics:
Citation Count: 0

- Donation refusal is high in all the regions of Argentina. - The deficient operative structure is a negative reality that allows inadequate donor maintenance and organ procurement. - In more developed regions, there are a high number of organs which are not utilized. This is true for heart, liver ...

5 published by ACM
December 2013 IIWAS '13: Proceedings of International Conference on Information Integration and Web-based Applications & Services
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 3,   Downloads (12 Months): 33,   Downloads (Overall): 106

Full text available: PDFPDF
In this paper, we describe a view of our research method on the Plagiarism Detection for Indonesian texts that we are working on. This method should address the problems of handling the equivalence class of Indonesian tokens, selecting the targeted source documents, and minimizing the gap of similarity measurement between ...
Keywords: external plagiarism detection, shingle, suspicious document

6 published by ACM
March 2013 EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 Workshops
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 0,   Downloads (12 Months): 6,   Downloads (Overall): 61

Full text available: PDFPDF
In this paper we present the WallBreaker system for similarity search as used in the String Similarity Search/Join Competition, 2013, organized by the Humboldt University of Berlin [1]. We consider the problem of how to efficiently find for a given string P (pattern) all words W in a lexicon such ...

7 published by ACM
September 2011 MOCR_AND '11: Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 5,   Downloads (12 Months): 18,   Downloads (Overall): 216

Full text available: PDFPDF
Erroneous tokens in the output of an OCR engine can be roughly divided into two categories. For less serious OCR errors typically human readers - in many cases also text correction systems - are able to reconstruct the correct original word, or to suggest a small set of plausible corrections. ...

8
June 2011 CiE'11: Proceedings of the 7th conference on Models of computation in context: computability in Europe
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 0

We present a number of applications in Natural Language Processing where the main computation consists of a similarity search for an input pattern in a large database. Afterwards we describe some efficient methods and algorithms for solving this computational challenge. We discuss the view of the similarity search as a ...

9
June 2011 International Journal on Document Analysis and Recognition - Special issue on noisy text analytics: Volume 14 Issue 2, June 2011
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 8

Due to the large number of spelling variants found in historical texts, standard methods of Information Retrieval (IR) fail to produce satisfactory results on historical document collections. In order to improve recall for search engines, modern words used in queries have to be associated with corresponding historical variants found in ...
Keywords: Information retrieval, Electronic lexica

10
June 2011 International Journal on Document Analysis and Recognition - Special issue on noisy text analytics: Volume 14 Issue 2, June 2011
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 0


11
May 2011 Theoretical Computer Science: Volume 412 Issue 22, May, 2011
Publisher: Elsevier Science Publishers Ltd.
Bibliometrics:
Citation Count: 3

Given some form of distance between words, a fundamental operation is to decide whether the distance between two given words w and v is within a given bound. In earlier work, we introduced the concept of a universal Levenshtein automaton for a given distance bound n. This deterministic automaton takes ...
Keywords: Levenshtein distance, Finite state automata, Dynamic programming

12
February 2011 Journal of Biomedical Informatics: Volume 44 Issue 1, February, 2011
Publisher: Elsevier Science
Bibliometrics:
Citation Count: 5

Searching for medical images and patient reports is a significant challenge in a clinical setting. The contents of such documents are often not described in sufficient detail thus making it difficult to utilize the inherent wealth of information contained within them. Semantic image annotation addresses this problem by describing the ...
Keywords: Medical knowledge engineering, Ontology modularization

13 published by ACM
October 2010 CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 1,   Downloads (12 Months): 5,   Downloads (Overall): 112

Full text available: PDFPDF
Keywords: noisy text

14
October 2009 International Journal on Document Analysis and Recognition - Special Issue NOISY: Volume 12 Issue 3, October 2009
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 0

The detection and correction of false friends—also called real-word errors—is a notoriously difficult problem. On realistic data, the break-even point for automatic correction so far could not be reached: the number of additional infelicitous corrections outnumbered the useful corrections. We present a new approach where we first compute a profile ...
Keywords: Error correction, Error dictionaries, False friends

15
October 2009 International Journal on Document Analysis and Recognition - Special Issue NOISY: Volume 12 Issue 3, October 2009
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 1


16 published by ACM
September 2009 DocEng '09: Proceedings of the 9th ACM symposium on Document engineering
Publisher: ACM
Bibliometrics:
Citation Count: 7
Downloads (6 Weeks): 2,   Downloads (12 Months): 10,   Downloads (Overall): 173

Full text available: PDFPDF
Many European libraries are currently engaged in mass digitization projects that aim to make historical documents and corpora online available in the Internet. In this context, appropriate lexical resources play a double role. They are needed to improve OCR recognition of historical documents, which currently does not lead to satisfactory ...
Keywords: electronic lexica, historical spelling variants, information retrieval

17 published by ACM
July 2009 AND '09: Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
Publisher: ACM
Bibliometrics:
Citation Count: 7
Downloads (6 Weeks): 2,   Downloads (12 Months): 9,   Downloads (Overall): 246

Full text available: PDFPDF
Due to the large number of spelling variants found in historical texts, standard methods of Information Retrieval (IR) fail to produce satisfactory results on historical document collections. In order to improve recall for search engines, modern words used in queries have to be associated with corresponding historical variants found in ...
Keywords: electronic lexica, historical spelling variants, information retrieval

18 published by ACM
July 2008 AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text data
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 0,   Downloads (12 Months): 9,   Downloads (Overall): 148

Full text available: PDFPDF
The detection and correction of false friends - also called real-word errors - is a notoriously difficult problem. On realistic data, the break-even point for automatic correction so far could not be reached: the number of additional infelicitous corrections outnumbered the useful corrections. We present a new approach where we ...
Keywords: error correction, error dictionaries, false friends

19
December 2007 International Journal on Document Analysis and Recognition: Volume 10 Issue 3, December 2007
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 2

Given a specific information need, documents of the wrong genre can be considered as noise. From this perspective, genre classification helps to separate relevant documents from noise. Orthographic errors represent a second, finer notion of noise. Since specific genres often include documents with many errors, an interesting question is whether ...
Keywords: Error dictionaries, Genre hierarchies, Features, Genre classification, Noisy corpora

20
December 2007 AI'07: Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 1

Lexical text correction systems are typically based on a central step: when finding a malformed token in the input text, a set of correction candidates for the token is retrieved from the given background dictionary. In previous work we introduced a method for the selection of correction candidates which is ...



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us