BOOKMARK & SHARE
Journal on Computing and Cultural Heritage (JOCCH): Volume 11 Issue 1, December 2017
Citation Count: 0
Downloads (6 Weeks): 42, Downloads (12 Months): 54, Downloads (Overall): 54
Full text available:
Optical character recognition (OCR) engines work poorly on texts published with premodern printing technologies. Engaging the key technological contributors from the IMPACT project, an earlier project attempting to solve the OCR problem for early modern and modern texts, the Early Modern OCR Project (eMOP) of Texas A8M received funding from ...
Machine learning, digital humanities
AAAI'15: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence
Publisher: AAAI Press
Mass digitization of historical documents is a challenging problem for optical character recognition (OCR) tools. Issues include noisy backgrounds and faded text due to aging, border/marginal noise, bleed-through, skewing, warping, as well as irregular fonts and page layouts. As a result, OCR tools often produce a large number of spurious ...