Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Loose, Falling Characters and Sentences: The Persistence of the OCR Problem in Digital Repository E-Books.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • نبذة مختصرة :
      The electronic conversion of scanned image files to readable text using optical character recognition (OCR) software and the subsequent migration of raw OCR text to e-book text file formats are key remediation or media conversion technologies used in digital repository e-book production. Despite real progress, the OCR problem of reliability and accuracy in OCR-derived e-book text and metadata persists. This paper examines a selection of digitized e-books in several prominent digital repositories and discusses the impact of OCR technology on e-book text file formats, metadata, and the online reading experience. [ABSTRACT FROM AUTHOR]
    • نبذة مختصرة :
      Copyright of Portal: Libraries & the Academy is the property of Johns Hopkins University Press and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)