Actions

::Optical character recognition

::concepts

First::title    Image::spaced    Ndash::journal    Which::accuracy    Printed::words    Images::optical

Optical character recognition(optical character reader) (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, whether passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Early versions needed to be trained with images of each character, and worked on one font at a time. Advanced systems capable of producing a high degree of recognition accuracy for most fonts are now common. Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components.


Optical character recognition sections
Intro  History  Applications  Types  Techniques  Workarounds  Accuracy  Unicode  See also  References  External links  

PREVIOUS: IntroNEXT: History
<<>>