Patent · US Active

Text extraction using optical character recognition

US11961316B2 · kind B2 · utility

0Cited by
8References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 10, 2022
Grant dateApr 16, 2024
Priority date
Expiry dateJul 3, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/41
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Provided herein are systems and methods for extracting text from a document. Different optical character recognition (OCR) tools are used to extract different versions of the text in the document. Metrics evaluating the quality of the extracted text are compared to identify and select higher quality extracted text. A selected portion of text is compared to a threshold to ensure minimal quality. The selected portion of text is then saved. Error correction can be applied to the selected portion of text based on errors specific to the OCR tools or the document contents.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.