Patent · US Active

Text extraction using optical character recognition

US12243337B2 · kind B2 · utility

0Cited by
10References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 8, 2024
Grant dateMar 4, 2025
Priority date
Expiry dateMar 8, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/41
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Provided herein are systems and methods for extracting text from a document. Different optical character recognition (OCR) tools are used to extract different versions of the text in the document. Metrics evaluating the quality of the extracted text are compared to identify and select higher quality extracted text. A selected portion of text is compared to a threshold to ensure minimal quality. The selected portion of text is then saved. Error correction can be applied to the selected portion of text based on errors specific to the OCR tools or the document contents.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.