Patent · US Active

System for information extraction from form-like documents

US12354396B2 · kind B2 · utility

0Cited by
2References
14Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 19, 2023
Grant dateJul 8, 2025
Priority date
Expiry dateOct 19, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06T2207/30176
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The present disclosure is directed to extracting text from form-like documents. In particular, a computing system can obtain an image of a document that contains a plurality of portions of text. The computing system can extract one or more candidate text portions for each field type included in a target schema. The computing system can generate a respective input feature vector for each candidate for the field type. The computing system can generate a respective candidate embedding for the candidate text portion. The computing system can determine a respective score for each candidate text portion for the field type based at least in part on the respective candidate embedding for the candidate text portion. The computing system can assign one or more of the candidate text portions to the field type based on the respective scores.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.