Patent · US Active

Document structure identification using post-processing error correction

US11321559B2 · kind B2 · utility

0Cited by
3References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 17, 2019
Grant dateMay 3, 2022
Priority date
Expiry dateSep 30, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/414
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Techniques are disclosed for identifying document structural elements and correcting errors in the classification and/or location of the identified structural elements. An example method includes determining location and classification for a structural element on a page of the document using a machine learning (ML) model; determining one or more errors in the location and/or classification for the structural element; and correcting each instance of the one or more errors using other content in the document (e.g., content spatially adjacent to the corresponding structural element on the page of the document). The method may further include storing the document and the location and classification (as corrected), and/or generating a structural map of the page of the document based on the location and classification (as corrected). The use of the document content to correct errors greatly enhances the agreement between the identified structural elements and the original document.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.