Determination of intermediate representations of discovered document structures
US11880435B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 4, 2021 |
| Grant date | Jan 23, 2024 |
| Priority date | — |
| Expiry date | Aug 5, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/027
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A document is received. The document is analyzed to discover text and structures of content included in the document. A result of the analysis is used to determine intermediate text representations of segments of the content included in the document, wherein at least one of the intermediate text representations includes an added text encoding the discovered structure of the corresponding content segment within a structural layout of the document. The intermediate text representations are used as an input to a machine learning model to extract information of interest in the document. One or more structured records of the extracted information of interest are created.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.