Modeling and extracting elements in semi-structured documents
US10614125B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 28, 2018 |
| Grant date | Apr 7, 2020 |
| Priority date | — |
| Expiry date | Sep 28, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06Q40/123
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The disclosed embodiments provide a system that describes a semi-structured document for the purpose of acquiring a set of data elements from the semi-structured document. During operation, the system obtains a physics model of a semi-structured document, wherein the physics model includes a set of relationships represented by physical objects that describe relative positions of a set of data elements in the semi-structured document. Next, the system applies the physics model to a representation of the semi-structured document to automatically extract a set of data from the representation. The system then provides the extracted set of data for use with one or more applications without requiring manual input of the data into the one or more applications.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.