Automated information extraction from electronic documents using machine learning
US12210824B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 29, 2022 |
| Grant date | Jan 28, 2025 |
| Priority date | — |
| Expiry date | Jul 22, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/022
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method of automatically extracting information from electronic documents is discussed. The method includes a computer system receiving a plurality of electronic documents of a particular type that includes information arranged in a plurality of different formats. The method further includes, for each of a set of electronic documents, the computer system analyzes the electronic documents to identify tokens within the electronic documents, identifies a plurality of points-of-interest within the electronic documents, and matches points-of-interest based on distance between points-of-interest and a determination by a natural language processing model that the points-of-interest correspond. The method further includes generating revised versions of the electronic documents in which the matched points-of-interest are arranged in a universal format and storing the revised versions of the electronic documents.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.