Patent · US Active

Automated information extraction from electronic documents using machine learning

US12210824B1 · kind B1 · utility

0Cited by
30References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 29, 2022
Grant dateJan 28, 2025
Priority date
Expiry dateJul 22, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N5/022
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method of automatically extracting information from electronic documents is discussed. The method includes a computer system receiving a plurality of electronic documents of a particular type that includes information arranged in a plurality of different formats. The method further includes, for each of a set of electronic documents, the computer system analyzes the electronic documents to identify tokens within the electronic documents, identifies a plurality of points-of-interest within the electronic documents, and matches points-of-interest based on distance between points-of-interest and a determination by a natural language processing model that the points-of-interest correspond. The method further includes generating revised versions of the electronic documents in which the matched points-of-interest are arranged in a universal format and storing the revised versions of the electronic documents.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.