Patent · US Active

Extracting information from unstructured documents using natural language processing and conversion of unstructured documents into structured documents

US11423042B2 · kind B2 · utility

3Cited by
7References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 7, 2020
Grant dateAug 23, 2022
Priority date
Expiry dateJun 25, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/345
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Aspects of the present disclosure describe techniques for generating a machine learning model for extracting information from textual content. The method generally includes receiving a training data set including a plurality of documents having related textual strings. A relevancy model is generated from the training data set. The relevancy model is generally configured to generate relevance scores for a plurality of words extracted from the plurality of documents. A knowledge graph model illustrating relationships between the plurality of words extracted from the plurality of documents is generated from the training data set. The relevancy model and the knowledge graph model are aggregated into a complimentary model including a plurality of nodes from the knowledge graph model and weights associated with edges between connected nodes, wherein the weights comprise relevance scores generated from the relevancy model, and the complimentary model is deployed for use in analyzing documents.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.