Patent · US Active

System and method for entity extraction from semi-structured text documents

US10489439B2 · kind B2 · utility

16Cited by
9References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 14, 2016
Grant dateNov 26, 2019
Priority date
Expiry dateJun 4, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N20/00
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for extracting entities from a text document includes, for at least a section of a text document, providing a first set of entities extracted from the at least a section, clustering at least a subset of the extracted entities in the first set into clusters, based on locations of the entities in the document. Complete ones of the clusters of entities are identified. Patterns for extracting new entities are learned based on the complete clusters. New entities are extracted from incomplete clusters based on the learned patterns.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.