Patent · US Expired

System and method for dynamically evaluating latent concepts in unstructured documents

US6978274B1 · kind B1 · utility

133Cited by
18References
44Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 31, 2001
Grant dateDec 20, 2005
Priority date
Expiry dateJun 17, 2023

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99945
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for dynamically evaluating latent concepts in unstructured documents is disclosed. A multiplicity of concepts are extracted from a set of unstructured documents into a lexicon. The lexicon uniquely identifies each concept and a frequency of occurrence. A frequency of occurrence representation is created for the documents set. The frequency representation provides an ordered corpus of the frequencies of occurrence of each concept. A subset of concepts is selected from the frequency of occurrence representation filtered against a pre-defined threshold. A group of weighted clusters of concepts selected from the concepts subset is generated. A matrix of best fit approximations is determined for each document weighted against each group of weighted clusters of concepts.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.