System and method for dynamically evaluating latent concepts in unstructured documents
US6978274B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Aug 31, 2001 |
| Grant date | Dec 20, 2005 |
| Priority date | — |
| Expiry date | Jun 17, 2023 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99945
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method for dynamically evaluating latent concepts in unstructured documents is disclosed. A multiplicity of concepts are extracted from a set of unstructured documents into a lexicon. The lexicon uniquely identifies each concept and a frequency of occurrence. A frequency of occurrence representation is created for the documents set. The frequency representation provides an ordered corpus of the frequencies of occurrence of each concept. A subset of concepts is selected from the frequency of occurrence representation filtered against a pre-defined threshold. A group of weighted clusters of concepts selected from the concepts subset is generated. A matrix of best fit approximations is determined for each document weighted against each group of weighted clusters of concepts.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.