System and method for thematically grouping documents into clusters
US8015188B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 4, 2010 |
| Grant date | Sep 6, 2011 |
| Priority date | — |
| Expiry date | Oct 4, 2030 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99945
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method for thematically grouping documents into clusters is provided. Concepts are extracted from a plurality of documents. The concepts include nouns or noun phrases. A number of occurrences for each concept are determined within each document. A bounded range is applied to the concepts and a subset of the concepts is selected by removing the concepts that fall outside the bounded range. The bounded range includes upper edge conditions and lower edge conditions. Themes are generated from the subset of concepts by identifying two or more concepts with common semantic meaning. Clusters of the documents are generated based on the themes.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.