System and method for interactive classification and analysis of data
US6424971B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 29, 1999 |
| Grant date | Jul 23, 2002 |
| Priority date | — |
| Expiry date | Oct 29, 2019 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99942
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system, method, and computer program product for interactively classifying and analyzing data is particularly applicable to classification and analysis of textual data. It is particularly useful in identification of helpdesk inquiry and problem categories amenable to automated fulfillment or solution. A dictionary is generated based on a frequency of occurrence of words in a document set. A count of occurrences of each word in the dictionary within each document in the document set is generated. The set of documents is partitioned into a plurality of clusters. A name, a centroid, a cohesion score, and a distinctness score are generated for each cluster and displayed in a table. The documents contained in the clusters sorted based on their similarity to other documents in the cluster. The similarity may be determined by calculating the distance of the document to the centroid of the cluster and the documents may be sorted in order of ascending or descending distance of the document to the centroid of the cluster. Editing input may be received from a user and the displayed table modified based on the received editing input—clusters may be split or deleted. The helpdesk applic…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.