Concept decomposition using clustering
US6560597B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 21, 2000 |
| Grant date | May 6, 2003 |
| Priority date | — |
| Expiry date | Mar 21, 2020 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99943
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method operates with a document collection in which documents are represented as normalized document vectors. The document vector space is partitioned into a set of disjoint clusters and a concept vector is computed for each partition, the concept vector comprising the mean vector of all the documents in each partition. Documents are then reassigned to the cluster having their closest concept vector, and a new set of concept vectors for the new partitioning is computed. From an initial partitioning, the concept vectors are iteratively calculated to a stopping threshold value, leaving a concept vector subspace of the document vectors. The documents can then be projected onto the concept vector subspace to be represented as a linear combination of the concept vectors, thereby reducing the dimensionality of the document space. A search query can be received for the content of text documents and a search can then be performed on the projected document vectors to identify text documents that correspond to the search query.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.