Clustering hypertext with applications to web searching
US6684205B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 18, 2000 |
| Grant date | Jan 27, 2004 |
| Priority date | — |
| Expiry date | Jul 13, 2021 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99933
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and structure of searching a database containing hypertext documents comprising searching the database using a query to produce a set of hypertext documents; and geometrically clustering the set of hypertext documents into various clusters using a toric k-means similarity measure such that documents within each cluster are similar to each other, wherein the clustering has a linear-time complexity in producing the set of hypertext documents, wherein the similarity measure comprises a weighted sum of maximized individual components of the set of hypertext documents, and wherein the clustering is based upon words contained in each hypertext document, out-links from each hypertext document, and in-links to each hypertext document.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.