Method and system for trawling the World-wide Web to identify implicitly-defined communities of web pages
US6886129B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 24, 1999 |
| Grant date | Apr 26, 2005 |
| Priority date | — |
| Expiry date | Nov 24, 2019 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99933
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and system for identifying groups of pages of common interest from a collection of hyper-linked pages are disclosed. A plurality of community cores are identified from the collection where each core includes first and second sets of pages, and each page in the first set points to every page in the second set. Each identified core is expanded into a full community which is a subset of the pages regarding a particular topic. The identification community cores is based on the analysis of the Web graph in which the communities correspond to instances of Web subgraphs. Extraneous pages are then pruned to improve the quality of the resulting communities.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.