Web graph compression through scalable pattern mining
US7818303B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 29, 2008 |
| Grant date | Oct 19, 2010 |
| Priority date | — |
| Expiry date | Apr 16, 2029 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/95
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and a processing device are provided for compressing a web graph including multiple nodes and links between the multiple nodes. Nodes of the web graph may be clustered into groups including no more than a predetermined number of nodes. A list of links of the clustered nodes may be created and sorted based on a frequency of occurrence of each of the links. A prefix tree may be created based on the sorted list of links. The prefix tree may be walked to find candidate virtual nodes. The candidate virtual nodes may be analyzed according to a selection criteria and a virtual node may be selected. The prefix tree may be adjusted to account for the selection of the virtual node and the virtual node may be added to the web graph.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.