Path-based ranking of unvisited web pages
US7424484B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 5, 2003 |
| Grant date | Sep 9, 2008 |
| Priority date | — |
| Expiry date | Mar 16, 2025 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99943
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Path-based ranking of unvisited Web pages for WWW crawling is provided, via identifying all the paths beginning with a “seed” URL and leading to visited relevant web pages as “good-path set”, and for each unvisited web page, identifying the paths beginning from the “seed” URL leading to it as “partial-path set”; classifying all the visited web pages and labeling each web Page with the labels of a class or classes it belongs to; training a statistic model for generalizing the common patterns among all ones of “good-path set”; and evaluating the “partial-path set” with the statistic model and ranking the unvisited web pages with the evaluation results.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.