Patent · US Expired

Path-based ranking of unvisited web pages

US7424484B2 · kind B2 · utility

11Cited by
3References
8Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 5, 2003
Grant dateSep 9, 2008
Priority date
Expiry dateMar 16, 2025

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Path-based ranking of unvisited Web pages for WWW crawling is provided, via identifying all the paths beginning with a “seed” URL and leading to visited relevant web pages as “good-path set”, and for each unvisited web page, identifying the paths beginning from the “seed” URL leading to it as “partial-path set”; classifying all the visited web pages and labeling each web Page with the labels of a class or classes it belongs to; training a statistic model for generalizing the common patterns among all ones of “good-path set”; and evaluating the “partial-path set” with the statistic model and ranking the unvisited web pages with the evaluation results.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.