Patent · US Active

System and method for prioritizing websites during a webcrawling process

US7966337B2 · kind B2 · utility

25Cited by
4References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 23, 2008
Grant dateJun 21, 2011
Priority date
Expiry dateJul 21, 2029

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99936
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for prioritizing a fetch order of web pages. The method comprises extracting by a web crawler a set of candidate web pages to be crawled. Each web page in the set of candidate web pages is associated with a website in a computer network. A determination is made to determine if a first website score for the website is in a website score database. The first website score is associated with web pages in the set of candidate web pages if the first website score exists in the website score database. The set of candidate web pages is prioritized with respect to an associated website score for each web page in the candidate set of web pages. Content is retrieved from the set of candidate web. Hyperlinks are extracted from the content. The hyperlinks are stored in a memory unit.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.