Patent · US Active

Web crawler scheduler that utilizes sitemaps from websites

US8037054B2 · kind B2 · utility

8Cited by
23References
15Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 25, 2010
Grant dateOct 11, 2011
Priority date
Expiry dateJun 25, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/951
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods and systems for a web crawler scheduler that utilizes sitemaps from websites are described. A web crawler scheduling system receives a notification from a website or web server. In response to the notification, the system accesses one or more sitemap(s) for documents associated with the website or web server. The system schedules crawls of the documents based on information identified from the sitemaps. The system crawls at least a subset of the documents scheduled for crawling.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.