Patent · US Active

Systems and methods of web crawling

US9576052B2 · kind B2 · utility

1Cited by
2References
14Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 16, 2013
Grant dateFeb 21, 2017
Priority date
Expiry dateApr 16, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/951
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods and systems for dynamically training a web crawler. The web crawler maintains one or more categories each comprising a set of words. The method includes selecting at least one hyperlink in response to a query received from a user. The method further includes determining a hyperlink score for the at least one hyperlink based on a category score associated with each of one or more categories. The category score associated with each of the one or more categories is updated based at least in part on the hyperlink score. The updated category score is compared with the hyperlink score to select a category from the one or more categories. The set of words associated with the category is updated based on content of a web page pointed by the at least one hyperlink.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.