Systems and methods of web crawling
US9576052B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jul 16, 2013 |
| Grant date | Feb 21, 2017 |
| Priority date | — |
| Expiry date | Apr 16, 2034 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/951
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods and systems for dynamically training a web crawler. The web crawler maintains one or more categories each comprising a set of words. The method includes selecting at least one hyperlink in response to a query received from a user. The method further includes determining a hyperlink score for the at least one hyperlink based on a category score associated with each of one or more categories. The category score associated with each of the one or more categories is updated based at least in part on the hyperlink score. The updated category score is compared with the hyperlink score to select a category from the one or more categories. The set of words associated with the category is updated based on content of a web page pointed by the at least one hyperlink.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.