Distributed crawling of hyperlinked documents
US7305610B1 · kind B1 · utility
19Cited by
10References
25Claims
0Family size
Assignee
Inventors
Key dates
| Filing date | Aug 14, 2000 |
| Grant date | Dec 4, 2007 |
| Priority date | — |
| Expiry date | Dec 22, 2022 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/951
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques for crawling hyperlinked documents are provided. Hyperlinked documents to be crawled are grouped by host and the host to be crawled next is selected according to a stall time of the host. The stall time can indicate the earliest time that the host should be crawled and the stall times can be a predetermined amount of time, vary by host and be adjusted according to actual retrieval times from the host.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.