Collaborative team crawling:Large scale information gathering over the internet
US6182085A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | May 28, 1998 |
| Grant date | Jan 30, 2001 |
| Priority date | — |
| Expiry date | May 28, 2018 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99948
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A distributed collection of web-crawlers to gather information over a large portion of the cyberspace. These crawlers share the overall crawling through a cyberspace partition scheme. They also collaborate with each other through load balancing to maximally utilize the computing resources of each of the crawlers. The invention takes advantage of the hierarchical nature of the cyberspace namespace and uses the syntactic components of the URL structure as the main vehicle for dividing and assigning crawling workload to individual crawler. The partition scheme is completely distributed in which each crawler makes the partitioning decision based on its own crawling status and a globally replicated partition tree data structure.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.