Network repository service directory for efficient web crawling
US6418452B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 3, 1999 |
| Grant date | Jul 9, 2002 |
| Priority date | — |
| Expiry date | Nov 3, 2019 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99953
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A master repository service maintains a directory of web servers and the most recent times that their web contents were modified, and provides this information to web crawlers to increase their efficiency. The master repository service receives web content update reports from a plurality of web servers, updates the directory to keep it current, and provides crawlers with web site modification information. The web site modification information preferably comprises identifiers for new web sites, “dead” web sites, and modified web sites. Each crawler is preferably provided only with web site modification information received since it last received information from the master repository service. The information allows web crawlers to know immediately about new web sites, and allows them to spend time visiting only those web sites that are new or that have changed their content.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.