Patent · US Expired

Network repository service directory for efficient web crawling

US6418452B1 · kind B1 · utility

60Cited by
11References
15Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 3, 1999
Grant dateJul 9, 2002
Priority date
Expiry dateNov 3, 2019

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99953
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A master repository service maintains a directory of web servers and the most recent times that their web contents were modified, and provides this information to web crawlers to increase their efficiency. The master repository service receives web content update reports from a plurality of web servers, updates the directory to keep it current, and provides crawlers with web site modification information. The web site modification information preferably comprises identifiers for new web sites, “dead” web sites, and modified web sites. Each crawler is preferably provided only with web site modification information received since it last received information from the master repository service. The information allows web crawlers to know immediately about new web sites, and allows them to spend time visiting only those web sites that are new or that have changed their content.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.