Patent · US Expired

Method of web crawling utilizing address mapping

US6145003A · kind A · utility

148Cited by
14References
28Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 17, 1997
Grant dateNov 7, 2000
Priority date
Expiry dateDec 17, 2017

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99933
  • WIPO fieldDigital communication
  • WIPO sectorElectrical engineering

Abstract

A computer-based system and method of retrieving information pertaining to Web documents on a computer network is disclosed. The method includes maintaining an address map that associates primary addresses with secondary addresses. A primary address includes a network retrieval protocol and a network address. The secondary address may include a different retrieval protocol or a different network address from the primary document address. A Web crawler retrieves a Web document using the primary document address, and determines whether the address map contains a secondary document address prefix corresponding to the primary document address prefix. If a secondary document address prefix exists, the Web crawler creates a secondary address, retrieves additional information pertaining to the Web document, and combines the additional information with the data retrieved from the Web document. The combined data may be stored in an index, and subsequently used to perform a document search.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.