Patent · US Active

Systems and methods for client-based web crawling

US7685296B2 · kind B2 · utility

33Cited by
5References
104Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 25, 2003
Grant dateMar 23, 2010
Priority date
Expiry dateJun 29, 2028

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/9538
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The present invention provides systems and methods for obtaining information from a networked system utilizing a distributed web crawler. The distributed nature of clients of a server is leveraged to provide fast and accurate web crawling data. Information gathered by a server's web crawler is compared to data retrieved by clients of the server to update the crawler's data. In one instance of the present invention, data comparison is achieved by utilizing information disseminated via a search engine results page. In another instance of the present invention, data validation is accomplished by client dictionaries, emanating from a server, that summarize web crawler data. The present invention also facilitates data analysis by providing a means to resist spoofing of a web crawler to increase data accuracy.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.