Patent · US Expired

Method for learning character patterns to interactively control the scope of a web crawler

US6411952B1 · kind B1 · utility

112Cited by
11References
66Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 24, 1998
Grant dateJun 25, 2002
Priority date
Expiry dateJun 24, 2018

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99935
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method controls a Web search for server computer resources by an end-user Web crawler. Each resource, such as a Web page, is located by a resource address specified as a character string. The end-user defines a scope for an initial Web search by settings. The settings are used to search the Web for resources limited by the scope. The set of resources located during the search are rendered on output device, and positive and negative examples are selected from the set of resources to infer a rule. The rule is displayed, as well as a subset of resources that match on the rule. The selecting, inferring, and rendering steps are repeated while searching until a final rule is obtained. The rule matches resources that the crawler should process and does not match resource that it should avoid.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.