Method for learning character patterns to interactively control the scope of a web crawler
US6411952B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 24, 1998 |
| Grant date | Jun 25, 2002 |
| Priority date | — |
| Expiry date | Jun 24, 2018 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99935
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method controls a Web search for server computer resources by an end-user Web crawler. Each resource, such as a Web page, is located by a resource address specified as a character string. The end-user defines a scope for an initial Web search by settings. The settings are used to search the Web for resources limited by the scope. The set of resources located during the search are rendered on output device, and positive and negative examples are selected from the set of resources to infer a rule. The rule is displayed, as well as a subset of resources that match on the rule. The selecting, inferring, and rendering steps are repeated while searching until a final rule is obtained. The rule matches resources that the crawler should process and does not match resource that it should avoid.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.