Techniques for crawling dynamic web content
US8024384B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 13, 2009 |
| Grant date | Sep 20, 2011 |
| Priority date | — |
| Expiry date | Mar 12, 2030 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99953
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An automated form filler and script executor is integrated with a web browser engine, which is communicatively coupled to a web crawler, thereby enabling the crawler to identify dynamic web content based on submission of forms completed by the form filler. The crawler is capable of identifying web pages containing forms that require submission, and JavaScript code that requires execution, respectively, for requesting dynamic web content from a server. The form filler systematically completes the form based on various combinations of search parameter values provided by the web page for requesting dynamic content. Duplicate forms are detected, so that the crawler does not unnecessarily re-process forms that are similar to forms that have already been processed. The crawler may also determine which JavaScript links in a page are relevant for execution, so as to avoid unnecessary execution of irrelevant JavaScript links.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.