Patent · US Active

Techniques for crawling dynamic web content

US8024384B2 · kind B2 · utility

185Cited by
6References
28Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 13, 2009
Grant dateSep 20, 2011
Priority date
Expiry dateMar 12, 2030

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99953
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An automated form filler and script executor is integrated with a web browser engine, which is communicatively coupled to a web crawler, thereby enabling the crawler to identify dynamic web content based on submission of forms completed by the form filler. The crawler is capable of identifying web pages containing forms that require submission, and JavaScript code that requires execution, respectively, for requesting dynamic web content from a server. The form filler systematically completes the form based on various combinations of search parameter values provided by the web page for requesting dynamic content. Duplicate forms are detected, so that the crawler does not unnecessarily re-process forms that are similar to forms that have already been processed. The crawler may also determine which JavaScript links in a page are relevant for execution, so as to avoid unnecessary execution of irrelevant JavaScript links.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.