Patent · US Expired

Systems and methods for identifying and extracting data from HTML pages

US6920609B1 · kind B1 · utility

85Cited by
9References
27Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 24, 2000
Grant dateJul 19, 2005
Priority date
Expiry dateMar 29, 2022

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99936
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems and methods for analyzing HTML formatted web pages to automatically identify and extract desired information. A computer algorithm identifies and extracts different pieces of information from different web pages automatically after minimal manual setup. The algorithm automatically analyzes pages with different content if they have the same, or similar, formats.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.