Patent · US Active

Selective content extraction

US9032285B2 · kind B2 · utility

0Cited by
5References
9Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 30, 2009
Grant dateMay 12, 2015
Priority date
Expiry dateJun 30, 2029

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/9577
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for extracting web content includes detecting, within a web page, a hierarchical structure that includes a plurality of nodes. Potential article nodes from the plurality of nodes are identified. The identified potential article node with a highest rank in the hierarchical structure is identified as an article node. Content is extracted from the article node.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.