Selective content extraction
US9032285B2 · kind B2 · utility
0Cited by
5References
9Claims
0Family size
Assignee
Inventors
Key dates
| Filing date | Jun 30, 2009 |
| Grant date | May 12, 2015 |
| Priority date | — |
| Expiry date | Jun 30, 2029 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/9577
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method for extracting web content includes detecting, within a web page, a hierarchical structure that includes a plurality of nodes. Potential article nodes from the plurality of nodes are identified. The identified potential article node with a highest rank in the hierarchical structure is identified as an article node. Content is extracted from the article node.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.