Discrete wavelet transform method for document structure similarity
US9405750B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 31, 2011 |
| Grant date | Aug 2, 2016 |
| Priority date | — |
| Expiry date | Mar 7, 2032 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/986
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Examples of the present disclosure may include methods, systems, and computer readable media with executable instructions. An example method for determining document structure similarity can include segmenting path sequences (206) of Document Object Model (DOM) trees (120, 462) from a number of web pages (202) into B components (561). Path signals (210) corresponding to the path sequences (206) are determined based on a count of the occurrences of particular paths in the Bthe component (571), and unique path signals (210) are transformed into discrete wavelet signals (214)(572). The discrete wavelet signals (214) are analyzed at multiple DOM tree resolution levels (573).
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.