Method for the logical segmentation of contents
US8311330B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 5, 2010 |
| Grant date | Nov 13, 2012 |
| Priority date | — |
| Expiry date | May 17, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/416
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A document to be segmented is converted into a common representation format, if necessary. Parsing of the document results in a document model that is analyzed based on at least one structure-dependent function to identify segments within the document. In one embodiment, the structure-dependent function may comprise a template, or a best-fit template of a plurality of templates, used for comparison with the document model. In other embodiments, the structure-dependent function may comprise table of contents information, font properties within the document model and/or an average segment size determined according to previously identified segments in one or more additional documents that are related to the document under consideration. Semantic-content dependent functions may be applied to further refine the analysis by identifying sub-segments within the extracted segments, or by identifying segments that may be properly merged according to the similarity of their respective semantic content.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.