Document heading detection
US10885282B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 7, 2018 |
| Grant date | Jan 5, 2021 |
| Priority date | — |
| Expiry date | Feb 5, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/211
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Document heading detection includes performing a classification on each of a plurality of paragraphs of a document to identify each paragraph as either a heading or non-heading paragraph. The classification is based on one or more pre-established values corresponding to one or more pre-established formatting features that are indicative of a heading paragraph relative to currently established values for each of the one or more pre-established formatting features in each of the plurality of paragraphs. Document heading detection further includes determining a strength of each of the one or more heading paragraphs by performing a linear regression on each heading paragraph and assigning each of the one or more heading paragraphs a heading level within a hierarchy of heading levels based on the determined strength.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.