Patent · US Active

Document heading detection

US10885282B2 · kind B2 · utility

8Cited by
6References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 7, 2018
Grant dateJan 5, 2021
Priority date
Expiry dateFeb 5, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/211
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Document heading detection includes performing a classification on each of a plurality of paragraphs of a document to identify each paragraph as either a heading or non-heading paragraph. The classification is based on one or more pre-established values corresponding to one or more pre-established formatting features that are indicative of a heading paragraph relative to currently established values for each of the one or more pre-established formatting features in each of the plurality of paragraphs. Document heading detection further includes determining a strength of each of the one or more heading paragraphs by performing a linear regression on each heading paragraph and assigning each of the one or more heading paragraphs a heading level within a hierarchy of heading levels based on the determined strength.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.