Patent · US Active

Using visual features to identify document sections

US10565444B2 · kind B2 · utility

1Cited by
3References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 7, 2017
Grant dateFeb 18, 2020
Priority date
Expiry dateApr 24, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/245
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method, computer system, and a computer program product for identifying sections in a document based on a plurality of visual features is provided. The present invention may include receiving a plurality of documents. The present invention may also include extracting a plurality of content blocks. The present invention may further include determining the plurality of visual features. The present invention may then include grouping the extracted plurality of content blocks into a plurality of categories. The present invention may also include generating a plurality of closeness scores for the plurality of categories by utilizing a Visual Similarity Measure. The present invention may further include generating a plurality of Association Matrices on the plurality of categories for each of the received plurality of documents based on the Visual Similarity Measure. The present invention may further include merging the plurality of categories into a plurality of clusters.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.