Patent · US Active

Removal of graphics from document images using heuristic text analysis and text recovery

US9355311B2 · kind B2 · utility

0Cited by
5References
20Claims
0Family size

Assignee

Inventor

Key dates

Filing dateSep 23, 2014
Grant dateMay 31, 2016
Priority date
Expiry dateSep 23, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06T2207/30176
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A graphic removal process for document images involves two stages: First, removal of graphics in the document image based on heuristic text analyses; and second, text recovery to recover some text that is accidentally removed during the first stage. The first stage uses a relatively aggressive strategy to ensure that all graphics components are removed, which also temporarily leads to the removal of some text; the lost text will then be recovered using the text recovery technique. The heuristic text analyses utilize the geometric properties of text characters and consider the properties of text characters in relation to their neighbors. The text recovery technique starts from the text that remain after the first stage, and recovers any connected component that is at least partially located within a pre-defined neighboring area around any of the text components in the intermediate document image.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.