Patent · US Active

Precise identification of text pixels from scanned document images

US7873215B2 · kind B2 · utility

12Cited by
28References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 27, 2007
Grant dateJan 18, 2011
Priority date
Expiry dateNov 17, 2029

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/10
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system or method for identifying text in a document. A group of connected components is created. A plurality of characteristics of different types is calculated for each connected component. Statistics are computed which describe the group of characteristics. Outlier components are identified as connected components whose computed characteristics are outside a statistical range. The outlier components are removed from the group of connected components. Text pixels are identified by segmenting pixels in the group of connected components into a group of text pixels and a group of background pixels.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.