Patent · US Active

Precise identification of text pixels from scanned document images

US7873215B2 · kind B2 · utility

12Cited by

28References

18Claims

0Family size

Assignee

SEIKO EPSON CORPORATION · JP

Inventors

Jing Xiao · Cupertino, US
Anoop K. Bhattacharjya · Campbell, US

Key dates

Filing date	Jun 27, 2007
Grant date	Jan 18, 2011
Priority date	—
Expiry date	Nov 17, 2029

Classification

Technology area (CPC G)Physics
CPC primaryG06V30/10
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A system or method for identifying text in a document. A group of connected components is created. A plurality of characteristics of different types is calculated for each connected component. Statistics are computed which describe the group of characteristics. Outlier components are identified as connected components whose computed characteristics are outside a statistical range. The outlier components are removed from the group of connected components. Text pixels are identified by segmenting pixels in the group of connected components into a group of text pixels and a group of background pixels.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.