Segmentation of text, picture and lines of a document image
US5465304A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | Apr 20, 1994 |
| Grant date | Nov 7, 1995 |
| Priority date | — |
| Expiry date | Apr 20, 2014 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06T9/005
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
In a character recognition system, a method and apparatus for segmenting a document image into areas containing text and non-text. Document segmentation in the present invention is comprised generally of the steps of: providing a bit-mapped representation of the document image, extracting run lengths for each scanline from the bit-mapped representation of the document image; constructing rectangles from the run lengths; initially classifying each of the rectangles as either text or non-text; correcting for the skew in the rectangles; merging associated text into one or more text blocks; and logically ordering the text blocks.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.