Converting digital images containing text to token-based files for rendering
US7460710B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 29, 2006 |
| Grant date | Dec 2, 2008 |
| Priority date | — |
| Expiry date | Apr 17, 2027 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/10
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A computer-implemented method is provided for converting a scanned-in electronic image into a token-based file. The method includes generally five steps. First, various tokens (i.e., graphical units) are identified in the electronic image. Second, the identified tokens having similar shapes are classified together to form a token group, to thereby form multiple token groups, each including one or more tokens having similar shapes. Third, in each token group, a representative token is found, which morphologically represents the shapes of tokens included in the group. Fourth, each representative token is converted into a vectorized token, which is a mathematical representation of the shape of the representative token. Fifth, each of the vectorized tokens is associated with the positions of the tokens in the electronic image represented by the vectorized token. Thus, upon rendering, the vectorized token is displayed to thereby create a page image consisting only of clean images of vectorized tokens.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.