Identifying similarly formed paragraphs in scanned images
US7715635B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 28, 2006 |
| Grant date | May 11, 2010 |
| Priority date | — |
| Expiry date | Oct 21, 2028 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/414
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method for identifying and/or categorizing similarly formed paragraphs in a digital image is set forth. An exemplary system includes a processor and a memory. The memory stores executable components which when direct the system to perform the following: obtain at least one page image of reflowable textual content and identify at least one paragraph of textual content. Thereafter, for each identified paragraph, a plurality of paragraph metrics regarding the identified paragraph is determined. Based on the paragraph metrics, similarly formed paragraphs are clustered.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.