Patent · US Active

Identifying similarly formed paragraphs in scanned images

US7715635B1 · kind B1 · utility

10Cited by
23References
52Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 28, 2006
Grant dateMay 11, 2010
Priority date
Expiry dateOct 21, 2028

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/414
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for identifying and/or categorizing similarly formed paragraphs in a digital image is set forth. An exemplary system includes a processor and a memory. The memory stores executable components which when direct the system to perform the following: obtain at least one page image of reflowable textual content and identify at least one paragraph of textual content. Thereafter, for each identified paragraph, a plurality of paragraph metrics regarding the identified paragraph is determined. Based on the paragraph metrics, similarly formed paragraphs are clustered.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.