Patent · US Expired

Method for identifying and using table structures

US7054871B2 · kind B2 · utility

19Cited by
5References
19Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 11, 2000
Grant dateMay 30, 2006
Priority date
Expiry dateMar 8, 2023

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99934
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for recognizing a table structure from delineated table region in an electronic document using hierarchical clustering of data strings. The cluster groupings are segregated effectively using the distances from a positional vector associated with words and groups of words rather than a minimum number of blank spaces between words. Once a data tree of the hierarchical clusterings is constructed, the tree is scanned downward from the root to find appropriate column boundaries using a columnization algorithm. Then using successive heuristic algorithms, determine column and row headers and row boundaries.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.