Method for identifying and using table structures
US7054871B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 11, 2000 |
| Grant date | May 30, 2006 |
| Priority date | — |
| Expiry date | Mar 8, 2023 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99934
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method for recognizing a table structure from delineated table region in an electronic document using hierarchical clustering of data strings. The cluster groupings are segregated effectively using the distances from a positional vector associated with words and groups of words rather than a minimum number of blank spaces between words. Once a data tree of the hierarchical clusterings is constructed, the tree is scanned downward from the root to find appropriate column boundaries using a columnization algorithm. Then using successive heuristic algorithms, determine column and row headers and row boundaries.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.