Tabular data parsing in document(s)
US9898523B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Apr 22, 2013 |
| Grant date | Feb 20, 2018 |
| Priority date | — |
| Expiry date | Feb 24, 2036 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/18
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
One or more techniques and/or systems are provided for parsing tabular data of a document. That is, a document may comprise arbitrarily formatted content (e.g., an equipment inspection report generated by an engineer). Respective rows of the document may be clustered into one or more row clusters based upon row proximity and/or numeric content (e.g., rows having similar numeric content may comprise logically related information). One or more vertical clusters may be generated within respective row clusters based upon vertical overlap. In this way, row clusters and/or vertical clusters may be searched for one or more values that may be assigned to a search term. For example, a row cluster may comprise a search term “Average temp”. One or more vertical clusters within the row cluster may be searched for a word that matches a pattern criteria (e.g., a two digit number), which may be assigned to the search term.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.