Searchable table extraction
US11837004B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 24, 2023 |
| Grant date | Dec 5, 2023 |
| Priority date | — |
| Expiry date | Feb 24, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/413
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Method includes generating a base model by training a pretrained model using a base training dataset including first training datapoints identifying tables in historical document images that include the tables and text, where the generated base model is configured to extract the tables as objects; and generating a table extraction model by training the base model using an enhanced training dataset including second training datapoints that are different from the first training datapoints and identify a plurality of cells disposed in each of the tables in a row direction and a column direction. The table extraction model is trained to output content of the tables and table information in an XML format, the table information including cell level information of the plurality of cells that is searchable via a query configured to provide target content that corresponds to one or more cells.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.