Patent · US Active

Searchable table extraction

US11837004B1 · kind B1 · utility

4Cited by
11References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 24, 2023
Grant dateDec 5, 2023
Priority date
Expiry dateFeb 24, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/413
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Method includes generating a base model by training a pretrained model using a base training dataset including first training datapoints identifying tables in historical document images that include the tables and text, where the generated base model is configured to extract the tables as objects; and generating a table extraction model by training the base model using an enhanced training dataset including second training datapoints that are different from the first training datapoints and identify a plurality of cells disposed in each of the tables in a row direction and a column direction. The table extraction model is trained to output content of the tables and table information in an XML format, the table information including cell level information of the plurality of cells that is searchable via a query configured to provide target content that corresponds to one or more cells.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.