Searchable data structure for electronic documents
US11727215B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 16, 2020 |
| Grant date | Aug 15, 2023 |
| Priority date | — |
| Expiry date | Jun 26, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/205
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method of generating a searchable representation of an electronic document includes obtaining an electronic document specifying a graphical layout of content items including text. The method also includes determining pixel data representing the graphical layout of the content items and providing input data based, at least in part, on the pixel data to a document parsing model. The document parsing model is trained to detect functional regions within the graphical layout based on the input data, assign boundaries to the functional regions based on the input data, and assign a category label to each functional region that is detected. The method also includes matching portions of the text to corresponding functional regions based on the boundaries assigned to the functional regions and locations associated with the portions of the text and storing data representing the content items, the functional regions, and the category labels in a searchable data structure.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.