Transferable neural architecture for structured data extraction from web documents
US11886533B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 29, 2020 |
| Grant date | Jan 30, 2024 |
| Priority date | — |
| Expiry date | Jan 29, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/02
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems and methods for efficiently identifying and extracting machine-actionable structured data from web documents are provided. The technology employs neural network architectures which process the raw HTML content of a set of seed websites to create transferable models regarding information of interest. These models can then be applied to the raw HTML of other websites to identify similar information of interest. Data can thus be extracted across multiple websites in a functional, structured form that allows it to be used further by a processing system.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.