Autonomous open schema construction from unstructured text
US11977569B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 28, 2022 |
| Grant date | May 7, 2024 |
| Priority date | — |
| Expiry date | May 29, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/025
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Disclosed is a natural language processing pipeline that analyzes and processes a corpus of textual data to automatically create a knowledge graph containing the corpus entities such as subjects and object and their relationships such as predicates or verbs. The pipeline is configured as an end-to-end neural Open Schema Construction pipeline having a coreference resolution module, an open information extraction (OIE) module, and an entity canonicalization module. The processed textual data is input to a graph database to create the knowledge graph displayable through a graphical user interface. In operation, the pipeline modules serve to create a single term for all entity mentions in the corpus that reference the same entity through coreference resolution, extract all subject-predicate-object triplets from the coreference resolved corpus through OIE, and then canonicalize the corpus by clustering each entity mention to a canonical form for mapping to the knowledge graph and display.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.