Graph-based labeling rule augmentation for weakly supervised training of machine-learning-based named entity recognition
US11669740B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 25, 2021 |
| Grant date | Jun 6, 2023 |
| Priority date | — |
| Expiry date | Dec 4, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems and methods for training a machine-learning model for named-entity recognition. A rule graph is constructed including a plurality of nodes each corresponding to a different labeling rule of a set of labeling rules (including a set of seeding rules of known labeling accuracy and a plurality of candidate rules of unknown labeling accuracy). The nodes are coupled to other nodes based on which rules exhibit the highest sematic similarity. A labeling accuracy metric is estimated for each candidate rule by propagating a labeling confidence metric through the rule graph from the seeding rules to each candidate rule. A subset of labeling rules is then identified by ranking the rules by their labeling confidence metric. The identified subset of labeling rules is applied to unlabeled data to generate a set of weakly labeled named entities and the machine-learning model is trained based on the set of weakly labeled named entities.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.