Patent · US Active

Synthetic crafting of training and test data for named entity recognition by utilizing a rule-based library

US11853699B2 · kind B2 · utility

0Cited by
95References
15Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 29, 2021
Grant dateDec 26, 2023
Priority date
Expiry dateApr 12, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/951
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and system for extracting and labeling Named-Entity Recognition (NER) data in a target language for use in a multi-lingual software module has been developed. First, a textual sentence is translated to the target language using a translation module. A named entity is identified and extracted within the translated sentence. The named entity is identified by either: exact mapping; a semantically similar translated named entity that meets a predetermined minimum threshold of similarity; or utilizing a rule-based library for the target language. Once identified, the named entity is labeled with a pre-determined category and stored in a retrievable electronic database.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.