Patent · US Active

Language model optimization for in-domain application

US9972311B2 · kind B2 · utility

4Cited by
9References
19Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 7, 2014
Grant dateMay 15, 2018
Priority date
Expiry dateMay 7, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/295
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems and methods are provided for optimizing language models for in-domain applications through an iterative, joint-modeling approach that expresses training material as alternative representations of higher-level tokens, such as named entities and carrier phrases. From a first language model, an in-domain training corpus may be represented as a set of alternative parses of tokens. Statistical information determined from these parsed representations may be used to produce a second (or updated) language model, which is further optimized for the domain. The second language model may be used to determine another alternative parsed representation of the corpus for a next iteration, and the statistical information determined from this representation may be used to produce a third (or further updated) language model. Through each iteration, a language model may be determined that is further optimized for the domain.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.