Patent · US Active

Language model optimization for in-domain application

US9972311B2 · kind B2 · utility

4Cited by

9References

19Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Michael Levit · San Jose, US
Sarangarajan Parthasarathy · Mountain View, US
Andreas Stolcke · Berkeley, US

Key dates

Filing date	May 7, 2014
Grant date	May 15, 2018
Priority date	—
Expiry date	May 7, 2034

Classification

Technology area (CPC G)Physics
CPC primaryG06F40/295
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Systems and methods are provided for optimizing language models for in-domain applications through an iterative, joint-modeling approach that expresses training material as alternative representations of higher-level tokens, such as named entities and carrier phrases. From a first language model, an in-domain training corpus may be represented as a set of alternative parses of tokens. Statistical information determined from these parsed representations may be used to produce a second (or updated) language model, which is further optimized for the domain. The second language model may be used to determine another alternative parsed representation of the corpus for a next iteration, and the statistical information determined from this representation may be used to produce a third (or further updated) language model. Through each iteration, a language model may be determined that is further optimized for the domain.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.