Representing n-gram language models for compact storage and fast retrieval
US8175878B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 14, 2010 |
| Grant date | May 8, 2012 |
| Priority date | — |
| Expiry date | Dec 14, 2030 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/268
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.