Representing n-gram language models for compact storage and fast retrieval
US7877258B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 29, 2007 |
| Grant date | Jan 25, 2011 |
| Priority date | — |
| Expiry date | Nov 24, 2029 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/268
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.