Apparatus and method for estimating, from sparse data, the probability that a particular one of a set of events is the next event in a string of events
US4831550A · kind A · utility
Assignee
Inventor
Key dates
| Filing date | Mar 27, 1986 |
| Grant date | May 16, 1989 |
| Priority date | — |
| Expiry date | Mar 27, 2006 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L15/14
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Apparatus and method for evaluating the likelihood of an event (such as a word) following a string of known events, based on event sequence counts derived from sparse sample data. Event sequences--or m-grams--include a key and a subsequent event. For each m-gram is stored a discounted probability generated by applying modified Turing's estimate, for example, to a count-based probability. For a key occurring in the sample data there is stored a normalization constant which preferably (a) adjusts the discounted probabilities for multiple counting, if any, and (b) includes a freed probability mass allocated to m-grams which do not occur in the sample data. To determine the likelihood of a selected event following a string of known events, a "backing off" scheme is employed in which successively shorter keys (of known events) followed by the selected event (representing m-grams) are searched until an m-gram is found having a discounted probability stored therefor. The normalization constants of the longer searched keys--for which the corresponding m-grams have no stored discounted probability--are combined together with the found discounted probability to produce the likelihood of the …
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.