Technique for searching out new words that should be registered in dictionary for speech processing
US8140332B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 14, 2007 |
| Grant date | Mar 20, 2012 |
| Priority date | — |
| Expiry date | Jan 15, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/284
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
To search out a new word that should be newly registered in a dictionary contained in a segmentation device for segmenting a text into words. This system inputs a training text into the segmentation device to cause the segmentation device to segment the training text into words, and thereby generates a plurality of segmentation candidates in association with certainty factors of the results of the segmentation, the segmentation candidates respectively containing mutually different combinations of words as results of the segmentation of the training text. Then, this system computes a likelihood that the each word is a new word by summing up some of the certainty factors that are respectively associated with some of the plurality of segmentation candidates that contain the each word. Then, from among combinations of words each contained in at least any one of the segmentation candidates, the system searches combinations of words contained in at least any one of the segmentation candidates and containing words with which the entire training text can be written, in order to find out a combination that minimizes an information entropy of words assuming that each word belonging to the co…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.