Patent · US Active

Technique for searching out new words that should be registered in dictionary for speech processing

US8140332B2 · kind B2 · utility

6Cited by
22References
5Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 14, 2007
Grant dateMar 20, 2012
Priority date
Expiry dateJan 15, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/284
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

To search out a new word that should be newly registered in a dictionary contained in a segmentation device for segmenting a text into words. This system inputs a training text into the segmentation device to cause the segmentation device to segment the training text into words, and thereby generates a plurality of segmentation candidates in association with certainty factors of the results of the segmentation, the segmentation candidates respectively containing mutually different combinations of words as results of the segmentation of the training text. Then, this system computes a likelihood that the each word is a new word by summing up some of the certainty factors that are respectively associated with some of the plurality of segmentation candidates that contain the each word. Then, from among combinations of words each contained in at least any one of the segmentation candidates, the system searches combinations of words contained in at least any one of the segmentation candidates and containing words with which the entire training text can be written, in order to find out a combination that minimizes an information entropy of words assuming that each word belonging to the co…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.