Patent · US Active

Word breaker from cross-lingual phrase table

US9330087B2 · kind B2 · utility

1Cited by
7References
14Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 11, 2013
Grant dateMay 3, 2016
Priority date
Expiry dateMar 13, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/45
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Automatically creating word breakers which segment words into morphemes is described, for example, to improve information retrieval, machine translation or speech systems. In embodiments a cross-lingual phrase table, comprising source language (such as Turkish) phrases and potential translations in a target language (such as English) with associated probabilities, is available. In various examples, blocks of source language phrases from the phrase table are created which have similar target language translations. In various examples, inference using the target language translations in a block enables stem and affix combinations to be found for source language words without the need for input from human-judges or prior knowledge of source language linguistic rules or a source language lexicon.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.