Patent · US Active

Language segmentation of multilingual texts

US8600730B2 · kind B2 · utility

3Cited by
15References
20Claims
0Family size

Assignee

Inventor

Key dates

Filing dateFeb 8, 2011
Grant dateDec 3, 2013
Priority date
Expiry dateFeb 4, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/263
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for segmenting a multi-language text is provided. An exemplary method comprises determining an initial probability distribution for sentences in the multi-language text, the initial probability distribution indicating the likelihood of each sentence being in each of a set of languages. A probability of language transitions across sentences may be learned based on the initial probability distribution. Additionally, a highest probability language sequence of sentences in the multi-language text may be determined based on a combination of the probability of language transitions and the prior probability distribution provided by an initial model.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.