Patent · US Active

Systems and methods for translating Chinese pinyin to Chinese characters

US7478033B2 · kind B2 · utility

62Cited by
5References
13Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 16, 2004
Grant dateJan 13, 2009
Priority date
Expiry dateJul 28, 2026

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/129
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems and methods to process and translate pinyin to Chinese characters and words are disclosed. A Chinese language model is trained by extracting unknown character strings from Chinese inputs, e.g., documents and/or user inputs/queries, determining valid words from the unknown character strings, and generating a transition matrix based on the Chinese inputs for predicting a word string given the context. A method for translating a pinyin input generally includes generating a set of Chinese character strings from the pinyin input using a Chinese dictionary including words derived from the Chinese inputs and a language model trained based on the Chinese inputs, each character string having a weight indicating the likelihood that the character string corresponds to the pinyin input. An ambiguous user input may be classified as non-pinyin or pinyin by identifying an ambiguous pinyin/non-pinyin ASCII word in the user input and analyzing the context to classify the user input.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.