Multi-lingual word hyphenation using inductive machine learning on training data
US8996994B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 16, 2008 |
| Grant date | Mar 31, 2015 |
| Priority date | — |
| Expiry date | May 15, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/191
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Tools and techniques are described for providing multi-lingual word hyphenation using inductive machine learning on training data. Methods provided by these techniques may receive training data that includes hyphenated words, and may inductively generate hyphenation patterns that represent substrings of these words. The hyphenation patterns may include the substrings and hyphenation codes associated with characters occurring in the substrings. The methods may receive induction parameters applicable to generating the hyphenation patterns, and may store the hyphenation patterns into a language-specific lexicon file. These methods may also receive requests to hyphenate input words that occur in a human language, and may evaluate how to process the request based on the language. The methods may search for hyphenation patterns occurring in the input words, with the hyphenation patterns being stored in the lexicon file. Finally, the methods may respond to the request, indicating whether the hyphenation patterns occurred in the input words.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.