Method and apparatus for automatic language determination of European script documents
US5377280A · kind A · utility
Assignees
Inventor
Key dates
| Filing date | Apr 19, 1993 |
| Grant date | Dec 27, 1994 |
| Priority date | — |
| Expiry date | Apr 19, 2013 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/263
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An automatic language-determining apparatus automatically determines the particular European language of the text image of a document when the gross-script-type is known to be, or is determined to be, an European script-type. A word token generating means generates word tokens from the text image. A feature determining means determines the frequency of appearance of word tokens of the text portion which correspond to predetermined word tokens. A language determining means converts the determined frequency of appearance rates to a point in a new coordinate space, then determines which predetermined region of the new coordinate space the point is closes to, to determine the language of the text portion.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.