Patent · US Expired

Method and apparatus for automatic language determination of European script documents

US5377280A · kind A · utility

29Cited by
6References
8Claims
0Family size

Assignees

Inventor

Key dates

Filing dateApr 19, 1993
Grant dateDec 27, 1994
Priority date
Expiry dateApr 19, 2013

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/263
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An automatic language-determining apparatus automatically determines the particular European language of the text image of a document when the gross-script-type is known to be, or is determined to be, an European script-type. A word token generating means generates word tokens from the text image. A feature determining means determines the frequency of appearance of word tokens of the text portion which correspond to predetermined word tokens. A language determining means converts the determined frequency of appearance rates to a point in a new coordinate space, then determines which predetermined region of the new coordinate space the point is closes to, to determine the language of the text portion.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.