Patent · US Expired

Method and apparatus for classifying text

US5182708A · kind A · utility

55Cited by
2References
11Claims
0Family size

Assignee

Inventor

Key dates

Filing dateSep 24, 1991
Grant dateJan 26, 1993
Priority date
Expiry dateSep 24, 2011

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/253
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The present invention provides a method and apparatus for classifying text by using two constants determined by analyzing the text. The first constant, G, classifies text in the order of constraint. It is defined by the equation G=log (N/L)/ {log(N)-1}, where N is the number of words and L is the number of different words in the text being classified. The second constant, R, is the correlation coefficient between the word length and the logarithm scaled rank order of word frequency. The values of the two constants can be used to determine how to classify text. In the case of English text, the text may be classified as computer language, text from a technical manual, English text written by foreigners or English text written by native English speakers.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.