Patent · US Expired

Method for extracting multi-word technical terms from text

US5423032A · kind A · utility

103Cited by
8References
28Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 3, 1992
Grant dateJun 6, 1995
Priority date
Expiry dateJan 3, 2012

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99935
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and apparatus for extracting multi-word technical terms from a text file in a computer system. Word strings are selected from the text that have at least two words, that have at most a specified maximum number of words, that include none of a special set of selected tokens, and that only include selected characters. Word string which occur less than a specified minimum number of times in the text file are deleted. The remaining strings form a set of word strings very likely to be multi-word technical terms. Improvements on the quality of the set of word strings can be accomplished by deleting word strings which do not satisfy certain grammatical constraints.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.