Patent · US Expired

Classification of data records by comparison of records to a training database using probability weights

US5251131A · kind A · utility

206Cited by
3References
37Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 31, 1991
Grant dateOct 5, 1993
Priority date
Expiry dateJul 31, 2011

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/30
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Classification of natural language data wherein the natural language data has an open-ended range of possible values or the data values do not have a relative order. A training database stores training records, wherein each training record includes predictor data fields. Each predictor data field containes a feature, wherein each feature is a natural language term, and a target data field containing a target value representing a classification of the record. Features may also include conjunctions of natural language terms and each feature may also be a member of a category subset of features. The training database stores, for each feature, a probability weight value representing the probability that a record will have the target value contained in the target data field if a feature contained in a corresponding predictor data field occurs in the record. Features are extracted from a new record and each feature from the new record is used to query the training records to determine the probability weights from the training records having matching features. The probability weights are accumulated for each training record to determine a comparison score representing the probability that…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.