Patent · US Expired

Method and system for linearly detecting data deviations in a large database

US5813002A · kind A · utility

26Cited by
28References
15Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 31, 1996
Grant dateSep 22, 1998
Priority date
Expiry dateJul 31, 2016

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99936
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for detecting deviations in a database is disclosed, comprising the steps of: determining respective frequencies of occurrence for the attribute values of the data items, and identifying any itemset whose similarity value satisfies a predetermined criterion as a deviation, based on the frequencies of occurrence. The determination of the frequencies of occurrence includes computing an overall similarity value for the database, and for each first itemset, computing a difference between the overall similarity value and the similarity value of a second itemset. The second itemset has all the data items except those of the first itemset. Preferably, a smoothing factor is used for indicating how much dissimilarity in an itemset can be reduced by removing a subset of items from the itemset. The smoothing factor is evaluated as each item is incrementally removed from the itemset, thereby allowing a data item to be identified as a deviation when the difference if similarity value is the highest.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.