Patent · US Expired

Density-based indexing method for efficient execution of high dimensional nearest-neighbor queries on large databases

US6263334A · kind A · utility

253Cited by
7References
39Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 11, 1998
Grant dateJul 17, 2001
Priority date
Expiry dateNov 11, 2018

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Method and apparatus for efficiently performing nearest neighbor queries on a database of records wherein each record has a large number of attributes by automatically extracting a multidimensional index from the data. The method is based on first obtaining a statistical model of the content of the data in the form of a probability density function. This density is then used to decide how data should be reorganized on disk for efficient nearest neighbor queries. At query time, the model decides the order in which data should be scanned. It also provides the means for evaluating the probability of correctness of the answer found so far in the partial scan of data determined by the model. In this invention a clustering process is performed on the database to produce multiple data clusters. Each cluster is characterized by a cluster model. The set of clusters represent a probability density function in the form of a mixture model. A new database of records is built having an augmented record format that contains the original record attributes and an additional record attribute containing a cluster number for each record based on the clustering step. The cluster model uses a probabilit…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.