Patent · US Expired

Optimal dissimilarity method for choosing distinctive items of information from a large body of information

US6535819B1 · kind B1 · utility

7Cited by
0References
17Claims
0Family size

Assignee

Inventor

Key dates

Filing dateApr 13, 1998
Grant dateMar 18, 2003
Priority date
Expiry dateApr 13, 2018

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F18/231
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The method of this invention identifies distinctive items of information from a larger body of information on the basis of similarities or dissimilarities among the items and achieves a significant increase in speed as well as the ability to balance the representativeness and diversity among the identified items by applying selection criteria to randomly chosen subsamples of all the information. The method is illustrated with reference to the compound selection requirements of medicinal chemists. Compound selection methods currently available to chemists are based on maximum or minimum dissimilarity selection or on hierarchical clustering. The method of the invention is more general and incorporates maximum and minimum dissimilarity-based selection as special cases. In addition, the number of iterations required to select the items is a multiple of the group size which, at its greatest, is approximately the square root of the population size. Thus, the selection method runs much faster than the methods of the prior art. Further, by adjusting the subsample size parameter K, it is possible to control the balance between representativeness and diversity in the compounds selected. In a…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.