Patent · US Expired

Method and system for computationally identifying clusters within a set of sequences

US6109776A · kind A · utility

16Cited by
1References
41Claims
0Family size

Assignee

Inventor

Key dates

Filing dateApr 21, 1998
Grant dateAug 29, 2000
Priority date
Expiry dateApr 21, 2018

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG16B30/00
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and system for computationally analyzing an initial set of patterns in order to identify subsets of patterns, called clusters, that contain common sub-patterns. The patterns of the initial set of patterns are represented as linear sequences of subunits, and the common sub-patterns occur as sub-sequences of subunits within the linear sequences starting at different positions within the different linear sequences. Variations in the offset and in the sequence of subunits within a common sub-pattern are considered in the analysis. In one embodiment, an initial set of oligonucleotide sequences that are produced by various biochemical techniques are computationally analyzed to identify clusters that may correspond to a number of different binding sites for DNA-binding proteins within one or more double-stranded DNA duplexes. The method places each oligonucleotide sequence within a new cluster and calculates an initial information weight matrix for that cluster. Then, other sequences from the initial set of sequences are added to the cluster and the information weight matrix of the cluster is re-computed until the information content of the information weight matrix falls below a…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.