Sequence database search with sequence search trees
US6633817B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 29, 1999 |
| Grant date | Oct 14, 2003 |
| Priority date | — |
| Expiry date | Dec 29, 2019 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG16B40/30
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and system for generating and searching a tree-structured index of window vectors that represent database sequences comprise a window vector generation module, a tree-structured index generation module, a query sequence partitioning module, and a retrieval component. The window vector generation module partitions a database sequence into a plurality of overlapping windows. Each window has a fixed length W comprising a fixed number of nucleotides, and the offset among windows is determined by a parameter &Dgr;. The window vector generation module then maps each database sequence window into a window vector. The database sequence window vector indicates the frequency of appearance of each k-tuple in the corresponding database sequence window. The tree-structured index generation module then generates a tree-structured index using the database sequence window vectors. The query sequence partitioning module partitions a query sequence into a plurality of windows and maps each query sequence window into a query sequence window vector. Each query sequence window vector is then compared against the tree-structured index to locate the database sequences that are similar to the que…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.