K-mer database for organism identification
US11830580B2 · kind B2 · utility
Assignees
Inventors
Key dates
| Filing date | Sep 30, 2018 |
| Grant date | Nov 28, 2023 |
| Priority date | — |
| Expiry date | Oct 24, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/367
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A large collection of sample genomes containing misclassified k-mers and metadata errors from a reference taxonomy was converted to a self-consistent k-mer database comprising a self-consistent taxonomy. The self-consistent taxonomy was based on genetic distances calculated using the MinHash method or the Meier-Koltoff method. An agglomerative clustering algorithm was used to calculate the self-consistent taxonomy. Each k-mer of the sample genomes was assigned to only one node of the self-consistent taxonomy. In another step, each node of the self-consistent taxonomy was mapped to the reference taxonomy, thereby preserving in the self-consistent taxonomy links to the reference taxonomy while correcting for the misclassification errors therein. The self-consistent k-mer database can be used to taxonomically profile sequenced nucleic acids with greater specificity compared to systems relying on the reference taxonomy.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.