Patent · US Active

Systems and methods for de novo assembly of nucleotide sequence reads using a modified string graph

US11557374B1 · kind B1 · utility

0Cited by
1References
6Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 13, 2018
Grant dateJan 17, 2023
Priority date
Expiry dateNov 18, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG16B50/00
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems and methods to automatically de novo assemble a set of unordered read sequences into one or more, larger nucleotide sequences are presented. The method involves first creating two identical sets of the reads, dividing each read in both sets into smaller sorted mer sequences and then comparing the mers for each read in set 1 to the mers from each read in set 2 to exhaustively identify overlapping segments. Overlap information is used to construct a modified assembly string graph, traversal of which produces a sorted string graph layout file consisting of all the reads ordered left to right including their approximate starting offset position. The sorted string graph layout file is then processed by a novel multiple sequence alignment system that uses mer matches between all the overlapping reads at a given position to place matching individual bases from each read into columns from which an overall consensus sequence is determined.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.