Patent · US Active

Quality score compression apparatus and method for improving downstream accuracy

US11762813B2 · kind B2 · utility

0Cited by
0References
11Claims
0Family size

Inventors

Key dates

Filing dateFeb 5, 2019
Grant dateSep 19, 2023
Priority date
Expiry dateDec 22, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG16C99/00
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

This disclosure provides for a highly-efficient and scalable compression tool that compresses quality scores, preferably by capitalizing on sequence redundancy. In one embodiment, compression is achieved by smoothing a large fraction of quality score values based on k-mer neighborhood of their corresponding positions in read sequences. The approach exploits the intuition that any divergent base in a k-mer likely corresponds to either a single-nucleotide polymorphism (SNP) or sequencing error; thus, a preferred approach is to only preserve quality scores for probable variant locations and compress quality scores of concordant bases, preferably by resetting them to a default value. By viewing individual read datasets through the lens of k-mer frequencies in a corpus of reads, the approach herein ensures that compression “lossiness” does not affect accuracy in a deleterious way.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.