Genome data compression and transmission method for FASTQ-formatted genome data
US11775172B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | May 5, 2022 |
| Grant date | Oct 3, 2023 |
| Priority date | — |
| Expiry date | May 5, 2042 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH03M7/70
- WIPO fieldBasic communication processes
- WIPO sectorElectrical engineering
Abstract
Provided is a genome data compression method of compressing FASTQ-formatted genome data, the method including: storing, by a first core that is one of the M cores, fixed header data in the first line of the first piece of sequence data in a compression result storage; and allocating, by the first core, N (N is a natural number of 2 or greater) pieces of the sequence data to each of the other M-1 (M is a natural number of 4 or greater) cores (hereinafter, referred to as “the remaining cores”), and performing compression by each of the remaining cores to compress N*(M-1) pieces of the sequence data together in parallel processing, and storing a compression result in the compression result storage, wherein the compression performed by each of the remaining cores is performed, including: primary compression in which for the N pieces of the sequence data, a process of the following stages for each piece of the sequence data is repeated: a stage in which a fixed header in the first line is removed; a stage in which the second line is encoded; a stage in which an identifier in the third line is stored; and a stage in which run-length encoding is performed on the fourth line; and secondary…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.