METHOD FOR COMPRESSING GENOMIC DATA (English)

Free access
2015
  • Patent  /  Electronic Resource

How to get this document?

The present invention relates to a method for compressing genomic data, whereby the genomic data are stored in at least one data file containing at least a plurality of reads built by a genome sequencing method, whereby each read includes a mapping position, a CIGAR string and an actual sequenced nucleotide sequence as a local part of the donor genome, comprising the steps: —unwind a nucleotide sequence of a current read of one of said data files by using the mapping position and the CIGAR string of said current read, whereby said current read has at least one previous read, —compute a difference between the unwound nucleotide sequence of said current read and an unwound nucleotide sequence of at least one of said previous reads, whereby said difference contains the differences of the mapping positions and the nucleotide sequences, —pass said computed difference to an entropy coder to compress said difference, —encode said current read by the compressed difference, and —repeat the forgoing steps with said current read as one of said previous reads and a following read as a new current read until no more following reads are available.

  • Title:
    METHOD FOR COMPRESSING GENOMIC DATA
  • Additional title:
    VERFAHREN ZUR KOMPRIMIERUNG VON GENOMDATEN
    PROCÉDÉ DE COMPRESSION DE DONNÉES GÉNOMIQUES
  • Patent number:
    EP3311318
  • Application date:
    2016-06-16
  • Patent applicant:
  • Patent family:
  • Author / Creator:
  • Publisher:
    Europäisches Patentamt
  • Year of publication:
    2015
  • Type of media:
    Patent
  • Type of material:
    Electronic Resource
  • Language:
    English
  • Classification:
    IPC:    G06F ELECTRIC DIGITAL DATA PROCESSING, Elektrische digitale Datenverarbeitung
  • Source:
  • Export:
Feedback