You are here

Data Analysis

The CNAG-CRG offers researchers state-of-the-art methods for massive parallel sequencing and data analysis for a wide range of applications. The center collaborates in sequencing projects with researchers from the public and private sectors providing expertise in project design, sequencing, data analysis and interpretation.
The data analysis platform provides robust analysis pipelines for Mendelian disease gene discovery, complex disease gene identification, somatic variant identification, de novo genome assembly, differential gene expression, identification of novel spliced isoforms, cytosine-methylation analysis and epigenetic analysis among others.
The data analysis platform includes more than half of the CNAG-CRG staff and it is strongly supported by a computing infrastructure of 7.6 petabyte of data storage and over 3400 cores of computing designed and managed by the Barcelona Supercomputing Center (BSC-CNS).

Computing Resources:

- Primary run analysis and quality control
- Alignment of sequencing reads to the reference sequence using proprietary GEM alignment pipeline
- Variant calling pipeline (SNVs and indels) including SamTools, GATK, Pindel and SNAPE
- Proprietary variant filtering and prioritization tool
- Identification of copy number variants (CNVs) with Control Free-C
- Proprietary genome assembly pipeline using shotgun genome sequencing and/or fosmid pool approaches
- Proprietary genome annotation pipeline for coding and non-coding RNAs
- Differential expression analysis pipeline including proprietary GEM-split-mapper and Flux Capacitor for gene and isoform quantification
- Identification of gene fusions from transcriptome reads
- Cytosine methylation analysis pipeline including proprietary GEM mapper and bs_call
- Determination of the higher-order chromatin folding of genomic domains and whole genomes using proprietary TADbit software
- Storage and distribution of data in collaboration with the Barcelona Supercomputing Center (BSC)


Bioinformatics Development Group
Simon Heath, PhD

Bioinformatics Analysis Group
Sergi Beltran, PhD

Genome Biology Group
Marc A. Marti-Renom, PhD

Applied Genomics Group
Ivo G. Gut, PhD

Berta Fusté, PhD
Project Manager
+34 934037289
Access Procedures

To discuss possible collaborations please contact the CNAG Project Manager Berta Fusté (

Assurance of quality

- SGS Certification ISO 9001: 2008
- ENAC Accreditation ISO/IEC 17025:2005

Capacity External Users