From ScientificComputing
Jump to: navigation, search




The GATK is used for identifying SNPs and indels in germline DNA and RNAseq data. Its scope is now expanding to include somatic variant calling tools, and to tackle copy number (CNV) and structural variation (SV). In addition to the variant callers themselves, the GATK also includes many utilities to perform related tasks such as processing and quality control of high-throughput sequencing data. These tools were primarily designed to process exomes and whole genomes generated with Illumina sequencing technology, but they can be adapted to handle a variety of other technologies and experimental designs. And although it was originally developed for human genetics, the GATK has since evolved to handle genome data from any organism, with any level of ploidy.

Available versions (Euler, old software stack)

Legacy versions Supported versions New versions
3.4.46, 3.5, 3.7, 3.8

Environment modules (Euler, old software stack)

Version Module load command Additional modules loaded automatically
3.4.46 module load gcc/4.8.2 java/1.8.0_91 gatk/3.4.46
3.5 module load gcc/4.8.2 java/1.8.0_91 gatk/3.5
3.7 module load gcc/4.8.2 java/1.8.0_91 gatk/3.7
3.8 module load gcc/4.8.2 java/1.8.0_91 gatk/3.8

How to submit a job

You can submit a GATK job in batch mode with the following command:
bsub [LSF options] "GATK [GATK options]"
Here you need to replace [GATK options] with GATK command line options and [LSF options] with LSF parameters for the resource requirements of the job. Please find a documentation about the parameters of bsub on the wiki page about the batch system.

License information

GATK license