GATK的FilterMutectCalls如何才能成功呢

因为有粉丝求助,他学习前面我分享的GATK的Mutect2流程都快奔溃了,总是各种报错。为了证明我教程没有错,所以我赶紧检查了代码,自己走了一遍,重新写了教程,了:最新最全的mutect2教程,提到了因为GATK的Mutect2流程更新太频繁,导致这个软件出现了一些无法解决的报错。走完了体细胞突变(somatic mutation)检测流程(Mutect2命令),这个时候拿到的文件仍然是需要过滤(走FilterMutectCalls命令)的,但是很多人就卡在了这一步。

比如我运行这个软件的FilterMutectCalls命令,测试了下面的几个情况:

reference=$HOME/biosoft/GATK/resources/bundle/hg38/Homo_sapiens_assembly38.fasta GATK=$HOME/biosoft/GATK/gatk-4.1.8.1/gatk GATK=$HOME/biosoft/GATK/gatk-4.0.3.0/gatkGATK=$HOME/biosoft/GATK/gatk-4.0.2.1/gatkls $reference $GATK ls *_mutect2.vcf |while read iddosample=$(basename "$id" _mutect2.vcf)$GATK FilterMutectCalls -R $reference --java-options -DGATK_STACKTRACE_ON_USER_EXCEPTION=true \ -V $id \ -O ${sample}_filtered.vcf done

如果是gatk-4.1.8.1,就会报错如下:

A USER ERROR has occurred: Mutect stats table _mutect2.vcf.stats not found. When Mutect2 outputs a file calls.vcf it also creates a calls.vcf.stats file. Perhaps this file was not moved along with the vcf, or perhaps it was not delocalized from a virtual machine while running in the cloud.

gatk官方论坛的意思是,在集群运行的过程中,会丢失后缀为.vcf.stats的文件,所以FilterMutectCalls 命令失败。

如果是Gatk-4.0.3.0,就会报错如下:

java.lang.IllegalStateException: Key P_CONTAM found in VariantContext field INFO at chr1:14932 but this key isn't defined in the VCFHeader. We require all VCFs to have complete VCF headers by default.

但是,我记得我以前写这个软件教程的时候,明明没有出现问题啊,所以就去检查了我的脚本,发现居然是 gatk-4.0.2.1 版本。

如果是是 gatk-4.0.2.1 版本

报错就更诡异了,运行到一半后戛然而止。仔细检查了vcf文件停止的地方,发现它对

chr2112391072.GAAAG,GA,GAAchr2131598742.CTC,CTT,CTTT

所以我干脆仅仅是保留SNP吧:

reference=$HOME/biosoft/GATK/resources/bundle/hg38/Homo_sapiens_assembly38.fasta GATK=$HOME/biosoft/GATK/gatk-4.0.2.1/gatkls $reference $GATK ls *_mutect2.vcf |while read iddosample=$(basename "$id" _mutect2.vcf)cat $id | perl -alne '{if(/^#/){print}else{ next if $F[0] =~ "_";print if (length($F[3])+length($F[4])) eq 2 } }' > ${sample}_snp.vcf $GATK FilterMutectCalls -R $reference --java-options -DGATK_STACKTRACE_ON_USER_EXCEPTION=true \ -V ${sample}_snp.vcf \ -O ${sample}_filtered.vcf cat ${sample}_filtered.vcf |perl -alne '{if(/^#/){print}else{next unless $F[6] eq "PASS";next if $F[0] =~/_/;print } }' > ${sample}_pass.vcf done

讽刺的是,居然就看到了成功的log日志

18:10:54.132 INFO ProgressMeter - Starting traversal18:10:54.132 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute18:10:54.504 INFO ProgressMeter - unmapped 0.0 792 128086.318:10:54.504 INFO ProgressMeter - Traversal complete. Processed 792 total variants in 0.0 minutes.18:10:54.657 INFO FilterMutectCalls - Shutting down engine[September 29, 2020 6:10:54 PM CST] org.broadinstitute.hellbender.tools.walkers.mutect.FilterMutectCalls done. Elapsed time: 0.04 minutes.Runtime.totalMemory()=357564416Tool returned:SUCCESS
(0)

相关推荐