본문 바로가기

Bioinformatics(생정보학)

여러 vcf파일 하나로 합치기

728x90
반응형
ls *vcf > vcfs.list
java -jar GenomeAnalysisTK.jar -T CombineVariants -R $REF --variant vcfs.list -o combined.vcf -genotypeMergeOptions UNIQUIFY



https://www.biostars.org/p/49730/


vcftools

export PERL5LIB=/path/to/your/vcftools-directory/perl/ #vcftools다운받은 곳 위치

make


chromosome 별로 나뉜 vcf합치는 법

/mnt/tools/vcftools/vcftools_0.1.13/perl/vcf-concat -f vcfs.list > ESP_merged.vcf

-f는 파일 명이 담긴 파일 아래의 형태임

ESP6500SI-V2-SSA137.GRCh38-liftover.chr10.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr11.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr12.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr13.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr14.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr15.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr16.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr17.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr18.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr19.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr1.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr20.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr21.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr22.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr2.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr3.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr4.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr5.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr6.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr7.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr8.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chr9.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chrX.snps_indels.vcf
ESP6500SI-V2-SSA137.GRCh38-liftover.chrY.snps_indels.vcf


http://vcftools.sourceforge.net/perl_module.html#vcf-concat


chr이 없는 것을 붙이는 방법
awk '{if($0 !~ /^#/) print "chr"$0; else print $0}' no_chr.vcf > with_chr.vcf


chr이 있는 것을 없애는 법

awk '{gsub(/^chr/,""); print}' your.vcf > no_chr.vcf


chromosome순으로 sorting하는 법

/mnt/tools/vcftools/vcftools_0.1.13/perl/vcf-sort -c ESP_merged_chr.vcf > ESP_merg_sorted_chr.vcf



728x90
반응형

'Bioinformatics(생정보학)' 카테고리의 다른 글

EnrichmentMap  (0) 2017.04.19
VEP variant effect predictor 깔기  (0) 2017.04.13
GATK-somaticindeldetector  (0) 2017.03.31
GATK-base quality score recalibration  (0) 2017.03.30
samtools 사용법  (0) 2017.03.28