Friday, June 5, 2015

Code to separate vcf files into chromosomes and prepend vcf header

Some scripts will need you to separate vcf file into chromosomes but keep the header, in that case use this bash code:

# separate vcf into chromosomes
cat file.vcf | awk '!/^#/{print>$1}'

#make chromosome list
awk '!/^#/ { a[$1]++ } END { for (b in a) { print b } }' file.vcf | sort > chr_list

#append header from vcf to each chromosome file
for i in $(cat chr_list)
echo $i
cat <(grep "#" 1410UNHS-0007_2305_3_SNP_INDEL.vcf) $i > "$i"_header

