If you want to get vcf file with a specific set of SNPs, e.g. lets say that you have a list of SNP IDs in one file and a complete dbSNP vcf set in another file, you can use
awk scripting to quickly obtain a vcf file with your set of SNPs.
File with SNP IDs, named CAD_SNP_positions.bed
chr1 2162944 2162945 rs590367
chr1 2163568 2163569 rs263533
chr1 2164116 2164117 rs263532
chr1 2164699 2164700 rs377599
...
Vcf file with dbSNP variants, named dbSNP-common_all.vcf.gz
...
##INFO=<ID=COMMON,Number=1,Type=Integer,Description="RS is a common SNP. A common SNP is one that has at least one 1000Genomes population with a minor allele of frequency >= 1% and for which 2 or more founders contribute to that minor allele frequency.">
#CHROM POS ID REF ALT QUAL FILTER INFO
1 10177 rs367896724 A AC . . RS=367896724;RSPOS=10177;dbSNPBuildID=138;SSR=0;SAO=0;VP=0x050000020005140026000200;WGT=1;VC=DIV;R5;ASP;VLD;KGPhase3;CAF=0.5747,0.4253;COMMON=1
1 10352 rs145072688 T TA . . RS=145072688;RSPOS=10353;dbSNPBuildID=134;SSR=0;SAO=0;VP=0x050000020005000002000200;WGT=1;VC=DIV;R5;ASP;CAF=0.5625,0.4375;COMMON=1
...
Do the following:
mpjanic@valkyr:~/vcf_subset_selection_script$ grep "#" <(gzip -dc dbSNP-common_all.vcf.gz) > CAD_SNP_common_all.vcf
mpjanic@valkyr:~/vcf_subset_selection_script$ awk 'NR==FNR {h[$4] = $4; next} {if(h[$3]) print$0}' CAD_SNP_positions.bed <(gzip -dc dbSNP-common_all.vcf.gz) >> CAD_SNP_common_all.vcf
Or in case you want to gzip it immediately pipe it to gzip:
mpjanic@valkyr:~/vcf_subset_selection_script$ grep "#" <(gzip -dc dbSNP-common_all.vcf.gz) | gzip > CAD_SNP_common_all.vcf
mpjanic@valkyr:~/vcf_subset_selection_script$ awk 'NR==FNR {h[$4] = $4; next} {if(h[$3]) print$0}' CAD_SNP_positions.bed <(gzip -dc dbSNP-common_all.vcf.gz) | gzip >> CAD_SNP_common_all.vcf