Topics: Life science. Biology. Biotechnology and Biomedical research. Bioinformatics and Computational Biology. Programming for Biologists. Lab protocols and methods. Paper reviews. Data science. Programming in R, C, C++, Perl, Python, Excel, basic and advanced Unix, shell scripting, awk scripting, vim editing, regular expressions. Custom script development.
If you want to get vcf file with a specific set of SNPs, e.g. lets say that you have a list of SNP IDs in one file and a complete dbSNP vcf set in another file, you can use awk scripting to quickly obtain a vcf file with your set of SNPs.
File with SNP IDs, named CAD_SNP_positions.bed
chr1 2162944 2162945 rs590367
chr1 2163568 2163569 rs263533
chr1 2164116 2164117 rs263532
chr1 2164699 2164700 rs377599
Vcf file with dbSNP variants, named dbSNP-common_all.vcf.gz
##INFO=<ID=COMMON,Number=1,Type=Integer,Description="RS is a common SNP. A common SNP is one that has at least one 1000Genomes population with a minor allele of frequency >= 1% and for which 2 or more founders contribute to that minor allele frequency.">
#CHROM POS ID REF ALT QUAL FILTER INFO
1 10177 rs367896724 A AC . . RS=367896724;RSPOS=10177;dbSNPBuildID=138;SSR=0;SAO=0;VP=0x050000020005140026000200;WGT=1;VC=DIV;R5;ASP;VLD;KGPhase3;CAF=0.5747,0.4253;COMMON=1
1 10352 rs145072688 T TA . . RS=145072688;RSPOS=10353;dbSNPBuildID=134;SSR=0;SAO=0;VP=0x050000020005000002000200;WGT=1;VC=DIV;R5;ASP;CAF=0.5625,0.4375;COMMON=1