Tuesday, April 2, 2019

Parsing GWAS Catalog

Download GWAS Catalog:

wget https://www.ebi.ac.uk/gwas/api/search/downloads/full


head full
DATE ADDED TO CATALOG PUBMEDID FIRST AUTHOR DATE JOURNAL LINK STUDY DISEASE/TRAIT INITIAL SAMPLE SIZE REPLICATION SAMPLE SIZE REGION CHR_ID CHR_POS REPORTED GENE(S) MAPPED_GENE UPSTREAM_GENE_ID DOWNSTREAM_GENE_ID SNP_GENE_IDS UPSTREAM_GENE_DISTANCE DOWNSTREAM_GENE_DISTANCE STRONGEST SNP-RISK ALLELE SNPS MERGED SNP_ID_CURRENT CONTEXT INTERGENIC RISK ALLELE FREQUENCY P-VALUE PVALUE_MLOG P-VALUE (TEXT) OR or BETA 95% CI (TEXT) PLATFORM [SNPS PASSING QC] CNV
2017-09-05 28749367 Gondalia R 2017-06-08 Environ Health Perspect www.ncbi.nlm.nih.gov/pubmed/28749367 Genome-wide Association Study of Susceptibility to Particulate Matter-Associated QT Prolongation. QT interval (ambient particulate matter interaction) 14,889 European ancestry individuals, 5,707 African American individuals, 1,562 Hispanic individuals NA 7p12.3 7 48771910 ABCA13, CDC14C AC091770.2 - AC004899.1 ENSG00000285536 ENSG00000225705  60328 74516 rs13309098-G rs13309098 013309098 intergenic_variant 1 0.93 2E-6 5.698970004336019  2.37 [1.39-3.35] unit increase Affymetrix, Illumina [~ 2500000] (imputed) N
2017-09-05 28749367 Gondalia R 2017-06-08 Environ Health Perspect www.ncbi.nlm.nih.gov/pubmed/28749367 Genome-wide Association Study of Susceptibility to Particulate Matter-Associated QT Prolongation. QT interval (ambient particulate matter interaction) 14,889 European ancestry individuals, 5,707 African American individuals, 1,562 Hispanic individuals NA 2q34 2 212200740 ERBB4 ERBB4   ENSG00000178568   rs6725041-T rs6725041 0 6725041 intron_variant 00.48 3E-6 5.522878745280337  1.52 [0.89-2.15] unit increase Affymetrix, Illumina [~ 2500000] (imputed) N
2017-09-05 28749367 Gondalia R 2017-06-08 Environ Health Perspect www.ncbi.nlm.nih.gov/pubmed/28749367 Genome-wide Association Study of Susceptibility to Particulate Matter-Associated QT Prolongation. QT interval (ambient particulate matter interaction) 14,889 European ancestry individuals, 5,707 African American individuals, 1,562 Hispanic individuals NA 20q12 20 40807060 MAFB AL035665.1 - RNA5SP484 ENSG00000229771 ENSG00000238908  108444 47059 rs7361259-T rs7361259 0 7361259 regulatory_region_variant 1 NR 5E-6 5.301029995663981  5.98 [3.26-8.7] unit increase Affymetrix, Illumina [~ 2500000] (imputed) N
2017-08-31 28604728 Litchfield K 2017-06-12 Nat Genet www.ncbi.nlm.nih.gov/pubmed/28604728 Identification of 19 new risk loci and potential regulatory mechanisms influencing susceptibility to testicular germ cell tumor. Testicular germ cell tumor 5,518 European ancestry cases, 19,055 European ancestry controls 1,801 European ancestry cases, 4,027 European ancestry controls 1p36.22 1 9653328 C1orf200, PIK3CD PIK3CD, PIK3CD-AS1   ENSG00000171608, ENSG00000179840   rs4240895-T rs4240895 0 4240895 non_coding_transcript_exon_variant 0 0.39 6E-13 12.221848749616356  1.14 [1.09–1.19] Illumina [at least 371504] (imputed) N
2017-08-31 28604728 Litchfield K 2017-06-12 Nat Genet www.ncbi.nlm.nih.gov/pubmed/28604728 Identification of 19 new risk loci and potential regulatory mechanisms influencing susceptibility to testicular germ cell tumor. Testicular germ cell tumor 5,518 European ancestry cases, 19,055 European ancestry controls 1,801 European ancestry cases, 4,027 European ancestry controls 1q22 1 156199819 CLCN6, SLC25A44 SLC25A44   ENSG00000160785   rs2072499-G rs20724992072499 non_coding_transcript_exon_variant 0 0.36 2E-10 9.698970004336019  1.18 [1.13–1.23] Illumina [at least 371504] (imputed) N
2017-08-31 28604728 Litchfield K 2017-06-12 Nat Genet www.ncbi.nlm.nih.gov/pubmed/28604728 Identification of 19 new risk loci and potential regulatory mechanisms influencing susceptibility to testicular germ cell tumor. Testicular germ cell tumor 5,518 European ancestry cases, 19,055 European ancestry controls 1,801 European ancestry cases, 4,027 European ancestry controls 1q24.1 1 165904155 UCK2 UCK2   ENSG00000143179   rs3790672-C rs3790672 0 3790672 non_coding_transcript_exon_variant 0 0.29 5E-11 10.301029995663981  1.2 [1.14–1.25] Illumina [at least 371504] (imputed) N
2017-08-31 28604728 Litchfield K 2017-06-12 Nat Genet www.ncbi.nlm.nih.gov/pubmed/28604728 Identification of 19 new risk loci and potential regulatory mechanisms influencing susceptibility to testicular germ cell tumor. Testicular germ cell tumor 5,518 European ancestry cases, 19,055 European ancestry controls 1,801 European ancestry cases, 4,027 European ancestry controls 2p13.2 2 71345325 ZNF638 ZNF638   ENSG00000075292   rs7581030-T rs7581030 0 7581030 intron_variant 0 0.24 2E-11 10.698970004336019  1.17 [1.12–1.23] Illumina [at least 371504] (imputed) N
2017-08-31 28604728 Litchfield K 2017-06-12 Nat Genet www.ncbi.nlm.nih.gov/pubmed/28604728 Identification of 19 new risk loci and potential regulatory mechanisms influencing susceptibility to testicular germ cell tumor. Testicular germ cell tumor 5,518 European ancestry cases, 19,055 European ancestry controls 1,801 European ancestry cases, 4,027 European ancestry controls 3p24.3 3 16583541 DAZL LINC00690 - DAZL ENSG00000233570 ENSG00000092345  42168 3251 rs10510452-A rs10510452 0 10510452 intergenic_variant 1 0.7 1E-9 9.0  1.18 [1.13–1.23] Illumina [at least 371504] (imputed) N
2017-08-31 28604728 Litchfield K 2017-06-12 Nat Genet www.ncbi.nlm.nih.gov/pubmed/28604728 Identification of 19 new risk loci and potential regulatory mechanisms influencing susceptibility to testicular germ cell tumor. Testicular germ cell tumor 5,518 European ancestry cases, 19,055 European ancestry controls 1,801 European ancestry cases, 4,027 European ancestry controls 3q23 3 142100008 TFDP2 TFDP2   ENSG00000114126   rs11705932-C rs11705932 0 11705932 intron_variant 0 0.8 5E-7 6.301029995663981  1.17 [1.11–1.23] Illumina [at least 371504] (imputed) N
Greb the columns you need and make a bed file, substituting spaces with _, and keeping only alphanumericals, _, and tab (CTRL-v TAB):

awk -F"\t" '{if ($12!="") print $12"\t"$13"\t"$15"\t"$8}' full |
awk -F"\t" '{print $1"\t"$2"\t"$2+1"\t"$3"\t"$4}'| 
sed -E 's/ /_/g' |sed -E 's/[^a-zA-Z0-9 _]//g' | 
awk '{print "chr"$1"\t"$2"\t"$3"\t"$5"\t"$4"\n"}'|
tail -n+2 > full.mod
head full.mod

chr7 48771910 48771911 QT_interval_ambient_particulate_matter_interaction AC0917702__AC0048991
chr2 212200740 212200741 QT_interval_ambient_particulate_matter_interaction ERBB4
chr20 40807060 40807061 QT_interval_ambient_particulate_matter_interaction AL0356651__RNA5SP484
chr1 9653328 9653329 Testicular_germ_cell_tumor PIK3CD_PIK3CDAS1
chr1 156199819 156199820 Testicular_germ_cell_tumor SLC25A44
Some rows have had two SNPs from SNP-SNP interaction studies and the conversion for those was bad, as they have two variants separated by x, and two genomic location separated by x:

grep '[0-9]_x_[0-9]' full.mod.x
chr5_x_5 36423829_x_36425491 36423830 Hypertension_SNP_x_SNP_interaction RANBP3L__RNA5SP181_x_RANBP3L__RNA5SP181
chr6_x_6 75846902_x_75850781 75846903 Hypertension_SNP_x_SNP_interaction MYO6_x_MYO6
chr8_x_8 119341027_x_119341744 119341028 Hypertension_SNP_x_SNP_interaction MIR548AZ__RF00421_x_MIR548AZ__RF00421
chr8_x_8 134554324_x_134554803 134554325 Hypertension_SNP_x_SNP_interaction ZFAT_x_ZFAT
chr12_x_12 45531972_x_45537190 45531973 Hypertension_SNP_x_SNP_interaction AC0799501_x_AC0799501
chr20_x_20 15618886_x_15625776 15618887 Hypertension_SNP_x_SNP_interaction MACROD2_x_MACROD2
chr5_x_5 83137962_x_83140542 83137963 Hypertension_SNP_x_SNP_interaction XRCC4_x_XRCC4
chr2_x_2 83065441_x_83066256 83065442 Hypertension_SNP_x_SNP_interaction DHFRP3__AC1386231_x_DHFRP3__AC1386231
chr1_x_1 146039391_x_146018957 146039392 Coronary_heart_disease_SNP_X_SNP_interaction HJV__RNVU16_x_HJV
chr3_x_3 162443828_x_162449608 162443829 Coronary_heart_disease_SNP_X_SNP_interaction AC1312111__TOMM22P6_x_AC1312111__TOMM22P6
chr4_x_4 5366258_x_5366497 5366259 Coronary_heart_disease_SNP_X_SNP_interaction STK32B_x_STK32B
First, separate those with two variants

awk -F"\t" '{if ($12!="") print $12"\t"$13"\t"$15"\t"$8}' full | 
grep -v " x "| 
awk -F"\t" '{print $1"\t"$2"\t"$2+1"\t"$3"\t"$4}'| 
sed -E 's/ /_/g' |sed -E 's/[^a-zA-Z0-9 _]//g' | 
awk '{print "chr"$1"\t"$2"\t"$3"\t"$5"\t"$4"\n"}'|
tail -n+2 > full.mod.nox

awk -F"\t" '{if ($12!="") print $12"\t"$13"\t"$15"\t"$8}' full | 
grep " x "| 
awk -F"\t" '{print $1"\t"$2"\t"$2+1"\t"$3"\t"$4}'| 
sed -E 's/ /_/g' |sed -E 's/[^a-zA-Z0-9 _]//g' | 
awk '{print "chr"$1"\t"$2"\t"$3"\t"$5"\t"$4"\n"}'|
tail -n+2 > full.mod.x



head full.mod.x

chr5_x_5 36423829_x_36425491 36423830 Hypertension_SNP_x_SNP_interaction RANBP3L__RNA5SP181_x_RANBP3L__RNA5SP181
chr6_x_6 75846902_x_75850781 75846903 Hypertension_SNP_x_SNP_interaction MYO6_x_MYO6
chr8_x_8 119341027_x_119341744 119341028 Hypertension_SNP_x_SNP_interaction MIR548AZ__RF00421_x_MIR548AZ__RF00421
chr8_x_8 134554324_x_134554803 134554325 Hypertension_SNP_x_SNP_interaction ZFAT_x_ZFAT
chr12_x_12 45531972_x_45537190 45531973 Hypertension_SNP_x_SNP_interaction AC0799501_x_AC0799501
Separate _x_ only in the first two columns so other columns containing _x_ are not changed, by using sed command 3 times in a circular fashion. First change _x_ from 3rd occurrence on to \n (that will never be present in the line by coincidence), then _x_ to space (changing only first two occurrences) then \n back to _x_. Use awk to take each SNP and its location and convert it to bed.

grep '[0-9XY]_x_[0-9]' full.mod.x| 
sed 's/_x_/\n/g3; s/_x_/ /g; s/\n/_x_/g' | 
awk '{print $1"\t"$3"\t"$5"\t"$6"\t"$7}'| head

chr5 36423829 36423830 Hypertension_SNP_x_SNP_interaction RANBP3L__RNA5SP181_x_RANBP3L__RNA5SP181
chr6 75846902 75846903 Hypertension_SNP_x_SNP_interaction MYO6_x_MYO6
chr8 119341027 119341028 Hypertension_SNP_x_SNP_interaction MIR548AZ__RF00421_x_MIR548AZ__RF00421
chr8 134554324 134554325 Hypertension_SNP_x_SNP_interaction ZFAT_x_ZFAT
chr12 45531972 45531973 Hypertension_SNP_x_SNP_interaction AC0799501_x_AC0799501
chr20 15618886 15618887 Hypertension_SNP_x_SNP_interaction MACROD2_x_MACROD2
chr5 83137962 83137963 Hypertension_SNP_x_SNP_interaction XRCC4_x_XRCC4
chr2 83065441 83065442 Hypertension_SNP_x_SNP_interaction DHFRP3__AC1386231_x_DHFRP3__AC1386231
chr1 146039391 146039392 Coronary_heart_disease_SNP_X_SNP_interaction HJV__RNVU16_x_HJV
chr3 162443828 162443829 Coronary_heart_disease_SNP_X_SNP_interaction AC1312111__TOMM22P6_x_AC1312111__TOMM22P6



grep '[0-9XY]_x_[0-9]' full.mod.x| 
sed 's/_x_/\n/g3; s/_x_/ /g; s/\n/_x_/g' | 
awk '{print "chr"$2"\t"$4"\t"$4+1"\t"$6"\t"$7}'| head

chr5 36425491 36425492 Hypertension_SNP_x_SNP_interaction RANBP3L__RNA5SP181_x_RANBP3L__RNA5SP181
chr6 75850781 75850782 Hypertension_SNP_x_SNP_interaction MYO6_x_MYO6
chr8 119341744 119341745 Hypertension_SNP_x_SNP_interaction MIR548AZ__RF00421_x_MIR548AZ__RF00421
chr8 134554803 134554804 Hypertension_SNP_x_SNP_interaction ZFAT_x_ZFAT
chr12 45537190 45537191 Hypertension_SNP_x_SNP_interaction AC0799501_x_AC0799501
chr20 15625776 15625777 Hypertension_SNP_x_SNP_interaction MACROD2_x_MACROD2
chr5 83140542 83140543 Hypertension_SNP_x_SNP_interaction XRCC4_x_XRCC4
chr2 83066256 83066257 Hypertension_SNP_x_SNP_interaction DHFRP3__AC1386231_x_DHFRP3__AC1386231
chr1 146018957 146018958 Coronary_heart_disease_SNP_X_SNP_interaction HJV__RNVU16_x_HJV
chr3 162449608 162449609 Coronary_heart_disease_SNP_X_SNP_interaction AC1312111__TOMM22P6_x_AC1312111__TOMM22P6

Thursday, March 14, 2019

Python script to check if DNA sequence is palindrome or reverse palindrome

Check if a DNA sequence is palindrome or reverse palindrome:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Mar 12 04:18:04 2019

@author: milospjanic
"""
# Program to check if a string
#  is palindrome or not
def reverse(Pattern):
    revcomp = []
    x = len(Pattern)
    for i in Pattern:
        x = x - 1
        revcomp.append(Pattern[x])
    return ''.join(revcomp)


def compliment(Nucleotide):
    comp = []
    for i in Nucleotide:
        if i == "t":
            comp.append("a")
        if i == "a":
            comp.append("t")
        if i == "g":
            comp.append("c")
        if i == "c":
            comp.append("g")

    return ''.join(comp)


my_str = str(input("Enter a sequence: "))
# change this value for a different output
#my_str = 'aIbohPhoBiA'

# make it suitable for caseless comparison
my_str = my_str.casefold()

# reverse the string
comp = compliment (my_str)

rev = reverse (my_str)

revcomp = compliment (reverse (my_str))

print("Complement:",comp) 

print("Reversed:",rev) 

print("Reversed Complement:",revcomp) 


# check if the string is equal to its reverse

if list(my_str) == list(rev):
   print(my_str, "is palindrome")
else:
   if list(my_str) == list(revcomp):
       print(my_str, "is reverse palindrome")
   
   else:
       print(my_str, "is not a palindrome")
   

Python script for Fibonacci sequence using while loop or recursion

Python script for Fibonacci sequence using while loop or recursion.

Using recursive function:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Mar 12 05:32:45 2019

@author: milospjanic
"""

# Python program to display the Fibonacci sequence up to n-th term using recursive functions

def recur_fibo(n):
   """Recursive function to
   print Fibonacci sequence"""
   if n <= 1:
       return n
   else:
       return(recur_fibo(n-1) + recur_fibo(n-2))

# Change this value for a different result
#nterms = 10

# uncomment to take input from the user
nterms = int(input("How many terms? "))

# check if the number of terms is valid
if nterms <= 0:
   print("Plese enter a positive integer")
else:
   print("Fibonacci sequence:")
   for i in range(nterms):```
       print(recur_fibo(i), end=" , ")


Using while loop:


#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Mar 12 05:22:28 2019

@author: milospjanic
"""

# Program to display the Fibonacci sequence up to n-th term where n is provided by the user

# change this value for a different result
#nterms = 10

# uncomment to take input from the user
nterms = int(input("How many terms? "))

# first two terms
n1 = 0
n2 = 1
count = 0

# check if the number of terms is valid
if nterms <= 0:
   print("Please enter a positive integer")
elif nterms == 1:
   print("Fibonacci sequence upto",nterms,":")
   print(n1)
else:
   print("Fibonacci sequence upto",nterms,":")
   while count < nterms:
       print(n1,end=" , ")
       n3 = n1 + n2
       # update values
       n1 = n2
       n2 = n3
       count += 1

Tuesday, July 17, 2018

Plot and compare two scatterplots on the same graph using R and ggplot2

If you have two sets of data located in two data frames in R, and you want to plot their correlation in a scatterplot, use the following ggplot2 code:

ggplot(WT, aes(V1, EF, color=EF))+geom_point(shape = 16, size = 5, show.legend = FALSE, alpha = .4) 
+ theme_minimal() 
+ scale_color_gradient(low = "#0091ff", high = "#f0650e") 
+ theme(axis.text=element_text(size=24),axis.title=element_text(size=26)) 
+ labs(title = "GENE WT", x="", y="Percent") 
+ theme(plot.title = element_text(size = rel(2)))
+ geom_smooth(method=lm)
+ annotate(x=10,y=85,label=paste("R = ", round(cor (WT$V1,WT$EF),2), ", p=0.008869"),geom="text", size=8, col="darkred")




ggplot(KO, aes(V1, EF, color=EF))+geom_point(shape = 16, size = 5, show.legend = FALSE, alpha = .4) 
+ theme_minimal() 
+ scale_color_gradient(low = "#0091ff", high = "#f0650e") 
+ theme(axis.text=element_text(size=24),axis.title=element_text(size=26))
+ labs(title = "GENE KO", x="Day", y="Percent") 
+ theme(plot.title = element_text(size = rel(2)))
+ geom_smooth(method=lm)
+ annotate(x=10,y=85,label=paste("R = ", round(cor (KO$V1,KO$EF),2), ", p=1.655e-10"), geom="text", size=8, col="darkred")







However, a better visual comparison would be if we plot these on the same graph using the code:

ggplot() + 
+ theme_minimal() +
+ theme(axis.text=element_text(size=24),axis.title=element_text(size=26))+ labs(title = "GENE KO vs. WT", x="Day", y="Percent") + theme(plot.title = element_text(size = rel(2)))+
+ geom_point(data = WT, aes(x = V1, y = EF), color = "red",shape = 16, size = 5, show.legend = FALSE, alpha = .4)+
+ geom_point(data = NAT1KO, aes(x = V1, y = EF), color = "blue",shape = 16, size = 5, show.legend = FALSE, alpha = .4)+
+ geom_smooth(data = WT, aes(x = V1, y = EF),method=lm,color="red",fill="red")+
+ geom_smooth(data = NAT1KO, aes(x = V1, y = EF),method=lm,color="blue",fill="blue")+
+ annotate(x=10,y=85,label=paste("R = ", round(cor (WT$V1,WT$EF),2), ", p=0.008869"),geom="text", size=8, col="red")+
+ annotate(x=7,y=47,label=paste("R = ", round(cor (NAT1KO$V1,NAT1KO$EF),2), ", p=1.655e-10"), geom="text", size=8, col="blue")



Saturday, April 7, 2018

A bit of SED and regular expression magic to clear up your tables


Want to keep Gene ID and Fold change from 2nd and 6th column and remove quotations. Use SED and regular expressions.



mpjanic@zoran:~/test$ cat SLC2A.txt

"6033","SLC2A1-AS1",24.8051071979286,27.7198330991446,21.8903812967126,0.789701049729193,-0.340621486785587,0.565579800713107,1
"15656","SLC2A1",21989.7607363,25370.4928449324,18609.0286276675,0.733491018144907,-0.447148795187931,0.0135793982312018,0.274890758345199
"15657","SLC2A10",116.660701272146,109.114399777756,124.207002766537,1.13831907630452,0.186905008681256,0.536100348167181,1
"15658","SLC2A11",153.584023702827,160.671774940703,146.496272464952,0.911773536571793,-0.133252558036828,0.610646274133104,1
"15659","SLC2A12",0,0,0,NA,NA,NA,NA
"15660","SLC2A13",91.7915789084436,88.0137320553238,95.5694257615633,1.08584675970211,0.118820516938777,0.718169926714343,1
"15661","SLC2A14",26.1757135597822,34.6019488604179,17.7494782591466,0.512961808328973,-0.963076678383474,0.0836580834259747,0.696210238779407
"15662","SLC2A2",0.946102725140454,0.450180696019295,1.44202475426161,3.20321321418857,1.67951983083656,0.80556985989175,1
"15663","SLC2A3",1801.59770325241,1801.22866521645,1801.96674128837,1.00040976256161,0.000591041330547107,0.91557362726645,1
"15664","SLC2A4",64.9152045965179,47.8981046043178,81.932304588718,1.71055421222935,0.774463827856209,0.0320174623595299,0.440346477750378
"15665","SLC2A4RG",2839.51257874418,2552.43962703032,3126.58553045805,1.22494005239048,0.292711146586849,0.106250785209328,0.755271734116694
"15666","SLC2A5",0.488574210205611,0.450180696019295,0.526967724391927,1.17056934926712,0.227210408076608,1,1
"15667","SLC2A6",849.67804667658,741.433588192089,957.922505161072,1.29198692966806,0.369591475141396,0.0690179423560097,0.640745754210812
"15668","SLC2A7",0,0,0,NA,NA,NA,NA
"15669","SLC2A8",306.652668694167,317.104693305773,296.20064408256,0.934078398508418,-0.0983844524549534,0.643201071003747,1
"15670","SLC2A9",312.030609161521,297.26095251567,326.800265807372,1.09937165659235,0.136679190181628,0.580291149287337,1


mpjanic@zoran:~/test$ sed -E "s/\"[0-9]*\",\"//g" SLC2A.txt | sed -E "s/\",[0-9.]*,[0-9.]*,[0-9.]*,/ /g" | sed -E "s/,.*//g"
SLC2A1-AS1 0.789701049729193
SLC2A1 0.733491018144907
SLC2A10 1.13831907630452
SLC2A11 0.911773536571793
SLC2A12 NA
SLC2A13 1.08584675970211
SLC2A14 0.512961808328973
SLC2A2 3.20321321418857
SLC2A3 1.00040976256161
SLC2A4 1.71055421222935
SLC2A4RG 1.22494005239048
SLC2A5 1.17056934926712
SLC2A6 1.29198692966806
SLC2A7 NA
SLC2A8 0.934078398508418
SLC2A9 1.09937165659235

Tuesday, March 6, 2018

Code for parsing vcf files

Code to clear out vcf files to get only genotypes calls.

For example, in this vcf file we need to keep only first parameter indicating phased genotypes.

cat rs
0|0:0:1,0,0 0|1:1:0,1,0 0|0:0:1,0,0 1|0:1:0,1,0 0|1:1:0,1,0 0|0:0:1,0,0 1|0:1:0,1,0 0|1:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 1|1:2:0,0,1 1|1:2:0,0,1 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 1|1:2:0,0,1 0|0:0:1,0,0 0|0:0:1,0,0 1|0:1:0,1,0 0|0:0:1,0,0 1|0:1:0,1,0 0|1:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 1|0:1:0,1,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 0|1:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 1|1:2:0,0,1 1|0:1:0,1,0 0|1:1:0,1,0 1|0:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0
0|0:0:1,0,0 0|1:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|1:1:0,1,0 0|0:0:1,0,0 1|0:1:0,1,0 0|1:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 1|1:2:0,0,1 1|1:2:0,0,1 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 1|1:2:0,0,1 0|0:0:1,0,0 0|0:0:1,0,0 1|0:1:0,1,0 0|0:0:1,0,0 1|0:1:0,1,0 0|1:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 1|0:1:0,1,0 1|0:1:0,1,0 0|1:1:0,1,0 1|0:1:0,1,0 0|1:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 0|1:1:0,1,0 1|0:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0
0|0:0:1,0,0 0|1:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 1|0:1:0,1,0 0|1:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 1|1:2:0,0,1 1|1:2:0,0,1 0|0:0:1,0,0 0|0:0:1,0,0 1|1:2:0,0,1 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 1|0:1:0,1,0 0|1:1:0,1,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 1|1:2:0,0,1 0|1:1:0,1,0 0|0:0:1,0,0 1|0:1:0,1,0 0|0:0:1,0,0 1|0:1:0,1,0 0|1:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 1|0:1:0,1,0 1|0:1:0,1,0 0|1:1:0,1,0 1|0:1:0,1,0 0|1:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 0|0:0:1,0,0 0|1:1:0,1,0 1|0:1:0,1,0 0|1:1:0,1,0 1|0:1:0,1,0 1|0:1:0,1,0 1|0:1:0,1,0
1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|0:1:0,1,0 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 0|1:1:0,1,0 1|1:2:0,0,1 1|1:2:0,0,1 0|1:1:0,1,0 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|0:1:0,1,0 1|1:2:0,0,1 1|1:2:0,0,1 0|1:1:0,1,0 1|1:2:0,0,1 1|1:2:0,0,1 1|1:1.99:0,0.01,0.99 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|0:1:0,1,0 1|1:2:0,0,1 1|0:1:0,1,0 0|1:1:0,1,0 1|1:2:0,0,1 0|1:1:0,1,0 1|1:2:0,0,1 1|1:2:0,0,1 0|1:1:0,1,0 0|1:1:0,1,0 1|1:2:0,0,1 1|1:2:0,0,1 1|0:1.4:0,0.6,0.4 1|1:2:0,0,1 0|1:1:0,1,0 1|0:1:0,1,0 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|1:2:0,0,1 1|0:1:0,1,0

Using sed, we could remove : and a character class [0-9] with any number of repetitions *. However, this did not clear out decimal numbers present in some columns.

sed -E 's/:[0-9]*:[0-9]*,[0-9]*,[0-9]*//g' rs
0|0 0|1 0|0 1|0 0|1 0|0 1|0 0|1 0|0 0|0 0|0 1|1 1|1 0|0 0|0 0|0 0|0 0|1 1|0 1|0 0|0 0|0 1|0 0|0 0|0 0|1 1|0 1|1 0|0 0|0 1|0 0|0 1|0 0|1 1|0 0|0 1|0 0|0 0|0 1|0 0|0 0|1 1|0 0|1 0|0 0|0 0|1 1|0 0|0 0|0 0|0 0|0 1|1 1|0 0|1 1|0 1|0 0|0
0|0 0|1 0|0 0|0 0|1 0|0 1|0 0|1 0|0 0|0 0|0 1|1 1|1 0|0 0|0 0|0 0|0 0|1 1|0 1|0 0|0 0|0 1|0 0|0 0|0 0|1 1|0 1|1 0|0 0|0 1|0 0|0 1|0 0|1 1|0 0|0 1|0 0|0 0|0 1|0 1|0 0|1 1|0 0|1 0|0 0|0 0|1 1|0 0|0 0|0 0|0 0|0 0|1 1|0 0|1 1|0 1|0 0|0
0|0 0|1 0|0 0|0 0|0 0|0 1|0 0|1 0|0 0|0 0|0 1|1 1|1 0|0 0|0 1|1 0|0 0|1 1|0 1|0 0|0 0|0 1|0 0|1 0|0 0|1 1|0 1|1 0|1 0|0 1|0 0|0 1|0 0|1 1|0 0|0 1|0 0|0 0|0 1|0 1|0 0|1 1|0 0|1 0|0 0|0 0|1 1|0 0|0 0|0 0|0 0|0 0|1 1|0 0|1 1|0 1|0 1|0
1|1 1|1 1|1 1|1 1|1 1|0 1|1 1|1 1|1 1|1 1|1 1|1 1|1 0|1 1|1 1|1 0|1 1|1 1|1 1|1 1|1 1|1 1|0 1|1 1|1 0|1 1|1 1|1 1|1:1.99:0,0.01,0.99 1|1 1|1 1|1 1|0 1|1 1|0 0|1 1|1 0|1 1|1 1|1 0|1 0|1 1|1 1|1 1|0:1.4:0,0.6,0.4 1|1 0|1 1|0 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|0

To remove decimal points, use character class [0-9.] with any number of repetitions *.

sed -E 's/:[0-9.]*:[0-9.]*,[0-9.]*,[0-9.]*//g' rs
0|0 0|1 0|0 1|0 0|1 0|0 1|0 0|1 0|0 0|0 0|0 1|1 1|1 0|0 0|0 0|0 0|0 0|1 1|0 1|0 0|0 0|0 1|0 0|0 0|0 0|1 1|0 1|1 0|0 0|0 1|0 0|0 1|0 0|1 1|0 0|0 1|0 0|0 0|0 1|0 0|0 0|1 1|0 0|1 0|0 0|0 0|1 1|0 0|0 0|0 0|0 0|0 1|1 1|0 0|1 1|0 1|0 0|0
0|0 0|1 0|0 0|0 0|1 0|0 1|0 0|1 0|0 0|0 0|0 1|1 1|1 0|0 0|0 0|0 0|0 0|1 1|0 1|0 0|0 0|0 1|0 0|0 0|0 0|1 1|0 1|1 0|0 0|0 1|0 0|0 1|0 0|1 1|0 0|0 1|0 0|0 0|0 1|0 1|0 0|1 1|0 0|1 0|0 0|0 0|1 1|0 0|0 0|0 0|0 0|0 0|1 1|0 0|1 1|0 1|0 0|0
0|0 0|1 0|0 0|0 0|0 0|0 1|0 0|1 0|0 0|0 0|0 1|1 1|1 0|0 0|0 1|1 0|0 0|1 1|0 1|0 0|0 0|0 1|0 0|1 0|0 0|1 1|0 1|1 0|1 0|0 1|0 0|0 1|0 0|1 1|0 0|0 1|0 0|0 0|0 1|0 1|0 0|1 1|0 0|1 0|0 0|0 0|1 1|0 0|0 0|0 0|0 0|0 0|1 1|0 0|1 1|0 1|0 1|0
1|1 1|1 1|1 1|1 1|1 1|0 1|1 1|1 1|1 1|1 1|1 1|1 1|1 0|1 1|1 1|1 0|1 1|1 1|1 1|1 1|1 1|1 1|0 1|1 1|1 0|1 1|1 1|1 1|1 1|1 1|1 1|1 1|0 1|1 1|0 0|1 1|1 0|1 1|1 1|1 0|1 0|1 1|1 1|1 1|0 1|1 0|1 1|0 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|0