Wednesday, February 1, 2017

Code for plotting up- and down-regulated gene FC and p-val distributions in R

In differential expression analysis, use the DE output table:

> head(res)
          id baseMean baseMeanA baseMeanB foldChange log2FoldChange      pval
1 SEC24B-AS1 3.837647 3.0343409  4.640954   1.529477      0.6130381 0.4469177
2       A1BG 1.769987 0.7314147  2.808559   3.839900      1.9410687 0.3359994
3       A1CF 0.000000 0.0000000  0.000000        NaN            NaN        NA
4      GGACT 5.722972 3.5491983  7.896745   2.224938      1.1537650 0.1943902
5        A2M 0.000000 0.0000000  0.000000        NaN            NaN        NA
6      A2ML1 0.000000 0.0000000  0.000000        NaN            NaN        NA
  padj
1    1
2    1
3   NA
4    1
5   NA
6   NA
Subset and create two data frames with up and down-regulated genes. Use na.omit to remove row with NA.

> down<-na.omit(res[res$log2FoldChange<(-0.5)&res$padj<0.05,])
> head (down)
            id     baseMean   baseMeanA    baseMeanB foldChange log2FoldChange
139        ACE    81.090090   128.42927 3.375091e+01 0.26279761     -1.9279759
266   ADAMTS15   109.686212   142.19707 7.717536e+01 0.54273520     -0.8816796
315      ADH1B   224.356545   348.75853 9.995456e+01 0.28660106     -1.8028842
421     AHNAK2   493.239479   623.24438 3.632346e+02 0.58281245     -0.7788964
435       AIM1   311.990989   409.72136 2.142606e+02 0.52294227     -0.9352764
              pval         padj
139   3.009363e-13 6.979143e-10
266   1.234959e-04 1.774189e-02
315   9.575011e-18 5.181357e-14
421   2.785485e-05 5.652445e-03
435   1.215970e-05 2.860877e-03

> up<-na.omit(res[res$log2FoldChange>0.5&res$padj<0.05,])
> head(up)
            id    baseMean    baseMeanA   baseMeanB foldChange log2FoldChange
198      ACTG2    28.00957    13.571106    42.44803   3.127824      1.6451595
206      ACTN1  7221.10709  5591.227639  8850.98654   1.583013      0.6626732
783      APLP1    56.91524    36.938228    76.89226   2.081645      1.0577237
1416      BEX1   134.33394    83.364379   185.30350   2.222814      1.1523872
2832      CCL2  3879.93627  2577.263209  5182.60934   2.010896      1.0078388
              pval         padj
198   2.693990e-05 5.535979e-03
206   2.481289e-04 3.196924e-02
783   1.122944e-04 1.672465e-02
1416  1.364952e-07 5.831217e-05
2832  3.065576e-08 1.605373e-05
Next, concatenate log2FoldChange from up and down dataframes to the dat$dens, and concatenate and repeat DOWN and UP elements exactly length(down$log2FoldChange) and length(up$log2FoldChange) times to the dat$lines column of the dat data frame.

> dat <- data.frame(dens = c(down$log2FoldChange, up$log2FoldChange), lines = rep(c("DOWN", "UP"), c(length(down$log2FoldChange), length(up$log2FoldChange))))

> head (dat)
          dens lines
1   -1.9279759  DOWN
2   -0.8816796  DOWN
3   -1.8028842  DOWN
4   -0.7788964  DOWN
5   -0.9352764  DOWN

> tail (dat)
143  1.2182830    UP
144  1.0698604    UP
145  0.7518891    UP
146  1.1097739    UP
147  0.6468804    UP
148  0.9988166    UP
Plot dat as density plot using ggplot2,

> library(ggplot2)

> pdf("pvalues.pdf")
> ggplot(dat, aes(x = dens, fill = lines)) + geom_density(alpha = 0.5) + ggtitle("Upregulated and downregulated gene -logFC distribution") +xlab("-logFC")+ylab("Density")+theme(axis.text=element_text(size=18),
+         axis.title=element_text(size=16), plot.title=element_text(size=16))
> dev.off()



Similarly, make a density plot for adjusted p-values.

> dat <- data.frame(dens = c(-log(down$padj), -log(up$padj)), lines = rep(c("DOWN", "UP"), c(length(down$padj), length(up$padj))))

> head (dat)
       dens lines
1 21.082925  DOWN
2  4.031827  DOWN
3 30.591124  DOWN
4  5.175667  DOWN
5  5.856627  DOWN
6  4.568621  DOWN
> tail (dat)
         dens lines
143 10.512156    UP
144 12.505760    UP
145  6.208198    UP
146 14.345963    UP
147  3.691205    UP
148 10.043970    UP
> pdf("pvalues.pdf")
> ggplot(dat, aes(x = dens, fill = lines)) + geom_density(alpha = 0.5) + ggtitle("Upregulated and downregulated gene pvalues distribution") +xlab("-logPvalue")+ylab("Density")+theme(axis.text=element_text(size=18),
+         axis.title=element_text(size=16), plot.title=element_text(size=16))
> dev.off()

No comments:

Post a Comment