Sunday, October 30, 2016

Hierarchical clustering of correlation matrices to find co-regulated genes

If you obtained a correlation matrix for a set of genes and would like to cluster it, use the following R code.

In this example, dissimilarity is calculated using 1 - absolute value of the correlation formula that gives the largest discrimination of the correlated pairs compared to other methods, such as, 1-abs(cor^2), 1-cor, or (1-cor)/2. Ref.


#load data matrix
data <-read.delim("data.txt", header=T,row.names=1)
data[] <- 1

#list matrix

#use cor to make a correlation matrix

#make disimilarity matrix as 1 - absolute value of the correlation
dissimilarity <- 1 - abs(correlation)

#calculate distance
distance <- as.dist(dissimilarity)

#create pdf

#plot matrix using heatmap.2 from gplots
After plotting the resulting graph will show clusters of correlated genes using hierarchical clustering of heatmap.2, that itself uses hclust R function for clustering:

No comments:

Post a Comment