Random sampling of the initial data set provides valuable estimates about the genome-wide correlation analysis and helps define true causal correlations and reveal false positive ones. Here is an example of a false positive correlation between transcription factor and a chromatin modification. The peak of chromatin modification shares same height for TF occupied genes as for the random gene subsets of the same size.
Conversely, if the peak is above the randomized ones the correlation peak possesses a causal characteristics for the two features tested. In the case bellow tested chromatin modification associates with the transcription binding near the promoters of genes.
This kind of analysis is necessary to determine whether a global correlation on a genome-wide scale is meaningful or not, i.e. whether it implies causal relationship between tested parameters or not. Researchers may be tricked by the correlation peaks when doing whole-genome analysis and thus may make wrong conclusions unless this kind of approach is not applied in parallel. The detailed information and the source code for the random sampling could be download from the BMC Genomics web page.
Link to original article: http://www.biomedcentral.com/1471-2164/12/181