Thursday, May 11, 2017

Defining and calculating genome coverage, depth and breadth of coverage

Average estimation of the whole genome coverage in a sequencing assay (i.e. depth of coverage) is calculated with the formula:

coverage = (read count * read length ) / total genome size
Similarly you can calculate the estimation of the coverage for each gene or gene locus:

coverage of the gene= (gene read count * read length ) / gene size 
In reality of the sequencing experiment, distribution of the mapped reads is uneven and in additon not all portions of the reads will map to the genome, therefore for each nucleotide you will calculate specific read coverage (per base read coverage), for example using bedtools genomecov:

bedtools genomecov -ibam file.bam -g my.genome -d | head

chr1  6  0
chr1  7  0
chr1  8  0
chr1  9  0
chr1  10 0
chr1  11 1
chr1  12 1
chr1  13 1
chr1  14 1
chr1  15 1
On the other side, breadth of coverage is a different term than depth of coverage and relates to the proportion of the genome that is covered with reads. Both breadth and depth of coverage correlate to the sequencing depth i.e. the number of reads generated in the sequencing experiment.

For details see publication:

1 comment:

  1. This topic is probably suited to the interests of biologists or science enthusiasts. The content shoes promise but exhibits disorganization. Hope that this is corrected for future posts.