Wednesday, April 20, 2016

Tools to convert bam to bigwig

These are two scripts I wrote, bam2bigwig and bamsplit2bigwig, to convert bam files to bigwig for UCSC Genome Browser viewing. The difference is that bamsplit2bigwig will separate reads according to the mapping strand and the separately convert them to bigwig, which may be useful in some cases.

bam2bigwig
This script will take a bam file mapped to the human genome hg19 and convert it to a bigwig file that could be loaded to UCSC Genome Browser. It depends on bedtools bamtobed (aka bamToBed), and it will attempt to download and execute three UCSC scripts, bedItemOverlapCount, bedGraphToBigWig and fetchChromSizes, so make sure your connection is working.

Usage
chmod 775 bam2bigwig
./bam2bigwig file.bam
Code:

#! /bin/bash

#create bed from bam, requires bedtools bamToBed
bamToBed -i $1 -split > accepted_hits.bed

#download bedItemOverlapCount
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/bedItemOverlapCount
chmod 775 bedItemOverlapCount

#fetch hg19 chromosome sizes
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/fetchChromSizes
chmod 775 fetchChromSizes
./fetchChromSizes hg19 > hg19.chrom.sizes

#create plus and minus strand bedgraph
cat accepted_hits.bed | sort -k1,1 | ./bedItemOverlapCount hg19 -chromSize=hg19.chrom.sizes stdin | sort -k1,1 -k2,2n > accepted_hits.bedGraph

#download bedGraphToBigWig
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/bedGraphToBigWig
chmod 755 bedGraphToBigWig

#create plus and minus strand bigwig
./bedGraphToBigWig accepted_hits.bedGraph hg19.chrom.sizes $1.bw

#removing intermediery files
rm accepted_hits.bed
rm accepted_hits.bedGraph
rm bedItemOverlapCount
rm bedGraphToBigWig
rm fetchChromSizes
rm hg19.chrom.sizes

bamsplit2bigwig
This script will take a bam file mapped to the human genome hg19 and convert it to two strand specific bigwig files that could be loaded to UCSC Genome Browser. It depends on bedtools bamtobed (aka bamToBed), and it will attempt to download and execute three UCSC scripts, bedItemOverlapCount, bedGraphToBigWig and fetchChromSizes, so make sure your connection is working. Converting bam to bigwig and separating according to strand could be useful to visualize nucleosomal or protein binding 'footprints' in ChIP-Seq experiments as at the boundaries of protein-DNA interactions will be marked with reads mapping on different DNA strands. 

Usage
chmod 775 bamsplit2bigwig
./bamsplit2bigwig file.bam
Code:

#! /bin/bash

#create bed from bam, requires bedtools bamToBed
bamToBed -i $1 -split > accepted_hits.bed

#download bedItemOverlapCount
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/bedItemOverlapCount
chmod 775 bedItemOverlapCount

#fetch hg19 chromosome sizes
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/fetchChromSizes
chmod 775 fetchChromSizes
./fetchChromSizes hg19 > hg19.chrom.sizes

#create plus and minus strand bedgraph
awk '{if($6=="+") print}' accepted_hits.bed | sort -k1,1 | ./bedItemOverlapCount hg19 -chromSize=hg19.chrom.sizes stdin | sort -k1,1 -k2,2n > accepted_hits.plus.bedGraph
awk '{if($6=="-") print}' accepted_hits.bed | sort -k1,1 | ./bedItemOverlapCount hg19 -chromSize=hg19.chrom.sizes stdin | sort -k1,1 -k2,2n | awk '{OFS="\t"; print $1,$2,$3,"-"$4}' > accepted_hits.minus.bedGraph

#download bedGraphToBigWig
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/bedGraphToBigWig
chmod 755 bedGraphToBigWig

#create plus and minus strand bigwig
./bedGraphToBigWig accepted_hits.plus.bedGraph hg19.chrom.sizes $1.plus.bw
./bedGraphToBigWig accepted_hits.minus.bedGraph hg19.chrom.sizes $1.minus.bw

#removing intermediery files
rm accepted_hits.bed
rm accepted_hits.plus.bedGraph
rm accepted_hits.minus.bedGraph
rm bedItemOverlapCount
rm bedGraphToBigWig
rm fetchChromSizes
rm hg19.chrom.sizes

3 comments:

  1. you can create a bedgraph directly from bam with bedtools alone:
    $ bedtools genomecov -bga -ibam file.bam > file.bedgraph
    you can then convert it to bigwig directly:
    $ bedGraphToBigWig file.bedgraph reference.fai file.bigwig
    the only requirement is the reference.fai file, that you should have already generated when you mapped the bam file.

    ReplyDelete

  2. An interesting and useful post, thank you very much for sharing clear and understandable concept scripting
    Richard Brown virtual data room pricing

    ReplyDelete
  3. Can you also filter out multimappers to keep only unique reads ?
    Thanks

    ReplyDelete