Wednesday, March 30, 2016

Finding SAM files in subfolders using find and regex, and convert to BAM format, sort, index within subfolders

Continuing on the previous post, if you want to find all SAM files in subdirectories and convert them to BAM, sort them and index within the subdirectories, use the following code.
Unix find command with -regex option will find all SAM files, make sure to add .* or .*\/ as the find command will find files using their full path.

List SAM files:

root@valkyr:~/tmp_rnaseq# sudo find ./  -regex '^.*\/.*sam$' -exec ls -l {} \;
-rw-r--r-- 1 root root 39437555424 Mar 29 03:07 ./9071501.8_26436_ATCACG/Pass2/Aligned.out.sam
-rw-r--r-- 1 root root 53567653787 Jan 15 20:01 ./59885590_26425_CGATGT/Pass2/Aligned.out.sam
-rw-r--r-- 1 root root 49754863117 Jan 15 21:01 ./8072501_26426_TGACCA/Pass2/Aligned.out.sam

Convert to BAM:

sudo find ./ -regex '^.*\/.*sam$' -execdir sh -c 'samtools view -bS {} > Aligned.out.bam' \;
Sort BAM files:

sudo find ./ -regex '^.*\/.*bam$' -execdir sh -c 'samtools sort {} {}.sort' \;
List sorted BAM files:

sudo find ./ -regex '^.*\/.*sort.bam$' -execdir sh -c 'ls -l {}' \;
-rw-r--r-- 1 root root 4516449286 Mar 30 13:05 ./Aligned.out.bam.sort.bam
-rw-r--r-- 1 root root 5993809569 Mar 30 13:44 ./Aligned.out.bam.sort.bam
-rw-r--r-- 1 root root 5404073437 Mar 30 14:18 ./Aligned.out.bam.sort.bam

Index BAM files:

sudo find ./ -regex '^.*\/.*sort.bam$' -execdir sh -c 'samtools index {}' \;

No comments:

Post a Comment