Tuesday, March 29, 2016

Find SAM files in subfolders and convert them to BAM format by executing samtools within the subfolders

Lets say you have SAM files in subfolders of a current folder and you would like to convert them into BAM format using samtools. To do this you can copy them all to another folder using find command and then execute samtools in a loop from that folder. However, you can find each SAM file within its subfolder and execute samtools within the subfolder by specifying -execdir option in the find command.

If you dont specify -execdir but only -exec, the output BAM file will be written in the parent folder, and probably will be overwritten in each cycle.

Find all SAM files in subforlders:

root@valkyr:~/tmp_rnaseq# sudo find ./  -regex '^.*\/.*sam$' -exec ls -l {} \;
-rw-r--r-- 1 root root 39416554017 Mar 29 02:23 ./9071501.8_26436_ATCACG/Pass1/Aligned.out.sam
-rw-r--r-- 1 root root 53346566430 Jan 15 19:11 ./59885590_26425_CGATGT/Pass1/Aligned.out.sam
-rw-r--r-- 1 root root 49704309954 Jan 15 20:13 ./8072501_26426_TGACCA/Pass1/Aligned.out.sam
Execute samtools within subfolders:

sudo find ./ -regex '^.*\/.*sam$' -execdir sh -c 'samtools view -bS {} > Aligned.out.bam' \;
You could achieve the same by creating a for loop that will go into each subdirectory and execute samtools, however creating such loop may be challenging if the names of the subfolders are complicated. The find oneliner is quicker and easier.


No comments:

Post a Comment