Thursday, May 26, 2016

Looping samtools for multiple sam and converting sam>bam through find/xargs

It might be the case that you have lots of sam files in multiple subfolders and you want to issue one command to convert them to bam files. This might be particularly useful when sam files have the same name and you want to keep converted bam files in the subfolders. To convert sam to bam you can use these three commands that include find piped to xargs:

find . -name *sam | xargs -I % sh -c 'echo %; samtools view -bS % > %.bam;'
find . -name *sam.bam | xargs -I % sh -c 'echo %; samtools sort % %.sorted;'
find . -name *sam.bam.sorted | xargs -I % sh -c 'echo %; samtools index %;'

Here is the output of the second command where you can see that find command will use files in subfolders N deep named /Pass2 generated by STAR and pipe it to xargs.

find . -name *sam.bam | xargs -I % sh -c 'echo %; samtools sort % %.sorted;'
./mS10_1.fastq.gz_mS10_2.fastq.gz/Pass2/Aligned.out.sam.bam
[bam_sort_core] merging from 80 files...
./FBS2_S4_merged_R1_001.fastq.gz_FBS2_S4_merged_R2_001.fastq.gz/Pass2/Aligned.out.sam.bam
[bam_sort_core] merging from 101 files...
./FBS4_S5_merged_R1_001.fastq.gz_FBS4_S5_merged_R2_001.fastq.gz/Pass2/Aligned.out.sam.bam
[bam_sort_core] merging from 54 files...
./FBS6_S6_merged_R1_001.fastq.gz_FBS6_S6_merged_R2_001.fastq.gz/Pass2/Aligned.out.sam.bam
[bam_sort_core] merging from 52 files...
./SF4_S1_merged_R1_001.fastq.gz_SF4_S1_merged_R2_001.fastq.gz/Pass2/Aligned.out.sam.bam
[bam_sort_core] merging from 93 files...

No comments:

Post a Comment