Brad Chapman suggested to use
'samtools index'
as an unsorted BAM would return an error.On my side I wrote the following 'C' program using the Samtools API:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
moved to: https://github.com/lindenb/samtools-utilities/blob/master/src/bamsorted.c |
Compilation
FLAG64= -m64
gcc ${FLAG64} -o bamsorted -O3 -I ${SAMDIR} -L ${SAMDIR} bamsorted.c -lbam -lz
gcc ${FLAG64} -o bamsorted -O3 -I ${SAMDIR} -L ${SAMDIR} bamsorted.c -lbam -lz
Test
Using samtools index
> for F in `find . -name "*.bam"`; do echo $F; samtools index ${F} ; done
GALAXY/galaxy-dist/test-data/sam_merge_out2.bam
GALAXY/galaxy-dist/test-data/sam_merge_in1.bam
GALAXY/galaxy-dist/test-data/srma_out1.bam
GALAXY/galaxy-dist/test-data/srma_out2.bam
GALAXY/galaxy-dist/test-data/sam_merge_in3.bam
GALAXY/galaxy-dist/test-data/sam_merge_in2.bam
GALAXY/galaxy-dist/test-data/1.bam
GALAXY/galaxy-dist/test-data/srma_in1.bam
GALAXY/galaxy-dist/test-data/sam_to_bam_out1.bam
GALAXY/galaxy-dist/test-data/3.bam
[bam_index_core] the alignment is not sorted (HWI-EAS91_1_30788AAXX:1:1:799:192): 14955 > 8420 in 46-th chr
GALAXY/galaxy-dist/test-data/srma_in3.bam
GALAXY/galaxy-dist/test-data/sam_pileup_in1.bam
GALAXY/galaxy-dist/test-data/sam_to_bam_out2.bam
GALAXY/galaxy-dist/lib/galaxy/datatypes/test/1.bam
GALAXY/galaxy-dist/lib/galaxy/datatypes/test/3.bam
[bam_index_core] the alignment is not sorted (HWI-EAS91_1_30788AAXX:1:1:799:192): 14955 > 8420 in 46-th chr
GALAXY/galaxy-dist/test-data/sam_merge_out2.bam
GALAXY/galaxy-dist/test-data/sam_merge_in1.bam
GALAXY/galaxy-dist/test-data/srma_out1.bam
GALAXY/galaxy-dist/test-data/srma_out2.bam
GALAXY/galaxy-dist/test-data/sam_merge_in3.bam
GALAXY/galaxy-dist/test-data/sam_merge_in2.bam
GALAXY/galaxy-dist/test-data/1.bam
GALAXY/galaxy-dist/test-data/srma_in1.bam
GALAXY/galaxy-dist/test-data/sam_to_bam_out1.bam
GALAXY/galaxy-dist/test-data/3.bam
[bam_index_core] the alignment is not sorted (HWI-EAS91_1_30788AAXX:1:1:799:192): 14955 > 8420 in 46-th chr
GALAXY/galaxy-dist/test-data/srma_in3.bam
GALAXY/galaxy-dist/test-data/sam_pileup_in1.bam
GALAXY/galaxy-dist/test-data/sam_to_bam_out2.bam
GALAXY/galaxy-dist/lib/galaxy/datatypes/test/1.bam
GALAXY/galaxy-dist/lib/galaxy/datatypes/test/3.bam
[bam_index_core] the alignment is not sorted (HWI-EAS91_1_30788AAXX:1:1:799:192): 14955 > 8420 in 46-th chr
Using my C code
> for F in `find . -name "*.bam"`; do ./bamsorted ${F} ; done
GALAXY/galaxy-dist/test-data/sam_merge_out2.bam OK
GALAXY/galaxy-dist/test-data/sam_merge_in1.bam OK
GALAXY/galaxy-dist/test-data/srma_out1.bam OK
GALAXY/galaxy-dist/test-data/srma_out2.bam OK
GALAXY/galaxy-dist/test-data/sam_merge_in3.bam OK
GALAXY/galaxy-dist/test-data/sam_merge_in2.bam OK
GALAXY/galaxy-dist/test-data/1.bam OK
GALAXY/galaxy-dist/test-data/srma_in1.bam OK
GALAXY/galaxy-dist/test-data/sam_to_bam_out1.bam OK
GALAXY/galaxy-dist/test-data/3.bam Unsorted: On Reference[45]="chrM" position=8420 after 14955.
GALAXY/galaxy-dist/test-data/srma_in3.bam OK
GALAXY/galaxy-dist/test-data/sam_pileup_in1.bam OK
GALAXY/galaxy-dist/test-data/sam_to_bam_out2.bam OK
GALAXY/galaxy-dist/lib/galaxy/datatypes/test/1.bam OK
GALAXY/galaxy-dist/lib/galaxy/datatypes/test/3.bam Unsorted: On Reference[45]="chrM" position=8420 after 14955.
GALAXY/galaxy-dist/test-data/sam_merge_out2.bam OK
GALAXY/galaxy-dist/test-data/sam_merge_in1.bam OK
GALAXY/galaxy-dist/test-data/srma_out1.bam OK
GALAXY/galaxy-dist/test-data/srma_out2.bam OK
GALAXY/galaxy-dist/test-data/sam_merge_in3.bam OK
GALAXY/galaxy-dist/test-data/sam_merge_in2.bam OK
GALAXY/galaxy-dist/test-data/1.bam OK
GALAXY/galaxy-dist/test-data/srma_in1.bam OK
GALAXY/galaxy-dist/test-data/sam_to_bam_out1.bam OK
GALAXY/galaxy-dist/test-data/3.bam Unsorted: On Reference[45]="chrM" position=8420 after 14955.
GALAXY/galaxy-dist/test-data/srma_in3.bam OK
GALAXY/galaxy-dist/test-data/sam_pileup_in1.bam OK
GALAXY/galaxy-dist/test-data/sam_to_bam_out2.bam OK
GALAXY/galaxy-dist/lib/galaxy/datatypes/test/1.bam OK
GALAXY/galaxy-dist/lib/galaxy/datatypes/test/3.bam Unsorted: On Reference[45]="chrM" position=8420 after 14955.
That's it,
Pierre
Thank you Pierre. Very useful !
ReplyDeleteHi Pierre,
ReplyDeleteI am confused. Does our script double check the bam files being sorted? Because I didn't get any error using "samtools index". Could it not be sorted?
It should only print a message if the reads aren't sorted.
ReplyDeletevery useful script. Thank you!
ReplyDelete