23 April 2013

playing with BWA-MEM : my notebook

BWA-MEM, is a new tool that is part of the latest version of BWA. As said Heng Li on Biostar bwa-mem can be used to identify the viral integration sites in the human genome. Here I've used various short reads to explore how bwa maps the pairs. The sequences I used below are:

  • NOTCH2a and NOTCH2b two sequences on the chromosomes 1, on the same strand
  • NOTCH2del : same as NOTCH2a but with a deletion
  • ROXAN1a and ROXAN1b: two sequences on the chromosome 22, on the same strand
  • NOTCH2ins is NOTCH2a with and insertion of ROXAN1a
The following results were generated with the makefile below. Each pair of fastq was submitted twice (1_* and 2_*). I also added some pairs of reads from another source of FASTQs to let bwa infer the size of the fragments (see biostar: http://www.biostars.org/p/69694).

NOTCH2a vs NOTCH2b BWA-MEM options: none

NOTCH2a vs NOTCH2b
BWA-MEM options: none
NOT Properly mapped reads because they're on the same strand.
ReadFlagChromPosqualCigarExtra
1_Notch2p1=651120572000950MNM:i:0 AS:i:50 XS:i:45
1_Notch2p2=1291120572200950MNM:i:0 AS:i:50 XS:i:45
2_Notch2p11120572000950MNM:i:0 AS:i:50 XS:i:45
2_Notch2p21120572200950MNM:i:0 AS:i:50 XS:i:45

NOTCH2a vs revcomp(NOTCH2b) BWA-MEM options: none

NOTCH2a vs revcomp(NOTCH2b)
BWA-MEM options: none
Properly mapped reads.
ReadFlagChromPosqualCigarExtra
1_Notch2pPR1=9911205720003950MNM:i:0 AS:i:50 XS:i:45
1_Notch2pPr2=14711205722003950MNM:i:0 AS:i:50 XS:i:45
2_Notch2pPR111205720003950MNM:i:0 AS:i:50 XS:i:45
2_Notch2pPr211205722003950MNM:i:0 AS:i:50 XS:i:45

NOTCH2del vs revcomp(NOTCH2b) BWA-MEM options: none

NOTCH2del vs revcomp(NOTCH2b)
BWA-MEM options: none
Deletion in Read1
ReadFlagChromPosqualCigarExtra
1_Notch2delpPR1=9911205720003937M20D31MNM:i:20 AS:i:42 XS:i:37
1_Notch2delpPr2=14711205722003950MNM:i:0 AS:i:50 XS:i:45
2_Notch2delpPR111205720003937M20D31MNM:i:20 AS:i:42 XS:i:37
2_Notch2delpPr211205722003950MNM:i:0 AS:i:50 XS:i:45

NOTCH2ins vs revcomp(NOTCH2b) BWA-MEM options:

NOTCH2ins vs revcomp(NOTCH2b)
BWA-MEM options: non
Soft clipping of chr1: two hits on different chromosomes.
ReadFlagChromPosqualCigarExtra
1_NOTCH2insRoxanpR1=9722417215666023S50M27SNM:i:0 AS:i:50 XS:i:0
1_NOTCH2insRoxanpr2=1451120572200950MNM:i:0 AS:i:50 XS:i:45
2_NOTCH2insRoxanpR122417215666023S50M27SNM:i:0 AS:i:50 XS:i:0
2_NOTCH2insRoxanpr21120572200950MNM:i:0 AS:i:50 XS:i:45


NOTCH2a+ROXAN1a vs revcomp(NOTCH2b) BWA-MEM options: none

NOTCH2a+ROXAN1a vs revcomp(NOTCH2b)
BWA-MEM options: none
two hits on chromosome chr1 and chr22 for the 5' seq
nevertheless, properly paired was set.
ReadFlagChromPosqualCigarExtra
1_Notch2RoxanpPR1=991120572000950M50SNM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0;
1_Notch2RoxanpPR1=992241721566950S50MNM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0;
1_Notch2RoxanpPr2=1471120572200950MNM:i:0 AS:i:50 XS:i:45
2_Notch2RoxanpPR11120572000950M50SNM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0;
2_Notch2RoxanpPR12241721566950S50MNM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0;
2_Notch2RoxanpPr21120572200950MNM:i:0 AS:i:50 XS:i:45

NOTCH2a+ROXAN1a vs revcomp(NOTCH2b+ROXAN1b) BWA-MEM options: none

NOTCH2a+ROXAN1a vs revcomp(NOTCH2b+ROXAN1b)
BWA-MEM options: none
All fragments on chr1/chr22.
Reads are NOT properly paired.
ReadFlagChromPosqualCigarExtra
1_Notch2RoxanpR1=971120572000950M50SNM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0;
1_Notch2RoxanpR1=972241721566950S50MNM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0;
1_Notch2Roxanpr2=14522417217666050S50MNM:i:0 AS:i:50 XS:i:0 XP:Z:1,-120572200,50M50S,9,0;
1_Notch2Roxanpr2=1451120572200950M50SNM:i:0 AS:i:50 XS:i:45 XP:Z:22,-41721766,50S50M,60,0;
2_Notch2RoxanpR11120572000950M50SNM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0;
2_Notch2RoxanpR12241721566950S50MNM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0;
2_Notch2Roxanpr222417217666050S50MNM:i:0 AS:i:50 XS:i:0 XP:Z:1,-120572200,50M50S,9,0;
2_Notch2Roxanpr21120572200950M50SNM:i:0 AS:i:50 XS:i:45 XP:Z:22,-41721766,50S50M,60,0;

NOTCH2a+ROXAN1a vs revcomp(NOTCH2b) BWA-MEM options: -C

NOTCH2a+ROXAN1a vs revcomp(NOTCH2b)
BWA-MEM options: -C
I cannot see the effect of the option '-C':
"append FASTA/FASTQ comment to SAM output"
ReadFlagChromPosqualCigarExtra
1_Notch2RoxanpPR1=991120572000950M50SNM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0;
1_Notch2RoxanpPR1=992241721566950S50MNM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0;
1_Notch2RoxanpPr2=1471120572200950MNM:i:0 AS:i:50 XS:i:45
2_Notch2RoxanpPR11120572000950M50SNM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0;
2_Notch2RoxanpPR12241721566950S50MNM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0;
2_Notch2RoxanpPr21120572200950MNM:i:0 AS:i:50 XS:i:45

NOTCH2a+ROXAN1a vs revcomp(NOTCH2b) BWA-MEM options: -M

NOTCH2a+ROXAN1a vs revcomp(NOTCH2b)
BWA-MEM options: -M
Set the duplicate flags for the multiple hits.
ReadFlagChromPosqualCigarExtra
1_Notch2RoxanpPR1=991120572000950M50SNM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0;
1_Notch2RoxanpPR1s=3552241721566950S50MNM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0;
1_Notch2RoxanpPr21120572200950MNM:i:0 AS:i:50 XS:i:45
2_Notch2RoxanpPR11120572000950M50SNM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0;
2_Notch2RoxanpPR1s2241721566950S50MNM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0;
2_Notch2RoxanpPr21120572200950MNM:i:0 AS:i:50 XS:i:45

NOTCH2ins vs revcomp(ROXAN1b) BWA-MEM options: none

NOTCH2ins vs revcomp(ROXAN1b)
BWA-MEM options: none
Reads are mapped on chr22.
ReadFlagChromPosqualCigarExtra
1_NOTCH2insRoxanpPR122417215666023S50M27SNM:i:0 AS:i:50 XS:i:0
1_NOTCH2insRoxanpPr222417217666050MNM:i:0 AS:i:50 XS:i:0
2_NOTCH2insRoxanpPR122417215666023S50M27SNM:i:0 AS:i:50 XS:i:0
2_NOTCH2insRoxanpPr222417217666050MNM:i:0 AS:i:50 XS:i:0

That's it,

Pierre

The makefile

No comments: