playing with BWA-MEM : my notebook
BWA-MEM, is a new tool that is part of the latest version of BWA. As said Heng Li on Biostar bwa-mem can be used to identify the viral integration sites in the human genome. Here I've used various short reads to explore how bwa maps the pairs. The sequences I used below are:
- NOTCH2a and NOTCH2b two sequences on the chromosomes 1, on the same strand
- NOTCH2del : same as NOTCH2a but with a deletion
- ROXAN1a and ROXAN1b: two sequences on the chromosome 22, on the same strand
- NOTCH2ins is NOTCH2a with and insertion of ROXAN1a
NOTCH2a vs NOTCH2b BWA-MEM options: none
Read | Flag | Chrom | Pos | qual | Cigar | Extra |
---|---|---|---|---|---|---|
1_Notch2 | p1=65 | 1 | 120572000 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
1_Notch2 | p2=129 | 1 | 120572200 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
2_Notch2 | p1 | 1 | 120572000 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
2_Notch2 | p2 | 1 | 120572200 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
NOTCH2a vs revcomp(NOTCH2b) BWA-MEM options: none
Read | Flag | Chrom | Pos | qual | Cigar | Extra |
---|---|---|---|---|---|---|
1_Notch2 | pPR1=99 | 1 | 120572000 | 39 | 50M | NM:i:0 AS:i:50 XS:i:45 |
1_Notch2 | pPr2=147 | 1 | 120572200 | 39 | 50M | NM:i:0 AS:i:50 XS:i:45 |
2_Notch2 | pPR1 | 1 | 120572000 | 39 | 50M | NM:i:0 AS:i:50 XS:i:45 |
2_Notch2 | pPr2 | 1 | 120572200 | 39 | 50M | NM:i:0 AS:i:50 XS:i:45 |
NOTCH2del vs revcomp(NOTCH2b) BWA-MEM options: none
Read | Flag | Chrom | Pos | qual | Cigar | Extra |
---|---|---|---|---|---|---|
1_Notch2del | pPR1=99 | 1 | 120572000 | 39 | 37M20D31M | NM:i:20 AS:i:42 XS:i:37 |
1_Notch2del | pPr2=147 | 1 | 120572200 | 39 | 50M | NM:i:0 AS:i:50 XS:i:45 |
2_Notch2del | pPR1 | 1 | 120572000 | 39 | 37M20D31M | NM:i:20 AS:i:42 XS:i:37 |
2_Notch2del | pPr2 | 1 | 120572200 | 39 | 50M | NM:i:0 AS:i:50 XS:i:45 |
NOTCH2ins vs revcomp(NOTCH2b) BWA-MEM options:
Read | Flag | Chrom | Pos | qual | Cigar | Extra |
---|---|---|---|---|---|---|
1_NOTCH2insRoxan | pR1=97 | 22 | 41721566 | 60 | 23S50M27S | NM:i:0 AS:i:50 XS:i:0 |
1_NOTCH2insRoxan | pr2=145 | 1 | 120572200 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
2_NOTCH2insRoxan | pR1 | 22 | 41721566 | 60 | 23S50M27S | NM:i:0 AS:i:50 XS:i:0 |
2_NOTCH2insRoxan | pr2 | 1 | 120572200 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
NOTCH2a+ROXAN1a vs revcomp(NOTCH2b) BWA-MEM options: none
Read | Flag | Chrom | Pos | qual | Cigar | Extra |
---|---|---|---|---|---|---|
1_Notch2Roxan | pPR1=99 | 1 | 120572000 | 9 | 50M50S | NM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0; |
1_Notch2Roxan | pPR1=99 | 22 | 41721566 | 9 | 50S50M | NM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0; |
1_Notch2Roxan | pPr2=147 | 1 | 120572200 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
2_Notch2Roxan | pPR1 | 1 | 120572000 | 9 | 50M50S | NM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0; |
2_Notch2Roxan | pPR1 | 22 | 41721566 | 9 | 50S50M | NM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0; |
2_Notch2Roxan | pPr2 | 1 | 120572200 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
NOTCH2a+ROXAN1a vs revcomp(NOTCH2b+ROXAN1b) BWA-MEM options: none
Read | Flag | Chrom | Pos | qual | Cigar | Extra |
---|---|---|---|---|---|---|
1_Notch2Roxan | pR1=97 | 1 | 120572000 | 9 | 50M50S | NM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0; |
1_Notch2Roxan | pR1=97 | 22 | 41721566 | 9 | 50S50M | NM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0; |
1_Notch2Roxan | pr2=145 | 22 | 41721766 | 60 | 50S50M | NM:i:0 AS:i:50 XS:i:0 XP:Z:1,-120572200,50M50S,9,0; |
1_Notch2Roxan | pr2=145 | 1 | 120572200 | 9 | 50M50S | NM:i:0 AS:i:50 XS:i:45 XP:Z:22,-41721766,50S50M,60,0; |
2_Notch2Roxan | pR1 | 1 | 120572000 | 9 | 50M50S | NM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0; |
2_Notch2Roxan | pR1 | 22 | 41721566 | 9 | 50S50M | NM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0; |
2_Notch2Roxan | pr2 | 22 | 41721766 | 60 | 50S50M | NM:i:0 AS:i:50 XS:i:0 XP:Z:1,-120572200,50M50S,9,0; |
2_Notch2Roxan | pr2 | 1 | 120572200 | 9 | 50M50S | NM:i:0 AS:i:50 XS:i:45 XP:Z:22,-41721766,50S50M,60,0; |
NOTCH2a+ROXAN1a vs revcomp(NOTCH2b) BWA-MEM options: -C
Read | Flag | Chrom | Pos | qual | Cigar | Extra |
---|---|---|---|---|---|---|
1_Notch2Roxan | pPR1=99 | 1 | 120572000 | 9 | 50M50S | NM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0; |
1_Notch2Roxan | pPR1=99 | 22 | 41721566 | 9 | 50S50M | NM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0; |
1_Notch2Roxan | pPr2=147 | 1 | 120572200 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
2_Notch2Roxan | pPR1 | 1 | 120572000 | 9 | 50M50S | NM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0; |
2_Notch2Roxan | pPR1 | 22 | 41721566 | 9 | 50S50M | NM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0; |
2_Notch2Roxan | pPr2 | 1 | 120572200 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
NOTCH2a+ROXAN1a vs revcomp(NOTCH2b) BWA-MEM options: -M
Read | Flag | Chrom | Pos | qual | Cigar | Extra |
---|---|---|---|---|---|---|
1_Notch2Roxan | pPR1=99 | 1 | 120572000 | 9 | 50M50S | NM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0; |
1_Notch2Roxan | pPR1s=355 | 22 | 41721566 | 9 | 50S50M | NM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0; |
1_Notch2Roxan | pPr2 | 1 | 120572200 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
2_Notch2Roxan | pPR1 | 1 | 120572000 | 9 | 50M50S | NM:i:0 AS:i:50 XS:i:45 XP:Z:22,+41721566,50S50M,9,0; |
2_Notch2Roxan | pPR1s | 22 | 41721566 | 9 | 50S50M | NM:i:0 AS:i:50 XS:i:0 XP:Z:1,+120572000,50M50S,9,0; |
2_Notch2Roxan | pPr2 | 1 | 120572200 | 9 | 50M | NM:i:0 AS:i:50 XS:i:45 |
NOTCH2ins vs revcomp(ROXAN1b) BWA-MEM options: none
Read | Flag | Chrom | Pos | qual | Cigar | Extra |
---|---|---|---|---|---|---|
1_NOTCH2insRoxan | pPR1 | 22 | 41721566 | 60 | 23S50M27S | NM:i:0 AS:i:50 XS:i:0 |
1_NOTCH2insRoxan | pPr2 | 22 | 41721766 | 60 | 50M | NM:i:0 AS:i:50 XS:i:0 |
2_NOTCH2insRoxan | pPR1 | 22 | 41721566 | 60 | 23S50M27S | NM:i:0 AS:i:50 XS:i:0 |
2_NOTCH2insRoxan | pPr2 | 22 | 41721766 | 60 | 50M | NM:i:0 AS:i:50 XS:i:0 |
That's it,
Pierre
The makefile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
.PHONY=all fastqs | |
REF=/path/to/human_g1k_v37.fasta | |
SAMTOOLS=/path/to/samtools-0.1.19 | |
BWA=/path/to/bwa-0.7.3a/bwa | |
ANOTHERFASTQ=/path/to/NA12878 | |
HTML=result | |
define complement | |
tr "atgcATGC" "TACGTACG" | |
endef | |
define revcomp | |
$(call complement) | rev | |
endef | |
define makefastq | |
echo -e "@$(1)\n$(2)\n+"; echo $(2)| tr "ATGCN" "A" | |
endef | |
define fastqfile | |
($(foreach N,1 2,$(call makefastq,$(N)_$(1),$(2));)) > $(3) | |
endef | |
define fastqfiles | |
$(call fastqfile,$(1)/1,$(2),$(4)_1.fq) | |
$(call fastqfile,$(1)/2,$(3),$(4)_2.fq) | |
endef | |
define callbwa | |
$(call fastqfiles,$(1),$(2),$(3),$(4)) | |
gunzip -c ${ANOTHERFASTQ}_1.fastq.gz | head -n 400 >> $(4)_1.fq | |
gunzip -c ${ANOTHERFASTQ}_2.fastq.gz | head -n 400 >> $(4)_2.fq | |
$(BWA) mem $(5) $(REF) $(4)_1.fq $(4)_2.fq 2> /dev/null | $(SAMTOOLS)/samtools view -SX - | grep -v -E "^@SQ" > $(4).sam | |
echo "<div><h3>${6} BWA-MEM options: ${5}</h3><table border='1' style='border:1px solid #000000;'>" >> $(HTML).html | |
echo "<caption>${6}<br/>BWA-MEM options: ${5}</caption>" >> $(HTML).html | |
echo "<tr><th>Read</th><th>Flag</th><th>Chrom</th><th>Pos</th><th>qual</th><th>Cigar</th><th>Extra</th></tr>\n" >> $(HTML).html | |
@cat $(4).sam | grep -E '^[12]_' | awk -F ' ' '{printf("<tr><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>",$$1,$$2,$$3,$$4,$$5,$$6);for(i=12;i<=NF;i++) printf("%s ",$$i); printf("</td></tr>\n");}' >> $(HTML).html | |
echo "</table></div>" >> $(HTML).html | |
endef | |
define begintable | |
endef | |
define endtable | |
endef | |
NOTCH2a=GTCTTGCTCTGTCACCCAGGCTAGAGTGCAATGATGTGGTGTCAGCTCAC | |
NOTCH2b=CTGACCTCGTGATTCGCCCAACACAGCCTCCTGAAGTGCTGGGATTACCA | |
NOTCH2del=GTCTTGCTCTGTCACCCAGGCTAGAGTGCAATGATGTTCCGCCTCCTGGGTTCAAGTCACTCTCCTGC | |
ROXAN1a=AGGTCGACACTACCCCTAAAGCAAGAAGAATATGAGGTGAGTGTCAGCTG | |
ROXAN1b=CAATGATCTGTTCCGGGAGAAGGACTATAAGCAGGCTCTGGTGCAGTACA | |
NOTCH2ins=GTCTTGCTCTGTCACCCAGGCTA${ROXAN1a}GAGTGCAATGATGTGGTGTCAGCTCAC | |
all: | |
rm -f $(HTML).html | |
$(call callbwa,Notch2,$(NOTCH2a),$(NOTCH2b),tmp,,NOTCH2a vs NOTCH2b) | |
$(call callbwa,Notch2,$(NOTCH2a),`echo $(NOTCH2b)| $(call revcomp)`,tmp,,NOTCH2a vs revcomp(NOTCH2b)) | |
$(call callbwa,Notch2del,$(NOTCH2del),`echo $(NOTCH2b)| $(call revcomp)`,tmp,,NOTCH2del vs revcomp(NOTCH2b)) | |
$(call callbwa,Notch2del,$(NOTCH2del),`echo $(NOTCH2b)| $(call revcomp)`,tmp,-M,NOTCH2del vs revcomp(NOTCH2b)) | |
$(call callbwa,NOTCH2insRoxan,$(NOTCH2ins),`echo $(NOTCH2b)| $(call revcomp)`,tmp,,NOTCH2ins vs revcomp(NOTCH2b)) | |
$(call callbwa,Notch2Roxan,$(NOTCH2a)$(ROXAN1a),`echo $(NOTCH2b)| $(call revcomp)`,tmp,,NOTCH2a+ROXAN1a vs revcomp(NOTCH2b)) | |
$(call callbwa,Notch2Roxan,$(NOTCH2a)$(ROXAN1a),`echo $(NOTCH2b)$(ROXAN1b)| $(call revcomp)`,tmp,,NOTCH2a+ROXAN1a vs revcomp(NOTCH2b+ROXAN1b)) | |
$(call callbwa,Notch2Roxan,$(NOTCH2a)$(ROXAN1a),`echo $(NOTCH2b)| $(call revcomp)`,tmp,-C,NOTCH2a+ROXAN1a vs revcomp(NOTCH2b)) | |
$(call callbwa,Notch2Roxan,$(NOTCH2a)$(ROXAN1a),`echo $(NOTCH2b)| $(call revcomp)`,tmp,-M,NOTCH2a+ROXAN1a vs revcomp(NOTCH2b)) | |
$(call callbwa,NOTCH2insRoxan,$(NOTCH2ins),`echo $(ROXAN1b)| $(call revcomp)`,tmp,,NOTCH2ins vs revcomp(ROXAN1b)) | |
cat $(HTML).html | |
No comments:
Post a Comment