14 July 2011

A text alignment viewer using the samtools API

I wrote a modified version of samtools tview. The original code uses the curses API to display an interactive sequence viewer.

I wanted to generate this kind of screen for a large number of positions and I wanted to be able to redirect the output to a unix pipeline so I changed the original code in order to only handle an extensible 2-dimensional array of characters. The code I wrote is available on github at:

.

Compilation

cd samtools/
make #compile samtools
gcc -o bamttview -g -Wall -O2 -DSTANDALONE_VERSION -I. -Lbcftools bam_ttview.c bam2bcf.o errmod.o bam_color.o libbam.a -lbcf -lm -lz

Example


print a position
$ ./bamttview  -g "ref:5" examples/toy.bam examples/toy.fa |\
cat -n
1 11 21 31 41 51 61
2 TGTTAGATAA****GATA**GCTGTGCTAGTAGGCAG*TCAGCGCCATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
3 ........ .... ......K.K......K. ..........
4 ........AGAG....***... ,,,,, ,,,,,,,,,
5 ......GG**....AA
6 ..C...**** ...**...>>>>>>>>>>>>>>T.....

print a list of positions:
$ cat positions.txt 
ref2:10-100
ref:10-15
ref2:11

$ ./bamttview -f positions.txt examples/toy.bam examples/toy.fa |\
cat -n | head -n 25
1
2
3 > ref2:10-100
4
5 11 21 31 41 51 61 71
6 aaaac****aattaagtctacagagcaactaNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
7 ....Y ..W...................
8 .....****..A...
9 .....****..A...T.
10 .....AAAT.............
11 C...T****....................
12 ..T****.....................
13 T****......................
14
15
16
17 > ref:10-15
18
19 11 21 31 41 51 61 71
20 GATAA****GATA**GCTGTGCTAGTAGGCAG*TCAGCGCCATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
21 ..... .... ......K.K......K. ..........
22 .....AGAG....***... ,,,,, ,,,,,,,,,
23 .....GG**....AA
24 .C...**** ...**...>>>>>>>>>>>>>>T.....
25

./bamttview -f positions.txt -d examples/toy.bam examples/toy.fa |\
cat -n | head -n 25
1
2
3 > ref2:10-100
4
5 11 21 31 41 51 61 71
6 aaaac****aattaagtctacagagcaactaNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
7 ....Y ..W...................
8 AAAAC****AAATAA
9 AAAAC****AAATAATT
10 AAAACAAATAATTAAGTCTACA
11 CAAAT****AATTAAGTCTACAGAGCAAC
12 AAT****AATTAAGTCTACAGAGCAACT
13 T****AATTAAGTCTACAGAGCAACTA
14
15
16
17 > ref:10-15
18
19 11 21 31 41 51 61 71
20 GATAA****GATA**GCTGTGCTAGTAGGCAG*TCAGCGCCATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
21 ..... .... ......K.K......K. ..........
22 GATAAAGAGGATA***CTG taggc cagcgccat
23 GATAAGG**GATAAA
24 GCTAA**** ATA**GCT>>>>>>>>>>>>>>TTCAGC
25




That's it,

Pierre

1 comment:

Nick Loman said...

Great! Just what I was after ...