This is my first notebook for developping a new
Walker for the
Genome Analysis Toolkit. This post was mostly inspired by the following pdf:
kvg_20_line_lifesavers_mad_v2.pptx.pdf.
Get the sources
git clone http://github.com/broadgsa/gatk.git GATK.dev
the javac compiler also requires the following library from google :
http://code.google.com/p/cofoja/.
A first "Short-Reads" walker
The following class
ReadWalker scans the reads and print them as fasta. The
@Output annotation tells the GATK that we're going to channel our output through the java.io.PrintStream object. This field is automatically filled by the application runtime.
Compilation
javac -cp /path/to/GenomeAnalysisTK.jar:/path/to/cofoja-1.0-r139.jar:. \
-sourcepath src \
-d tmp src/mygatk/HelloRead.java
jar cvf HelloRead.jar -C tmp .
Running
Here I'm using a BAM from the 'examples' folder of samtools. (We need to pre-process this BAM with
picard AddOrReplaceReadGroups).
We then use our library as follow:
java -cp path/to/GenomeAnalysisTK.jar:HelloRead.jar \
org.broadinstitute.sting.gatk.CommandLineGATK -T HelloRead \
-I test.bam \
-R ${SAMTOOLS}/examples/ex1.fa
Result:
The Makefile
That's it,
Pierre
This is a great tutorial. If your readers are hungry for more ways to leverage the power of the GATK, the GATK team at the Broad Institute is planning a workshop for users this Fall. If you’re interested in attending the workshop, you can vote on the topics and activities that you’d like the workshop to include by filling in this survey: http://www.surveymonkey.com/s/T799FQK
ReplyDelete