30 June 2010

XSLT+NCBI-Taxonomy=Graphviz Dot

The following post was inspired by this question on Biostar.com: http://biostar.stackexchange.com/questions/1549: "lets say I want to know which taxonomic level groups Tribolium castaneum and Drosophila melanogaster. Insects, right? (...) Now lets say I have 10 pairs of such species and I want to see how close & distant they are... How can I do this easily?"
I suggested two solutions, both using a XSLT stylesheet. I then wondered if one could use a xslt stylesheet to draw a tree of life with the help of graphviz. This stylesheet I wrote is available at:


Usage


xsltproc --novalid taxonomy2dot.xsl \
"http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?id=7070,32351,9605,9606&db=taxonomy&retmode=xml" |\
dot -o/home/pierre/file.svg -Tsvg

The main problem with this stylesheet was to create one and only one connection between two nodes even if this connection was present more than one time in the XML file. So, the trick was to use the xpath axis: preceding-sibling:: to check if the connection was previously printed.

Result

Your browser does not support the <CANVAS> element !


That's it !

Pierre

No comments: