27 May 2008

NCBI Blast+ XSLT => XHTML + SVG

This post was inspired by the article Processing and duplicons on human chromosomes sent by Paulo Nuin yesterday and the short discussion that followed on FriendFeed. Paulo described in this article how the processing tool was used to display an output of ncbi-blast.

Here I show how a XSLT stylesheet can be used to transform a Blast into a XHTML+SVG page.

The stylesheet described here is available [here]



Here is how it works, at the beginning we've got a blast output in XML (here the query was a murine histone vs the human genome)
<?xml version="1.0"?>
<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "NCBI_BlastOutput.dtd">
<BlastOutput>
<BlastOutput_program>blastn</BlastOutput_program>
<BlastOutput_version>blastn 2.2.5 [Nov-16-2002]</BlastOutput_version>
<BlastOutput_reference>~Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, ~Jin
ghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), ~&quot;Gapped BLAST and PSI-BLAST: a new
generation of protein database search~programs&quot;, Nucleic Acids Res. 25:3389-3402.</BlastOutput_r
eference>
<BlastOutput_db>HumanGenome</BlastOutput_db>
<BlastOutput_query-ID>lcl|QUERY</BlastOutput_query-ID>
<BlastOutput_query-def>gi|34556456|gb|AY158922.2| Mus musculus histone protein Hist2h2ab gene, complet
e cds</BlastOutput_query-def>
<BlastOutput_query-len>1680</BlastOutput_query-len>
<BlastOutput_param>
<Parameters>
<Parameters_expect>10</Parameters_expect>
<Parameters_sc-match>1</Parameters_sc-match>
<Parameters_sc-mismatch>-3</Parameters_sc-mismatch>
<Parameters_gap-open>5</Parameters_gap-open>
<Parameters_gap-extend>2</Parameters_gap-extend>
<Parameters_filter>D</Parameters_filter>
</Parameters>
</BlastOutput_param>
<BlastOutput_iterations>
<Iteration>
<Iteration_iter-num>1</Iteration_iter-num>
<Iteration_hits>
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>gnl|BL_ORD_ID|32</Hit_id>
<Hit_def>chr6</Hit_def>
<Hit_accession>32</Hit_accession>
<Hit_len>170975699</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>444.541</Hsp_bit-score>
<Hsp_score>224</Hsp_score>
<Hsp_evalue>7.76885e-122</Hsp_evalue>
<Hsp_query-from>209</Hsp_query-from>
<Hsp_query-to>580</Hsp_query-to>
<Hsp_hit-from>27883967</Hsp_hit-from>
<Hsp_hit-to>27884338</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>335</Hsp_identity>
<Hsp_positive>335</Hsp_positive>
<Hsp_align-len>372</Hsp_align-len>
<Hsp_qseq>ATGTCTGGCCGTGGCAAACAGGGAGGCAAGGCCCGCGCCAAGGCCAAGTCGCGGTCTTCCCGGGCCGGGCTACAGTTCCC
GGTGGGGCGTGTGCACCGGCTGCTGCGCAAGGGCAACTACGCGGAGCGCGTGGGTGCCGGCGCGCCGGTATACATGGCGGCGGTGCTGGAGTACCTAACGGCCGAGATCC
TGGAGCTGGCGGGCAACGCGGCCCGCGACAACAAGAAGACGCGCATCATCCCGCGCCACCTGCAGCTGGCCATCCGCAACGACGAGGAGCTCAACAAGCTGCTGGGCAAA
GTGACGATCGCACAGGGCGGCGTCCTGCCCAACATCCAGGCCGTGCTGCTGCCCAAGAAGACCGAGAGCCAC</Hsp_qseq>
<Hsp_hseq>ATGTCTGGGCGTGGCAAGCAGGGAGGCAAAGCTCGCGCCAAGGCCAAGACCCGCTCTTCTCGGGCCGGGCTTCAGTTTCC
CGTAGGCCGAGTGCATCGCCTGCTCCGCAAAGGCAACTATGCGGAGCGGGTCGGTGCTGGAGCGCCGGTGTACCTGGCGGCGGTGCTGGAGTACCTGACCGCCGAGATCC
TGGAGCTGGCTGGCAACGCGGCCCGCGACAACAAGAAGACTCGCATCATCCCGCGTCACCTCCAGCTGGCCATCCGCAACGATGAGGAGCTCAACAAGCTTCTGGGCAAA
GTCACCATCGCACAGGGTGGCGTCCTGCCCAACATCCAGGCCGTGCTACTGCCCAAGAAGACCGAGAGCCAC</Hsp_hseq>
<Hsp_midline>|||||||| |||||||| ||||||||||| || ||||||||||||||| | || ||||| ||||||||||| |||||
|| || || || ||||| || ||||| ||||| |||||||| |||||||| || ||||| || |||||||| ||| |||||||||||||||||||||| || |||||||
||||||||||||| ||||||||||||||||||||||||||||| |||||||||||||| ||||| |||||||||||||||||||| ||||||||||||||||| ||||||
||||| || ||||||||||| ||||||||||||||||||||||||||||| ||||||||||||||||||||||||</Hsp_midline>
</Hsp>
<Hsp>
<Hsp_num>2</Hsp_num>
<Hsp_bit-score>420.753</Hsp_bit-score>
<Hsp_score>212</Hsp_score>
<Hsp_evalue>1.1255e-114</Hsp_evalue>
<Hsp_query-from>580</Hsp_query-from>
<Hsp_query-to>209</Hsp_query-to>
<Hsp_hit-from>27890126</Hsp_hit-from>
<Hsp_hit-to>27890497</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>-1</Hsp_hit-frame>
<Hsp_identity>332</Hsp_identity>
<Hsp_positive>332</Hsp_positive>
<Hsp_align-len>372</Hsp_align-len>
<Hsp_qseq>GTGGCTCTCGGTCTTCTTGGGCAGCAGCACGGCCTGGATGTTGGGCAGGACGCCGCCCTGTGCGATCGTCACTTTGCCCA
GCAGCTTGTTGAGCTCCTCGTCGTTGCGGATGGCCAGCTGCAGGTGGCGCGGGATGATGCGCGTCTTCTTGTTGTCGCGGGCCGCGTTGCCCGCCAGCTCCAGGATCTCG
GCCGTTAGGTACTCCAGCACCGCCGCCATGTATACCGGCGCGCCGGCACCCACGCGCTCCGCGTAGTTGCCCTTGCGCAGCAGCCGGTGCACACGCCCCACCGGGAACTG
TAGCCCGGCCCGGGAAGACCGCGACTTGGCCTTGGCGCGGGCCTTGCCTCCCTGTTTGCCACGGCCAGACAT</Hsp_qseq>
<Hsp_hseq>GTGGCTCTCAGTTTTCTTTGGCAGCAGCACGGCCTGGATGTTGGGCAGGACGCCACCCTGTGCGATGGTGACTTTGCCCA
GAAGCTTGTTGAGCTCCTCATCGTTGCGGATGGCCAGCTGGAGGTGACGCGGGATGATGCGAGTCTTCTTGTTGTCGCGGGCCGCGTTGCCAGCCAGCTCCAGGATCTCG
GCGGTCAGGTACTCCAGCACCGCCGCCAGGTACACCGGCGCTCCAGCACCGACCCGCTCCGCATAGTTGCCTTTGCGGAGCAGGCGATGCACTCGGCCTACGGGAAACTG
AAGCCCGGCCCGAGAAGAGCGGGTCTTGGCCTTGGCGCGAGCTTTGCCTCCCTGCTTACCACGCCCAGACAT</Hsp_hseq>
<Hsp_midline>||||||||| || ||||| ||||||||||||||||||||||||||||||||||| ||||||||||| || |||||||
|||| ||||||||||||||||| |||||||||||||||||||| ||||| |||||||||||||| ||||||||||||||||||||||||||||| |||||||||||||||
||||| || |||||||||||||||||||||| ||| |||||||| || ||||| || |||||||| |||||||| ||||| ||||| || ||||| || || || || ||
||| ||||||||||| ||||| || | ||||||||||||||| || ||||||||||| || ||||| ||||||||</Hsp_midline>
</Hsp>

(...)

</Hit_hsps>
</Hit>
</Iteration_hits>
<Iteration_stat>
<Statistics>
<Statistics_db-num>1</Statistics_db-num>
<Statistics_db-len>245522847</Statistics_db-len>
<Statistics_hsp-len>0</Statistics_hsp-len>
<Statistics_eff-space>5.13463e+12</Statistics_eff-space>
<Statistics_kappa>0.710605</Statistics_kappa>
<Statistics_lambda>1.37407</Statistics_lambda>
<Statistics_entropy>1.30725</Statistics_entropy>
</Statistics>
</Iteration_stat>
</Iteration>
</BlastOutput_iterations>
</BlastOutput>


And the XSLT stylesheet:
1 <?xml version="1.0" encoding="UTF-8"?>
2 <xsl:stylesheet
3 version="1.0"
4 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
5 xmlns:svg="http://www.w3.org/2000/svg"
6 xmlns:xlink="http://www.w3.org/1999/xlink"
7 xmlns:h="http://www.w3.org/1999/xhtml"
8 >


17 <!-- ========================================================================= -->
18 <xsl:output method='xml' indent='yes' omit-xml-declaration="no"/>
19 <!-- we preserve the spaces in that element -->
20 <xsl:preserve-space elements="svg:style h:style" />
21
22 <!-- ========================================================================= -->
23 <!-- the width of the SVG -->
24 <xsl:variable name="svg-width">800</xsl:variable>
25 <!-- height of a HSP -->
26 <xsl:variable name="hsp-height">10</xsl:variable>
27 <!-- total number of Hits in first blast iteration -->
28 <xsl:variable name="hit-count"><xsl:value-of select="count(BlastOutput/BlastOutput_iterations/Iteration[1]/Iteration_hits/Hit)"/></xsl:variable>
29 <!-- total number of HSP in first blast iteration -->
30 <xsl:variable name="hsp-count"><xsl:value-of select="count(BlastOutput/BlastOutput_iterations/Iteration[1]/Iteration_hits/Hit/Hit_hsps/Hsp)"/></xsl:variable>
31 <!-- query length (bases or amino acids ) -->
32 <xsl:variable name="query-length"><xsl:value-of select="BlastOutput/BlastOutput_query-len"/></xsl:variable>
33 <!-- margin between two hits -->
34 <xsl:variable name="space-between-hits"><xsl:value-of select="3* $hsp-height"/></xsl:variable>
35 <!-- height of all hits -->
36 <xsl:variable name="hits-height"><xsl:value-of select="$hsp-count * $hsp-height + ($hit-count + 1) * $space-between-hits"/></xsl:variable>
37 <!-- size of the top header -->
38 <xsl:variable name="header-height">50</xsl:variable>
39
40 <!-- ========================================================================= -->
41
42 <!-- matching the root node -->
43 <xsl:template match="/">
44 <!-- start XHTML -->
45 <h:html>
46 <h:head>
47 <h:style type="text/css">
48 body {
49 font-size:10px;
50 font-family:Helvetica;
51 background-color:rgb(150,150,150);
52 color:white;
53 }
54 </h:style>
55 <h:title><xsl:value-of select="BlastOutput/BlastOutput_query-def"/></h:title>
56 </h:head>
57 <h:body>
58 <h:h1>Blast Results</h:h1>
59 <h:div>
60 <h:h3>Parameters</h:h3>
61 <h:table>
62 <h:tr><h:th>Database</h:th><h:td><xsl:value-of select="BlastOutput/BlastOutput_db"/></h:td></h:tr>
63 <h:tr><h:th>Query ID</h:th><h:td><xsl:value-of select="BlastOutput/BlastOutput_query-ID"/></h:td></h:tr>
64 <h:tr><h:th>Query Def.</h:th><h:td><h:b><xsl:value-of select="BlastOutput/BlastOutput_query-def"/></h:b></h:td></h:tr>
65 <h:tr><h:th>Query Length</h:th><h:td><h:b><xsl:value-of select="BlastOutput/BlastOutput_query-len"/></h:b></h:td></h:tr>
66 <h:tr><h:th>Version</h:th><h:td><xsl:value-of select="BlastOutput/BlastOutput_version"/></h:td></h:tr>
67 <h:tr><h:th>Reference</h:th><h:td><h:a href="http://www.ncbi.nlm.nih.gov/pubmed/9254694"><xsl:value-of select="BlastOutput/BlastOutput_reference"/></h:a></h:td></h:tr>
68 </h:table>
69 </h:div>
70 <h:hr/>
71 <h:div style="text-align:center">
72
73 <!-- starts SVG figure -->
74 <xsl:element name="svg:svg">
75 <xsl:attribute name="version">1.0</xsl:attribute>
76 <xsl:attribute name="width"><xsl:value-of select="$svg-width"/></xsl:attribute>
77 <xsl:attribute name="height"><xsl:value-of select="$hits-height + $header-height "/></xsl:attribute>
78 <svg:title><xsl:value-of select="BlastOutput/BlastOutput_query-def"/></svg:title>
79 <svg:defs>
80 <svg:style type="text/css">
81 text.t1 {
82 fill:black;
83 font-size:<xsl:value-of select="$hsp-height - 2"/>px;
84 font-family:Helvetica;
85 }
86 text.t2 {
87 fill:blue;
88 font-size:<xsl:value-of select="$space-between-hits - 2"/>px;
89 font-family:Helvetica;
90 text-anchor:middle;
91 }
92 text.title {
93 fill:white;
94 stroke:black;
95 font-size:12px;
96 font-family:Helvetica;
97 text-anchor:middle;
98 alignment-baseline:middle;
99 }
100 line.grid {
101 stroke:lightgray;
102 stroke-width:1.5px;
103 }
104
105 rect.hit {
106 fill:none;
107 stroke:darkgray;
108 stroke-width:1px;
109 }
110 </svg:style>
111
112 <svg:linearGradient x1="0%" y1="0%" x2="0%" y2="100%" id="score1">
113 <svg:stop offset="5%" stop-color="red" />
114 <svg:stop offset="50%" stop-color="whitesmoke" />
115 <svg:stop offset="95%" stop-color="red" />
116 </svg:linearGradient>
117 <svg:linearGradient x1="0%" y1="0%" x2="0%" y2="100%" id="score2">
118 <svg:stop offset="5%" stop-color="orange" />
119 <svg:stop offset="50%" stop-color="whitesmoke" />
120 <svg:stop offset="95%" stop-color="orange" />
121 </svg:linearGradient>
122 <svg:linearGradient x1="0%" y1="0%" x2="0%" y2="100%" id="score3">
123 <svg:stop offset="5%" stop-color="green" />
124 <svg:stop offset="50%" stop-color="whitesmoke" />
125 <svg:stop offset="95%" stop-color="green" />
126 </svg:linearGradient>
127 <svg:linearGradient x1="0%" y1="0%" x2="0%" y2="100%" id="score4">
128 <svg:stop offset="5%" stop-color="blue" />
129 <svg:stop offset="50%" stop-color="whitesmoke" />
130 <svg:stop offset="95%" stop-color="blue" />
131 </svg:linearGradient>
132 <svg:linearGradient x1="0%" y1="0%" x2="0%" y2="100%" id="score5">
133 <svg:stop offset="5%" stop-color="black" />
134 <svg:stop offset="50%" stop-color="whitesmoke" />
135 <svg:stop offset="95%" stop-color="black" />
136 </svg:linearGradient>
137 </svg:defs>
138
139 <xsl:element name="svg:rect">
140 <xsl:attribute name="x">0</xsl:attribute>
141 <xsl:attribute name="y">0</xsl:attribute>
142 <xsl:attribute name="width"><xsl:value-of select="$svg-width - 1"/></xsl:attribute>
143 <xsl:attribute name="height"><xsl:value-of select="$hits-height + $header-height "/></xsl:attribute>
144 <xsl:attribute name="fill">whitesmoke</xsl:attribute>
145 <xsl:attribute name="stroke">blue</xsl:attribute>
146 </xsl:element>
147
148
149 <xsl:apply-templates select="BlastOutput"/>
150 </xsl:element>
151 <!-- end SVG figure -->
152
153 </h:div>
154 <h:hr/>
155 <xsl:apply-templates select="BlastOutput/BlastOutput_param/Parameters"/>
156 <h:hr/>
157 <h:p><h:b>SVG</h:b> figure generated with <h:a href="http://code.google.com/p/lindenb/source/browse/trunk/src/xsl/blast2svg.xsl">blast2svg</h:a>. <h:a href="http://plindenbaum.blogspot.com">Pierre Lindenbaum PhD</h:a> <h:i>( plindenbaum at yahoo dot fr )</h:i></h:p>
158
159 </h:body>
160 </h:html>
161 </xsl:template>
162 <!-- ========================================================================= -->
163 <!-- display parameters in a HTML table -->
164 <xsl:template match="Parameters">
165 <h:div>
166 <h:h3>Parameters</h:h3>
167 <h:table>
168 <h:tr><h:th>Expect</h:th><h:td><xsl:value-of select="Parameters_expect"/></h:td></h:tr>
169 <h:tr><h:th>Sc-match</h:th><h:td><xsl:value-of select="Parameters_sc-match"/></h:td></h:tr>
170 <h:tr><h:th>Sc-mismatch</h:th><h:td><xsl:value-of select="Parameters_sc-mismatch"/></h:td></h:tr>
171 <h:tr><h:th>Gap-open</h:th><h:td><xsl:value-of select="Parameters_gap-open"/></h:td></h:tr>
172 <h:tr><h:th>Gap-extend</h:th><h:td><xsl:value-of select="Parameters_gap-extend"/></h:td></h:tr>
173 <h:tr><h:th>Filter</h:th><h:td><xsl:value-of select="Parameters_filter"/></h:td></h:tr>
174 </h:table>
175 </h:div>
176 </xsl:template>
177
178
179 <!-- ========================================================================= -->
180 <xsl:template match="BlastOutput">
181 <!-- paint header -->
182 <svg:g>
183 <xsl:element name="svg:rect">
184 <xsl:attribute name="x">0</xsl:attribute>
185 <xsl:attribute name="y">0</xsl:attribute>
186 <xsl:attribute name="width"><xsl:value-of select="$svg-width - 1"/></xsl:attribute>
187 <xsl:attribute name="height"><xsl:value-of select="$header-height - 2"/></xsl:attribute>
188 <xsl:attribute name="fill">url(#score5)</xsl:attribute>
189 <xsl:attribute name="stroke">black</xsl:attribute>
190 </xsl:element>
191
192 <xsl:element name="svg:text">
193 <xsl:attribute name="x"><xsl:value-of select="$svg-width div 2"/></xsl:attribute>
194 <xsl:attribute name="y"><xsl:value-of select="$header-height div 2"/></xsl:attribute>
195 <xsl:attribute name="class">title</xsl:attribute>
196 <xsl:value-of select="BlastOutput_query-def"/> (len=<xsl:value-of select="BlastOutput_query-len"/> )
197 </xsl:element>
198 </svg:g>
199 <xsl:apply-templates select="BlastOutput_iterations/Iteration[1]/Iteration_hits"/>
200 </xsl:template>
201
202 <!-- ========================================================================= -->
203
204 <xsl:template match="Iteration_hits">
205 <xsl:apply-templates select="Hit"/>
206 </xsl:template>
207
208 <!-- ========================================================================= -->
209
210 <xsl:template match="Hit">
211 <!-- count number of preceding hits -->
212 <xsl:variable name="preceding-hits"><xsl:value-of select="count(preceding-sibling::Hit)"/></xsl:variable>
213 <!-- count number of preceding hsp -->
214 <xsl:variable name="preceding-hsp"><xsl:value-of select="count(preceding-sibling::Hit/Hit_hsps/Hsp)"/></xsl:variable>
215 <!-- calculate hieght of this part -->
216 <xsl:variable name="height"><xsl:value-of select="count(Hit_hsps/Hsp)*$hsp-height"/></xsl:variable>
217 <!-- translate this part verticaly -->
218 <xsl:element name="svg:g">
219 <xsl:attribute name="transform">translate(0,<xsl:value-of select="$header-height + $preceding-hsp * $hsp-height + ($preceding-hits + 1) * $space-between-hits "/>)</xsl:attribute>
220 <xsl:attribute name="id">hit-<xsl:value-of select="generate-id(.)"/></xsl:attribute>
221 <xsl:element name="svg:text">
222 <xsl:attribute name="x"><xsl:value-of select="$svg-width div 2"/></xsl:attribute>
223 <xsl:attribute name="y"><xsl:value-of select="-2"/></xsl:attribute>
224 <xsl:attribute name="class">t2</xsl:attribute>
225 <xsl:value-of select="Hit_def"/>
226 </xsl:element>
227 <xsl:element name="svg:rect">
228 <xsl:attribute name="x">0</xsl:attribute>
229 <xsl:attribute name="y">0</xsl:attribute>
230 <xsl:attribute name="width"><xsl:value-of select="$svg-width"/></xsl:attribute>
231 <xsl:attribute name="height"><xsl:value-of select="$height"/></xsl:attribute>
232 <xsl:attribute name="class">hit</xsl:attribute>
233 </xsl:element>
234
235 <xsl:call-template name="grid">
236 <xsl:with-param name="x" select="0"/>
237 <xsl:with-param name="d" select="20"/>
238 <xsl:with-param name="W" select="$svg-width"/>
239 <xsl:with-param name="H" select="$height"/>
240 </xsl:call-template>
241
242 <xsl:apply-templates select="Hit_hsps"/>
243
244
245
246 </xsl:element>
247 </xsl:template>
248
249 <!-- ========================================================================= -->
250 <!-- draw vertical lines , recursive template -->
251 <xsl:template name="grid">
252 <xsl:param name="x" select="0" />
253 <xsl:param name="d" select="20" />
254 <xsl:param name="W" select="0" />
255 <xsl:param name="H" select="0" />
256 <svg:line class="grid" x1="{$x}" x2="{$x}" y1="0" y2="{$H}"/>
257 <xsl:if test="$d + $x &lt; $W">
258 <xsl:call-template name="grid">
259 <xsl:with-param name="x" select="$d + $x"/>
260 <xsl:with-param name="d" select="$d"/>
261 <xsl:with-param name="W" select="$W"/>
262 <xsl:with-param name="H" select="$H"/>
263 </xsl:call-template>
264 </xsl:if>
265 </xsl:template>
266
267 <!-- ========================================================================= -->
268
269 <xsl:template match="Hit_hsps">
270 <xsl:apply-templates select="Hsp"/>
271 </xsl:template>
272
273
274 <!-- ========================================================================= -->
275 <xsl:template match="Hsp">
276 <!-- number of previous hsp in the same Hit -->
277 <xsl:variable name="preceding-hsp"><xsl:value-of select="count(preceding-sibling::Hsp)"/></xsl:variable>
278 <!-- get the 5' position of the hsp in the query -->
279 <xsl:variable name="hsp-left"><xsl:choose>
280 <xsl:when test="Hsp_query-from &lt; Hsp_query-to"><xsl:value-of select="Hsp_query-from"/></xsl:when>
281 <xsl:otherwise><xsl:value-of select="Hsp_query-to"/></xsl:otherwise>
282 </xsl:choose></xsl:variable>
283 <!-- get the 3' position of the hsp in the query -->
284 <xsl:variable name="hsp-right"><xsl:choose>
285 <xsl:when test="Hsp_query-from &lt; Hsp_query-to"><xsl:value-of select="Hsp_query-to"/></xsl:when>
286 <xsl:otherwise><xsl:value-of select="Hsp_query-from"/></xsl:otherwise>
287 </xsl:choose></xsl:variable>
288 <!-- 5' position on screen -->
289 <xsl:variable name="x1"><xsl:value-of select="($hsp-left div $query-length ) * $svg-width"/></xsl:variable>
290 <!-- 3' position on screen -->
291 <xsl:variable name="x2"><xsl:value-of select="($hsp-right div $query-length ) * $svg-width"/></xsl:variable>
292 <!-- label -->
293 <xsl:variable name="label"><xsl:value-of select="Hsp_hit-from"/> - <xsl:value-of select="Hsp_hit-to"/> (<xsl:choose>
294 <xsl:when test="Hsp_query-from &lt; Hsp_query-to">+</xsl:when>
295 <xsl:otherwise>-</xsl:otherwise></xsl:choose>) e=<xsl:value-of select="Hsp_evalue"/></xsl:variable>
296
297 <!-- translate this Hsp verticaly in its Hit -->
298 <xsl:element name="svg:g">
299 <xsl:attribute name="transform">translate(0,<xsl:value-of select="$preceding-hsp * $hsp-height"/>)</xsl:attribute>
300 <xsl:attribute name="id">hsp-<xsl:value-of select="generate-id(.)"/></xsl:attribute>
301 <xsl:attribute name="title"><xsl:value-of select="Hsp_evalue"/></xsl:attribute>
302
303 <!-- paint the Hsp Rectangle -->
304 <xsl:element name="svg:rect">
305 <xsl:attribute name="x"><xsl:value-of select="$x1"/></xsl:attribute>
306 <xsl:attribute name="y">2</xsl:attribute>
307 <xsl:attribute name="width"><xsl:value-of select="$x2 - $x1"/></xsl:attribute>
308 <xsl:attribute name="height"><xsl:value-of select="$hsp-height - 4"/></xsl:attribute>
309 <!-- choose a color according to the e-value -->
310 <xsl:attribute name="fill"><xsl:choose>
311 <xsl:when test="Hsp_evalue &lt; 1E-100">url(#score1)</xsl:when>
312 <xsl:when test="Hsp_evalue &lt; 1E-10">url(#score2)</xsl:when>
313 <xsl:when test="Hsp_evalue &lt; 0.1">url(#score3)</xsl:when>
314 <xsl:when test="Hsp_evalue &lt; 0">url(#score4)</xsl:when>
315 <xsl:otherwise>url(#score5)</xsl:otherwise>
316 </xsl:choose></xsl:attribute>
317 </xsl:element>
318
319 <!-- paint the label according to the position of the Hsp on screen -->
320 <xsl:choose>
321 <xsl:when test="$x2 &lt; (0.75 * $svg-width)">
322 <xsl:element name="svg:text">
323 <xsl:attribute name="class">t1</xsl:attribute>
324 <xsl:attribute name="x"><xsl:value-of select="$x2 + 10 "/></xsl:attribute>
325 <xsl:attribute name="y"><xsl:value-of select="$hsp-height -1"/></xsl:attribute>
326 <xsl:attribute name="text-anchor">start</xsl:attribute>
327 <xsl:value-of select="$label"/>
328 </xsl:element>
329 </xsl:when>
330 <xsl:when test="$x1 &gt; (0.25 * $svg-width)">
331 <xsl:element name="svg:text">
332 <xsl:attribute name="class">t1</xsl:attribute>
333 <xsl:attribute name="x"><xsl:value-of select="$x1 - 10 "/></xsl:attribute>
334 <xsl:attribute name="y"><xsl:value-of select="$hsp-height -1"/></xsl:attribute>
335 <xsl:attribute name="text-anchor">end</xsl:attribute>
336 <xsl:value-of select="$label"/>
337 </xsl:element>
338 </xsl:when>
339 <xsl:otherwise>
340 <xsl:element name="svg:text">
341 <xsl:attribute name="class">t1</xsl:attribute>
342 <xsl:attribute name="x"><xsl:value-of select="($x2 - $x1) div 2 "/></xsl:attribute>
343 <xsl:attribute name="y"><xsl:value-of select="$hsp-height -1"/></xsl:attribute>
344 <xsl:attribute name="text-anchor">middle</xsl:attribute>
345 <xsl:value-of select="$label"/>
346 </xsl:element>
347 </xsl:otherwise>
348 </xsl:choose>
349
350 </xsl:element>
351 </xsl:template>
352 <!-- ========================================================================= -->
353
354
355 </xsl:stylesheet>

Some random notes:
Line 22-38: I define a few variables such as the number of Hsp or the number of Hits
Line 43: matching the root., we start the XHTML document here
line 73: we start the SVG document here. It is embedded in the XHTML document
line 80-110: CSS can be used for SVG
line 112-137: we define a few gradients to colorize the Hsp. TODO: finding a better method to colorize according to its e-value/score
line 199: for the first iteration, the <Hit> templates are called
line 211: we need to count the number of preceding hits/hsp to know how much we should translate vertically this group of object
line 242: we loop over each hsp in this hit
line 276-292: again, we need to know the number of preceding hits/hsp to translate this hsp vertically. We also calculate the 5' and the 3' position of the Hit in the query.
line 304-307: here, we paint the hsp-rectangle
line 320-348: lousy method, we paint a label for this hsp, trying to find the best place (left/right/middle) to print the label.

The blast file was processed with xsltproc:
xsltproc --novalid blast2svg.xsl blast.xml > ~/blast.xhtml


A sample output is displayed below
blast2svg



Pierre

25 May 2008

Leonard Colebrook: Creating a Biography in Wikipedia

Today I created a new article in wikipedia about Leonard Colebrook who was an " English medical researcher who introduced the use of Prontosil, the first sulfonamide drug, as a cure for puerperal, or childbed, fever, a condition resulting from infection after childbirth or abortion (Encyclopaedia Britannica)" (Let's be clear, I didn't know who was that guy till today). Here is how I wrote this article.
First of all, I logged into wikipedia using my login/password. In the article about the Prontosil, I clicked on the link "edit this page", added a reference about Colebrook.
... [[Leonard Colebrook]] introduced it as a cure for puerperal fever ...

and saved the page.

Clicking on this new link makes wikipedia open an editor for a new article. The wiki-code below is what I wrote (please not that, a few weeks ago I created to tool called XUL4Wikipedia, I use it as a source of shortcuts to edit such articles).



1 {{Infobox Scientist
2 |name = {{PAGENAME}}
3 |box_width =
4 |image =Replace_this_image_male.svg
5 |image_size =150px
6 |caption = {{PAGENAME}}
7 |birth_date = {{Birth date|1883|3|2}}
8 |birth_place = [[Guildford, Surrey]]
9 |death_date = {{Death date and age|1883|3|2|1967|9|27}}
10 |death_place = [[Farnham Common]], [[Buckinghamshire]]
11 |residence =
12 |citizenship =
13 |nationality = [[England]]
14 |ethnicity =
15 |field = [[medicine]]
16 |work_institutions =
17 |alma_mater = [[St Mary's Hospital, London]]
18 |doctoral_advisor =
19 |doctoral_students =
20 |known_for = [[Prontosil]]
21 |author_abbrev_bot =
22 |author_abbrev_zoo =
23 |influences = [[Almroth Wright]]
24 |influenced = [[Peter Medawar]]
25 |prizes = [[Blair Bell medal]] in [[1955]]
26 |religion =
27 |footnotes =
28 |signature =
29 }}
30 '''{{PAGENAME}}''' [[Fellow_of_the_Royal_Society|FRS]] ( {{Birth
31 date|1883|3|2}} – {{Death date|1967|9|27}}) was an
32 [[England|English]] [[physician]] who introduced the use of [[Prontosil]]
33 in [[1935]] as a cure for [[puerperal fever]].
34
35 ==References==
36 *{{cite journal
37 | quotes = yes
38 |last=Dunn
39 |first=P M
40 |authorlink=
41 |year=[[2008]]
42 |month=May
43 |title=Dr Leonard Colebrook, FRS (1883-1967) and the chemotherapeutic
44 conquest of puerperal infection
45 |journal=Arch. Dis. Child. Fetal Neonatal Ed.
46 |volume=93
47 |issue=3
48 |pages=F246-8
49 | publisher = | location = | issn =
50 | pmid = 18426926
51 |doi = 10.1136/adc.2006.104448
52 | bibcode = | oclc =| id = | url = | language = | format = | accessdate =
53 | laysummary = | laysource = | laydate = | quote =
54 }}
(...)
191 ==See also==
192 * [[Prontosil]]
193
194 {{Persondata
195 |NAME =Colebrook, Leonard
196 |ALTERNATIVE NAMES =
197 |SHORT DESCRIPTION = English physician who introduced the use of
198 [[Prontosil]] in [[1935]] as a cure for [[puerperal fever]]
199 |DATE OF BIRTH = [[23 December]] [[1961]]
200 |PLACE OF BIRTH = [[Guildford, Surrey]]
201 |DATE OF DEATH = {{Death date|1967|9|27}}
202 |PLACE OF DEATH = [[Farnham Common]], [[Buckinghamshire]]
203 }}
204
205
206 {{physician-stub}}
207
208 {{DEFAULTSORT:Colebrook, Leonard}}
209 [[Category:1883 births]]
210 [[Category:1967 deaths]]
211 [[Category:British physisicans]]




  • 1-29 (Infobox Scientist)and 194-203 (Persondata) are respectively an infobox and a source of metadata about individuals. I guess those structures can be parsed and interpreted by some other tools such as Freebase or DBpedia
  • .
  • 4: I could not find any picture of Colebrook on http://commons.wikimedia.org. I put this link to a SVG figure as I don't have any image. It is then possible to answer the question: "who is missing a portrait ?"

  • 7 and 9: these are templates (~macro) which format the date. The later template calculates and prints the age of the individual at his death

  • 36-190: I searched the references about Colebrook on pubmed using the query "Colebrook L[PS]" ([PS] stands for Personal Name as Subject). The nine articles found were saved as XML and transformed to wiki-code using xsltproc and my xsltstylesheet pubmed2wiki

  • 206:just a signal to say "this article needs to be improved". It is then possible to answer the question: "what are the medical biographies which need to be improved ?"

  • 208: this template is used by wikipedia to sort the results of a query

  • 209-211: "Categories provide automatic indexes that are useful as tables of contents. Together with links and templates they structure a project."



That's it
Pierre

16 May 2008

Twitter m'a tuer

Just like Paweł Szczęsny ( on http://freelancingscience.com), I'm using less and less this blog favor of twitter, especially for the short posts .

For example, yesterday I sent this information on twitter: .

I'm also starting using friendfeed.


Image found here


Pierre