RDF/Jena: a simple extension for XSLT/XALAN. Testing with NCBI-Gene
In a previous post, I've shown that the XALAN XSLT engine can be extended with custom function returning a DOM Document that will be used by the xslt-stylesheet. Here, I'll create an extension for XALAN getting some RDF statements from a Jena/RDF model. The RDF model will be loaded in memory but one can imagine to use a persistent model ( TDB or SDB). I'll download a record from NCBI-gene, transform it to html and use the disease-ontology database as RDF to annotate it.
A Gene record is downloaded as XML from NCBI gene:
For example the following xpath expression:
config.mk:
Omim ID 610205
Omim ID 102500
That's it,
Pierre
A Gene record is downloaded as XML from NCBI gene:
curl "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=gene&id=4853&retmode=xml" > notch2.htmlThe disease ontology is downloaded as RDF/XML:
curl -odoid.owl "http://www.berkeleybop.org/ontologies/doid.owl"
The XSLT Stylesheet
The stylesheet declares the extension jena, loads the RDF model ("$model"), searches for the OMIM identifiers in the Gene record and loads the RDF statements related to that OMIM-ID.For example the following xpath expression:
jena:query( $model, $doiid, 'http://www.geneontology.org/formats/oboInOwl#hasExactSynonym', '' )returns a rdf/XML document containing the RDF statements having a subject=$doiid, a property "http://www.geneontology.org/formats/oboInOwl#hasExactSynonym" and any object.
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Statement> <rdf:subject rdf:resource="http://purl.obolibrary.org/obo/DOID_0050721"/> <rdf:predicate rdf:resource="http://www.geneontology.org/formats/oboInOwl#hasExactSynonym"/> <rdf:object>Phosphoserine phosphatase deficiency</rdf:object> </rdf:Statement> </rdf:RDF>The stylesheet:
The Java code
This is the java extension: the constructor loads the RDF model in memory. The function query(..) returns a RDF/XML document matching the query.Makefile
config.mk:
Result
java -cp ${class.path} org.apache.xalan.xslt.Process \ -IN notch2.xml \ -XSL gene2html.xsl -EDUMP -OUT result.html
NOTCH2
Omim ID 610205
- Label
- Alagille syndrome
- Synonym
- Arteriohepatic dysplasia (disorder)
- Sub-Class Of
-
- Label
- gastrointestinal system disease
- Synonym
- gastrointestinal disease
- Sub-Class Of
-
- Label
- disease of anatomical entity
- Sub-Class Of
-
- Label
- disease
Omim ID 102500
- Label
- Hajdu-Cheney syndrome
- Synonym
- Hajdu-Cheney syndrome (disorder)
- Sub-Class Of
-
- Label
- autosomal dominant disease
- Sub-Class Of
-
- Label
- autosomal genetic disease
- Sub-Class Of
-
- Label
- monogenic disease
- Sub-Class Of
-
- Label
- genetic disease
- Sub-Class Of
-
- Label
- disease
That's it,
Pierre