Reasoning with the Variation Ontology using Apache Jena #OWL #RDF
The Variation Ontology (VariO), "is an ontology for standardized, systematic description of effects, consequences and mechanisms of variations".
In this post I will use the Apache Jena library for RDF to load this ontology. It will then be used to extract a set of variations that are a sub-class of a given class of Variation.
Loading the ontology
The OWL ontology is available for download here: http://www.variationontology.org/download/VariO_0.979.owl. A new RDF model for an OWL ontology is created and the owl file is loaded.OntModel ontModel = ModelFactory.createOntologyModel(); InputStream in = FileManager.get().open(VO_OWL_URL); ontModel.read(in, ""); in.close();
Creating a Reasoner
A OWL Reasoner is then created and associated to the previous model:Reasoner reasoner = ReasonerRegistry.getOWLReasoner(); reasoner=this.reasoner.bindSchema(ontModel);
Creating a random set of variations
A new RDF model is created to hold a few instances of random Variations. For each instance, we add a random property 'my:chromosome', a random property 'my:position' and we associated one of the following type:- vo:VariO_0000029 "modified amino acid", a sub-Class of vo:VariO_0000028 ("post translationally modified protein")
- vo:VariO_0000030 "spliced protein", a sub-Class of vo:VariO_0000028 ("post translationally modified protein")
- vo:VariO_0000033 "effect on protein subcellular localization". It is NOT a sub-class of vo:VariO_0000028
Random rand=new Random(); com.hp.hpl.jena.rdf.model.Model instances = ModelFactory.createDefaultModel(); instances.setNsPrefix("vo",VO_PREFIX); instances.setNsPrefix("my",MY_URI); for(int i=0;i< 10;++i) { Resource subject= null; Resource rdftype=null; switch(i%3) { case 0: { //modified amino acid subject=instances.createResource(AnonId.create("modaa_"+i)); rdftype=instances.createResource(VO_PREFIX+"VariO_0000029"); break; } case 1: { //spliced protein subject=instances.createResource(AnonId.create("spliced_"+i)); rdftype=instances.createResource(VO_PREFIX+"VariO_0000030"); break; } default: { //effect on protein subcellular localization subject=instances.createResource(AnonId.create("subcell_"+i)); rdftype=instances.createResource(VO_PREFIX+"VariO_0000033"); break; } } instances.add(subject, RDF.type, rdftype); instances.add(subject, hasChromosome, instances.createLiteral("chr"+(1+rand.nextInt(22)))); instances.add(subject, hasPosition, instances.createTypedLiteral(rand.nextInt(1000000))); }
Reasoning
A new inference model is created using the reasoner and the instances of variation. An iterator is used to only list the variations being a subclasses of vo:VariO_0000028 and having a property "my:chromosome" and a property "my:position".InfModel model = ModelFactory.createInfModel (reasoner, instances); ExtendedIterator<Statement> sti = model.listStatements( null, null, model.createResource(VO_PREFIX+"VariO_0000028")); sti=sti.filterKeep(new Filter<Statement>() { @Override public boolean accept(Statement stmt) { return stmt.getSubject().getProperty(hasChromosome)!=null && stmt.getSubject().getProperty(hasPosition)!=null ; } });Loop over the iterator and print the result:
while(sti.hasNext() ) { Statement stmt = sti.next(); System.out.println("\t+ " + PrintUtil.print(stmt)); Statement val=stmt.getSubject().getProperty(hasChromosome); System.out.println("\t\tChromosome:\t"+val.getObject()); val=stmt.getSubject().getProperty(hasPosition); System.out.println("\t\tPosition:\t"+val.getObject()); }
Result
+ (spliced_7 rdf:type http://purl.obolibrary.org/obo/VariO_0000028) Chromosome: chr7 Position: 134172^^http://www.w3.org/2001/XMLSchema#int + (spliced_4 rdf:type http://purl.obolibrary.org/obo/VariO_0000028) Chromosome: chr13 Position: 674316^^http://www.w3.org/2001/XMLSchema#int + (spliced_1 rdf:type http://purl.obolibrary.org/obo/VariO_0000028) Chromosome: chr22 Position: 457596^^http://www.w3.org/2001/XMLSchema#int + (modaa_9 rdf:type http://purl.obolibrary.org/obo/VariO_0000028) Chromosome: chr12 Position: 803303^^http://www.w3.org/2001/XMLSchema#int + (modaa_6 rdf:type http://purl.obolibrary.org/obo/VariO_0000028) Chromosome: chr15 Position: 794137^^http://www.w3.org/2001/XMLSchema#int + (modaa_3 rdf:type http://purl.obolibrary.org/obo/VariO_0000028) Chromosome: chr14 Position: 34487^^http://www.w3.org/2001/XMLSchema#int + (modaa_0 rdf:type http://purl.obolibrary.org/obo/VariO_0000028) Chromosome: chr15 Position: 536371^^http://www.w3.org/2001/XMLSchema#int
Full source code
import java.io.IOException; import java.io.InputStream; import java.util.Random; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import com.hp.hpl.jena.ontology.OntModel; import com.hp.hpl.jena.ontology.OntModelSpec; import com.hp.hpl.jena.rdf.model.AnonId; import com.hp.hpl.jena.rdf.model.InfModel; import com.hp.hpl.jena.rdf.model.ModelFactory; import com.hp.hpl.jena.rdf.model.Property; import com.hp.hpl.jena.rdf.model.Resource; import com.hp.hpl.jena.rdf.model.Statement; import com.hp.hpl.jena.reasoner.Reasoner; import com.hp.hpl.jena.reasoner.ReasonerRegistry; import com.hp.hpl.jena.util.FileManager; import com.hp.hpl.jena.util.PrintUtil; import com.hp.hpl.jena.util.iterator.ExtendedIterator; import com.hp.hpl.jena.util.iterator.Filter; import com.hp.hpl.jena.vocabulary.RDF; public class VariationOntologyReasoner { private static final String VO_PREFIX="http://purl.obolibrary.org/obo/"; private static final String MY_URI="urn:my:ontology"; private static final String VO_OWL_URL="http://www.variationontology.org/download/VariO_0.979.owl"; private Reasoner reasoner; static final private Property hasChromosome=ModelFactory.createDefaultModel().createProperty(MY_URI,"chromosome"); static final private Property hasPosition=ModelFactory.createDefaultModel().createProperty(MY_URI,"position"); private VariationOntologyReasoner() throws IOException { OntModel ontModel = ModelFactory.createOntologyModel(); InputStream in = FileManager.get().open(VO_OWL_URL); ontModel.read(in, ""); in.close(); this.reasoner = ReasonerRegistry.getOWLReasoner(); this.reasoner=this.reasoner.bindSchema(ontModel); } private void run() { Random rand=new Random(); com.hp.hpl.jena.rdf.model.Model instances = ModelFactory.createDefaultModel(); instances.setNsPrefix("vo",VO_PREFIX); instances.setNsPrefix("my",MY_URI); for(int i=0;i< 10;++i) { Resource subject= null; Resource rdftype=null; switch(i%3) { case 0: { //modified amino acid subject=instances.createResource(AnonId.create("modaa_"+i)); rdftype=instances.createResource(VO_PREFIX+"VariO_0000029"); break; } case 1: { subject=instances.createResource(AnonId.create("spliced_"+i)); rdftype=instances.createResource(VO_PREFIX+"VariO_0000030"); break; } default: { //effect on protein subcellular localization subject=instances.createResource(AnonId.create("subcell_"+i)); rdftype=instances.createResource(VO_PREFIX+"VariO_0000033"); break; } } instances.add(subject, RDF.type, rdftype); instances.add(subject, hasChromosome, instances.createLiteral("chr"+(1+rand.nextInt(22)))); instances.add(subject, hasPosition, instances.createTypedLiteral(rand.nextInt(1000000))); } InfModel model = ModelFactory.createInfModel (reasoner, instances); ExtendedIterator<Statement> sti = model.listStatements(null, null, model.createResource(VO_PREFIX+"VariO_0000028")); sti=sti.filterKeep(new Filter<Statement>() { @Override public boolean accept(Statement stmt) { return stmt.getSubject().getProperty(hasChromosome)!=null && stmt.getSubject().getProperty(hasPosition)!=null ; } }); while(sti.hasNext() ) { Statement stmt = sti.next(); System.out.println("\t+ " + PrintUtil.print(stmt)); Statement val=stmt.getSubject().getProperty(hasChromosome); System.out.println("\t\tChromosome:\t"+val.getObject()); val=stmt.getSubject().getProperty(hasPosition); System.out.println("\t\tPosition:\t"+val.getObject()); } } public static void main(String[] args) throws Exception { VariationOntologyReasoner app=new VariationOntologyReasoner(); app.run(); } }
That's it,
Pierre
No comments:
Post a Comment