10 April 2009

Resolving LSID: my notebook

This post is about LSID (The Life Science Identifier) and was inspired by the recent activity of Roderic Page on Twitter and by Roderic's paper "LSID Tester, a tool for testing Life Science Identifier resolution services".

OK.
At the beginning, there is a LSID

urn:lsid:ubio.org:namebank:11815

ubio.org is the authority.It is followed by a database and an id.
We need to resolve this authority to find some metadata about this LSID object. On unix, we put _lsid._tcp before this authority and the host command is used to ask the "DNS for the lsid service record for pdb.org with TCP as the network protocol" (I'm not really sure of what it really means, and I guess this can be a problem for the other bioinformaticians too).
%host -t srv _lsid._tcp.ubio.org
_lsid._tcp.ubio.org has SRV record 1 0 80 ANIMALIA.ubio.org.

So http://ANIMALIA.ubio.org the is location of the LSID service. We append /authority and we get a WSDL file at http://animalia.ubio.org/authority/ (This WSDL is another issue for me, is there so many bioinformaticians knowing how to read such format ?).

<wsdl:definitions targetNamespace="http://www.hyam.net/lsid/Authority">
<import namespace="http://www.omg.org/LSID/2003/AuthorityServiceHTTPBindings"
location="LSIDAuthorityServiceHTTPBindings.wsdl"
/>

<wsdl:service name="MyAuthorityHTTPService">
<wsdl:port name="MyAuthorityHTTPPort" binding="httpsns:LSIDAuthorityHTTPBinding">
<httpsns:address location="http://animalia.ubio.org/authority/index.php"/>
</wsdl:port>
</wsdl:service>
</wsdl:definitions>

At http://animalia.ubio.org/authority/LSIDAuthorityServiceHTTPBindings.wsdl we get the Http bindings.
</><definitions targetNamespace="http://www.omg.org/LSID/2003/AuthorityServiceHTTPBindings">
<import namespace="http://www.omg.org/LSID/2003/Standard/WSDL" location="LSIDPortTypes.wsdl"/>
<binding name="LSIDAuthorityHTTPBinding" type="sns:LSIDAuthorityServicePortType">
<http:binding verb="GET"/>
<operation name="getAvailableServices">
<http:operation location="/authority/"/>
<input>
<http:urlEncoded/>
</input>
<output>
<mime:multipartRelated>
<mime:part>
<mime:content part="wsdl" type="application/octet-stream"/>
</mime:part>
</mime:multipartRelated>
</output>
</operation>
</binding>
</definitions>

This WSDL tells us that http://animalia.ubio.org/authority/ is the URL where we can find some metadata about the LSID and using http+GET. And, by appending metadata.php (why this php extension ? this is not clear for me ) you'll get the following RDF metadata about urn:lsid:ubio.org:namebank:11815 (Very cool, I like this idea of getting a RDF from one identifier). The process of resolving the WSDL can be achieved once and cached.

<rdf:RDF>
<rdf:Description rdf:about="urn:lsid:ubio.org:namebank:11815">
<dc:identifier>urn:lsid:ubio.org:namebank:11815</dc:identifier>
<dc:creator rdf:resource="http://www.ubio.org"/>
<dc:subject>Pternistis leucoscepus (Gray, GR) 1867</dc:subject>
<ubio:taxonomicGroup>Aves</ubio:taxonomicGroup>
<ubio:recordVersion>4</ubio:recordVersion>
<ubio:canonicalName>Pternistis leucoscepus</ubio:canonicalName>
<dc:title>Pternistis leucoscepus</dc:title>
<dc:type>Scientific Name</dc:type>
<ubio:lexicalStatus>Unknown (Default)</ubio:lexicalStatus>
<gla:rank>Species</gla:rank>
<gla:vernacularName rdf:resource="urn:lsid:ubio.org:namebank:954940"/>
<gla:vernacularName rdf:resource="urn:lsid:ubio.org:namebank:954941"/>
<gla:vernacularName rdf:resource="urn:lsid:ubio.org:namebank:1564236"/>
<gla:vernacularName rdf:resource="urn:lsid:ubio.org:namebank:783787"/>
<gla:vernacularName rdf:resource="urn:lsid:ubio.org:namebank:1580313"/>
<gla:mapping rdf:resource="http://starcentral.mbl.edu/microscope/portal.php?pagetitle=classification&BLCHID=12-4498"/>
<gla:mapping rdf:resource="http://www.cbif.gc.ca/pls/itisca/next?v_tsn=553857&taxa=&p_format=&p_ifx=cbif&p_lang="/>
<gla:hasBasionym rdf:resource="urn:lsid:ubio.org:namebank:12292"/>
<gla:objectiveSynonym rdf:resource="urn:lsid:ubio.org:namebank:12292"/>
<gla:objectiveSynonym rdf:resource="urn:lsid:ubio.org:namebank:1762007"/>
<gla:objectiveSynonym rdf:resource="urn:lsid:ubio.org:namebank:1762032"/>
<gla:objectiveSynonym rdf:resource="urn:lsid:ubio.org:namebank:1762051"/>
<gla:objectiveSynonym rdf:resource="urn:lsid:ubio.org:namebank:3408791"/>
<ubio:hasCAVConcept rdf:resource="urn:lsid:ubio.org:classificationbank:1116259"/>
<ubio:hasCAVConcept rdf:resource="urn:lsid:ubio.org:classificationbank:1137821"/>
<ubio:hasCAVConcept rdf:resource="urn:lsid:ubio.org:classificationbank:1173817"/>
<ubio:hasCAVConcept rdf:resource="urn:lsid:ubio.org:classificationbank:1174615"/>
<ubio:hasCAVConcept rdf:resource="urn:lsid:ubio.org:classificationbank:1416177"/>
<ubio:hasCAVConcept rdf:resource="urn:lsid:ubio.org:classificationbank:1672192"/>
<ubio:hasCAVConcept rdf:resource="urn:lsid:ubio.org:classificationbank:2233032"/>
<ubio:hasCAVConcept rdf:resource="urn:lsid:ubio.org:classificationbank:13853963"/>
<ubio:hasCAVConcept rdf:resource="urn:lsid:ubio.org:classificationbank:1909656"/>
<ubio:hasCAVConcept rdf:resource="urn:lsid:ubio.org:classificationbank:2304281"/>
<dcterms:bibliographicCitation>Sclater, W.L., Systema Avium Ethiopicarum, p. 91</dcterms:bibliographicCitation>
</rdf:Description>
</rdf:RDF>


notebook EOF.

1 comment:

Mark Wilkinson said...

It's great to have a new convert to LSID's! I've been an advocate of them for quite a while :-)

I don't think the point is for bioinformaticians to be able to read WSDL... nor even service providers for that matter! The LSID code stack will auto-generate the WSDL for you, so there's simply no need to every design or read WSDL "by eye".

It all just works :-)

The DNS manipulation that you did is also ~unnecessary. It isn't a formal part of the spec, but it is supported by both the Java and Perl codebase that if you simply go to http://authority.uri/authority you will find an LSID authority server - there's no *requirement* to do the DNS record manipulation (though it's better practice to do so)