Showing posts with label jaxws. Show all posts
Showing posts with label jaxws. Show all posts

16 November 2011

"VCF annotation" with the NHLBI GO Exome Sequencing Project (JAX-WS)

The NHLBI Exome Sequencing Project (ESP) has released a web service to query their data. "The goal of the NHLBI GO Exome Sequencing Project (ESP) is to discover novel genes and mechanisms contributing to heart, lung and blood disorders by pioneering the application of next-generation sequencing of the protein coding regions of the human genome across diverse, richly-phenotyped populations and to share these datasets and findings with the scientific community to extend and enrich the diagnosis, management and treatment of heart, lung and blood disorders.".
In the current post, I'll show how I've used this web service to annotate a VCF file with this information.
The web service provided by the ESP is based on the SOAP protocol.
Here is an example of the XML response: We can generate the java classes for a client invoking this Web Service by using ${JAVA_HOME}/bin/wsimport.

$ wsimport -keep "http://evs.gs.washington.edu/wsEVS/EVSDataQueryService?wsdl"

parsing WSDL...
generating code...
compiling code...

Here is the java code running this client. It scans the VCF, calls the webservice for each variation and insert the annotation as JSON in a new column .
... and the makefile:

Result (some columns have been cut)

curl -s "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20100804/supporting/EUR.2of4intersection_allele_freq.20100804.sites.vcf.gz" |\
 gunzip -c |\
 java -jar evsclient.jar 



##fileformat=VCFv4.0
##filedat=20101112
##datarelease=20100804
##samples=629
##description="Where BI calls are present, genotypes and alleles are from BI.  In there absence, UM genotypes are used.  If neither are available, no genotype information is present and the alleles are from the NCBI calls."
(...)
#CHROM POS ID EVS
1 10469 rs117577454 {"start":10469,"chromosome":"1","stop":10470,"strand":"+","snpList":[],"setOfSiteCoverageInfo":[]}
1 10583 rs58108140 {"start":10583,"chromosome":"1","stop":10584,"strand":"+","snpList":[],"setOfSiteCoverageInfo":[]}
1 11508 . {"start":11508,"chromosome":"1","stop":11509,"strand":"
(...)
1 69511 . {"start":69511,"chromosome":"1","stop":69512,"strand":"+","snpList":[{"chromosome":"1","conservationScore":"1.0","conservationScoreGERP":"0.5","refAllele":"A","ancestralAllele":"G","filters":"PASS","clinicalLink":"unknown","positionString":"1:69511","chrPosition":69511,"alleles":"G/A","uaAlleleCounts":"1373/47","aaAlleleCounts":"880/600","totalAlleleCounts":"2253/647","uaAlleleAndCount":"G=1373/A=47","aaAlleleAndCount":"G=880/A=600","totalAlleleAndCount":"G=2253/A=647","uaMAF":3.3099,"aaMAF":40.5405,"totalMAF":22.3103,"avgSampleReadDepth":185,"geneList":"OR4F5","snpFunction":{"chromosome":"1","position":69511,"conservationScore":"1.0","conservationScoreGERP":"0.5","snpFxnList":[{"mrnaAccession":"NM_001005484","fxnClassGVS":"missense","aminoAcids":"THR,ALA","proteinPos":"141/306","cdnaPos":421,"pphPrediction":"benign","granthamScore":"58"}],"refAllele":"A","ancestralAllele":"G","firstRsId":75062661,"secondRsId":0,"filters":"PASS","clinicalLink":"unknown"},"altAlleles":"G","hasAtLeastOneAccession":"true","rsIds":"rs75062661"}],"setOfSiteCoverageInfo":[{"chromosome":"1","position":69511,"avgSampleReadDepth":185.0,"totalSamplesCovered":1452,"eaSamplesCovered":712,"avgEaSampleReadDepth":157.0,"aaSamplesCovered":740,"avgAaSampleReadDepth":211.0},{"chromosome":"1","position":69512,"avgSampleReadDepth":180.0,"totalSamplesCovered":1501,"eaSamplesCovered":739,"avgEaSampleReadDepth":153.0,"aaSamplesCovered":762,"avgAaSampleReadDepth":207.0}]}
(...)
1 901923 . {"start":901923,"chromosome":"1","stop":901924,"strand":"+","snpList":[{"chromosome":"1","conservationScore":"1.0","conservationScoreGERP":"5.0","refAllele":"C","ancestralAllele":"C","filters":"PASS","clinicalLink":"unknown","positionString":"1:901923","chrPosition":901923,"alleles":"A/C","uaAlleleCounts":"2/2542","aaAlleleCounts":"52/1934","totalAlleleCounts":"54/4476","uaAlleleAndCount":"A=2/C=2542","aaAlleleAndCount":"A=52/C=1934","totalAlleleAndCount":"A=54/C=4476","uaMAF":0.0786,"aaMAF":2.6183,"totalMAF":1.1921,"avgSampleReadDepth":35,"geneList":"PLEKHN1","snpFunction":{"chromosome":"1","position":901923,"conservationScore":"1.0","conservationScoreGERP":"5.0","snpFxnList":[{"mrnaAccession":"NM_032129","fxnClassGVS":"missense","aminoAcids":"SER,ARG","proteinPos":"4/612","cdnaPos":12,"pphPrediction":"probably-damaging","granthamScore":"110"}],"refAllele":"C","ancestralAllele":"C","firstRsId":0,"secondRsId":0,"filters":"PASS","clinicalLink":"unknown"},"altAlleles":"A","hasAtLeastOneAccession":"true","rsIds":"none"}],"setOfSiteCoverageInfo":[{"chromosome":"1","position":901923,"avgSampleReadDepth":35.0,"totalSamplesCovered":2280,"eaSamplesCovered":1272,"avgEaSampleReadDepth":32.0,"aaSamplesCovered":1008,"avgAaSampleReadDepth":38.0},{"chromosome":"1","position":901924,"avgSampleReadDepth":35.0,"totalSamplesCovered":2283,"eaSamplesCovered":1273,"avgEaSampleReadDepth":32.0,"aaSamplesCovered":1010,"avgAaSampleReadDepth":38.0}]}
1 902069 rs116147894 {"start":902069,"chromosome":"1","stop":902070,"strand":"+","snpList":[{"chromosome":"1","conservationScore":"0.0","conservationScoreGERP":"1.0","refAllele":"T","ancestralAllele":"T","filters":"PASS","clinicalLink":"unknown","positionString":"1:902069","chrPosition":902069,"alleles":"C/T","uaAlleleCounts":"2/320","aaAlleleCounts":"18/212","totalAlleleCounts":"20/532","uaAlleleAndCount":"C=2/T=320","aaAlleleAndCount":"C=18/T=212","totalAlleleAndCount":"C=20/T=532","uaMAF":0.6211,"aaMAF":7.8261,"totalMAF":3.6232,"avgSampleReadDepth":13,"geneList":"PLEKHN1","snpFunction":{"chromosome":"1","position":902069,"conservationScore":"0.0","conservationScoreGERP":"1.0","snpFxnList":[{"mrnaAccession":"NM_032129","fxnClassGVS":"intron","aminoAcids":"none","proteinPos":"NA","cdnaPos":-1,"pphPrediction":"unknown","granthamScore":"NA"}],"refAllele":"T","ancestralAllele":"T","firstRsId":0,"secondRsId":0,"filters":"PASS","clinicalLink":"unknown"},"altAlleles":"C","hasAtLeastOneAccession":"true","rsIds":"none"}],"setOfSiteCoverageInfo":[{"chromosome":"1","position":902069,"avgSampleReadDepth":13.0,"totalSamplesCovered":304,"eaSamplesCovered":169,"avgEaSampleReadDepth":13.0,"aaSamplesCovered":135,"avgAaSampleReadDepth":12.0},{"chromosome":"1","position":902070,"avgSampleReadDepth":12.0,"totalSamplesCovered":338,"eaSamplesCovered":190,"avgEaSampleReadDepth":13.0,"aaSamplesCovered":148,"avgAaSampleReadDepth":12.0}]}
1 902108 rs62639981 {"start":902108,"chromosome":"1","stop":902109,"strand":"+","snpList":[{"chromosome":"1","conservationScore":"0.0","conservationScoreGERP":"-8.7","refAllele":"C","ancestralAllele":"unknown","filters":"PASS","clinicalLink":"unknown","positionString":"1:902108","chrPosition":902108,"alleles":"T/C","uaAlleleCounts":"5/333","aaAlleleCounts":"0/248","totalAlleleCounts":"5/581","uaAlleleAndCount":"T=5/C=333","aaAlleleAndCount":"T=0/C=248","totalAlleleAndCount":"T=5/C=581","uaMAF":1.4793,"aaMAF":0.0,"totalMAF":0.8532,"avgSampleReadDepth":13,"geneList":"PLEKHN1","snpFunction":{"chromosome":"1","position":902108,"conservationScore":"0.0","conservationScoreGERP":"-8.7","snpFxnList":[{"mrnaAccession":"NM_032129","fxnClassGVS":"coding-synonymous","aminoAcids":"none","proteinPos":"36/612","cdnaPos":108,"pphPrediction":"unknown","granthamScore":"NA"}],"refAllele":"C","ancestralAllele":"unknown","firstRsId":62639981,"secondRsId":0,"filters":"PASS","clinicalLink":"unknown"},"altAlleles":"T","hasAtLeastOneAccession":"true","rsIds":"rs62639981"}],"setOfSiteCoverageInfo":[{"chromosome":"1","position":902108,"avgSampleReadDepth":13.0,"totalSamplesCovered":294,"eaSamplesCovered":170,"avgEaSampleReadDepth":13.0,"aaSamplesCovered":124,"avgAaSampleReadDepth":13.0},{"chromosome":"1","position":902109,"avgSampleReadDepth":13.0,"totalSamplesCovered":309,"eaSamplesCovered":177,"avgEaSampleReadDepth":13.0,"aaSamplesCovered":132,"avgAaSampleReadDepth":13.0}]}
(...)
That's it
Pierre

14 May 2009

WebServices/JAXWS for SNP, Glassfish, Taverna: my notebook

In this post I describe how to deploy a WebService in the GlassFish web server and to to use it via the Taverna workflow engine.

Server side


Classes


The JAX-WS API (the java API for Web Services) was used here. Our Web Service will be designed to
  • find the position of the SNP from his name
  • find the SNPs in a given region
First of all, a simple POJO (Plain Old Java Object) for a SNP (name, chromosome, position, ....) was created
public class SNP
implements Serializable
{
private static final long serialVersionUID = 1L;
private String name=null;
private String acn=null;
private String chromosome=null;
private String sequence=null;
private int position=-1;

(...)
//getters and setters here...
(...)

}

The interface of our web service "SnpTool" is then defined. The class is decorated with the JAX-WS annotations defining the methods of the web service and the name of the parameters:
package fr.cephb.operon.server.ws;
import javax.jws.WebMethod;
import javax.jws.WebParam;
import javax.jws.WebResult;
import javax.jws.WebService;
import javax.jws.soap.SOAPBinding;
import javax.jws.soap.SOAPBinding.Style;

@WebService
@SOAPBinding(style=Style.DOCUMENT)
public interface SnpTool
{
@WebMethod
@WebResult(name="SnpDescriptorList")
public SNP[] getSNPByName(@WebParam(name="rsNumber")String name) throws Exception;

@WebMethod
@WebResult(name="SnpList")
public SNP[] getSNPByPosition(@WebParam(name="chromosome")String chrom,@WebParam(name="start")int start,@WebParam(name="end")int end) throws Exception;
}
The annotation @SOAPBinding(style=Style.DOCUMENT) was critical because Taverna doesn't seem to handle Style.RPC.
The service for SnpTool is then implemented. As I want to configure this WebService on deployment time (for example to specify a maximum number of SNPs to be retrieved, the default assembly, etc...), we need to get a handle to the web container: this pointer was obtained by injecting a @WebServiceContext annotation. This context is then used to retrieve the initialization parameters of the web application. Warning, this context is not initialized in the SNPToolWeb constructor.
package fr.cephb.operon.server.ws;
import java.io.File;

import javax.annotation.Resource;
import javax.jws.WebService;
import javax.servlet.ServletContext;
import javax.xml.ws.WebServiceContext;
import javax.xml.ws.handler.MessageContext;

import fr.cephb.joperon.core.bio.Assembly;

@WebService(endpointInterface="fr.cephb.operon.server.ws.SnpTool")
public class SNPToolWeb
implements SnpTool
{
@Resource
private WebServiceContext wsContext=null;

/** max number of SNP to be retrieved */
private Integer maxNumberOfSNP=null;

private Integer getMaxNumberOfSNP() throws Exception
{
if(maxNumberOfSNP!=null) return maxNumberOfSNP;
MessageContext ctxt = wsContext.getMessageContext();
ServletContext ctx = (ServletContext)ctxt.get(MessageContext.SERVLET_CONTEXT);
String s= ctx.getInitParameter("limit");
if(s!=null)
{
maxNumberOfSNP=Integer.parseInt(s);
}
return maxNumberOfSNP;
}



@Override
public SNP[] getSNPByName(String rsName) throws Exception {
//get your data here from a database, a file, etc....
//yes it returns an array because some SNP may have been merged
return snp;
}

@Override
public SNP[] getSNPByPosition(String krom, int start, int end)
throws Exception {
List<SNP> list= new ArrayList<SNP>();

//get your data here....

return list.toArray(new SNP[list.size()]);
}

}

Compile and Deploy

The developement descriptor looks like this:
<web-app>
<context-param>
<param-name>limit</param-name>
<param-value>1000000</param-value>
</context-param>
</web-app>

Ant is called to deploy this web service in Glassfish. Note that wsgen was invoked to generate the JAX-WS portable artifacts used in JAX-WS web services.
<project default="all" basedir=".">
<property environment="env"/>
<property name="home.dir" value="${env.HOME}"/>
<property name="rootdir" value="."/>
<property name="builddir" value="${rootdir}/build"/>
<property name="compiledir" value="${builddir}/compile"/>
<property file="${home.dir}/.project-properties"/>

<path id="libraries">
<pathelement path="lib1.jar"/>
<pathelement path="lib2.jar"/>
</path>

<path id="j2eelib">
<pathelement path="${appserver.dir}/lib/webservices-rt.jar"/>
<pathelement path="${appserver.dir}/lib/webservices-tools.jar"/>
</path>



<target name="demoservlet">
<mkdir dir="${compiledir}"/>
<mkdir dir="${builddir}"/>

<copy todir="${compiledir}" includeEmptyDirs="false">
<fileset dir="${rootdir}/src/java">
<filename name="**/*.java"/>
</fileset>
<fileset dir="${rootdir}/src/java">
<filename name="**/*.xml"/>
</fileset>
</copy>

<javac srcdir="${compiledir}" destdir="${compiledir}" debug="true" source="1.6" target="1.6">
<include name="**/SNPServlet.java"/>
<include name="**/SNPToolWeb.java"/>
<classpath>
<path refid="libraries"/>
<pathelement location="${appserver.dir}/lib/j2ee.jar"/>
</classpath>
</javac>


<echo message="Running wsgen"/>
<exec executable="${appserver.dir}/bin/wsgen">
<arg value="-cp"/> <arg value="${appserver.dir}/lib/j2ee.jar:$lib1.jar:$lib2.jar:${compiledir}"/>
<arg value="-verbose"/>
<arg value="-s"/> <arg value="${compiledir}"/>
<arg value="-d"/> <arg value="${compiledir}"/>
<arg value="-wsdl"/>
<arg value="-keep"/>
<arg value="fr.cephb.operon.server.ws.SNPToolWeb"/>
</exec>


<delete includeEmptyDirs="true">
<fileset dir="${compiledir}" includes="**/*.java"/>
</delete>

<war destfile="${builddir}/snp2rdf.war" webxml="${compiledir}/WEB-INF/web.xml">
<lib file="lib1.jar"/>
<classes dir="${compiledir}"/>
</war>

<delete dir="${compiledir}"/>

<exec executable="asadmin" failonerror="true">
<arg value="deploy"/>
<arg line="--user username"/>
<arg line="--passwordfile ${passwordfile}"/>
<arg value="--host"/>
<arg value="localhost"/>
<arg value="--port"/>
<arg value="${domain.admin.port}"/>
<arg value="--echo=true"/>
</>
<arg value="--libraries"/><arg value="lib1.jar"/>
<arg value="${builddir}/snp2rdf.war"/>
</exec>
</target>
</project>

The WSDL

After this WebService was compiled and deployed, we can see its WSDL at http://www.example.org:8080/snp2rdf/SNPToolWebService?wsdl:
<definitions targetNamespace="http://ws.server.operon.cephb.fr/" name="SNPToolWebService">
<types>
<xsd:schema>
<xsd:import namespace="http://ws.server.operon.cephb.fr/" schemaLocation="http://www.example.org:8080/snp2rdf/SNPToolWebService?xsd=1"/>
</xsd:schema>
</types>
<message name="getSNPByName">
<part name="parameters" element="tns:getSNPByName"/>
</message>
<message name="getSNPByNameResponse">
<part name="parameters" element="tns:getSNPByNameResponse"/>
</message>
<message name="Exception">
<part name="fault" element="tns:Exception"/>
</message>
<message name="getAssemblyName">
<part name="parameters" element="tns:getAssemblyName"/>
</message>
<message name="getAssemblyNameResponse">
<part name="parameters" element="tns:getAssemblyNameResponse"/>
</message>
<message name="getSNPByPosition">
<part name="parameters" element="tns:getSNPByPosition"/>
</message>
<message name="getSNPByPositionResponse">
<part name="parameters" element="tns:getSNPByPositionResponse"/>
</message>
<portType name="SnpTool">
<operation name="getSNPByName">
<input message="tns:getSNPByName"/>
<output message="tns:getSNPByNameResponse"/>
<fault message="tns:Exception" name="Exception"/>
</operation>
<operation name="getAssemblyName">
<input message="tns:getAssemblyName"/>
<output message="tns:getAssemblyNameResponse"/>
<fault message="tns:Exception" name="Exception"/>
</operation>
<operation name="getSNPByPosition">
<input message="tns:getSNPByPosition"/>
<output message="tns:getSNPByPositionResponse"/>
<fault message="tns:Exception" name="Exception"/>
</operation>
</portType>
<binding name="SNPToolWebPortBinding" type="tns:SnpTool">
<soap:binding transport="http://schemas.xmlsoap.org/soap/http" style="document"/>
<operation name="getSNPByName">
<soap:operation soapAction=""/>
<input>
<soap:body use="literal"/>
</input>
<output>
<soap:body use="literal"/>
</output>
<fault name="Exception">
<soap:fault name="Exception" use="literal"/>
</fault>
</operation>
<operation name="getAssemblyName">
<soap:operation soapAction=""/>
<input>
<soap:body use="literal"/>
</input>
<output>
<soap:body use="literal"/>
</output>
<fault name="Exception">
<soap:fault name="Exception" use="literal"/>
</fault>
</operation>
<operation name="getSNPByPosition">
<soap:operation soapAction=""/>
<input>
<soap:body use="literal"/>
</input>
<output>
<soap:body use="literal"/>
</output>
<fault name="Exception">
<soap:fault name="Exception" use="literal"/>
</fault>
</operation>
</binding>
<service name="SNPToolWebService">
<port name="SNPToolWebPort" binding="tns:SNPToolWebPortBinding">
<soap:address location="http://www.example.org:8080/snp2rdf/SNPToolWebService"/>
</port>
</service>
</definitions>


Client side


Creating a Client with wsimport


I've previously described how to use wsimport to generate the classes using a WebService in a previous post (see "The EBI/IntAct Web-Service API, my notebook")

Using the WS with Taverna



I've used this WSDL to run my first Taverna workflow: the input is the SNP "rs25". The Web Services invoked finds its position on the human genome and find its neighbours at 100bp. The XML result is then saved to a local file.
  • The green nodes are the WebServices.
  • The blue nodes are the constants (e.g. "rs25")
  • The orange node is a simple Java BeanShell script extending the position of the SNP:
    left=Math.max(0,Integer.parseInt(position)-Integer.parseInt(extend));
    right=Integer.parseInt(position)+Integer.parseInt(extend);
  • the purple nodes are the XML scavengers (a mysterious thing used to convert a structure to/from a XML file) and the processors (e.g. write to a file)

We this workflow was invoked it saved the following file:
<ns2:getSNPByPositionResponse>
<SnpList>
<acn>rs12699208</acn>
<chromosome>Chr7</chromosome>
<name>rs12699208</name>
<position>11549694</position>
<sequence>(...)TATAGCTTCAACATATATGAAAAAAATGTCCACTGA[R]TAGTTCCTGGTGGAGAACTCTCCCATCTCTTTTG</sequence>
</SnpList>
<SnpList>
<acn>rs27</acn>
<chromosome>Chr7</chromosome>
<name>rs27</name>
<position>11549750</position>
<sequence>(..)CCCCCATTTGAGATCCTTCTTCATCTCACCTG[S]TACCTCTCAATCCCGGTGAACCAAAAGAGATGGG(...)</sequence>
</SnpList>

(...)

<SnpList>
<acn>rs7458209</acn>
<chromosome>Chr7</chromosome>
<name>rs7458209</name>
<position>11551560</position>
<sequence>(...)GAAAACTTTAGGAAGCAAACAT[Y]GTTTTATTAAGAAAACAGGTTAAGCAAGATGGCTGACAGGAAGAGCTTCTCC(...)</sequence>
</SnpList>
</ns2:getSNPByPositionResponse>

This workflow was then uploaded and shared on
. Please note that this web service is under developpement and might be replaced and/or switched off soon.

That's it !
Pierre