17 May 2010

The poor state of the java web services for Bioinformatics

In his latest post Brad Chapman cited Jessica Kissinger who wished the Galaxy community could access the web services listed in the http://www.biocatalogue.org/. This reminded me this thread I started on http://biostar.stackexchange.com/ : "Anyone using 'Biomart + Java Web Services' ?" where Michael Dondrup and I realized that there was a poor support of the JAVA Web services API for Biomart.

I wanted to test the ${JAVA_HOME}/bin/wimport for all the services in the biocatalogue: I created a small java program using the biocatalogue API (see below) and extracting the web services having a WSDL file. Each WSDL URI was processed with the ${JAVA_HOME}/bin/wimport and I observed if any class was generated. The wsimport '-version' was JAX-WS RI 2.1.6 in JDK 6.

The result is available as a Google spreadsheet at :

Result


Number of services: 1644
  Can't access the service, something went wrong:6
  No WSDL: 6
  Found a WSDL: 1590


Number of services where wsimport failed to parse the WSDL: 1179 (74%)

Common Errors:
  690 : [ERROR] rpc/encoded wsdls are not supported in JAXWS 2.0.
  119 : [ERROR] undefined simple or complex type 'soapenc:Array'
  96 : [ERROR] 'EndpointReference' is already defined
  7 : [ERROR] only one "types" element allowed in "definitions"
  6 : [ERROR] undefined simple or complex type 'apachesoap:DataHandler'
  4 : [ERROR] only one of the "element" or "type" attributes is allowed in part "inDoc"


Number of services successfully parsed by wsimport: 411 (26%)

Count by host:


Source Code




That's it
Pierre

6 comments:

Egon Willighagen said...

Nice!

Joerg Kurt Wegner said...

Oh wow, you would hope that kind of testing is done automatically and used for flagging services ! Did you already ask the BioCatalogue guys if they can perform this test automatically and on a regular basis?

Pierre Lindenbaum said...

@Joerg, nice suggestion, I'll suggest it to the BioCatalogue

Jits said...

Hello from a BioCatalogue person,

Interesting findings! And nice use of the BioCatalogue API :-) Very pleased that you provided the source code.

@Joerg, intriguing suggestion! For service monitoring we currently have regular availability checks and run pre-approved test scripts that test the functionality of individual services. Testing for certain toolkits/platforms (in this case, Java/wimport) could have major benefits to users and providers who want to know where the services work or not.

We are planning on allowing service providers etc to host/run tests like these and then publish the test results back to us via an API. This can allow us to aggregate all kinds of monitoring information. (We would mark these as "external tests" and give appropriate credit).

What do you think?

Alternatively, we could possibly wrap tests like these into test scripts and run them on our existing infrastructure.

One question I had: we are wary of ever flagging services as being "dysfunctional", so in the case of Java/wsimport how mature is that stack for web services client work?

Cheers,
Jits

Pierre Lindenbaum said...

@jits : "How mature is that stack for web services client work ". I'm not sure how I can answer this question; wsimport is a 'standard' tool released in the Java SDK. As far as I can see, the EBI generates its web services using wsgen and then, on client side, wsimport can read those WSDL without any problem. Java people are also using some other tools such as apache AXIS, or CXF. I know those tools also fail with parsing the 'old' WSDLs (see http://stackoverflow.com/questions/2479069/importing-a-webservice.

Jits said...

@pierre: thanks for the clarification. I guess the world of SOAP/WSDL parsing is fraught with issues!