Showing posts with label interaction. Show all posts
Showing posts with label interaction. Show all posts

18 September 2012

Describing protein-protein interactions in XML: customizing the xsd-schema with JAXB, my notebook.

Say, you want to describe a network of protein-protein interaction using a XML format. Your XML schema will contain a set of

  • Articles/References
  • Proteins
  • Interactions
Here is a simple XSD schema for this model:
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.example.org/" targetNamespace="http://www.example.org/" elementFormDefault="qualified">
  <complexType name="article">
    <sequence>
      <element name="title" type="string"/>
      <element name="year" type="gYear"/>
    </sequence>
    <attribute name="pmid" type="ID" use="required"/>
  </complexType>
  <complexType name="protein">
    <sequence>
      <element name="acn" type="ID"/>
      <element name="description" type="string"/>
    </sequence>
  </complexType>
  <complexType name="interaction">
    <sequence>
      <element name="pmids" type="int" minOccurs="1" maxOccurs="unbounded"/>
      <element name="proteins" type="IDREF" minOccurs="1" maxOccurs="unbounded"/>
    </sequence>
  </complexType>
  <complexType name="interactome">
    <sequence>
      <element name="article" type="tns:article" minOccurs="0" maxOccurs="unbounded"/>
      <element name="protein" type="tns:protein" minOccurs="0" maxOccurs="unbounded"/>
      <element name="interaction" type="tns:interaction" minOccurs="0" maxOccurs="unbounded"/>
    </sequence>
  </complexType>
  <element name="interactome" type="tns:interactome"/>
</schema>
Here, the attibutes 'type="ID"' and 'type="IDREF"' are used to link the entities (One protein can be part of several interactions,....).
One can generate the java classes for those types using: ${JAVA_HOME}/bin/xjc:
$ xjc  interactome.xsd
parsing a schema...
compiling a schema...
org/example/Article.java
org/example/Interaction.java
org/example/Interactome.java
org/example/ObjectFactory.java
org/example/Protein.java
org/example/package-info.java
Problem: xjc doesn't know the exact nature of the links created between ID and IDREF. What kind of object should return the method 'getProteins' of the class 'Interaction' ? In consequence, xjc generates the following code:

$ more org/example/Interaction.java

    (...)
    protected List<JAXBElement<Object>> proteins;
    (...)
    public List<JAXBElement<Object>> getProteins()

We can tell xjc about those link by creating a binding file (JXB). In the following file, we tell XJC that the entities linked by 'proteins' should be some instances of 'Protein':
<?xml version="1.0" encoding="UTF-8"?>
<jxb:bindings xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc" xmlns:jxb="http://java.sun.com/xml/ns/jaxb" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" jxb:version="2.1">
  <jxb:bindings schemaLocation="interactome.xsd">
    <jxb:bindings node="/xs:schema/xs:complexType[@name=' interaction']/xs:sequence">
      <jxb:bindings node="xs:element[@name=' proteins']">
        <jxb:property>
          <jxb:baseType name="Protein"/>
        </jxb:property>
      </jxb:bindings>
    </jxb:bindings>
  </jxb:bindings>
</jxb:bindings>

Invoking XJC with the bindings:
$ xjc -b interactome.jxb  interactome.xsd
parsing a schema...
compiling a schema...
org/example/Article.java
org/example/Interaction.java
org/example/Interactome.java
org/example/ObjectFactory.java
org/example/Protein.java
org/example/package-info.java

The generated class 'Interaction.java' now contains the correct java type:


$ more org/example/Interaction.java

   (...)
    protected List<Protein> proteins;
    (...)
    public List<Protein> getProteins() {
     (....)

That's it,
Pierre

03 February 2011

Using the #Gephi toolkit to draw a graph from PSI-MI data, my notebook

GEPHI, interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs. Recently, a java library , the Gephi toolkit has been released and was used by LinkedIn for generating its inMaps.

I've been playing with the Toolkit, to generate a graph from a PSI-MI file downloaded from EMBL-Strings. The source code is available on github:



Shortly: the program reads the PSIMI-XML, uses XPATH to find the nodes and the edges, creates the graph (I could also have used XSLT too, to create an internal GEXF (the native file format for Gephi) file from the PIS-MI file), applies a layout algorithm (YifanHuLayout) and outputs the result (SVG, PDF, GEXF... ). Nevertheless it was not clear how I could change the output to insert a hyperlink, to change the background color, etc... The online javadoc was missing many informations.

Compilation

javac -cp /path/to/gephi-toolkit.jar -sourcepath src -d src src/sandbox/PsimiWithGephi.java

Generating a PDF for HOPX

I've downloaded the psi-mi.xml for HOPX from the embl-strings database and run the program.
java -cp /path/to/gephi-toolkit.jar:src sandbox.PsimiWithGephi -o ~/result.pdf file.xml

Result


Viewing the result with Flash

The API can export a GEXF file too and it can be visualized using a flash application named GEXF Explorer:

Note: "Sigma" is another viewer available for gexf: http://ofnodesandedges.com/sigma-neighborhoods-exploration/

That's it,

Pierre

03 September 2009

Building a naive Interactome Database with Hibernate.

This is my notebook for building a naive database of protein-protein interactions with Hibernate (a java object/relational persistence and query service).

Files and Directories


./project
./project/lib
./project/src
./project/src/hibernate.cfg.xml
./project/src/org
./project/src/org/lindenb
./project/src/org/lindenb/hbn01
./project/src/org/lindenb/hbn01/Journal.java
./project/src/org/lindenb/hbn01/mapping.hbm.xml
./project/src/org/lindenb/hbn01/Article.java
./project/src/org/lindenb/hbn01/Main.java
./project/src/org/lindenb/hbn01/PMID.java
./project/src/org/lindenb/hbn01/Interactor.java
./project/src/org/lindenb/hbn01/Complex.java
./project/src/org/lindenb/hbn01/Protein.java
./project/src/org/lindenb/hbn01/PMIDType.java
./project/src/log4j.properties
./project/build
./project/bin
./derby.log
./build
./build/db
./Makefile

The components / Java Classes


Interactor


An abstract class defining a protein or a complex: Just a name and an ID.
package org.lindenb.hbn01;

public class Interactor
implements java.io.Serializable
{
private Long id;
private String name;
protected Interactor()
{
}

protected Interactor(String name)
{
setName(name);
}

private void setId(Long id)
{
this.id=id;
}

public Long getId()
{
return this.id;
}
public String getName()
{
return this.name;
}
public void setName(String name)
{
this.name=name;
}

@Override
public boolean equals(Object o)
{
if(o==this) return true;
if(o==null || o.getClass()!=getClass()) return false;
return getId().equals(Interactor.class.cast(o).getId());
}

@Override
public String toString()
{
return getClass().getName()+":"+getName()+"("+getId()+")";
}
}

Protein

Protein is a concrete subclass of Interactor. This could be an Unigene entry.
package org.lindenb.hbn01;

public class Protein
extends Interactor
{
public Protein()
{
}

public Protein(String name)
{
super(name);
}

@Override
public String toString()
{
return "Protein:"+getName();
}
}

Complex

Complex is a concrete subclass of Interactor. It is a Set of Interactors. It also contains a Set of Articles holding the references for those interactions.
package org.lindenb.hbn01;
import java.util.*;

public class Complex
extends Interactor
{
private Set<Interactor> partners= new HashSet<Interactor>();
private Set<Article> articles= new HashSet<Article>();
public Complex()
{
}

public Complex(String name)
{
super(name);
}

public Set<Interactor> getPartners()
{
return this.partners;
}
public void setPartners(Set<Interactor> partners)
{
this.partners = partners;
}
public Set<Article> getArticles()
{
return this.articles;
}

public void setArticles(Set<Article> articles)
{
this.articles = articles;
}
@Override
public String toString()
{
String s="Complex:"+getName()+". ID:"+getId()+" interacts with";
for(Interactor i: getPartners())
{
s+=" "+i.getName();
}
return s;
}
}

Article

an Article is a reference to a paper in Pubmed. I wanted to use the custom dataType in hibernate, so I used the class PMID rather than an Integer. Each Article is linked to a Journal.
package org.lindenb.hbn01;

public class Article
implements java.io.Serializable
{
private PMID pmid;
private String title;
private Integer year;
private String doi;
private Journal journal;

public Article()
{
}

public Article(PMID pmid,Journal journal,Integer year,String title)
{
setPmid(pmid);
setJournal(journal);
setYear(year);
setTitle(title);
}

public Journal getJournal()
{
return journal;
}

public String getDoi()
{
return this.doi;
}

public void setDoi(String doi)
{
this.doi=doi;
}

public void setJournal(Journal journal)
{
this.journal=journal;
}

private void setPmid(PMID pmid)
{
this.pmid=pmid;
}
public PMID getPmid()
{
return this.pmid;
}
public void setTitle(String title)
{
this.title=title;
}
public String getTitle()
{
return this.title;
}
public void setYear(Integer year)
{
this.year=year;
}
public Integer getYear()
{
return this.year;
}

public String toString()
{
return "("+getYear()+")\""+getTitle()+"\"."+getJournal().getTitle();
}
}

PMID

A custom type holding a Pubmed identifier
package org.lindenb.hbn01;
import org.hibernate.*;
import java.io.Serializable;

public class PMID
implements java.io.Serializable
{
private long pmid;

public PMID(String pmid)
{
this(new Long(pmid));
}

public PMID(long pmid)
{
this.pmid=pmid;
}

public long value()
{
return this.pmid;
}

public int hashCode()
{
return 31+(int)this.pmid;
}

public boolean equals(Object o)
{
if(o==this) return true;
if(o==null || !(o instanceof PMID)) return false;
return PMID.class.cast(o).pmid==this.pmid;
}

public String toString()
{
return String.valueOf(this.pmid);
}
}

Journal

A Journal is a NLM-Id and a title
package org.lindenb.hbn01;

public class Journal
implements java.io.Serializable
{
private long nlmId;
private String title;


public Journal()
{
}

public Journal(long nlmId,String title)
{
setNlmId(nlmId);
setTitle(title);
}

private void setNlmId(long nlmId)
{
this.nlmId=nlmId;
}
public Long getNlmId()
{
return this.nlmId;
}
public void setTitle(String title)
{
this.title=title;
}
public String getTitle()
{
return this.title;
}

public String toString()
{
return getTitle()+"["+getNlmId()+"]";
}
}

Using a Custom Type

PMIDType implements EnhancedUserType. Hibernate will use this class to manage the class PMID (how to read/write it from/to the database).
package org.lindenb.hbn01;
import org.hibernate.*;
import org.hibernate.usertype.EnhancedUserType;
import java.sql.*;
import java.io.Serializable;

public class PMIDType
implements EnhancedUserType
{


public int[] sqlTypes() {
return new int[]{Types.INTEGER};
}


public Object assemble(Serializable cached,
Object owner)
throws HibernateException
{
return cached;
}

public Serializable disassemble(Object value)
throws HibernateException
{
return Serializable.class.cast(value);
}

public boolean isMutable() { return false;}
public Object deepCopy(Object value)
{
return value;
}
public boolean equals(Object a, Object b)
{
return a==null?b==null:a.equals(b);
}

public int hashCode(Object x) throws HibernateException
{
return x==null?0:x.hashCode();
}

public Object nullSafeGet(ResultSet rs,
String[] names,
Object owner)
throws HibernateException,
SQLException
{
Object o = rs.getObject( names[0] );
if(rs.wasNull()) return null;
if(o instanceof Number)
{
return new PMID(Number.class.cast(o).longValue());
}
else if(o instanceof String)
{
return new PMID(String.class.cast(o));
}
throw new IllegalArgumentException("Bad class "+o.getClass());
}

public void nullSafeSet(PreparedStatement st,
Object value,
int index)
throws HibernateException, SQLException
{
if(value==null)
{
st.setNull( index, Types.INTEGER );
}
else
{
st.setLong(index,PMID.class.cast(value).value());
}
}


public Object replace(Object original,
Object target,
Object owner)
throws HibernateException
{
return original;
}

public Class<?> returnedClass()
{
return PMID.class;
}


public Object fromXMLString(String xmlValue)
{
return xmlValue==null? null: new PMID(new Long(xmlValue));
}
public String objectToSQLString(Object value)
{
return value==null? null: String.valueOf(PMID.class.cast(value).
value());
}
public String toXMLString(Object value)
{
return value==null? null: String.valueOf(PMID.class.cast(value).
value());
}
}

The mapping file

The file mapping.hbm.xml tells hibernate how the classes are linked to each others.
<!DOCTYPE hibernate-mapping PUBLIC
"-//Hibernate/Hibernate Mapping DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping package="org.lindenb.hbn01" default-cascade="none" default-access="property" default-lazy="true" auto-import="true">

<class name="org.lindenb.hbn01.Article" table="Article" mutable="true" polymorphism="implicit" dynamic-update="false" dynamic-insert="false" select-before-update="false" optimistic-lock="version">
<meta attribute="class-description" inherit="true">A pubmed Article</meta>
<id name="pmid" column="pmid" type="org.lindenb.hbn01.PMIDType">
<meta attribute="field-description" inherit="true">Pubmed Identifier</meta>
<generator class="assigned"/>
</id>
<property name="title" not-null="true" unique="false" optimistic-lock="true" lazy="false" generated="never"/>
<property name="doi" unique="true" type="string" optimistic-lock="true" lazy="false" generated="never"/>
<property name="year" column="yearDate" type="integer" not-null="true" unique="false" optimistic-lock="true" lazy="false" generated="never"/>
<many-to-one name="journal" column="nlmId" not-null="true" unique="false" update="true" insert="true" optimistic-lock="true" not-found="exception" embed-xml="true"/>
</class>

<class name="Journal" mutable="true" polymorphism="implicit" dynamic-update="false" dynamic-insert="false" select-before-update="false" optimistic-lock="version">
<id name="nlmId" column="nlmId" type="long">
<generator class="assigned"/>
</id>
<property name="title" not-null="true" unique="false" optimistic-lock="true" lazy="false" generated="never"/>
</class>

<class name="Interactor" mutable="true" polymorphism="implicit" dynamic-update="false" dynamic-insert="false" select-before-update="false" optimistic-lock="version">
<id name="id" type="long">
<generator class="native"/>
</id>
<property name="name" not-null="true" unique="false" optimistic-lock="true" lazy="false" generated="never"/>

<joined-subclass name="Protein" dynamic-update="false" dynamic-insert="false" select-before-update="false">
<key column="interactorId" on-delete="noaction"/>
</joined-subclass>

<joined-subclass name="Complex" dynamic-update="false" dynamic-insert="false" select-before-update="false">
<key column="interactorId" on-delete="noaction"/>

<set name="partners" table="interactions" sort="unsorted" inverse="false" mutable="true" optimistic-lock="true" embed-xml="true">
<key column="complex_id" on-delete="noaction"/>
<many-to-many column="interactor_id" class="Interactor" embed-xml="true" not-found="exception" unique="false"/>
</set>

</joined-subclass>
</class>

</hibernate-mapping>

Configuring Hibernate


The file hibernate.cfg.xml describes the database we are using for persisting all the entities (driver, uri, login, password...). Here I've used JavaDB/Derby (Note: the 'unique' directive was ignored by Derby (?) ).
<!DOCTYPE hibernate-configuration PUBLIC
"-//Hibernate/Hibernate Configuration DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
<hibernate-configuration>

<session-factory>

<!-- Database connection settings -->
<property name="connection.driver_class">org.apache.derby.jdbc.EmbeddedDriver</property>
<property name="connection.url">jdbc:derby:build/db/derby/hibernate;create=true</property>
<property name="connection.username">sa</property>
<property name="connection.password"/>

<!-- JDBC connection pool (use the built-in) -->
<property name="connection.pool_size">1</property>

<!-- SQL dialect -->
<property name="dialect">org.hibernate.dialect.DerbyDialect</property>

<!-- Enable Hibernate's automatic session context management -->
<property name="current_session_context_class">thread</property>

<!-- Disable the second-level cache -->
<property name="cache.provider_class">org.hibernate.cache.NoCacheProvider</property>

<!-- Echo all executed SQL to stdout -->
<property name="show_sql">true</property>

<!-- Drop and re-create the database schema on startup -->
<property name="hbm2ddl.auto">create</property>

<mapping resource="org/lindenb/hbn01/mapping.hbm.xml"/>

</session-factory>

</hibernate-configuration>

Running


Building a Session factory

sessionFactory = new Configuration().configure().buildSessionFactory();

Creating an Interacome

Session session= getSessionFactory().getCurrentSession();
session.beginTransaction();
Journal journal = new Journal(1L,"PNAS");
session.save(journal);
Article article= new Article(new PMID(12234),journal,1988,"Article title 1");
article.setDoi("1");
session.save(article);
article= new Article(new PMID(456789),journal,1989,"Article title 2");
article.setDoi("2");
session.save(article);


Protein prot1= new Protein("prot1");
session.save(prot1);
Protein prot2= new Protein("prot2");
session.save(prot2);
Protein prot3= new Protein("prot3");
session.save(prot3);
Complex c1= new Complex("cplx1");
c1.getPartners().add(prot1);
c1.getPartners().add(prot2);
session.save(c1);
Complex c2= new Complex("cplx2");
c2.getPartners().add(prot1);
c2.getPartners().add(c1);
c2.getArticles().add(article);
session.save(c2);

session.getTransaction().commit();

Querying

Listing the Journals
Session session= getSessionFactory().getCurrentSession();
session.beginTransaction();
List list = session.createQuery("from Journal").list();

for(Object o:list)
{
System.out.println(o);
}
session.getTransaction().commit();

Listing the Articles
Session session= getSessionFactory().getCurrentSession();
session.beginTransaction();
List list = session.createQuery("from Article").list();

for(Object o:list)
{
System.out.println(o);
}
session.getTransaction().commit();
Listing the Interactors
Session session= getSessionFactory().getCurrentSession();
session.beginTransaction();
List list = session.createQuery("from Interactor").list();

for(Object o:list)
{
System.out.println("\n\n###\t"+o+"\n\n");
}
session.getTransaction().commit();

Full code

package org.lindenb.hbn01;

import org.hibernate.*;
import org.hibernate.cfg.*;
import java.util.*;

public class Main
{
private static final SessionFactory sessionFactory;

static
{
try
{
sessionFactory = new Configuration().configure().buildSessionFactory();
}
catch(Throwable err)
{
err.printStackTrace();
throw new ExceptionInInitializerError(err);
}
}

public static SessionFactory getSessionFactory()
{
return Main.sessionFactory;
}

private void listJournals()
{
Session session= getSessionFactory().getCurrentSession();
session.beginTransaction();
List list = session.createQuery("from Journal").list();

for(Object o:list)
{
System.out.println(o);
}
session.getTransaction().commit();
}

private void listArticles()
{
Session session= getSessionFactory().getCurrentSession();
session.beginTransaction();
List list = session.createQuery("from Article").list();

for(Object o:list)
{
System.out.println(o);
}
session.getTransaction().commit();
}

private void listInteractors()
{
Session session= getSessionFactory().getCurrentSession();
session.beginTransaction();
List list = session.createQuery("from Interactor").list();

for(Object o:list)
{
System.out.println("\n\n###\t"+o+"\n\n");
}
session.getTransaction().commit();
}

public void run()
{
Session session= getSessionFactory().getCurrentSession();
session.beginTransaction();
Journal journal = new Journal(1L,"PNAS");
session.save(journal);
Article article= new Article(new PMID(12234),journal,1988,"Article title 1");
article.setDoi("1");
session.save(article);
article= new Article(new PMID(456789),journal,1989,"Article title 2");
article.setDoi("2");
session.save(article);


Protein prot1= new Protein("prot1");
session.save(prot1);
Protein prot2= new Protein("prot2");
session.save(prot2);
Protein prot3= new Protein("prot3");
session.save(prot3);
Complex c1= new Complex("cplx1");
c1.getPartners().add(prot1);
c1.getPartners().add(prot2);
session.save(c1);
Complex c2= new Complex("cplx2");
c2.getPartners().add(prot1);
c2.getPartners().add(c1);
c2.getArticles().add(article);
session.save(c2);

session.getTransaction().commit();

listJournals();
listArticles();
listInteractors();
}

public static void main(String args[])
{
try
{
Main app= new Main();
app.run();
}
catch(Throwable err)
{
err.printStackTrace();
}
finally
{
if(Main.sessionFactory!=null) Main.sessionFactory.close();
}
System.out.println("Done.");
}
}

Compiling

LIB=${HIBERNATE_HOME}/lib
LIBS=${LIB}/antlr-2.7.6.jar:${LIB}/cglib-2.1.3.jar:${LIB}/asm.jar:${LIB}/asm-attrs.jar:${LIB}/commons-collections-2.1.1.jar:${LIB}/commons-logging-1.0.4.jar:${HIBERNATE_HOME}/hibernate3.jar:${LIB}/jta.jar:${LIB}/dom4j-1.6.1.jar:${LIB}/log4j-1.2.11.jar:${DERBY_HOME}/derby.jar
test:
cp -r project/src/* project/build
javac -cp ${LIBS} -d project/build -sourcepath project/build project/build/org/lindenb/hbn01/*.java
jar cvf project/bin/project.jar -C project/build .
java -cp ${LIBS}:project/bin/project.jar org.lindenb.hbn01.Main

Output

21:49:15,396 INFO Environment:514 - Hibernate 3.2.6
21:49:15,402 INFO Environment:547 - hibernate.properties not found
21:49:15,405 INFO Environment:681 - Bytecode provider name : cglib
21:49:15,409 INFO Environment:598 - using JDK 1.4 java.sql.Timestamp handling
21:49:15,457 INFO Configuration:1432 - configuring from resource: /hibernate.cfg.xml
21:49:15,458 INFO Configuration:1409 - Configuration resource: /hibernate.cfg.xml
21:49:15,546 INFO Configuration:559 - Reading mappings from resource : org/lindenb/hbn01/mapping.hbm.xml
21:49:15,682 INFO HbmBinder:300 - Mapping class: org.lindenb.hbn01.Article -> Article
21:49:15,751 INFO HbmBinder:300 - Mapping class: org.lindenb.hbn01.Journal -> Journal
21:49:15,752 INFO HbmBinder:300 - Mapping class: org.lindenb.hbn01.Interactor -> Interactor
21:49:15,778 INFO HbmBinder:873 - Mapping joined-subclass: org.lindenb.hbn01.Protein -> Protein
21:49:15,780 INFO HbmBinder:873 - Mapping joined-subclass: org.lindenb.hbn01.Complex -> Complex
21:49:15,781 INFO HbmBinder:1419 - Mapping collection: org.lindenb.hbn01.Complex.partners -> interactions
21:49:15,783 INFO Configuration:1547 - Configured SessionFactory: null
21:49:15,802 INFO DriverManagerConnectionProvider:41 - Using Hibernate built-in connection pool (not for production use!)
21:49:15,803 INFO DriverManagerConnectionProvider:42 - Hibernate connection pool size: 1
21:49:15,803 INFO DriverManagerConnectionProvider:45 - autocommit mode: false
21:49:16,036 INFO DriverManagerConnectionProvider:80 - using driver: org.apache.derby.jdbc.EmbeddedDriver at URL: jdbc:derby:build/db/derby/hibernate;create=true
21:49:16,036 INFO DriverManagerConnectionProvider:86 - connection properties: {user=sa, password=****}
21:49:18,144 INFO SettingsFactory:89 - RDBMS: Apache Derby, version: 10.2.2.1 - (538595)
21:49:18,145 INFO SettingsFactory:90 - JDBC driver: Apache Derby Embedded JDBC Driver, version: 10.2.2.1 - (538595)
21:49:18,158 INFO Dialect:152 - Using dialect: org.hibernate.dialect.DerbyDialect
21:49:18,165 INFO TransactionFactoryFactory:31 - Using default transaction strategy (direct JDBC transactions)
21:49:18,167 INFO TransactionManagerLookupFactory:33 - No TransactionManagerLookup configured (in JTA environment, use of read-write or transactional second-level cache is not recommended)
21:49:18,167 INFO SettingsFactory:143 - Automatic flush during beforeCompletion(): disabled
21:49:18,167 INFO SettingsFactory:147 - Automatic session close at end of transaction: disabled
21:49:18,168 INFO SettingsFactory:162 - Scrollable result sets: enabled
21:49:18,168 INFO SettingsFactory:170 - JDBC3 getGeneratedKeys(): disabled
21:49:18,169 INFO SettingsFactory:178 - Connection release mode: auto
21:49:18,169 INFO SettingsFactory:205 - Default batch fetch size: 1
21:49:18,170 INFO SettingsFactory:209 - Generate SQL with comments: disabled
21:49:18,170 INFO SettingsFactory:213 - Order SQL updates by primary key: disabled
21:49:18,170 INFO SettingsFactory:217 - Order SQL inserts for batching: disabled
21:49:18,170 INFO SettingsFactory:386 - Query translator: org.hibernate.hql.ast.ASTQueryTranslatorFactory
21:49:18,172 INFO ASTQueryTranslatorFactory:24 - Using ASTQueryTranslatorFactory
21:49:18,173 INFO SettingsFactory:225 - Query language substitutions: {}
21:49:18,173 INFO SettingsFactory:230 - JPA-QL strict compliance: disabled
21:49:18,173 INFO SettingsFactory:235 - Second-level cache: enabled
21:49:18,173 INFO SettingsFactory:239 - Query cache: disabled
21:49:18,174 INFO SettingsFactory:373 - Cache provider: org.hibernate.cache.NoCacheProvider
21:49:18,174 INFO SettingsFactory:254 - Optimize cache for minimal puts: disabled
21:49:18,174 INFO SettingsFactory:263 - Structured second-level cache entries: disabled
21:49:18,178 INFO SettingsFactory:283 - Echoing all SQL to stdout
21:49:18,178 INFO SettingsFactory:290 - Statistics: disabled
21:49:18,178 INFO SettingsFactory:294 - Deleted entity synthetic identifier rollback: disabled
21:49:18,178 INFO SettingsFactory:309 - Default entity-mode: pojo
21:49:18,179 INFO SettingsFactory:313 - Named query checking : enabled
21:49:18,206 INFO SessionFactoryImpl:161 - building session factory
21:49:18,472 INFO SessionFactoryObjectFactory:82 - Not binding factory to JNDI, no JNDI name configured
21:49:18,477 INFO SchemaExport:154 - Running hbm2ddl schema export
21:49:18,477 DEBUG SchemaExport:170 - import file not found: /import.sql
21:49:18,478 INFO SchemaExport:179 - exporting generated schema to database
21:49:18,482 DEBUG SchemaExport:303 - alter table Article drop constraint FK379164D684F84236
21:49:18,754 DEBUG SchemaExport:303 - alter table Complex drop constraint FK9BDFFC90D86C24B8
21:49:18,798 DEBUG SchemaExport:303 - alter table Protein drop constraint FK50CD6F63D86C24B8
21:49:18,834 DEBUG SchemaExport:303 - alter table interactions drop constraint FK4F6EF4A127BFBBC5
21:49:18,903 DEBUG SchemaExport:303 - alter table interactions drop constraint FK4F6EF4A1EBB9AEEF
21:49:19,029 DEBUG SchemaExport:303 - drop table Article
21:49:19,173 DEBUG SchemaExport:303 - drop table Complex
21:49:19,276 DEBUG SchemaExport:303 - drop table Interactor
21:49:19,410 DEBUG SchemaExport:303 - drop table Journal
21:49:19,537 DEBUG SchemaExport:303 - drop table Protein
21:49:19,654 DEBUG SchemaExport:303 - drop table interactions
21:49:19,814 DEBUG SchemaExport:303 - drop table hibernate_unique_key
21:49:19,896 DEBUG SchemaExport:303 - create table Article (pmid integer not null, title varchar(255) not null, doi varchar(255), yearDate integer not null, nlmId bigint not null, primary key (pmid))
21:49:20,063 DEBUG SchemaExport:303 - create table Complex (interactorId bigint not null, primary key (interactorId))
21:49:20,224 DEBUG SchemaExport:303 - create table Interactor (id bigint not null, name varchar(255) not null, primary key (id))
21:49:20,359 DEBUG SchemaExport:303 - create table Journal (nlmId bigint not null, title varchar(255) not null, primary key (nlmId))
21:49:20,580 DEBUG SchemaExport:303 - create table Protein (interactorId bigint not null, primary key (interactorId))
21:49:20,725 DEBUG SchemaExport:303 - create table interactions (complex_id bigint not null, interactor_id bigint not null, primary key (complex_id, interactor_id))
21:49:20,858 DEBUG SchemaExport:303 - alter table Article add constraint FK379164D684F84236 foreign key (nlmId) references Journal
21:49:20,997 DEBUG SchemaExport:303 - alter table Complex add constraint FK9BDFFC90D86C24B8 foreign key (interactorId) references Interactor
21:49:21,055 DEBUG SchemaExport:303 - alter table Protein add constraint FK50CD6F63D86C24B8 foreign key (interactorId) references Interactor
21:49:21,086 DEBUG SchemaExport:303 - alter table interactions add constraint FK4F6EF4A127BFBBC5 foreign key (interactor_id) references Interactor
21:49:21,202 DEBUG SchemaExport:303 - alter table interactions add constraint FK4F6EF4A1EBB9AEEF foreign key (complex_id) references Complex
21:49:21,331 DEBUG SchemaExport:303 - create table hibernate_unique_key ( next_hi integer )
21:49:21,376 DEBUG SchemaExport:303 - insert into hibernate_unique_key values ( 0 )
21:49:21,500 INFO SchemaExport:196 - schema export complete
21:49:21,501 WARN JDBCExceptionReporter:54 - SQL Warning: 10000, SQLState: 01J01
21:49:21,501 WARN JDBCExceptionReporter:55 - Database 'build/db/derby/hibernate' not created, connection made to existing database instead.
21:49:21,724 WARN JDBCExceptionReporter:54 - SQL Warning: 10000, SQLState: 01J01
21:49:21,724 WARN JDBCExceptionReporter:55 - Database 'build/db/derby/hibernate' not created, connection made to existing database instead.
Hibernate: insert into Journal (title, nlmId) values (?, ?)
Hibernate: insert into Article (title, doi, yearDate, nlmId, pmid) values (?, ?, ?, ?, ?)
Hibernate: insert into Article (title, doi, yearDate, nlmId, pmid) values (?, ?, ?, ?, ?)
Hibernate: insert into Interactor (name, id) values (?, ?)
Hibernate: insert into Protein (interactorId) values (?)
Hibernate: insert into Interactor (name, id) values (?, ?)
Hibernate: insert into Protein (interactorId) values (?)
Hibernate: insert into Interactor (name, id) values (?, ?)
Hibernate: insert into Protein (interactorId) values (?)
Hibernate: insert into Interactor (name, id) values (?, ?)
Hibernate: insert into Complex (interactorId) values (?)
Hibernate: insert into Interactor (name, id) values (?, ?)
Hibernate: insert into Complex (interactorId) values (?)
Hibernate: insert into interactions (complex_id, interactor_id) values (?, ?)
Hibernate: insert into interactions (complex_id, interactor_id) values (?, ?)
Hibernate: insert into interactions (complex_id, interactor_id) values (?, ?)
Hibernate: insert into interactions (complex_id, interactor_id) values (?, ?)
Hibernate: select journal0_.nlmId as nlmId1_, journal0_.title as title1_ from Journal journal0_
PNAS[1]
Hibernate: select article0_.pmid as pmid0_, article0_.title as title0_, article0_.doi as doi0_, article0_.yearDate as yearDate0_, article0_.nlmId as nlmId0_ from Article article0_
Hibernate: select journal0_.nlmId as nlmId1_0_, journal0_.title as title1_0_ from Journal journal0_ where journal0_.nlmId=?
(1988)"Article title 1".PNAS
(1989)"Article title 2".PNAS

Hibernate: select interactor0_.id as id2_, interactor0_.name as name2_, case when interactor0_1_.interactorId is not null then 1 when interactor0_2_.interactorId is not null then 2 when interactor0_.id is not null then 0 else -1 end as clazz_ from Interactor interactor0_ left outer join Protein interactor0_1_ on interactor0_.id=interactor0_1_.interactorId left outer join Complex interactor0_2_ on interactor0_.id=interactor0_2_.interactorId


### Protein:prot1




### Protein:prot2




### Protein:prot3


Hibernate: select partners0_.complex_id as complex1_1_, partners0_.interactor_id as interactor2_1_, interactor1_.id as id2_0_, interactor1_.name as name2_0_, case when interactor1_1_.interactorId is not null then 1 when interactor1_2_.interactorId is not null then 2 when interactor1_.id is not null then 0 else -1 end as clazz_0_ from interactions partners0_ left outer join Interactor interactor1_ on partners0_.interactor_id=interactor1_.id left outer join Protein interactor1_1_ on interactor1_.id=interactor1_1_.interactorId left outer join Complex interactor1_2_ on interactor1_.id=interactor1_2_.interactorId where partners0_.complex_id=?


### Complex:cplx1. ID:4 interacts with prot2 prot1


Hibernate: select partners0_.complex_id as complex1_1_, partners0_.interactor_id as interactor2_1_, interactor1_.id as id2_0_, interactor1_.name as name2_0_, case when interactor1_1_.interactorId is not null then 1 when interactor1_2_.interactorId is not null then 2 when interactor1_.id is not null then 0 else -1 end as clazz_0_ from interactions partners0_ left outer join Interactor interactor1_ on partners0_.interactor_id=interactor1_.id left outer join Protein interactor1_1_ on interactor1_.id=interactor1_1_.interactorId left outer join Complex interactor1_2_ on interactor1_.id=interactor1_2_.interactorId where partners0_.complex_id=?


### Complex:cplx2. ID:5 interacts with cplx1 prot1


21:49:22,076 INFO SessionFactoryImpl:769 - closing
21:49:22,076 INFO DriverManagerConnectionProvider:147 - cleaning up connection pool: jdbc:derby:build/db/derby/hibernate;create=true
Done


That's it!
Pierre

15 December 2008

An idea: Twitter as a tool to build a protein-protein interactions database

In this post I describe the idea about how http://twitter.com could be used as a tool to build a collaborative database of protein-protein interactions. This idea was inspired by the recent creation of http://twitter.com/omnee: Omnee is said to be the "first organic directory for Twitter which you can control directly via your tweets": Using a tag-based structure in your tweets this gives you the freedom to add yourself to multiple "groups" quickly and easily.

e.g.:


Chris Upton's tags
+informatics, +ipod touch, +genomics, +proteomics, +dnasequencing, + mac, +semanticweb, -ipodtoch, +bioinformatics, +virology, #omnee
.

How about building a collaborative biological database with this kind of tool ?. One could create a database of protein-protein interactions using twitter. For example, say the @biotecher account will be used as the core account to harvest the tweets, anybody could send a new component of the interactome by sending a tweet to @biotecher with the gi of the two proteins, a pubmed-id as reference and a special hashtag say #interactome.

E.g: Rotavirus protein NSP3 interacts with human EIF4G1 (view tweet )

Tweet
@biotecher gi:41019505 gi:255458 pmid:9755181 #interactome


With such system the metadata ( who gave this information ? when ?) is also recorded by tweeter.com so we can imagine to filter the information according to our network ("I don't trust the information supplied by this user, discard it")

I've also created a short piece of code as a proof of concept: the program fetches search for the tweets about #interactome and bound to @biotecher. It then download a few information from the NCBI (get the organism and name of the protein, get the title of the paper, etc...) and output the network as a RDF graph. The code (java) of this program is available at: http://code.google.com/p/lindenb/source/browse/trunk/proj/tinytools/src/org/lindenb/tinytools/TwitterOmics.java.

Here is the output with 3 interactions. As you will see, each interaction is stored in the rdf:Class <Interaction>. The interaction is identified by the URL of the tweet. Each interaction contains a reference of the author, the proteins , the date and the article in pubmed.

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:bibo="http://purl.org/ontology/bibo/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns="http://twitteromics.lindenb.org"
>

<foaf:Person rdf:about="http://twitter.com/yokofakun">
<foaf:name>yokofakun (Pierre Lindenbaum)</foaf:name>
</foaf:Person>

<Organism rdf:about="lsid:ncbi.nlm.nih.gov:taxonomy:4932">
<taxId>4932</taxId>
<dc:title>Saccharomyces cerevisiae</dc:title>
</Organism>

<Protein rdf:about="lsid:ncbi.nlm.nih.gov:protein:417441">
<gi>417441</gi>
<dc:title>RecName: Full=Polyadenylate-binding protein, cytoplasmic and nuclear; Short=Poly(A)-binding protein; Short=PABP; AltName: Full=ARS consensus-binding protein ACBP-67; AltName: Full=Polyadenylate tail-binding protein</dc:title>
<organism rdf:resource="lsid:ncbi.nlm.nih.gov:taxonomy:4932"/>
</Protein>

<Organism rdf:about="lsid:ncbi.nlm.nih.gov:taxonomy:9606">
<taxId>9606</taxId>
<dc:title>Homo sapiens</dc:title>
</Organism>

<Protein rdf:about="lsid:ncbi.nlm.nih.gov:protein:41019505">
<gi>41019505</gi>
<dc:title>RecName: Full=Eukaryotic translation initiation factor 4 gamma 1; Short=eIF-4-gamma 1; Short=eIF-4G 1; Short=eIF-4G1; AltName: Full=p220</dc:title>
<organism rdf:resource="lsid:ncbi.nlm.nih.gov:taxonomy:9606"/>
</Protein>

<bibo:Article rdf:about="http://www.ncbi.nlm.nih.gov/pubmed/9418852">
<bibo:pmid>9418852</bibo:pmid>
<dc:title>RNA recognition motif 2 of yeast Pab1p is required for its functional interaction with eukaryotic translation initiation factor 4G.</dc:title>
</bibo:Article>

<Interaction rdf:about="http://twitter.com/yokofakun/statuses/1058586293">
<interactor rdf:resource="lsid:ncbi.nlm.nih.gov:protein:417441"/>
<interactor rdf:resource="lsid:ncbi.nlm.nih.gov:protein:41019505"/>
<reference rdf:resource="http://www.ncbi.nlm.nih.gov/pubmed/9418852"/>
<dc:creator rdf:resource="http://twitter.com/yokofakun"/>
<dc:date>2008-12-15T14:51:42Z</dc:date>
</Interaction>

<Organism rdf:about="lsid:ncbi.nlm.nih.gov:taxonomy:10922">
<taxId>10922</taxId>
<dc:title>Simian rotavirus</dc:title>
</Organism>

<Protein rdf:about="lsid:ncbi.nlm.nih.gov:protein:255458">
<gi>255458</gi>
<dc:title>NS34=gene 7 nonstructural protein [simian rotavirus, SA114F, serotype G3, Peptide, 315 aa]</dc:title>
<organism rdf:resource="lsid:ncbi.nlm.nih.gov:taxonomy:10922"/>
</Protein>

<Protein rdf:about="lsid:ncbi.nlm.nih.gov:protein:6176338">
<gi>6176338</gi>
<dc:title>ubiquitous tetratricopeptide containing protein RoXaN [Homo sapiens]</dc:title>
<organism rdf:resource="lsid:ncbi.nlm.nih.gov:taxonomy:9606"/>
</Protein>

<bibo:Article rdf:about="http://www.ncbi.nlm.nih.gov/pubmed/15047801">
<bibo:pmid>15047801</bibo:pmid>
<dc:title>RoXaN, a novel cellular protein containing TPR, LD, and zinc finger motifs, forms a ternary complex with eukaryotic initiation factor 4G and rotavirus NSP3.</dc:title>
</bibo:Article>

<Interaction rdf:about="http://twitter.com/yokofakun/statuses/1058292539">
<interactor rdf:resource="lsid:ncbi.nlm.nih.gov:protein:255458"/>
<interactor rdf:resource="lsid:ncbi.nlm.nih.gov:protein:6176338"/>
<reference rdf:resource="http://www.ncbi.nlm.nih.gov/pubmed/15047801"/>
<dc:creator rdf:resource="http://twitter.com/yokofakun"/>
<dc:date>2008-12-15T11:01:10Z</dc:date>
</Interaction>

<bibo:Article rdf:about="http://www.ncbi.nlm.nih.gov/pubmed/9755181">
<bibo:pmid>9755181</bibo:pmid>
<dc:title>Rotavirus RNA-binding protein NSP3 interacts with eIF4GI and evicts the poly(A) binding protein from eIF4F.</dc:title>
</bibo:Article>

<Interaction rdf:about="http://twitter.com/yokofakun/statuses/1058290564">
<interactor rdf:resource="lsid:ncbi.nlm.nih.gov:protein:41019505"/>
<interactor rdf:resource="lsid:ncbi.nlm.nih.gov:protein:255458"/>
<reference rdf:resource="http://www.ncbi.nlm.nih.gov/pubmed/9755181"/>
<dc:creator rdf:resource="http://twitter.com/yokofakun"/>
<dc:date>2008-12-15T10:59:19Z</dc:date>
</Interaction>

</rdf:RDF>


What do you think ?

Pierre

30 October 2008

The EBI/IntAct Web-Service API, my notebook

This post covers my experience with the IntAct API at EBI. IntAct provides a freely available, open source database system and analysis tools for protein interaction data. All interactions are derived from literature curation or direct user submissions and are freely available.

This web service is invoked for searching binary interactions, it is described (but not documented...) as a WSDL file at http://www.ebi.ac.uk/intact/binary-search-ws/binarysearch?wsdl

Glassfih, the Java Application Server from Sun, comes with a tool called wsimport. It generates a set of java files used to handle this web-service from the wsdl file.



Here are the generated java files :
./uk/ac/ebi/intact/binarysearch/wsclient/generated/BinarySearchService.java
./uk/ac/ebi/intact/binarysearch/wsclient/generated/ObjectFactory.java
./uk/ac/ebi/intact/binarysearch/wsclient/generated/FindBinaryInteractionsResponse.java
./uk/ac/ebi/intact/binarysearch/wsclient/generated/FindBinaryInteractionsLimitedResponse.java
./uk/ac/ebi/intact/binarysearch/wsclient/generated/FindBinaryInteractionsLimited.java
./uk/ac/ebi/intact/binarysearch/wsclient/generated/SimplifiedSearchResult.java
./uk/ac/ebi/intact/binarysearch/wsclient/generated/GetVersionResponse.java
./uk/ac/ebi/intact/binarysearch/wsclient/generated/package-info.java
./uk/ac/ebi/intact/binarysearch/wsclient/generated/FindBinaryInteractions.java
./uk/ac/ebi/intact/binarysearch/wsclient/generated/GetVersion.java
./uk/ac/ebi/intact/binarysearch/wsclient/generated/BinarySearch.java


AFAIK, the WSDL file contained almost no documentation about this service, but eclipse helped me to find the correct methods thanks to the completion of the code editor.
Here is the short program I just wrote: it connects to the webservice and retrieves all the binary interactions with NSP3
import uk.ac.ebi.intact.binarysearch.wsclient.generated.BinarySearch;
import uk.ac.ebi.intact.binarysearch.wsclient.generated.BinarySearchService;
import uk.ac.ebi.intact.binarysearch.wsclient.generated.SimplifiedSearchResult;

public class IntActClient
{
/**
* @param args
*/
public static void main(String[] args) {
try
{
final String query="NSP3";
BinarySearchService service=new BinarySearchService();
BinarySearch port=service.getBinarySearchPort();
SimplifiedSearchResult ssr= port.findBinaryInteractionsLimited(query, 0,500);
System.out.println("#first-result "+ssr.getFirstResult());
System.out.println("#max-results "+ssr.getMaxResults());
System.out.println("#total-results "+ssr.getTotalResults());
System.out.println("#luceneQuery "+ssr.getLuceneQuery());
for(String line:ssr.getInteractionLines())
{
System.out.println(line);
}
}
catch(Throwable err)
{
err.printStackTrace();
}
}
}


The result:
#first-result 0
#max-results 500
#total-results 7
#luceneQuery identifiers:nsp3 pubid:nsp3 pubauth:nsp3 species:nsp3 type:nsp3 detmethod:nsp3 interact
uniprotkb:Q8N5H7|intact:EBI-745980 uniprotkb:O43281|intact:EBI-718488 uniprotkb:SH2D3C uniprotkb:EFS
intact:EBI-1263954 uniprotkb:Q00721|intact:EBI-1263962 - uniprotkb:S7 - uniprotkb:NCVP4|uniprotkb:vn....
uniprotkb:Q00721|intact:EBI-1263962 intact:EBI-1263971 uniprotkb:S7 - uniprotkb:NCVP4|uniprotkb:vn34....
uniprotkb:Q00721|intact:EBI-1263962 uniprotkb:Q00721|intact:EBI-1263962 uniprotkb:S7 uniprotkb:S7 un...
uniprotkb:Q00721|intact:EBI-1263962 uniprotkb:Q9UGR2|intact:EBI-948845 uniprotkb:S7 uniprotkb:ZC3H7B....
uniprotkb:Q04637|intact:EBI-73711 uniprotkb:Q00721|intact:EBI-1263962 uniprotkb:EIF4G1 uniprotkb:S7....
uniprotkb:Q04637|intact:EBI-73711 uniprotkb:P03536|intact:EBI-296448 uniprotkb:EIF4G1 uniprotkb:S7 u...


Ok, it was easy but I'm a little bit disappointed here because the result was 'just' a set of tab delimited lines (and where is the documentation about those columns ??) and I would have rather expected a set of XML objects.
update: the format of the columns was described here:ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psimitab/README.

That's it for tonight....

Pierre

22 January 2007

Launch of BMC Systems Biology


BMC Systems Biology, the first open access journal focussed solely on the entire emerging subject of systems biology, has just published its first articles.