21 July 2006

NAR, Nature Aggademia & Bioinformatics web services

Yesterday, I've read the 150 abstracts of the special web server issue from NAR. About those 150 articles see my previous post about this. All thoses abstracts where tagged with connotea here.

As a Bioinformatician (yes I am), I like automated operations and most of all those
servers require a 'clickodrome'/web interface, that is to say,
a manual input: for example, I cannot use it to perform multiple
sequence analysis. NAR and other publishers should now require from
authors that the ouput of those tools should be written in a fully
parsable xhtml syntax, or better, in XML/RDF backed with a XSLT/xhtml stylesheet.
See also: BioMoby.

In the issue of NAR, there was a paper about Hubmed written by Alf Eaton. Alf now works at Nature and he just have released an experimental and beta social site called aggademia which aggregates the 50 most popular scientific blogs. At first sight, it works like postgenomic. I'm a little puzzled now with all those social web sites at Nature: connotea, Nature Network Boston, and (perhaps) now aggademia. On each of those site if had to register a new profile, I could create a group of user and send invitation ("Oh Nooo Pierre, not AGAIN one of your CENSORED social network"). Why not ("...you just have to......") a large integrated social scientific network ?


17 July 2006

NAR: The Web Server Issue 2006

Nucleic Acids Research has published its annual special issue devoted to web servers. The current issue contains... 150 papers (argh !).


10 July 2006

Mysql user defined function (UDF) for Bioinformatics.

MYSQL allows to create User Defined Functions (UDF). Written in 'C/C++', this kind of function can be used to embed bioinformatics into mysql. Here is an example of a function used to translate a DNA sequence into a protein directly in mysql.

#include <my_global.h>
#include <m_ctype.h>
#include <mysql.h>
#include <m_string.h>

/* a function translating 3 bases into an amino acid */
static char translation(char a,char b,char c);

/* The initialization function */
my_bool translate_init(UDF_INIT *initid, UDF_ARGS *args, char *message);
/* The deinitialization function */
void translate_deinit(UDF_INIT *initid);
/* The main function. This is where the function result is computed */
char *translate(UDF_INIT *initid, UDF_ARGS *args, char *result,
unsigned long *length, char *is_null, char *error);

/* The initialization function */
my_bool translate_init(
UDF_INIT *initid,
UDF_ARGS *args,
char *message
)
{
/* check the args */
if (!(args->arg_count == 1 && args->arg_type[0] == STRING_RESULT ))
{
strncpy(message,"Bad parameter expected a DNA",MYSQL_ERRMSG_SIZE);
return 1;
}
initid->maybe_null=1;
initid->ptr= (char*)malloc(0);

if(initid->ptr==NULL)
{
strncpy(message,"Out Of Memory",MYSQL_ERRMSG_SIZE);
return 1;
}
return 0;
}

/* The deinitialization function */
void translate_deinit(UDF_INIT *initid)
{
/* free the memory **/
if(initid->ptr!=NULL) free(initid->ptr);
}

/* The main function. This is where the function result is computed */
char *translate(UDF_INIT *initid, UDF_ARGS *args, char *result,
unsigned long *length, char *is_null, char *error)
{
long i;
long dnaLength= args->lengths[0];
const char *dna=args->args[0];
char *ptr=NULL;

if (dna==NULL) /* Null argument */
{
*is_null=1;
return NULL;
}
*length=dnaLength/3;
ptr= (char*)realloc(initid->ptr,sizeof(char)*(*length));
if(ptr==NULL)
{
*is_null=1;
*error=1;
strncpy(error,"Out Of Memory",MYSQL_ERRMSG_SIZE);
return NULL;
}
initid->ptr=ptr;
/* loop over the codons of the sequence */
int j=0;
for(i=0;i+2< dnaLength;i+=3)
{
initid->ptr[j++]=translation(dna[i],dna[i+1],dna[i+2]);
}

return initid->ptr;
}

/************************************
*
* translation
* a function translating 3 bases into an amino acid
*/
static
char translation(char base1,char base2,char base3)
{
(...)/* so obvious.... */
}


And here is the Makefile for my machine...

/usr/lib/translate.so:translate.c
gcc -fPIC -shared -I/usr/include/mysql -DDBUG_OFF -O3 -lmysqlclient -o $@ $<


... and the fragment from a session

mysql> CREATE FUNCTION translate RETURNS STRING SONAME 'translate.so';
Query OK, 0 rows affected (0,03 sec)

mysql> select translate("ATGGAGTCTACTCAGCAGATGGCTTCTTCTATTATTAATTCTTCATTTGAAGCT
AATTGATGGGTATTCAATATGACTACAATGAGGTATATACTAGAGTAAAGAGTAAATTTGATTTAGTTATGGATGATTC
GCAATTACTATTGATCAAGCTTTGAATGGAAAATTTAGTTCAGCGATTAGGAATAGAAATTGGATGACTGACTCTCGAA
TAAACTAAGAATTATGCTATCATCAAAAGGAATCGATCAGAAAATGAGAGTGCTTAATGCTTGTTTTAGTGTCAAGAGA
AATGTACTAGACTGATGAAAGACAAATTAGAACGTGGTGAAGTTGAAGTTGATGATTCCTTTGTTGAAGAGAAAATGGA
TATGAACAGTTAGAAAAGAGATTTGAGTCACTGAAACATCGGGTTAATGAGAAGTATAATCATTGGGTTCTTAAAGCTA
TCAAAATGTGATTT
") as NSP3;
+------------------------------------------------------------------------------ -------------------------------------------------------------------------------
| NSP3
+------------------------------------------------------------------------------
-------------------------------------------------------------------------------
| MESTQQMASSIINSSFEAAVVAATSTLELMGIQYDYNEVYTRVKSKFDLVMDDSGVKNNLIGKAITIDQALNGKFSS
SSKGIDQKMRVLNACFSVKRIPGKSSSIVKCTRLMKDKLERGEVEVDDSFVEEKMEVDTIDTKSRYEQLEKRFESLKHR

+------------------------------------------------------------------------------
-------------------------------------------------------------------------------
1 row in set (0,00 sec)


03 July 2006

Simile:Timeline

Timeline is an impressive DHTML-based AJAX widget written by David François Huynh for visualizing time-based events. It is like Google Maps for time-based information. Biologists might use this script to display seminars, publications. I also suggested the author that it could be used to display genomic traks (such as ensembl or the UCSC genome Browser) by just replacing "time-start" by "chromosome-start".

See also: [gbrowse] [genome browser with svg]

BTW, this reminds me something that was said at JavaDay 2006: someone wondered why so many interfaces use AJAX wereas a simple(r) applet could be written.