29 May 2013

Binding a C library with Javascript/ #mozilla. An example with the Tabix library

In this post I'll show how to bind a C API to javascript using the mozilla/xul-runner API and the tabix library.

About xpcshell

XULRunner is a Mozilla runtime package. The SDK package contains xpcshell, a JavaScript Shell application that lets you run JavaScript code. "Unlike the ordinary JS shell (js), xpcshell lets the scripts running in it access the mozila technologies (XPCOM)." I've tested the current code with
$ xulrunner -v
Mozilla XULRunner 22.0 - 20130521223249
XULRunner is not installed by default on ubuntu on needs to be downloaded.

The js.type library

The js-ctypes is a foreign-function library for Mozilla's privileged JavaScript. It provides C-compatible data types and allows JS code to call functions in shared libraries (dll, so, dylib) and implement callback functions.

Tabix

Heng Li's Tabix is "a generic tool that indexes position sorted files in TAB-delimited formats such as GFF, BED, PSL, SAM and SQL export, and quickly retrieves features overlapping specified regions.". The code is available in github at https://github.com/samtools/tabix.

Binding the Tabix library to javascript

First of all, the dynamic library for tabix must be compiled:
$ cd /path/to/tabix.dir
$ make libtabix.so.1
A javascript file tabix.js is created. At the top, we tell the javascrpipt engine we want to use the js.type library:
Components.utils.import("resource://gre/modules/ctypes.jsm")
The dynamic library for tabix is loaded:
var lib = ctypes.open("libtabix.so.1");
We bind each required methods of the tabix library to javascript. As an example we're going to bind ti_open. The C declaration for this method is:
tabix_t *ti_open(const char *fn, const char *fnidx);
Using js.type, the call to that method is wrapped to javascript using declare/:
var DLOpen= lib.declare("ti_open",/* method name */
 ctypes.default_abi,/* Application binary interface type */
 ctypes.voidptr_t, /* return type is a pointer 'void*' */
 ctypes.char.ptr,  /* first argument is 'char*' */
 ctypes.int32_t /* second argument is 'int' */
 );
In javascript, the library is used by invoking DLOpen :
function TabixFile(filename)
 {
 this.ptr= DLOpen(filename,0);
 if(this.ptr.isNull()) throw "I/O ERROR: Cannot open \""+filename+"\"";
 };
var tabix=new TabixFile("annotatons.bed.gz");

The tabix.js library

All in one, I wrote the following file.

Testing


load("tabix.js");
var tabix=new TabixFile("/path/to/tabix-0.2.5/example.gtf.gz");
var iter=tabix.query("chr2:32800-35441");
while((line=iter.next())!=null)
 {
 print(line);
 }
tabix.close();

Set the dynamic library path (LD_LIBRARY_PATH) and invoke this script with xpcshell:
LD_LIBRARY_PATH=/path/to/xulrunner-sdk/bin:/path/to/tabix-0.2.5 /path/to/xulrunner-sdk/bin/xpcshell -f test.js
Output:
chr2 HAVANA transcript 28814 36385 . - . gene_id "ENSG00000184731"; transcript_id "ENST00000327669"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "FAM110C"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "FAM110C-001"; level 2; tag "CCDS"; ccdsid "CCDS42645"; havana_gene "OTTHUMG00000151321"; havana_transcript "OTTHUMT00000322220";
chr2 HAVANA gene 28814 36870 . - . gene_id "ENSG00000184731"; transcript_id "ENSG00000184731"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "FAM110C"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "FAM110C"; level 2; havana_gene "OTTHUMG00000151321";
chr2 HAVANA transcript 31220 32952 . - . gene_id "ENSG00000184731"; transcript_id "ENST00000460464"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "FAM110C"; transcript_type "processed_transcript"; transcript_status "KNOWN"; transcript_name "FAM110C-003"; level 2; havana_gene "OTTHUMG00000151321"; havana_transcript "OTTHUMT00000322222";
chr2 HAVANA transcript 31221 36870 . - . gene_id "ENSG00000184731"; transcript_id "ENST00000461026"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "FAM110C"; transcript_type "processed_transcript"; transcript_status "KNOWN"; transcript_name "FAM110C-002"; level 2; havana_gene "OTTHUMG00000151321"; havana_transcript "OTTHUMT00000322221";
chr2 HAVANA exon 32809 32952 . - . gene_id "ENSG00000184731"; transcript_id "ENST00000460464"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "FAM110C"; transcript_type "processed_transcript"; transcript_status "KNOWN"; transcript_name "FAM110C-003"; level 2; havana_gene "OTTHUMG00000151321"; havana_transcript "OTTHUMT00000322222";
chr2 HAVANA CDS 35440 36385 . - 0 gene_id "ENSG00000184731"; transcript_id "ENST00000327669"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "FAM110C"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "FAM110C-001"; level 2; tag "CCDS"; ccdsid "CCDS42645"; havana_gene "OTTHUMG00000151321"; havana_transcript "OTTHUMT00000322220";
chr2 HAVANA exon 35440 36385 . - . gene_id "ENSG00000184731"; transcript_id "ENST00000327669"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "FAM110C"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "FAM110C-001"; level 2; tag "CCDS"; ccdsid "CCDS42645"; havana_gene "OTTHUMG00000151321"; havana_transcript "OTTHUMT00000322220";

That's it,

Pierre

No comments: