Showing posts with label ceph. Show all posts
Showing posts with label ceph. Show all posts

05 September 2008

Center for the Study of Human Polymorphisms: Week 1

I've started my first week at the center Center for the Study of Human Polymorphisms and today we had our first meeting with Mario Foglio and some other to define what will be my job in the following monthes. As I said, I will collaborate with the National Center of Genotyping on Operon, a feasible bioinformatics platform to centralize scientific software and biomedical data with internal results. It was curious because I found that nobody there uses most of the tools used/discussed with the biogang (rss feeds, social bookmarking, etc... ) and I hope I will present some slides about this later.

I will have to re-factoring the current 'C' code of operon (written over BerkeleyDB) to build a new clean C API that will be used some other persons.

What is cool is that this is an open source project and we will host it on google (http://code.google.com/p/polymorphism/).
I've also created a mailing list on google.groups: http://groups.google.com/group/operon-dev, shown my collaborators how to share a calendar on google-calendar (to find what are the possible dates for organizing a meeting) and we have already started to share some documents using google-docs. Thank you google.

The 'C' language was chosen because it is a low-level language and it seems that the developers at the CNG prefer it. I hope I will create some wrappers around this API with some other language. I already know it is possible with java using the Java Native Interface (JNI, see my previous post about this). SWIG (http://www.swig.org/), a tool generating some wrappers in various languages (python, perl...), might also be of hel. Using a Java wrapper will allow us to deploy any application in a java web server such as tomcat.

I've not much played with 'C' since 1998 ( I then played with C++ for 4 years before switching to java) but I (hope) still have some good skills and I know I now have better good programming practice.

That's it for tonight.

Pierre

01 September 2008

I'm not looking for a job anymore: Welcome at the CEPH


Today was my first day as a bioinformatician at the Center for the Study of Human Polymorphisms (CEPH http://www.cephb.fr/en/cephdb) and I want to thank my former colleagues Christine K and Philippe Gesnouin (philguess on twitter/FF ) who helped me to find this position. It's a short term contract (one year).

The CEPH is localized in Paris near the St-Louis Hospital and the "Place de la République" it maintains a database of genotypes for genetic markers that have been typed on the CEPH reference family resource for linkage mapping (Genomics 6: 575-577, 1990; Science 265: 2049-2054, 1994). The CEPH works works in conjunction with the National Center of Genotyping (CNG/Evry) where I also worked height years ago and both centers are managed by Dr Mark Lathrop. One of my first objective is to develop a set of tools around OPERON with the help of his author, Mario Foglio.

As far as I've understand operon today (I may be wrong), it is a C program handling a large set of genotypes (among other things...) using BerkeleyDB as a storage engine (I blogged about BerkeleyDB a few posts ago). It seems that using this strategy, the genotypes can be quickly accessed using something like fseek(table,sizeof(genotype_t)*(sample_count*marker_index+sample_index),SEEK_SET).

As a java programmer, I wish I could write a wrapper around the Operon C API, that would be useful to embed this model in a web container (servlet, jsp) or to write a Swing interface. My first ideas to achieve this are:
* using JNI (Java Native Interface, allows to call C from java) to write a java wrapper around the C API
* reading the data in the berkeleyDB files using the BerkeleyDB Java API.
* ...

That's it for tonight.

Pierre

12 December 2007

IBD Status applet

I've just released an applet called IBDStatus. This applet (java 6 is required) is freely available at:




This applet takes as input the breakpoint analysis data (Nature. Dib et al.(1996); 380:152-154) from the 'Fondation Jean-Dausset' (CEPH) and display the Identical By Descent (IBD) regions between a pair of related individuals. Two people share an allele identical by descent if the two copies of the allele were inherited from a common ancestor. A pair of siblings can share 0, 1, or 2 alleles:
  • 0: not the same alleles

  • 1: only one allele in common

  • 2: both same alleles





Picture from Abel & Dessein


As an example, this IBD status can be used to design the controls of a CGH assay.







  • Top left pane: a linkage table with genotype=f(individual,marker)

  • Middle left pane: the list of individual: using the Ctrl-key select
    two related
    individuals an press the Add sib button. Your new pair is added in
    the bottom left table.

  • Bottom left pane: the list of sib-pairs: for each pair, the IBD status
    is displayed in the right table


  • Right table:

    • Marker index

    • chromosome

    • STS D-Number

    • Start-position (build 36)

    • End-position (build 36)

    • IBD status of each sib-pair(if any)

    • Count IBD with unknown status

    • Count IBD. 0


    • Count IBD. 1

    • Count IBD. 2






I wrote this software a few monthes ago but it was not much used, so I
was given the permission to release this version to the community.
Enjoy.

Pierre