30 November 2009

Playing with Erlang (II)

transcripting a DNA to a RNA sequence with an anonymous function


lists:map(fun($T)->$U;(OTHER)->OTHER end,"AATAGCTGATCGACAATGTTAGCTAGGC").
>"AAUAGCUGAUCGACAAUGUUAGCUAGGC"

Testing if a symbol is an acid nucleic

> ACIDNUCLEICS="ATGCatgc".
"ATGCatgc"
> lists:member($A,ACIDNUCLEICS).
true
> lists:member($P,ACIDNUCLEICS).
false

Filtering a sequence with lists:filter

> IsAcidNucleic=fun(BASE)->lists:member(BASE,ACIDNUCLEICS) end.
#Fun<erl_eval.6.13229925>
> lists:filter( IsAcidNucleic, "1 ACGT ACGT ACGT ACGT ACGT ACGT 36").
"ACGTACGTACGTACGTACGTACGT"

Generating an array of numbers

> lists:seq(1,10).
[1,2,3,4,5,6,7,8,9,10]
> lists:seq(65,65+24).
"ABCDEFGHIJKLMNOPQRSTUVWXY"
The following notation says: add 32 to each item 'X' of the sequence 65 to 89.
> [X+32 || X<-lists:seq(65,65+24)].
"abcdefghijklmnopqrstuvwxy"
Create a list of structures {number,'X'} for each item 'X' of the sequence 1 to 5.
> [{number,X}|| X<-lists:seq(1,5)].
[{number,1},{number,2},{number,3},{number,4},{number,5}]
Create a list of structures {number,'X'} for each item 'X' of the sequence 1 to 100 where the modulus of X/15 is '0'.
> [{number,X}|| X<-lists:seq(1,100), X rem 15 =:= 0].
[{number,15},
{number,30},
{number,45},
{number,60},
{number,75},
{number,90}]
Create a list of all the pairs(x,y) for x and y between 1 and 4:
> [{pair,X,Y}|| X<-lists:seq(1,4),Y<-lists:seq(1,4)].
[{pair,1,1},
{pair,1,2},
{pair,1,3},
{pair,1,4},
{pair,2,1},
{pair,2,2},
{pair,2,3},
{pair,2,4},
{pair,3,1},
{pair,3,2},
{pair,3,3},
{pair,3,4},
{pair,4,1},
{pair,4,2},
{pair,4,3},
{pair,4,4}]
Create a list of all the pairs(x,y) for x and y between 1 and 4 having X==Y:
> [{pair,X,Y}|| X<-lists:seq(1,4),Y<-lists:seq(1,4),X=:=Y].
[{pair,1,1},{pair,2,2},{pair,3,3},{pair,4,4}]
A few is_XXXX functions
> is_integer("AAA").
false
> is_integer($A).
true
> is_list(1).
false
> is_list("AZAZAZ").
true
> is_atom(1).
false
> is_atom(hello).
true
> is_tuple(1).
false
> is_tuple({person,{name,"Pierre"}}).
true
A few functions:
% get the header of a list
> hd([100,99,98,97]).
100
% get the 3rd element of a list
> lists:nth(3,["EcoRI","HindIII","BamHI","PstI"]).
"BamHI"
>size({test,{name,"Hello"},{age,3}}).
3
> element(2,{test,{name,"Hello"},{age,3}}).
{name,"Hello"}
> abs(-99).
99
Translating a DNA to a protein.
The file bio.erl:
-module(bio).
-export([translate/1]).

%The standard genetic code table
geneticCode()->"FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG".

%convert a base to an index in the table
base2index($T)-> 0;
base2index($C)-> 1;
base2index($A)-> 2;
base2index($G)-> 3.

%convert a codon to an index, lookup the table and return the amino acid
codon2AminoAcid(A1,A2,A3) -> lists:nth(
(base2index(A1)*16 + base2index(A2)*4 + base2index(A3))+1,
geneticCode())
.

%return an empty array if the argument is an empty sequence or a sequence with less than 3 bases
translate([])->[];
translate([_])->[];
translate([_,_])->[];
%translate the 3 first bases and append the remaining translation.
translate([A1,A2,A3|REMAIN])->[codon2AminoAcid(A1,A2,A3)|translate(REMAIN)].

Invoking bio:translate:
c(bio).

bio:translate("ATGGAGAGGCAGAAACGGAAGGCGGACATCGAGAAG").
"MERQKRKADIEK"

Records


A record file defines the structure of an object. Example: the following file snp.hrl defines the 'class' snp with its default values:
-record(snp,{
name="rs",
chrom="?",
chromStart=0,
chromEnd=0,
avHet= -1
}).
We can now use this class in an erlang shell:
%load the structure of a SNP
rr("snp.hrl").
[snp]
%create an empty instance of a snp
EMPTY_SNP= #snp{}.
#snp{name = "rs",chrom = "?",chromStart = 0,chromEnd = 0,
avHet = -1}

%create and fill an new instance of a snp
RS84=#snp{name="rs84",chrom="chr7",chromStart=25669317,chromEnd=25669318}.
#snp{name = "rs84",chrom = "chr7",chromStart = 25669317,
chromEnd = 25669318,avHet = -1}

%re-use the content of RS84 and fill the value of 'avHet'
RS84_2=RS84#snp{avHet=0.475045}.
#snp{name = "rs84",chrom = "chr7",chromStart = 25669317,
chromEnd = 25669318,avHet = 0.475045}

%extract the name and the avHet for RS84_2
#snp{name=NAME_RS84,avHet=AVHET_RS84}=RS84_2.
NAME_RS84.
"rs84"
AVHET_RS84.
0.475045


That's it
Pierre

No comments: