14 April 2010

Object Oriented Programming with R: My notebook

In the following post, I describe how I've used the OOP features of R to create and use the following class hierarchy:

Your browser does not support the <CANVAS> element !


First the class Person is defined. It contains four fields : firstName, lastName, birthDate and birthPlace.
setClass("Person",
representation(
firstName="character",
lastName="character",
birthDate="Date",
birthPlace="character"
))

A kind of 'constructor' function can be called for Person to check that both firtsName and lastName are not empty:
setValidity("Person",
function(object)
{
length(object@firstName)>0 &&
length(object@lastName)>0
}
)
DeceasedPerson is a subClass of Person, it contains two more fields: deathPlace and deathDate:
setClass("DeceasedPerson",
representation(
deathDate="Date",
deathPlace="character"
),
contains="Person"
)
Scientist is another subClass of Person it contains one more field:'knownFor':
setClass("Scientist",
representation(
knownFor="character"
),
contains="Person"
)
Lastly, DeceasedScientist is a subClass of both Scientist and DeceasedPerson:
setClass("DeceasedScientist",
contains=c("Scientist","DeceasedPerson")
)
Let's define a 'generic' function 'age' returning the age of an individual from his 'birthdate':
age <- function(individual)
{
as.integer((Sys.Date()-individual@birthDate)/365)
}
setGeneric("age")
Polymorphism: for the DeceasedPerson another function will be used, it will calculate the age from both 'deathDate' and 'birthDate':
age.of.death <- function(individual)
{
as.integer((individual@deathDate-individual@birthDate)/365)
}
setMethod(age,signature=c("DeceasedPerson"),definition=age.of.death)
Ok, let's play with our class, we can first create a new instance of Scientist for Craig Venter:
craigVenter <-new(
"Scientist",
firstName="Craig",
lastName="Venter",
birthPlace="Salt Lake City",
birthDate=as.Date("1946-10-14", "%Y-%m-%d"),
knownFor=c("The Institute for Genomic Research","J. Craig Venter Institute")
)
... and Charles Darwin is a DeceasedScientist:
charlesDarwin <-new(
"DeceasedScientist",
firstName="Charles",
lastName="Darwin",
birthDate=as.Date("1809-02-12", "%Y-%m-%d"),
deathDate=as.Date("1882-04-19", "%Y-%m-%d"),
knownFor=c("Natural Selection","The Voyage of the Beagle")
)
Hey , we know where Charles was born!
charlesDarwin@birthPlace="Shrewsbury"
The following statement fails because the firstName is empty:
> try(new("Person",lastName="Darwin",birthDate=as.Date("1809-02-12", "%Y-%m-%d")),FALSE)
Error in validObject(.Object) : invalid class "Person" object: FALSE
Is Darwin a valid object?:
> validObject(charlesDarwin)
[1] TRUE
Print both individuals:
> charlesDarwin
An object of class “DeceasedScientist”
Slot "knownFor":
[1] "Natural Selection" "The Voyage of the Beagle"

Slot "firstName":
[1] "Charles"

Slot "lastName":
[1] "Darwin"

Slot "birthDate":
[1] "1809-02-12"

Slot "birthPlace":
[1] "Shrewsbury"

Slot "deathDate":
[1] "1882-04-19"

Slot "deathPlace":
character(0)

> craigVenter
An object of class “Scientist”
Slot "knownFor":
[1] "The Institute for Genomic Research" "J. Craig Venter Institute"

Slot "firstName":
[1] "Craig"

Slot "lastName":
[1] "Venter"

Slot "birthDate":
[1] "1946-10-14"

Slot "birthPlace":
[1] "Salt Lake City"
Let's use the 'is' operator:
> is(craigVenter,"Person")
[1] TRUE
> is(craigVenter,"DeceasedScientist")
[1] FALSE
> is(charlesDarwin,"DeceasedScientist")
[1] TRUE
Finally let's invoke the polymorhic function 'age' for both individuals:
> age(charlesDarwin)
[1] 73 #age.of.death was called
> age(craigVenter)
[1] 63 #generic "age'


Full source code

setClass("Person",
representation(
firstName="character",
lastName="character",
birthDate="Date",
birthPlace="character"
))

setValidity("Person",
function(object)
{
length(object@firstName)>0 &&
length(object@lastName)>0
}
)

setClass("DeceasedPerson",
representation(
deathDate="Date",
deathPlace="character"
),
contains="Person"
)

setClass("Scientist",
representation(
knownFor="character"
),
contains="Person"
)

age <- function(individual)
{
as.integer((Sys.Date()-individual@birthDate)/365)
}

setGeneric("age")

age.of.death <- function(individual)
{
as.integer((individual@deathDate-individual@birthDate)/365)
}


setClass("DeceasedScientist",
contains=c("Scientist","DeceasedPerson")
)

setMethod(age,signature=c("DeceasedPerson"),definition=age.of.death)

craigVenter <-new(
"Scientist",
firstName="Craig",
lastName="Venter",
birthPlace="Salt Lake City",
birthDate=as.Date("1946-10-14", "%Y-%m-%d"),
knownFor=c("The Institute for Genomic Research","J. Craig Venter Institute")
)

charlesDarwin <-new(
"DeceasedScientist",
firstName="Charles",
lastName="Darwin",
birthDate=as.Date("1809-02-12", "%Y-%m-%d"),
deathDate=as.Date("1882-04-19", "%Y-%m-%d"),
knownFor=c("Natural Selection","The Voyage of the Beagle")
)

try(new("Person",lastName="Darwin",birthDate=as.Date("1809-02-12", "%Y-%m-%d")),FALSE)

charlesDarwin@birthPlace="Shrewsbury"

validObject(charlesDarwin)

charlesDarwin
craigVenter


is(craigVenter,"Person")
is(craigVenter,"DeceasedScientist")
is(charlesDarwin,"DeceasedScientist")
age(charlesDarwin)
age(craigVenter)


That's it !
Pierre

3 comments:

Paul Guermonprez said...

Dans les langages OO qui se respectent, comme dans les religions, la consanguinité dans les hierarchies c'est péché !

Pierre Lindenbaum said...

@Paul hopefully Religion & Darwin don't mix :-)

Ricardo Pietrobon said...

Pierre, very nice post. Do you know of any ongoing efforts to create a package that would link R to an ontology software? Two main that come to mind are Protege (since it is open source) and TopBraid (since it is of really high quality). I was looking for something where I could both instantiate and query (sparql) the ontology from the R side.

thanks