Friday, August 19, 2011

Learning to use RPy2 (4)

This is a brief post about how to access the components (attributes) of S3 and S4 classes from R, using RPy2. Just a few notes on what I've found, but YMMV, of course. There's a pretty detailed post about this kind of thing here. And of course, the R documentation is extensive (but more complex).

Note: the Bioconductor classes can be handled by using extensions to RPy2. That's for a later post. First some standard imports and a bit of setup:

>>> import rpy2.robjects as robjects
>>> from rpy2.robjects.packages import importr
>>> r = robjects.r
>>> g = robjects.globalenv
>>>

We define a toy S3 class starting from a vector, simply by giving it the "class" attribute, and then define a version of print that can work on it:

>>> r('''
... x <- rep(0:1, c(3,6))
... myS3_class <- x
... class(myS3_class) <- 'myS3_class'
... print.myS3_class <- function(x, ...) {
... cat("This is my vector:\n")
... cat(paste(x[1:5]), "...\n")
... }
... ''')
<SignatureTranslatedFunction - Python:0x1446648 / R:0xc5bf34>


>>> myS3_class = r['myS3_class']
>>> myS3_class.rclass
<rpy2.rinterface.SexpVector - Python:0x194550 / R:0x2c51048>
>>> myS3_class.r_repr()
'structure(c(0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L), class = "myS3_class")'
>>> attributes = r['attributes']
>>> aL = attributes(myS3_class)
>>> type(aL)
<class 'rpy2.robjects.vectors.ListVector'>
>>> aL.rx2('class')
<StrVector - Python:0x23b00a8 / R:0x2c51048>
['myS3_class']
>>> len(aL)
1
>>> aL.rx2('structure')
rpy2.rinterface.NULL

To get at the actual numeric data:

>>> len(myS3_class)
9
>>> v = r('as.integer(myS3_class)')
>>> v
<IntVector - Python:0x23b0fa8 / R:0xb77c80>
[ 0, 0, 0, ..., 1, 1, 1]
>>> structure = r['structure']
>>> structure(myS3_class)
<IntVector - Python:0x23b71e8 / R:0x2864a00>
[ 0, 0, 0, ..., 1, 1, 1]
>>> tuple(myS3_class)
(0, 0, 0, 1, 1, 1, 1, 1, 1)

Now, look at symbols in the global "environment":

>>> import rpy2.rinterface as ri
>>> tuple(ri.globalenv)
('myS3_class', 'print.myS3_class', 'x')

Let's try a simple S4 class:

>>> r('''
... setClass(
... Class="myS4_class",
... representation=representation(
... times = "numeric",
... names = "character"
... )
... )
... ''')
<StrVector - Python:0x23b0af8 / R:0xdc3068>
['myS4_class']
>>>

and instantiate one of them:

>>> r('''
... myS4_class = new(
... Class="myS4_class",
... times=c(1,2,4),
... names=c('a','b')
... )
... ''')
<RS4 - Python:0x23b0ad0 / R:0xa4d550>
>>>

Now, to access its slots:

>>> myS4_class = r['myS4_class']
>>> slotNames = r['slotNames']
>>> slotNames(myS4_class)
<StrVector - Python:0x23b0a58 / R:0xae3de8>
['times', 'names']
>>> myS4_class.do_slot('times')
<FloatVector - Python:0x23b0ad0 / R:0x2c755b8>
[1.000000, 2.000000, 4.000000]
>>>

Assign to one:

>>> v = robjects.StrVector('abcde')
>>> myS4_class.do_slot_assign('names', v)
>>> print(myS4_class)
An object of class "myS4_class"
Slot "times":
[1] 1 2 4

Slot "names":
[1] "a" "b" "c" "d" "e"

>>> v = myS4_class.do_slot('names')
>>> v

['a', 'b', 'c', 'd', 'e']


That's it for this example.