Thursday, July 9, 2009

The eigenvalue problem

In learning Bioinformatics, I've come across matrices and matrix operations repeatedly. For example, eigenvectors and eigenvalues are at the heart of principal component analysis. Since I quit math before I got that far, I've been trying to catch up. I'm using a simple linear algebra textbook (McMahon).

In keeping with the spirit I've described, I'll post my homework here.

Suppose we have a 2 x 2 matrix A (i.e. of order 2):

A = [ 5  2 ]
[ 9 2 ]


The characteristic polynomial of A is given by:

det|A - kI|


I is the identity matrix of order 2:

I = [ 1  0 ]
[ 0 1 ]


And k is a variable (I'm using k instead of λ, the traditional symbol).

kI = [ k  0 ]
[ 0 k ]

A - kI = [ 5-k 2 ]
[ 9 2-k ]


Remembering how to evaluate the determinant of a 2 x 2 matrix:

M = [ a  b ]
[ c d ]

det|M| = (a * d) - (b * c)


so:

det|A| = 5 * 2 - 9 * 2 = -8


to get the characteristic equation, set:

det |A - kI| = 0

A - kI = [ 5-k 2 ]
[ 9 2-k ]

det |A - kI| =
= (5 - k)(2 - k) - 18
= 10 - 7k + k^2 - 18
= k^2 - 7k - 8
= (k - 8)(k + 1)


The solutions for k are then 8 and -1. These are the eigenvalues of A. For some reason, which I don't know, it is also true that:

A^2 - 7A - 8I = 0


(I'm using ^ here in the code to indicate a superscript). So we can check our math:

A = [ 5  2 ]
[ 9 2 ]

A^2 = [ 5 2 ] * [ 5 2 ] = [ 43 14 ]
[ 9 2 ] [ 9 2 ] [ 63 22 ]

A^2 - 7A - 8I =
[ 43 14 ] - [ 35 14 ] - [ 8 0 ] = 0
[ 63 22 ] [ 63 14 ] [ 0 8 ]


The eigenvectors of A satisfy this equation:

Av = kv 


where v is a vector with values x and y:

v = [ x ]
[ y ]


So for the first eigenvalue we have:

A v1 = k1 v1

[ 5 2 ] [ x ] = 8 [ x ]
[ 9 2 ] [ y ] = [ y ]


That is:

5x + 2y = 8x => y = 3/2 x


and

9x + 2y = 8y => y = 3/2 x


We can choose x = 2, then y = 3 and

v1 = [ 2 ]
[ 3 ]


Check our math:

A v1 = k1 v1

[ 5 2 ] [ 2 ] = 8 [ 2 ]
[ 9 2 ] [ 3 ] = [ 3 ]

[ 16 ] = [ 16 ]
[ 24 ] = [ 24 ]


Similarly, for the second eigenvalue we have:

  A v2 = k2 v1

[ 5 2 ] [ x ] = -1 [ x ]
[ 9 2 ] [ y ] = [ y ]


That is:
  5x + 2y = -x => y = -3x


and
  9x + 2y = -y => y = -3x


Choose x = 1, then y = -3

  v2 = [  1 ]
[ -3 ]


Check the math:

  A v2 = k2 v2

[ 5 2 ] [ 1 ] = -1 [ 1 ]
[ 9 2 ] [ -3 ] = [ -3 ]

[ -1 ] = [ -1 ]
[ 3 ] = [ 3 ]


The eigenvectors are usually ordered according to the magnitude of the corresponding eigenvalue. That's why 8 is the first eigenvalue. The eigenvectors can be normalized to have length = 1. Here, the length of v1:

  v1 = [ 2 ]
[ 3 ]


is

sqrt (2^2 + 3^2) = sqrt(13)


so, the normalized v1 is:

  v1 =  [ 2/sqrt(13) ] = [ 0.554 ]
[ 3/sqrt(13) ] [ 0.832 ]


The trace of a vector A is the sum along the diagonal

tr(A) = 5 + 2 = 7


The determinant of A is (above) = -8

The trace of a vector is also equal to the sum of
the eigenvalues (sum of k's).

The determinant of a vector is the product of the k's. Since k1 = 8 and k2 = -1, we can verify the relationships are correct for matrix A.

In R:

v = c(5,2,9,2)
A = matrix(v,nrow=2,byrow=T)
A
[,1] [,2]
[1,] 5 2
[2,] 9 2

eigen(A)

$values
[1] 8 -1

$vectors
[,1] [,2]
[1,] 0.5547002 -0.3162278
[2,] 0.8320503 0.9486833


Why do eigenvalues and eigenvectors matter to us? It turns out that if we have an n x p matrix like

B = [ x1  y1 ]
[ x2 y2 ]
[ x3 y3 ]
[ x4 y4 ]


where the x and y values are separately normalized. (They are z-scores, obtained by subtracting the mean for each column and then dividing by its standard deviation)

We construct a covariance matrix:

C = 
x y
x [ 1.00 z ]
y [ z 1.00 ]


where

z = 1/N * [ the sum over i of (zxi * zyi) ]


Then the eigenvalues and eigenvectors of C identify the principal components of B.

[UPDATE: This last part is not correct. I confused the correlation and covariance matrix. See here for details.]