Thursday, August 13, 2009

Student's t test 3

The paired t test is used when two sets of values are related, for example because each of a pair of measurements was made on the same subject.

In this case, it is the mean of the difference between the two values that is distributed according to the t distribution.

This example is from Dalgard.

pre = c(5260,5470,5640,
6180,6390,6515,6805,
7515,7515,8230,8770)
post = c(3910,4220,3885,
5160,5645,4680,5265,
5975,6790,6900,7335)

plot(pre,post,pch=16,
col='blue',cex=2)

diff = post-pre




Not only are the values correlated, but the difference is always negative:

> diff
[1] -1350 -1250 -1755 -1020
[5] -745 -1835 -1540 -1540
[9] -725 -1330 -1435


t.test(pre,post,paired=T)


> t.test(pre,post,paired=T)

Paired t-test

data: pre and post
t = 11.9414, df = 10,
p-value = 3.059e-07
alternative hypothesis: true difference
in means is not equal to 0
95 percent confidence interval:
1074.072 1566.838
sample estimates:
mean of the differences
1320.455


We can do the test by hand, as follows:

> mean(diff)
[1] -1320.455
> sd(diff)
[1] 366.7455

> x = sd(diff)/sqrt(10)
> x
[1] 115.9751
> abs(mean(diff))/x
[1] 11.38567


The question now is, what fraction of the values from the t-distribution with df = 10 are greater than 11.39?

S = seq(0,1,by=0.001)
w = rt(1000000,df=10)
y = quantile(w,S)
round(tail(y))


> round(tail(y))
99.5% 99.6% 99.7%
3 3 3
99.8% 99.9% 100.0%
4 4 11


The short answer: not very many!