1. Introduction

  2. Eyeballing

  3. Significance

  4. Power

  5. Point of Stability

  6. Recap

Is it significant?

Typically if we observe a correlation between two variables, we want to know whether the correlation is statistically significant.

Usually our null hypothesis is that the true correlation in the population is zero

  • ρ = 0

... and we would like to know how likely the observed correlation was, given the null hypothesis.

  • What factors do you think should affect the statistical significance of the result?
  • ?

It turns out that we can calculate a t-value for the correlation coefficient using this formula:

?

Is our correlation significant?

On the Matlab command line, enter the equation for t. Set the values of r and n to reflect the values we used in the sample.

?

The function tcdf returns the area under the curve to the left of some t-value.

  • Use tcdf to find out if the correlation is significant
    HINT Reveal answer
  • If this value significant?
    ?
  • Did you expect it to be significant from eyeballing the data?
    ?

What does significance mean again?

What exactly does it mean if the p value is 0.05?

?

Let's try it!

Generate 1000 samples from a population with ρ = 0, work out their correlation coefficients, and plot a histogram.

  • You can do this using sections 2 and 3 of the provided script
  • What proportion or samples have a correlation coefficient of 0.21 or greater?
  • Add the t distribution you calculated above to the plot, to check it matches

Try it again with a sample size n=10 or n=100.

  • For each value of n, how many sample correlation coefficients have r>0.21?

What is the critical r value for significance?

Finally let's work out exactly what the minimum r value would have been to give us a significant effect.

To do this we first use the function tinv, which gives us the critical the t-value corresponding to an input p value

?

I make it t=1.64

Then we need to find the r value corresponding to t=1.65

The problem is that it is not so easy to rearrange this formula:

... to get r

Instead, I would suggest working out t for a range of r values, from the formula above, and finding the nearest t value to 1.64. Then take the corresponding r value as rcrit

?

I make rcrit = 0.24.

►►►