Data Analysis for Neuroscientists IV:
Sample vs. Population
Island of Shetland - population 22,210
Last week we generated a simulated height 'dataset':
- h - drawn from a Normal distribution with mean 179cm and standard deviation 7cm
- These are the true statistics for British men
Shetland is an island to the North of Scotland with a population of 22,210 and some impressive weather.
Generate a vector with simulated heights for 10,105 men (let's call this the entire male population of Shetland) with these statistics.
Heights of Shetland Men
Whilst out walking on Shetland, I encounter our old friend Eric, with a group of 5 men.
From their height, I suspect they are not from around here. They seem to be a little taller than the local men.
Also, I don't like the look of them.
How can I determine whether these are bone fide Shetlanders or possibly an invading army?
Simulating samples from the null population
One way we can work out if Eric and his friends really belong to the local population is:
- find the mean height of my test group (Eric et al)
- draw lots of samples of size 5 from the true local population
- how often do I get a group whose mean height is as tall as Eric and co?
- if this is very unusual, we may conclude that Eric and co are not as local as they claim to be
Let's explore what happens if we draw samples of 5 men from our population.
First, let's draw one random sample of 5 men from our population by:
- Creating an index vector ix with random numbers between 1 and 10,105
- Using it to pull out the heights of the indexed men into a new vector, s (for sample)
- Finding the mean m of the sample?
Now we are going to draw 1000 samples of 5 men each from the population of Shetlanders.
- Make a for loop to do this
- Plot a histogram of the sample means for the samples of 5 men
- The histogram show the distribution of the sample mean, m, for samples of size 5
- What is the mean of the distribution of means?
- How does this compare to the mean height of the population of Shetlanders?
- What about the standard deviation of the distribution of mean?
- The mean height of Eric's group was 184 cm.
What % of our samples of 5 true Shetlanders are at least as tall as Eric's group?