Exam code:1ST0
Population & Sample Types
What are populations, samples and sampling frames?
-
The population refers to the whole set of things which you are interested in
-
e.g. if a vet wanted to know how long a typical French bulldog sleeps for in a day
-
then the population would be all the French bulldogs in the world
-
-
Be careful – the word ‘population’ can mean different things in different contexts
-
e.g. ‘the population of the UK’ is usually used to refer to everyone in the UK
-
But if you’re studying UK dentists then the ‘population’ for your study would be restricted to all the dentists in the UK
-
-
-
A sample refers to a subset of the population which is used to collect data from
-
e.g. out of all the French bulldogs in the world (the population)
-
a vet might take a sample of French bulldogs from different cities and record how long they sleep in a day
-
-
-
A sampling frame (or sample frame) is a list of all members of the population
-
For example, a list of employees’ names within a company
-
Not every population will have an easily-accessible sampling frame
-
What’s the difference between a census and a sample?
-
A census collects data about all the members of a population
-
e.g. the government in the UK does a national census every 10 years to collect data about every person living in England at the time
-
The main advantage of a census is that it gives fully accurate results
-
The disadvantages of a census are:
-
It is time consuming and expensive to carry out
-
It can destroy or use up all the members of a population (imagine a company testing every single firework it produces)
-
-
-
Sampling is used to collect data from a subset of the population
-
The advantages of sampling are:
-
It is quicker and cheaper than a census
-
It leads to less data needing to be analysed
-
-
The disadvantages of sampling are:
-
It might not represent the population accurately
-
It could introduce bias, if some parts of the population are more represented in the sample than others
-
-
What different sampling techniques do I need to know?
Random sampling methods
-
Simple random sampling: here every member of the population has an equal probability of being selected for the sample
-
To select a simple random sample of
members of the population
-
Uniquely number every member of the population
-
Then randomly select
different numbers using a random number generator (or other form of random selection)
-
-
-
Stratified sampling: the population is divided into separate groups (called strata) and then a random sample is taken from each group (stratum)
-
The proportion of a sample that belongs to a stratum is equal to the proportion of the population as a whole that belongs to that stratum
-
e.g. if 1/20 of the population belongs to a particular stratum
-
then 1/20 of the sample should come from that stratum
-
-
A population could be split into strata by age ranges, gender, occupation, etc.
-
-
See the spec points on ‘Random Samples’ and ‘Stratified Samples’ for more info on these two methods
Non-random sampling methods
Note: some of these methods include random elements, but the samples as a whole are not random
-
Judgement sampling: here you simply use your judgement to choose a sample of the population
-
You should attempt to make sure that the sample is representative of the population as a whole
-
-
Opportunity (convenience) sampling: a sample is formed using available members of the population who fit the study criteria
-
e.g. for a study of UK consumers you could stand on a street corner and interview the first 50 people who walk by
-
-
Cluster sampling: the population is divided into sensible ‘clusters’ and then a number of clusters are chosen at random to form the sample
-
e.g. a study of UK education might use schools as the clusters
-
then select 50 schools at random and use the people in those schools as the sample
-
-
Systematic sampling: a sample is formed by choosing members of a population at regular intervals using a list (sampling frame)
-
e.g. to select 1/10 of the students in a school as a sample
-
Start with a list of all students
-
Select one student at random as a ‘starting point’
-
Then also select every 10th student on the list after that starting point
-
(If necessary, wrap back around to the start of the list when you get to the end)
-
-
-
Quota sampling: the population is split into groups (like in stratified sampling) and a quota is specified for each group
-
The quota specifies how many members of the population are to be selected from each group
-
This will often be done in the same way as selecting the sizes of the strata in stratified sampling
-
Or other criteria could be used to set the quota for each group
-
-
Members of the population are selected until each quota is filled
-
If a member does not want to be included then another member is chosen instead
-
-
The members do not have to be selected randomly
-
What are the advantages and disadvantages of different sampling techniques?
-
In general
-
Most sampling techniques can be improved by taking a larger sample
-
You want to minimise the bias within a sample
-
This occurs when the sample is not representative of the population
-
The best way to do avoid bias (when possible) is to use a random method
-
-
Sometimes the ‘best’ method would cost too much or take too much time
-
So you need to choose the ‘best method you can afford (or have the time for)’
-
-
A sample only gives information about the members in the sample
-
A different sample from the same population could lead to different conclusions about the population!
-
-
-
Simple random sampling:
-
This is the best sampling method for avoiding bias
-
Although it is possible that members of some groups in the population will not be represented in the sample
-
To avoid this stratified sampling can be used instead
-
-
Most useful when you have a small population or want a small sample
-
e.g. children in a class
-
-
This cannot be used if it is not possible to number or list all the members of the population
-
e.g. the fish in a lake
-
-
-
Stratified sampling:
-
This should be used when the population can be split into obvious groups
-
Useful when there are very different groups of members within a population
-
The sample will be representative of the population structure
-
Members of every group (stratum) are guaranteed to be included in the sample
-
-
The members selected from each stratum are chosen randomly
-
This helps to avoid bias
-
-
This cannot be used
-
if the population cannot be split into groups
-
or if the groups overlap
-
-
-
Systematic sampling:
-
This is useful when you want a sample from a large population
-
You need access to a sampling frame (list of the population)
-
If the order of the sampling frame is random then the sample will also be random
-
-
This cannot be used if it is not possible to number or list all the members of the population
-
e.g. penguins in Antarctica
-
-
Be careful of periodic (i.e. regularly recurring) patterns in the sampling frame
-
e.g. a list of names where the names are grouped by 5-person teams with the team captain appearing first
-
If you selected every 5th name in the list you would end up with either all captains or no captains in your sample
-
-
-
Quota sampling:
-
This is useful when a small sample is needed to be representative of the population structure
-
Useful when collecting data by asking people who walk past you in a public place or when a sampling frame is not available
-
Just keep asking people until the quota is filled for each group
-
-
This can introduce bias as some members of the population might choose not to be included in the sample
-
-
Cluster sampling:
-
This will usually require less time and be less expensive than simple random sampling or stratified sampling
-
e.g. if your clusters are schools, you will only need to collect data from the people in some of those schools
-
instead of having to collect data from a few people in every school in the country
-
-
However the clusters may not be representative of the population structure as a whole
-
This can make the sample biased
-
-
-
Opportunity (convenience) sampling:
-
This should be used when a sample is needed quickly
-
Useful when a list of the population is not possible
-
But the sample is unlikely to be representative of the population structure
-
This can make the sample biased
-
-
-
Judgement sampling:
-
This can be used when a sample is needed quickly
-
The person choosing the sample should try to make it representative of the population
-
But intentionally or unintentionally the sample can end up being biased
-
Therefore this is rarely a preferred method
-
-
Worked Example
Aaron, Belinda and Charlotte are writing an article about school uniforms for their school newsletter. They want to interview a sample of 30 students to find out their opinions about school uniforms.
(a) Write down the population for the survey, and suggest a possible sampling frame.
Be careful with the population here
They only want to interview students, so the population for their survey is only the students in the school
It does not include teachers or other staff members
The population is all the students in the school.
A sampling frame could be an alphabetical list of all the students in the school.
Aaron suggests that he could stand by the school gates in the morning and interview the first 30 students that come past him.
(b) Name this type of sampling and suggest a possible disadvantage.
Opportunity sampling
The sample could be biased. For example, Aaron could end up interviewing all people who have just arrived on the same bus, or groups of friends or siblings arriving at school together.
Belinda suggests that instead Aaron should interview students at the school gates until he has interviewed exactly 6 students from each of the school’s year groups (years 7 through 11).
(c) Name this type of sampling and suggest a reason why it would be an improvement over Aaron’s original plan.
Quota sampling
The sample would probably be more representative of all the students in the school, because it would be certain to include students from each year group.
In the end, Aaron, Belinda and Charlotte decide to use systematic sampling to select their sample.
(d) Given that there are 480 students in the school, suggest how they might go about choosing their sample.
They are going to need to select names from a list
But first we need to know what proportion of the students in the school they want to interview
Divide the number in their sample (40) by the total number of students (480)
So they want their sample to contain 1/12 of the students in the school
This means they need to choose every 12th name in the list (after the random starting point)
They will need a list of all the students in the school to use as a sampling frame.
They need to randomly select one student from the list as a starting point, then also select every 12th student from the list after that.
They may need to ‘wrap back around’ to the start of the list to get all 40 names for their sample.
Random Samples
What do I need to know about random sampling?
-
In a simple random sample every member of the population has an equal probability of being selected for the sample
-
This means that the sample selection is fair and unbiased
-
Therefore the sample is likely to be representative of the population
-
-
To minimise bias this will usually be the best method
-
But it can also be expensive and time-consuming
-
And some groups in the population may end up not being represented in the sample
-
-
-
Some other sampling methods also include random selection
-
In stratified sampling, the members of the population chosen from each stratum (group) are chosen randomly
-
A ‘simple random sample’ is taken from each stratum
-
This leads to relatively unbiased samples that also reflect the population structure
-
-
In cluster sampling, the clusters to include in the sample are selected randomly
-
This can give good results if the clusters are representative of the population as a whole
-
-
In systematic sampling the ‘starting point’ member in the population list is chosen randomly
-
This is not considered a ‘random sample’ unless the ordering of the list is also random
-
-
How is a random sample selected?
-
To take a simple random sample you need to have access to a list of all members of the population (i.e. a sampling frame)
-
Every member in the sampling frame must be assigned a number
-
Usually this will mean starting at 1
-
and numbering the rest of the list in order: 2, 3, 4, etc.
-
-
-
To select a random sample of <img alt=”n” data-mathml=”<math ><semantics><mi>n</mi><annotation encoding=”application/vnd.wiris.mtweb-params+json”>{“fontFamily”:”Times New Roman”,”fontSize”:”18″,”autoformat”:true,”toolbar”:”<toolbar ref=’general’><tab ref=’general’><removeItem ref=’setColor’/><removeItem ref=’bold’/>&
Responses