Back to 课程

Statistics Gcse Edexcel Foundation

0% Complete
0/0 Steps
  1. Planning-And-Types-Of-Data gcse Edexcel Foundation
    2 主题
  2. Population-Sampling-And-Collecting-Data gcse Edexcel Foundation
    2 主题
  3. Tabulation-Diagrams-And-Representation gcse Edexcel Foundation
    10 主题
  4. Measures-Of-Central-Tendency gcse Edexcel Foundation
    4 主题
  5. Measures-Of-Dispersion gcse Edexcel Foundation
    3 主题
  6. Using-Summary-Statistics gcse Edexcel Foundation
    3 主题
  7. Index-Numbers-And-Rates-Of-Change gcse Edexcel Foundation
    2 主题
  8. Scatter-Diagrams-And-Correlation gcse Edexcel Foundation
    3 主题
  9. Time-Series gcse Edexcel Foundation
    3 主题
  10. Estimation gcse Edexcel Foundation
    1 主题
  11. Probability gcse Edexcel Foundation
    4 主题
课 Progress
0% Complete

Exam code:1ST0

Outliers

What are outliers?

  • Outliers are extreme data values that do not fit with the general pattern of the data

  • Outliers in a data set can be due to

    • genuine extreme events

      • these are valid data, even if unusual

    • mistakes in the data collection

      • these should be identified and removed if possible

  • Outliers will affect some statistics that are calculated from the data

    • They can have a big effect on the mean,

      • but not on the median

      • and usually not on the mode

    • The range will be completely changed by a single outlier

      • but the interquartile range will not be affected

  • When calculating the mean or the range it is important to decide whether any outlier(s) should be included in the calculations

    • An exam question will tell you whether to include outliers or not

      • But you may have to decide which value(s) are outliers

      • Look for values that are much bigger or smaller than the rest of the data set

  • In general outliers are

    • included if they are a valid piece of data

    • excluded if it is likely that they are erroneous

Worked Example

The following data was collected about the ages of a number of students at the time that they sat their GCSE Maths exam

3 13 15 15 15 15 16 16 16 16 16 57

(a) Suggest possible outliers in the data set.

Most students sit their GCSEs when they are 15 or 16
Some students sit them a bit younger, so the ’13’ is not very unusual
However the ‘3’ and the ’57’ are definitely extreme data values compared to the rest of the set!

3 and 57 should probably be considered to be outliers

(b) For each outlier identified in part (a), suggest with a reason whether the data value should be kept in or excluded from the data set.

It is essentially impossible that a 3 year old would be sitting a GCSE exam, so that data value is surely a mistake

On the other hand older people do sometimes sit GCSE exams, so the ’57’ shouldn’t be excluded from the data set without further information

The ‘3’ should be excluded. There is no way a 3 year old would be sitting a GCSE exam, so that is almost certainly an error in the data collection.

The ’57’ should be kept. It is unusual for older people to sit GCSEs, but it is not impossible. So that may be a valid data value.

Responses

您的邮箱地址不会被公开。 必填项已用 * 标注