Statistics Chapter 3

Mean

1. It is the sum of the values, divided by the total number of values
2. Varies less than the median or the mode when samples are taken from the same population and all 3 measures are computed for these samples
3. Used in computing variance
4. It is uniqu

Median

1. It is the midpoint of the data array. The symbol for the median is MD
2. Used to find the center or middle value of a data set
3. Used when it is necessary to find out whether the data values fall into the upper half or the lower half of the distributi

Mode

1. The value that occurs most often in a data set
2. Used when the most typical case is desired
3. Can be used when the data are nominal or categorical, such as religious preference, gender or political affiliation

Midrange

1. It is defined as the sum of the lowest and highest values in the data set divided by 2. The symbol MR is used for the midrange. MR = (lowest value + highest value) / 2
2. It gives the midpoint
3. It is affected by extremely high or low values in a data

Statistic

Characteristic or measure obtained by using the data values from a sample

Parameter

Characteristic or measure obtained by using all the data values from a specific population

In positively skewed distribution shapes, where are the mode, the median and the mean?

Mean is to the right of the median and the mode is to the left of the median

In a symmetric distribution, where are the mode, the median and the mean?

When the distribution is unimodal, the mean, median and mode are the same and are at the center of the distribution

In a left skewed distribution shape, where are the mode, the median and the mean?

Mean is to the left of the median, and the mode is to the right of the median

Range

It is the highest value minus the lowest value. The symbol R is used for the range
R = highest value - lowest value

Coefficient of variation

1. Denoted by CVar, is the standard deviation divided by the mean. The result is expressed as a percentage. CVar = s/Xbar x 100

What is the Range Rule of Thumb

A rough estimate of the standard deviation is
s ~ range/4

What does Chebyshev's Theorem say?

The proportion of values from a data set that will fall within k standard deviations of the mean will be at least 1 - 1 / k square, where k is a number greater than 1 (k is not necessarily an integer)
In summary Chebyshev's theorem states:
1. At least 3/4

The empirical rule

Chebyshev's theorem applies to any distribution regardless of its shape. However, when a distribution is bell-shaped (or what is called normal), the following statements, which make up the empirical rule, are true:
1. Approximately 68% of the data values

Interquartile range (IQR)

The difference between the third and first quartiles

Outlier

An extremely high or an extremely low data value when compared with the rest of the data values

What is the z score?

It tells how many standard deviations the data value is above or below the mean. If z is +, then the score is above the mean. If z is -, then the score is below the mean. If z=0, then the score is the same as the mean