Chapter 3 Statistics

Sample Standard Deviation

preferred measure of variation when the mean is used as the measure of center
takes into account all the observations
first step is find the deviations of the mean

Measures of center

value at the center or middle of the data set

mean

adding all of the values and dividing the total by the number of values

mode

most frequently occurring value
no modes if no repeating numbers

n denotes

number of data values in a sample

Median

#NAME?

x bar

mean of a set of sample values

Range

difference between maximum and minimum

most frequently used measures of variation

range and sample standard deviation

standard deviation formula

Sample Variance formula

relationship between variance and standard deviation

the more variation that there is in a data set, the larger is its standard deviation

Definition of descriptive measures

#NAME?

Procedures in computing a sample standard deviation

1. Calculate sample mean (x bar)
2. Find the deviation from the mean of each observations: xi - x bar
3. Sum of squared deviations
4. Sample variance
5. Sample standard deviation
(just take the sample variance and then take the square root of it)

Two-Standard-Deviations Rule

almost all the observation in any data set lit within two standard deviations to either side of the mean

five-number summary

bimodal

two data values with the same greatest frequency

multimodal

more than two data values occur with the same greatest frequency

no mode

no data is repeated

midrange

value midway b/w max & min value
MR=max+min/2

round off rule for measures of center

carry one more decimal place than is present in original set of values

mean from frequency distribution equation

x bar= E(f*x)/Ef

Weighted mean

when data values are assigned different weights, w, can compute a weighted mean

weighted mean equation

x bar= E(w*x)/Ew

round off rule

carry one more decimal place than is present (only for final answer)

Standard deviation

set of sample values, denoted by "s", measure of how much data values deviate away from the mean

shortcut formula SD

s=square root n(Ex^2)-(Ex)^2/
n(n-1)

range rule of thumb

the majority (95%) of sample values lies within two standard deviations of the mean

minimum (usual) value

0

Maximum (unusual) value

0

estimation range rule of thumb

s=range/4

population standard deviation equation

o= square root E(x-u)/N

variance

set of values is a measure of variation equal to the square of the SD

s

sample standard deviation

s^2

sample variance

o

Population standard deviation

o^2

Population variance

Empirical rule

-68% within 1 SD of the mean
-95% within 2 SD of the mean
-99.7% with 3 SD of the mean

Chebyshev's Theorem

Proportion of any data set lying with K SD of the mean is always at least 1-1/^2, where k is any positive number <1

Coefficient of Variation (CV)

for a set of nonnegative sample or population data, expresses as a percentage, is the mean
CV=s/x *100 <sample
CV=o/u*100 <population

z-score

standardized value, number of standard deviations that given value x is above or below the mean

measures of z-score

sample: z=x-x bar/s
Population: z=x-u/o

interpreting z-scores

unusual -3
ordinary=0
unusual=3

percentiles

measures of location p1,p2...p99 which divides a set of data into 100 groups with about 1% of the values in each group

finding percentile

percentile of value x= # of values <x/ total # of values times 100

converting from the k percentile

L=k/100 times n

percentile notations

n= total # of values in a data set
k= percentile being used
L= locater gives you position of value
Pk= kth percentile

quartiles

measures of located, denoted Q1,Q2,Q3, which divide a set of data into four groups with about 25% of the values in each group

Q1

first quartile, separates the bottom 25% of sorted values from the top 75%

Q2

second quartile, same as medium , separates at bottom 50% and top 50%

Q3

third quartile, separates bottom 75% from top 25%

interquartile(IQR)

Q3-Q1

semi-interquartile range

Q3-Q1/2

mid quartile

Q3+Q1/2

outliers

a value that lies very far away from the majority of the other values