Ch. 4 AP Statistics (Numerical Methods for Describing Data)

Sample Variance

Denoted by s� is the sum of the squared deviation from the mean / divided by n-1. (n-1) are the degrees of freedom.

Comparative boxplots

A diagram that includes more than one boxplot using the same scale; allows the reader to find similarities and differences between data sets.

Median

The middle score in a distribution; half the scores are above it and half are below it. Easy when sample size is odd.

Mean

Equals the sum of the observations divided by the total amount of observations. A measure of the center(usually)

sample mean

Abbreviated with a lowercase x with a horizontal line over top (called 'x-bar') x?, is the true mean of a sample of data often used to estimate the true mean of the entire population, can be distorted if there is a large outlier in the sample.

population mean

The true mean of the entire population often estimated using the sample mean. Abbreviated with the lowercase Greek letter mu. (?)

Sample median

the middle value in an ordered list of sample observations (if n is even, the median is the average of the two middle values); the median is INSENSITIVE TO OUTLIERS

trimmed mean

A measure of center in which the observations are first ordered from smallest to largest, one or more observations are deleted from each end, and the remaining ones are averaged. In terms of sensitivity to outliers, it is a compromise between the mean and

Trimmed percentage

The percentage of values deleted from each end of the ordered list. Calculated by taking number deleted from each end/ dividedd by sample size multiplied by 100.

sample proportion of successes

The number of responses designated as success out of the number of responses, denoted by P.

Range

the difference between the greatest and least numbers in a set of data

Deviations from the sample mean

The differences between sample observations deviation from the mean., positive if the value of x is greater than x?, less if x is less than x?

Average deviation

difference between the mean & each of the individual values in the data set, not useful because it always sums to zero.

Sample standard deviation

The positive square root of the sample variance. Denoted by s.

Population Variance

?the mean of the squares of the deviations. ��=?(x-�) �/N, denoted by sigma squared (?)^2

Population standard deviation

The square root of the population variance. �=?��=?(?(x-�) �/N) denoted by sigma (?)

Degrees of freedom

The number of individual scores that can vary without changing the sample mean. Statistically written as usually 'N-1' where N represents the number of subjects.

Inter-quartile Range

The difference between the scores (or estimated scores) at the 75th percentile and the 25th percentile. Used more than the range because it eliminates extreme scores. IQR

Lower Quartile

25% quartile. The median of the lower data values in order.

Upper Quartile

75% quartile. The median of the upper data values in order.

Population inter-quartile range

The difference between the upper and lower population quartiles. Shows that the relationship between standard deviation and IQR to be sd =IQR/1.35

Boxplot

Shows the spread of data based on five-number summary, box spans quartiles, shows the central half of a distribution, median is marked in the box, lines extend to the extremes

modified boxplot

A boxplot that represents mild outliers by shaded circles and extreme outliers by open circles, and the whiskers extend on each end to the most extreme observations that are not outliers

five number summary

minimum, 1st quartile, median, 3rd quartile, maximum especially found in a boxplot

Chebyshev's Rule

for any number k >_ 1, at least 100(1-1/k squared)% of the observations in any data set are within k standard deviations of the mean; the percentage value is typically conservative in that the actual percentages often considerably exceed the stated lower

Empirical Rule

The rules gives the approximate % of observations w/in 1 standard deviation (68%), 2 standard deviations (95%) and 3 standard deviations (99.7%) of the mean when the histogram is well approx. by a normal curve

z score

tells how many standard deviations a value is from the mean; have a mean of zero and a standard deviation of one, calculated by (value-mean)/ Standard deviation, also makes a standardized score.

standardized score

A derived score that uses standard deviation units to indicate an individual's performance relative to the norming group's performance. To calculate z-score- mean/standard deviation.

rth percentile

a value such that r percent of the observations in the data set fall at or below that value