Test Review

Upper Fence Formula

UF = Q3+1.5(IQR)

Lower Fence Formula

LF = Q1-1.5(IQR)

Relative Frequency

The counts turned into percentages

What will be the shape of the histogram when the mean is bigger than the median?

right skew

What will be the shape of the histogram when the mean is smaller than the median?

left skew

Z-Score Equation

Z=(x-�)/s

What happens to the mean and standard deviation when adding or subtracting a constant amount?

The mean changes by the same constant that is being added or subtracted, but the standard deviation stays the same

What happens to the mean and standard deviation when multiplying or dividing a constant amount?

Both the mean and standard deviation are multiplied/divided by the same constant

What is a percentile?

A number that indicates what percentage of data is at or below a given value. It compares one data point to the rest.

What is the 68-95-99.7 Rule?

68% of the values fall within one standard deviation of the data, 95% of the values fall within two standard deviations of the data, and 99.7% of the values fall within three standard deviations of the data.

When the mean and median are the same

Symmetrical

Categorical

a variable is categorical if the responses fall into categories, such as eye color

Quantitative

a variable is quantitative if the variable is numerical, such as height, and usually includes units to tell how it is measured

Categorical data can be presented in a..

1.Pie Chart
2. Bar Chart
3. Side-by-Side Bar Chart
4. Segmented Bar Chart

Quantitative data can be presented in a..

1. Boxplot
2. Histogram
3. Dot plot
4. Stem-and-Leaf Plot

Who?

Who was measured?
Case- each individual being measured (row)
Observation- data values recorded about each individual (actual number)

What?

What was measured?
Variables- characteristics recorded about each individual, it may change from one individual to another (categorical vs. quantitative)

Where?

Where was the study collected? The place where the data was collected

When?

When was the study performed?

Why?

Why was the study performed?

How?

How was the data collected?
Ex. sample surveys, observational studies, experiments

How to draw a "quick" boxplot

1. Draw a scale number line.
2. Draw rectangular box with ends at quartiles.
3. Draw line through box at median.
4. Draw two "whiskers" from corresponding ends of box to extreme values (min and max).
5. Find the fences to see if there are outliers

What is a histogram?

Displays a group of data at a glance through counts.

What is a relative frequency histogram?

Shows a group of data/information through percentages.

When describing a distribution, what four things should you always mention?

SOCS: Shape, outliers, center, and spread.

When describing the shape of a distribution what should you look for?

1. Mode (unimodal, bimodal, or multimodal).
2. If it is symmetric.
3. If there are any outliers.

What is the center of distribution?

The typical value that occurs, with a symmetric/unimodal histogram, it's directly in the center.

What is the spread of distribution?

It answers the question, "how much the data values vary around the center?", it is used to describe a distribution numerically. It can also be measured by range or IQR.

When is it appropriate to use a time plot?

When there is data measured over time, and you are looking for pattern.

What is re-expressing or transforming data?

When you apply a simple function to make skewed distribution more symmetric, it is to make information easier to understand, and to find the center.

Why is IQR a better indication of spread than range?

Range can be skewed by outliers, but since IQR only accounts for the middle 50%, it isn't skewed.

What does standard deviation demonstrate?

It is a measurement that is used to describe how far a set of values are from the mean.

How are stem and leaf plots and histograms similar?

They show individual COUNTS.

What is the formula used to to find the value of median in a data set?

(n+1)/2=th count (count this many terms into the data in order to find the actual VALUE of the median)

How does standardizing a variable affect the shape, center, and spread of its distribution?

Shape: does not change
Center: changes the the mean now equals 0
Spread: standard deviation now equals 1, so it shifts

What does a box plot show?

Minimum, Q1, Median, Q3, Maximum.
5 NUMBER SUMMARY!

What are disadvantages of box plots?

You can't see the shape in a box plot.
Exact values are not recorded in a box plot

What is SOCS?

Shape, Outliers, Center, Spread
Always define SOCS when writing sentences about graphs

What are advantages of box plots?

Clear picture of middle half of data
box height= IQR
Clear display of outliers

5-number summary is...

standard deviation, min, Q1, med, Q3, max

conditions of a normal distribution

#NAME?

How do you find the percentile of a value?

([Number of score lower + 0.5] � Total Number of scores) x 100

IQR is...

Q3-Q1

How many steps must you have when normalizing?

4

What does normalizing do?

Normalizing makes the data comparable to each other through a constant unit

How do you normalize?

Z-score

What is a z-score?

It tells you how far away a value is from the mean

How do you make a list on your calculator?

STAT, edit, enter

How do you find the mean, median, min, max, Q's, and Standard deviation on your calculator?

STAT, CALC, 1-Var Stats, (Pick List), ENTER

How do you make a dot plot?

Draw a horizontal line and mark it with an appropriate measurement scale. Locate each value in the data set along the measurement scale, and represent it by a dot.

What does truncating mean?

Truncating is making a value smaller/less complicated. ex: 3.09----->3.0

Standardizing

converting scores from their original values to standard deviation units
-a standardized value is AKA z score

Marginal Distribution

In the contingency table, the distribution of either variable alone

Conditional Distribution

The distribution of a variable restricting the Who to consider only a smaller group of individuals

Variance

sum of squared deviations from the mean, divided by the count minus one

Parameter

Numerically valued attribute of a model

Statistics

Value calculated from data to summarize the aspects of the data

Contingency Table

displays counts and,sometimes, percentages of individuals falling into named categories on two or more variable.

Independence

the conditional distribution for one variable is the same for each category of the other

Sample Mean

standard deviation divided by the square root of n

Skewed Right

Distributions with fewer observations on the right (toward higher values) are said to be skewed right

Skewed Left

Distributions with fewer observations on the left (toward lower values) are said to be skewed left.

Zip codes

Categorical

How do you make quantitative data categorical?

By grouping it

Where do the first standard deviations on a normal curve go?

At the inflection points

What are inflection points?

Where the graph shifts concavity

If the data is skewed right, the mean will be ________ than the median.

Greater

4 steps to do a normal distribution problem

1. State problem
2. Draw/shade a normal curve
3. Solve- use Table A
4. conclude- answer in the context of the problem

Segmented Bar Chart

displays the conditional distribution of a categorical variable within each category of another variable (always adds up to 100%)

Independence

Variables are independent if the conditional distribution of one variable is the same for each category of the other

Uniform

A distribution that is relatively flat

Outliers

Extreme values that do not appear to belong with the rest of the data

A stem and leaf always needs a _____ with it

key/legend

When is it appropriate to use a pie chart?

1. When the data adds to 100
2. When there are 5 categories or less
3. When data can apply to one category only

When is it more appropriate to use a histogram rather than a stem-and-leaf display

When you have a lot of values in your data

What does sigma represent?

The standard deviation

What does Mu (?) represent?

The mean

Where are on the normal curve are inflection points located?

Where the bell shape changes from curving downward to curving back up.

Three methods for assessing whether or not a distribution is approximately normal:

1. Probability Plot
2. Histogram/Stem and leaf plot
3. 68-95-99.7 rule

Why are standardized units used to compare values with different scales, units or populations?

Because standardized values have no units - it simply measures the distance of data from the mean in standard deviations

What is the relationship between variance and standard deviation?

Variance is the sum of squared deviations from the mean while standard deviation is the square root of the variance.

Equation for finding the count?

(n+1)/2