Standard Deviation as a ruler
Standard deviation tells us how the whole collection of values varies
Standardized values (z)
AKA z-scores. No units. Measures the distance of each data value from the mean in standard deviations.
Meaning of pos/neg z-scores
Negative z-score = data value is below the mean, positive z-score = data value is above the mean
Benefits of standardizing units
Can compare values that are measured on different scales, with different units, or different populations
Shifting data
Add/subtract a constant amount to each value; measures of spread and shape of graph (histograms, box plots) unchanged
Rescaling data
Multiply/divide the data values by constant value; different bin lengths for histograms
How standardization to z-scores works
Shift the data by subtracting the mean, rescales the values by dividing by standard deviation
Effect of standardizing to graphs
Shape doesn't change, changes the center by making the mean to 0, and changes the spread by making the standard deviation to 1
Larger z-score
The larger the z-score, the more unlikely it is (super big, or super small than the mean)
Normal model
AKA Bell shaped curves. Graphs with distributions whose shapes are unimodal and roughly symmetric. Provide a measure of how extreme a z-score is.
Parameters of the normal graph
N(?, ?), mean and standard deviation, respectively
Standard Normal model (distribution)
N(0,1) graph
Nearly Normal Condition
The shape of the data's distribution is unimodal and symmetric; checked with a histogram
Normal cdf
Enter lower bound, upper bound, mean, standard deviation (in this order). Use to find percentage from the z-score.
Inverse Norm
Enter percentage, mean, standard deviation (in this order). Use to find z-score. For the range of percentage, it is the range that is left of the z-score line.
68?95?99.7 Rule
Percentage of values that lie within a band around the mean in a normal distribution with a width of two, four and six standard deviations
Normal probability plot
Used to assess normality. If the points lie close to a line, the plot indicates that the data are approximately Normal.
Associations
Scatterplot; the relationship between one variable to another
Direction
Scatterplot; Positive or negative? Moving up or down?
Form
Scatterplot; linear relationship? Exponential? Logarithmic?
Strength
Scatterplot; Strong, weak or none - is it like a line or not?
Unusual features of a scatterplot
Look for the unexpected; outliers, clusters (subgroups), etc.
Explanatory/predictor variable
x axis of the scatterplot
Response variable
y axis of the scatterplot
Standardizing scatterplots
The axes would represent the mean of both variables; gives a neutral way of drawing the scatterplot and a fairer impression of the strength of the association
Correlation coefficient (r)
Gives us a numerical measurement of the strength of the linear relationship between the explanatory and response variables.
Finding the correlation coefficient (r)
List �> calc, LinReg (ax+b). Remember to turn diagnostics on (use catalog to activate).