Chapter 13 - Correlation & Linear Regression

When do we use statistical procedures such as correlation?

When variables are continuous rather than categorical

What are continuous variables?

The possible IV values can fall along a numeric continuum; they are more related

How can two continuous variables be distinguished from each other?

Labelling as X and Y

What is a scatter plot?

A graphical display used for paired scores on two variables

What can a scatter plot tell us?

Provides an initial indication of how scores on two variable are associated with each other

Why do we use a scatter plot?

To initially see if the scores on one variable are associated with scores on another variable

What does it mean when variables being analyzed are continuous in nature?

They are measured along a numeric continuum and may be illustrated using a scatter plot

What does a dot on a scatter plot represent?

A pair of scores for the two variables

What is correlation?

A mutual or reciprocal relationship between two variables such that changes in one variable are accompanied in changes in another variable

How do we describe the relationship between variables?

1. Nature
2. Strength
3. Direction

What is a bivariate relationship?

The relationship between two variables

What does correlation describe?

The relationship between two variables

Does correlation equal causation?

NO!

What are the 4 key assumptions of correlation?

1. Ratio or interval scale of measurement
2. Both variables are continuous
3. Each variable is more or less normally distributed
4. Linear relationship between variables

What is another name for the correlation coefficient?

Pearson product moment correlation

What is an example of a non-linear relationship?

Curvilinear

Nature of the relationship

Manner in which changes in scores on one variable correspond to changes in scores on another variable

Linear relationship

A relationship between variables is appropriately represented by a straight line, such that increases or decreases in scores for one variable are associated with corresponding increases or decreases in scores for another variable

Nonlinear relationsihp

A relationship between variables which is not appropriately represented by a straight line

What does the angle of the line indicate?

That changes in one variable are associated with changes in the other variable

What do you need in order to relate one variable to another?

There must be differences in the scores of BOTH variables

Which are more common: linear or nonlinear relationships?

Linear

Direction of the relationship

The direction in which changes in one variable are associated with changes in another

What is a positive relationship?

Increases in the scores for one variable are associated with increases in the scores of another variable

What is a negative relationship?

Increases in the scores for one variable are associated with decreases in the scores of another variable

What does a negative relationship imply? That the relationship is worse?

No, simply that the scores on two variables move in opposite directions

What does a positive relationship imply? That the relationship is better?

No, simply that the scores on the two variables move in the same direction

Strength of the relationship

The extent to which scores on one variable are associated with scores on another variable

What is a perfect relationship?

A relationship in which each score for one variable is associated with one and only one score for the other variable

What would a perfect relationship look like?

All of the data in a perfect relationship would fall exactly on the line; it allows you to perfectly predict values

What does having a strong relationship between two variables allow us to do?

Knowing the score on one variable allows us to predict the score on the other variable within a small range

What does the scatter plot look like as the relationship between two variables becomes weaker?

More spherical

What is a "zero" relationship?

Exists when al of the scores on one variable are associated with a wide range of scores on another variable; a circle on a scatter plot

What does it mean for prediction when there is no relationship between two variables?

Knowing the score on one variable does not allow us to predict the score on the other variable with any degree of precision

What are correlational statistics?

Statistics designed to measure relationships between variables

What is the Pearson correlation coefficient represented by?

The symbol r

What is the Pearson correlation coefficient?

Measures the linear relationship between two continuous variables measured at interval and/or ration level of measurement

What is the Pearson correlation coefficient designed to measure?

The nature, direction, and strength of the relationship between two variables

What does Pearson r assume?

That the relationship may be represented by a straight line, such that increases (or decreases) in one variable correspond to increases (or decreases) in the other variable

Do we use Pearson r for relationships that are nonlinear?

No

What does the sign (+ or -) of Pearson r indicate?

The direction of the relationship; can be either positive or negative

How is the strength of the relationship indicated by Pearson r?

It is represented by the numeric value of the correlation with the possible values ranging from -1.00 to 1.00

What is the numerical value of the correlation for a perfect relationship?

-1.00 or 1.00. This is the strongest relationship possible

What does a correlation coefficient of 0 indicate?

No relationship

How can we interpret the Pearson r coefficient?

Using Cohen's guidelines for interpreting values of Pearson r

What does a Pearson r between 0.10 to 0.30 indicate?

A positive weak relationship

What does a Pearson r between -0.10 to -0.30 indicate?

A negative weak relationship

What does a Pearson r between 0.30 to 0.50 indicate?

A positive moderate relationship

What does a Pearson r between -0.30 to -0.50 indicate?

A negative moderate relationship

What does a Pearson r between 0.50 to 1.00 indicate?

A positive strong relationship

What does a Pearson r between -0.50 to -1.00 indicate?

A negative strong relationship

If r exceeds + or - 0.50, what is the strength of the relationship?

Strong

What do the guidelines for interpreting the values of Pearson r not keep in mind?

1. Sample size
2. Statistical significance

What are correlational statistics based on?

Variance

What does it mean when both variables contain a certain amount of variance?

That there are differences among the scores for the two variables

What is covariance?

Refers to the extent to which two variables vary together such that they have shared variance

What is shared variance?

Variance in common with each other

What does measuring the relationship between two variables involve?

The extent to which two variables vary n their own and the extent to which they vary together

Can variables have both variance and covariance?

Of course!

What is Pearson r a ratio of?

The covariance between two variables to the variance of the two variables

What does it mean if the amount of covariance is high relative to the variance of the two variables?

A strong relationship will exist between the two variables, and the value for r will be high

What does it mean if the amount of covariance is relatively low?

There will be a weak relationship between the two variable as reflected by a smaller value of r

What will Pearson r be if there is absolutely no covariance between two variables?

0

What are the 3 steps to calculating Pearson r?

i. Represent the covariance between X and Y (SPxy)
ii. Represent the variance of Variable X (SSx)
iii. Represent the variance of Variable Y (SSy)

What is the covariance between X and Y represented by?

Sum of products (SPxy)

What does positive correlation coefficient mean?

One variable increases as the other increases

Wha does a negative correlation coefficient mean?

One variable increases as the other decreases

What does the correlation coefficient indicate?

The strength and direction of the relationship between two variables

When stating statistical hypotheses for correlation coefficients, how is Pearson r represented?

rho" or p

What are the overall steps for calculating r?

1. State the null and alternative hypotheses
2. Make a decision about the null hypothesis
3. Draw a conclusion from the analysis
4. Relate the result of the analysis to the
research hypothesis

What is the null hypothesis for correlation coefficients?

H0: p = 0
(There is no relationship between the two variables [i.e., population r is equal to 0])
Correlation between the two variables does not exist in the population!

What is the alternative hypothesis for correlation coefficients?

H1: p ? 0
(There is a relationship between the two variables
[population r is not equal to 0])
Correlation between the two variables does exist in the population!

What does Greek "rho" represent?

Correlation coefficient

What is the equation for degrees of freedom with correlation?

df = N - 2
N = number of sets of scores (not individual scores)

What does N represent in the df formula?

The number of pairs of scores involved in the
computation of the correlation

Can the statistical hypotheses for correlation bet two-tailed?

Yes - because correlation coefficients can be positive or negative

How is the variance of variable X represented?

Sum of squares X (SSx)

How is the variance of variable Y represented?

Sum of squares Y (SSy)

How do we calculate the effect size of Pearson r?

The square of the Pearson product-moment
correlation coefficient (r)

What does r^2 represent?

Shows the percentage of variance between the variables (based on Pearson r)

What is a small effect for r^2?

0.01

What is a medium effect for r^2?

0.09

What is a large effect for r^2?

0.25

What are the different sizes of effect for r^2?

Small, medium, large

What are the different strengths for Pearson r?

Weak, moderate, strong

What is linear regression?

The relationship between two variables; a straight line is fitted to a set of data to best represent the relationship between two variables

What is the purpose of linear regression?

Want to predict one variable from another

What is regression?

May be defined as the use of a relationship between two or more correlated variables to predict values of one variable from values of other variables

What is the goal of regression?

To predict one variable from other variables

What is the slope of a line?

The angle or tilt of the line relative to the X-axis (horizontal axis) which is also referred to as rise/run

What is the formula for a straight line?

Y = mx + b

What is the Y-intercept?

The point at which the line crosses the Y-axis when X = 0

What does the linear regression equation do?

An equation that predicts a score on one variable from a score on another variable based on the relationship between them

What is the linear regression equation?

Y' = a + bX

What does b represent?

Slope

What does a represent?

Y-intercept

What is Y' as opposed to Y?

Y' represents a predicted value for the Y variable rather than an actual Y value

What is the purpose of the linear regression equation?

To predict scores on one variable based on its relationship with another variable in a sample of data

What are the two main steps in calculating the linear regression equation?

1) Calculate slope (b)
2) Calculate Y-intercept (a)

What information do you need in order to solve for a predicted Y value (Y')?

The slope, Y-intercept, X value...

What are the assumptions of linear regression?

1. For each value of X there is an array of possible Y values
2. The mean of the distribution of possible Y values is on the regression line
3. The standard deviation of the distribution of possible Y values is constant regardless of the value of X

What is homoscedasticity?

This assumption means that the variance around the regression line is the same for all values of the predictor variable (X).

What is a violation of homoscedasticity?

Greater variability around the regression line violates homoscedasticity

What is research literacy?

Critically evaluate research and information to
determine best practices

What is the main purpose of correlation?

To measure and test the relationship between variables