Statistics Chapter 3

The relationship between two variable can be _______influenced by _____variables that are _____in the background.


Response Variable

measures an outcome of a study(dependent variable)

Explanatory Variable

helps explain or influences changes in a response variable (independent variable)

The ____ variable depends on the ______variable.

response, explanatory

Remember that calling one variable explanatory and the other response ______necessarily mean that changes in one _____ changes in the other.

dosent, cause

What is the most effective way to to display the relationship between two quantitative variables?

A scatterplot

Scatter plot

shows the relationship between two quantitative variables measured on the same individuals.


the overall pattern moves from upper left to lower right. positive or negative association

How is a strength of a relationship in a scatter plot determined?

by how closely the points follow a clear form

How do you interpret a scatter plot?

-look for the overall pattern and for striking deviations from that pattern
-you could describe the overall pattern of a scatter plot by the direction, form, and strength of the relationship
-An important kind of deviation is an outlier, an individual val

negatively associated

When above-average values of one tend to accompany below-average values of the other, and vice versa.

positively associated

when above average values of one tend to accompany above-average values of the other and below average values also tend to occur together

Our eyes are __ good judges of how strong a linear relationship is.


correlation measures

the direction and strength of the linear relationship between two quantitative variables.

Correlation is written as


correlation is not____

resistant, r is strongly affected by a few outlying observations, use r with caution if outliers appear

Correlation makes __distinction between explanatory and response variables.

no, it makes no difference which variable you call x and which variable you call y

Because r uses the standardized values of the observations, r does __change when we change the units of measurement of x, y, or both.

not, the correlation r itself has no unit of measurement, it is just a number

positive r indicates positive association between the variables and negative r indicates negative association


the correlation r is __a number between -1 and 1.


Correlation requires that both variables be _______so that it makes sense to do the arithmetic indicated by the formula for r.


Correlation measures the strength of only the linear relationship between two variables.

correlation does not describe curved relationships between variables, no matter how strong they are

correlation is not a complete summary of two-variable data

give means and standard deviations of both x and y along with the correlation


linear relationships


the strength of a relationship is determined by how close the points in the scatterplot lie to a simple form such as a line

regression line

summarizes the relationship between two variables

what does regression require?

you have an explanatory variable and a response variable

We use a regression line

to predict the value of y for a given value of x

Regression Line=



the amount by which y changes when x increases by one unit,

you ___say how important a relationship is by looking at how big the regression slope is



the use of the regression line for prediction outside the range of values of the explanatory variable x used to obtain the line. Such predictions are often not accurate

least-squares regression line

of y on x is the line that makes the sum of the squared vertical distances of the data points from the line as small as possible

least squares regression line equation =

^y =a+bx with slope b=r(standard deviation of y/ standard deviation of ) that passes through point (xbar,ybar)


The difference between the observed value of the response variable and the value predicted by the regression line residual= observed y- predicted y

what is the sum of the least squares residuals?


what do residual plots help you asses?

how well a regression line fits the data

the residual plot should should no obvious __.


Increasing(or decreasing) spread about the line as x increases indicates that prediction of y will be less accurate for larger x(smaller x)


Coefficient of Determination(r^2)

numerical quantity that tells us how well the least squares line does at predicting values of the response variable y.

you read slope as:

a change of one standard deviation in x corresponds to a change of r standard deviation in y

Correlation and regression describe only___relationships


An observation is influential for a statistical calculation

if removing it would markedly change the result of the calculation

Lurking Variable

A variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two

Correlations based on averages are usually too___ when applied to individuals
