qbus ch.14

an independent variable, x, explains variation (which is a fancy word for change) in another variable which is called the

dependent variable, y

the sample correlation coefficient, r,

indicates both the strength and direction of the linear relationship between the independent and dependent variables

the size of a television screen

the selling price of the television screen

the number of visitors per day on a web site

the amount of sales per day from the web site

the curb weight of a car

the cars gas mileage

the population correlation coefficient (p)

refers to the correlation between all values of two variables of interest in a population

the technique of simple regression analysis

enables us to describe a straight line that best fits a series of ordered pairs (x,y)

the difference between the actual data values and the predicted values is known as

residual

the least squares method is

a mathematical procedure used to identify the linear equation that best fits a set of ordered pairs

the regression line

is the line that best fits the data

the total sum of squares (SST)

measures the total variation in the dependent variable

the sum of squares error (SSE)

measures the variation in the dependent variable that is explained by variables other than the independent variable

the sum of squares regression (SSR)

measures the amount of variation in the dependent variable (Exam grade) that is explained by the independent variable (hours of study)

the sample coefficient of determination, R2

measures the percentage of the total variation of our dependent variable that is explained by our independent variable from a sample

the population coefficient of determination, P2

measures for an entire population the percentage of the total variation of a dependent variable that is explained by an independent variable

the standard error of the estimate, se1

measures the amount of dispersion of observed data around a regression line

the first assumption of a regression analysis is that the relationship between the independent and dependent variables

is linear

the second assumption of a regression analysis is that the residuals exhibit no patterns across values for the

independent (age) variable

homoscedasticity

is a regression assumption that states that the variation of the dependent variable is the same across all values for the independent variable

a normal probability plot

is used to verify if data follows the normal probability distribution by graphing the data on the y-axis and the z-scores for the data on the x-asis

just because the linear relationship between the variables is statistically significant doesn't prove that independent variable actually

caused the change in the dependent variable