Stats Final

Which of the following would be LEAST affected if a data set contained one outlier?

median

The numerator in the Bayes' rule formula is a

marginal probability

Gender would be an example of

a qualitative variable

Which of the following distributions would be likely to model the number of trucks arriving at
a warehouse in a particular time interval?

Poisson

A variance can never be

negative

The expected value of a random variable

must be an observable value

Which of the following is not a discrete random variable?

The number of minutes required to run 1 mile

Statistical Inference

is the science of using a sample to make generalizations about the important aspects of a population.

binomial distribution

Each trial results in a success or failure.
Trials are independent of each other.
The probability of success remains constant from trial to trial.
The experiment consists of n identical trials.

A probability distribution that describes the time or space between successive
occurrences of an event is an

Exponential probability distribution

A ratio variable has the following characteristic:

Inherently defined zero value

The set of all possible outcomes for an experiment is called a

Sample space

The area under the curve of a valid continuous probability distribution must

equal 1

The median is said to be resistant to

extreme values

If the mean, median, and mode for a given population all equal 25, then we know that the shape of the distribution of the population is

Symmetrical

If the mean is greater than the median, then the distribution is

Skewed right

In an observational study, the variable of interest is called a

response variable

The two types of quantitative variables are:

Interval and ratio

Jersey numbers of soccer players is an example of a

Nominative variable

The number of miles a truck is driven before it is overhauled is an example of a

Ratio variable

Beginning the vertical scale of a graph at a value different from zero

can cause increases to look more dramatic.

The stem-and-leaf display is advantageous because

it allows us to actually see the measurements in the data set.

All of the following are used to describe quantitative data

Histogram
Stem-and-leaf chart
Dot plot

A Pareto chart

can be used to differentiate the vital few causes of quality problems from the trivial many causes of quality problems.

A histogram that tails out toward larger values is

Skewed to the right

If events A and B are mutually exclusive, then

P(A|B) is always equal to zero.

Bayes Theorem uses prior probabilities with

additional information to compute posterior probabilities.

If two events are independent, we can

Multiply their probabilities to determine the intersection probability.

A Conditional probability

is the probability that one event will occur given that we know that another event already has occurred.

The mean and the variance of a Poisson random variable are

equal

The mean and median are the same for a

normal distribution

The actual weight of hamburger patties is an example of a

continuous random variable

The z value tells us the number of standard deviations

that a value x is from the mean.

The exponential distribution

would most likely be used to describe the distribution of time between arrivals of customers at the grocery store.

If we have a sample size of 100 and the estimate of the population proportion is .10,

we can estimate the sampling distribution of p hat with a normal distribution.

The reason sample variance has a divisor of n minus 1 rather than n is that

it makes the variance an unbiased estimate of the population variance.

The Central Limit Theorem states that

as sample size increases, the population distribution more closely approximates a normal distribution.

If the sampled population distribution is skewed,

then in most cases the sampling distribution of the mean can be approximated by the normal distribution if the sample size n is at least 30.

As the sample size increases

the variation of the sampling distribution of X bar decreases

Whenever the population has a normal distribution, the sampling distribution of X bar

is normal or near normal distribution for any sample size

If we wish to estimate a population parameter by using a sample statistic, we are using

Point estimation

When the level of confidence and sample standard deviation remain the same

a confidence interval for a population mean based on a sample of n = 100 will be narrower than a confidence interval for a population mean based on a sample of n = 50.

When determining the sample size n for a confidence interval for ?, if you have no idea what value p is

use p = .5.

The t distribution approaches the z distribution as

the sample size increases

When determining the sample size, if the value found is not an integer initially, you should

always choose the next highest integer value.

When the sample size and the sample proportion P hat remain the same, a 90 percent confidence interval for a population proportion p will be

Narrower than

If everything else is held constant, decreasing the margin of error causes the required sample size to

increase

In testing the equality of population variances, two assumptions are required

independent samples and normally distributed populations

In testing the difference between two population variances, it is a common practice to compute the F statistic

so that its value is always greater than or equal to one.

When comparing two population means based on independent random samples,

the pooled estimate of the variance is used when there is an assumption of equal population variances.

An experiment in which there is no relationship between the measurements on the different samples is an

Independent samples experiment

In testing the equality of population variance, what assumption should be considered?

Independent samples and Normal distribution of the populations

When using a randomized block design, the interaction effect between the block and treatment factors

cannot be separated from the error term.

Different levels of a factor are called

Treatments

In one-way ANOVA, other factors being equal, the further apart the treatment means are from each other

the more likely we are to reject the null hypothesis associated with the ANOVA F test.

A Randomized block

design is an experimental design that compares v treatments by using d blocks, where each block is used exactly once to measure the effect of each treatment.

The dependent variable, the variable of interest in an experiment, is also called the

response variable

The residual is the difference between the

observed value of the dependent variable and the predicted value of the dependent variable.

In simple regression analysis, r2 is a

percentage measure and measures the proportion of the variation explained by the simple linear regression model.

When using simple linear regression, we would like to use confidence intervals for the

Mean y-value and prediction intervals for the individual y-value at a given value of x

In a simple linear regression analysis, the correlation coefficient and the slope

always have the same sign

For a given data set, specific value of X, and confidence level, if all the other factors are constant, the confidence interval for the mean value of Y will

never be wider than the corresponding prediction interval for the individual value of Y.