Which of the following would be LEAST affected if a data set contained one outlier?
median
The numerator in the Bayes' rule formula is a
marginal probability
Gender would be an example of
a qualitative variable
Which of the following distributions would be likely to model the number of trucks arriving at
a warehouse in a particular time interval?
Poisson
A variance can never be
negative
The expected value of a random variable
must be an observable value
Which of the following is not a discrete random variable?
The number of minutes required to run 1 mile
Statistical Inference
is the science of using a sample to make generalizations about the important aspects of a population.
binomial distribution
Each trial results in a success or failure.
Trials are independent of each other.
The probability of success remains constant from trial to trial.
The experiment consists of n identical trials.
A probability distribution that describes the time or space between successive
occurrences of an event is an
Exponential probability distribution
A ratio variable has the following characteristic:
Inherently defined zero value
The set of all possible outcomes for an experiment is called a
Sample space
The area under the curve of a valid continuous probability distribution must
equal 1
The median is said to be resistant to
extreme values
If the mean, median, and mode for a given population all equal 25, then we know that the shape of the distribution of the population is
Symmetrical
If the mean is greater than the median, then the distribution is
Skewed right
In an observational study, the variable of interest is called a
response variable
The two types of quantitative variables are:
Interval and ratio
Jersey numbers of soccer players is an example of a
Nominative variable
The number of miles a truck is driven before it is overhauled is an example of a
Ratio variable
Beginning the vertical scale of a graph at a value different from zero
can cause increases to look more dramatic.
The stem-and-leaf display is advantageous because
it allows us to actually see the measurements in the data set.
All of the following are used to describe quantitative data
Histogram
Stem-and-leaf chart
Dot plot
A Pareto chart
can be used to differentiate the vital few causes of quality problems from the trivial many causes of quality problems.
A histogram that tails out toward larger values is
Skewed to the right
If events A and B are mutually exclusive, then
P(A|B) is always equal to zero.
Bayes Theorem uses prior probabilities with
additional information to compute posterior probabilities.
If two events are independent, we can
Multiply their probabilities to determine the intersection probability.
A Conditional probability
is the probability that one event will occur given that we know that another event already has occurred.
The mean and the variance of a Poisson random variable are
equal
The mean and median are the same for a
normal distribution
The actual weight of hamburger patties is an example of a
continuous random variable
The z value tells us the number of standard deviations
that a value x is from the mean.
The exponential distribution
would most likely be used to describe the distribution of time between arrivals of customers at the grocery store.
If we have a sample size of 100 and the estimate of the population proportion is .10,
we can estimate the sampling distribution of p hat with a normal distribution.
The reason sample variance has a divisor of n minus 1 rather than n is that
it makes the variance an unbiased estimate of the population variance.
The Central Limit Theorem states that
as sample size increases, the population distribution more closely approximates a normal distribution.
If the sampled population distribution is skewed,
then in most cases the sampling distribution of the mean can be approximated by the normal distribution if the sample size n is at least 30.
As the sample size increases
the variation of the sampling distribution of X bar decreases
Whenever the population has a normal distribution, the sampling distribution of X bar
is normal or near normal distribution for any sample size
If we wish to estimate a population parameter by using a sample statistic, we are using
Point estimation
When the level of confidence and sample standard deviation remain the same
a confidence interval for a population mean based on a sample of n = 100 will be narrower than a confidence interval for a population mean based on a sample of n = 50.
When determining the sample size n for a confidence interval for ?, if you have no idea what value p is
use p = .5.
The t distribution approaches the z distribution as
the sample size increases
When determining the sample size, if the value found is not an integer initially, you should
always choose the next highest integer value.
When the sample size and the sample proportion P hat remain the same, a 90 percent confidence interval for a population proportion p will be
Narrower than
If everything else is held constant, decreasing the margin of error causes the required sample size to
increase
In testing the equality of population variances, two assumptions are required
independent samples and normally distributed populations
In testing the difference between two population variances, it is a common practice to compute the F statistic
so that its value is always greater than or equal to one.
When comparing two population means based on independent random samples,
the pooled estimate of the variance is used when there is an assumption of equal population variances.
An experiment in which there is no relationship between the measurements on the different samples is an
Independent samples experiment
In testing the equality of population variance, what assumption should be considered?
Independent samples and Normal distribution of the populations
When using a randomized block design, the interaction effect between the block and treatment factors
cannot be separated from the error term.
Different levels of a factor are called
Treatments
In one-way ANOVA, other factors being equal, the further apart the treatment means are from each other
the more likely we are to reject the null hypothesis associated with the ANOVA F test.
A Randomized block
design is an experimental design that compares v treatments by using d blocks, where each block is used exactly once to measure the effect of each treatment.
The dependent variable, the variable of interest in an experiment, is also called the
response variable
The residual is the difference between the
observed value of the dependent variable and the predicted value of the dependent variable.
In simple regression analysis, r2 is a
percentage measure and measures the proportion of the variation explained by the simple linear regression model.
When using simple linear regression, we would like to use confidence intervals for the
Mean y-value and prediction intervals for the individual y-value at a given value of x
In a simple linear regression analysis, the correlation coefficient and the slope
always have the same sign
For a given data set, specific value of X, and confidence level, if all the other factors are constant, the confidence interval for the mean value of Y will
never be wider than the corresponding prediction interval for the individual value of Y.