Math 120 Unit 1 Test Review

Define statistics.

The science of collecting, organizing, summarizing, and analyzing information to draw a conclusion and answer questions. Providing a measure of confidence in any conclusion.

A(n)______is a person or object that is a member of the population being studied.

Individual

________ statistics consist of organizing and summarizing information collected, while ________ statistics uses methods that generalize results obtained from a sample to the population and measure the reliability of the results.

Descriptive, Inferential

A ________ numerical summary of a sample. A ________ numerical summary of a population.

Statistic, Parameter

________ are the characteristics of the individuals of the population being studied.

Variables

A sample of professors is selected and it is found that 55% own a computer.

Statistic because the value is a numerical measurement describing a characteristic of a sample.

The average annual salary of 50 of a company's 800 employees is $54,000.

Statistic because the data set of salaries of 50 employees is a sample.

National survey substance abuse, 66.4% of respondents who were full-time college students aged 18 to 22 reported using alcohol within the past month.

The value is a statistic because the respondents who were full time college students aged 18 to 22 are a sample.

Goals scored in a hockey game.

Quantitative, numerical measure

Height

Quantitative, numerical measure

Favorite song

Qualitative, attribute classification

Number of beats in a song

Discrete, countable

Weight carried by a soldier in combat

Continuous, not countable

Number of aces held in a poker hand

Discrete, countable

The rankings of songs in the top 100

Ordinal

Years of elections: 1988, 1990, 1992, 1994, and 1996

Interval

Highest degree conferred (high school, bachelors, and so on)

Ordinal

Movie ratings with 5 stars

Ordinal

The gallup organization contacts 2290 undergraduates who attended a university and live in the US and asks wether or not they had spent more than $200 on food in the last month.
What is the population in the study?
What is the sample?

Undergrads who attend a university and live in the US.
The 2290 undergrads who attend a university and live in the US.

A quality-control manager randomly selects 20 bottles of bleach that were filled on November 11 to assess the calibration of the filling machine.
What is the population in the study?
What is the sample in the study?

Bottles of bleach produced in the plant on November 11.
20 Bottles of bleach produced in the plant on November 11.

Study of widescreen high-definition Televisions
Setup: A, B, C, D, E
Size (in): 59, 41, 55, 55, 45
Screen type: Plasma, Projection, Projection, Projection, Plasma
Number of Channels: 298, 113, 425, 269, 290
What are the individuals?
What are the variables

The setups A through E of widescreen high-def TVs
Size (59, 41, 55, 55, 45) screen type (Plasma, Projection, Projection, Plasma) and Number of channels (298, 113, 425, 269, 290)
Continuous
Qualitative
Discrete

If a researcher wanted to describe countries based on ISBN group identifier, what level of measurement would "ISBN group identifier" be?
Now researchers felt that certain countries with greater populations received higher identifying numbers. Does the lev

Nominal
Yes, changes to ordinal

What is an observational study?

Measures the value of the response variable without attempting to influence the value of either the response or explanatory variables.

What is a designed experiment?

When a researcher assigns individuals to a certain group, intentionally changing the value of an explanatory variable and then recording the value of the response variable for each group.

What is meant by confounding?

Occurs when the effect of 2 or more explanatory variables are not separated. Any relation that exists between an explanatory variable and the response variable may be due to some other variable or variables not accounted for in the study.

What is a lurking variable?

Explanatory variable that was not considered in a study, but that affects the value of the response variable in the study. Related to explanatory variables in the study.

50 patients with brain cancer are divided into two groups. One group receives an experimental drug to fight cancer, the other a placebo. After two years, the spread of cancer is measured.

Experiment because the researchers control one variable to determine the effect on the response variable.

5th grade students randomly divided into two groups. One group is taught math using traditional techniques. The other taught math using a reform method. After 1 year, each group is given an achievement test to compare its proficiency with that of the othe

Experiment because the researchers control one variable to determine the effect on the response variable.

What does it mean when sampling is done without replacement?

Once an individual is selected, the individual cannot be selected again.

A(n) _________ is obtained by dividing the population into groups and selecting all individuals from within a random sample of the groups.

Cluster sample

A(n) ___________ is obtained by dividing the population into homogeneous groups and randomly selecting individuals from each group.

Stratified sample

True or False?
A SRS is always preferred because it obtains the same information as other sampling plans but requires a smaller sample size.

False because other sampling techniques achieve alternative goals better than a simple random sample.

True or False?
Inferences based on voluntary response samples are generally not reliable.

True because it is often the case that the individuals who volunteer do not accurately represent the group.

What kind of sampling?
To estimate the percentage of defects in a recent manufacturing batch, a quality control manager at Ford selects every 11th truck that comes off the assembly line starting with the 4th until she obtains a sample of 140 trucks.

Systematic

What kind of sampling?
To determine customer opinion of their safety features, Toyota randomly selects 40 dealerships during a certain week and surveys all customers visiting the dealerships.

Cluster

What kind of sampling?
IBM wants to administer a satisfaction survey to it's current customers. Using their customers database the company randomly selects 50 customers and asks them about their level of satisfaction with the company.

Simple random

What kind of sampling?
A newspaper asks 18 readers to call in their opinion regarding the number of books they have read this month.

Convenience

What kind of sampling?
To determine her stress level, Jean divides up her day into 3 parts; Morning, afternoon, and evening. She then measures her stress level at 4 randomly selected times during each part of the day.

Stratified

What does it mean when a population is under-reprresented?

When it is proportionally smaller in a sample than in its population.

Nonsampling error

Results from the process of obtaining the data

Sampling

Results because a sample is being used to estimate information about a population

Owner of a shopping mall...research survey the first 80 customers who come into the food court during weekend afternoons to determine the types of food the shoppers would like to see.
What is the cause of the bias?
What is a remedy to the problem?

Sampling bias
Ask customers throughout the day on both weekdays and weekends.

A polling organization conducts a study to estimate the percentage of households that has more than one computer. It mails a questionnaire to 1369 randomly selected households across the US and asks about the computers. Of the 1369 households selected, 32

Nonresponse

Magazine conducting a study on the effects of infidelity in a marriage. Editors randomly select 400 women whose husbands were unfaithful and ask "Do you believe a marriage can survive when the husband destroys the trust that must exists between husband an

Response bias
Reword the question

Survey regarding violence among teenagers in LA school districts. Cluster sample of 21 schools within the LA school district and sample all sophomore students at the schools.
What kind of bias does this study have?

Response bias

Phone survey about stamp design, if not home they call again.
What tactic is used to increase response rate?

Callbacks

Poll being conducted at a mall to obtain a sample of the population of an entire country.
What is the frame for this type of sampling?
Who is excluded?

People who shop at the mall
Any person who does not shop at the mall could result in sampling bias due to undercoverage.

Survey class about pizza topping SRS of 30 people. However 10 don't participate.
Does this affect the ability to obtain accurate polling results?

Yes, especially if the people who don't want to participate have a trait that is not accurately represented by the remaining people in the class.

What are the advantages of having a presurvey with open ended questions to assist in constructing a questionnaire that has close questions?

Researchers can learn common answers.

Nominal level of measurement

If the values of the variable name, label, or categorize. The naming scheme does not allow for the values of the variable to be arranged in a ranked specific order.

Ordinal level of measurement

If it has the properties of the nominal level of measurement, the naming scheme allows for the values of the variable to be arranged in a ranked specific order.

Interval level of measurement

If it has the properties of the ordinal level of measurement and the differences in the values of the variable have meaning. A value of zero does not mean the absence of the quantity. Arithmetic operations such as addition and subtraction can be performed

Ratio level of measurement

If it has the properties of the interval level of measurement and the ratios of the values of the variable have meaning. A value of zero means the absence of the quantity. Arithmetic operations such as multiplication and division can be performed on the v

The sum of deviations about the mean always equals

Zero

The standard deviation is used in conjunction with the _______ to numerically describe distributions that are bell shaped. The ______ measures the center of the distribution, while the standard deviation measures the _________ of the distribution.

Mean, Mean, Spread

True or False: When comparing two populations, the larger the standard deviation, the more dispersion the distribution has, provided that the variable of interest from the two populations has the same unit of measure.

True, because the standard deviation describes how far, on average, each observation is from the typical value. A larger standard deviation means that observations are more distant from the typical value and therefore more dispersed.

True or False: Chebyshev's inequality applies to all distributions regardless of shape, but the empirical rule holds only for distributions that are bell shaped.

True, Chebyshev's inequality is less precise than the empirical rule, but will work for any distribution, while the empirical rule only works for bell-shaped distributions.

What makes the range less desirable than the standard deviation as a measure of dispersion?

The range does not use all the observations.

A frequency distribution lists the _______ of occurrences of each category of data, while a relative frequency distribution lists the _______ of occurrences of each category.

Number, proportion

In a relative frequency distribution, what should the relative frequencies add up to?

One

What is a bar graph?

Horizontal or vertical representation of the frequency or relative frequency of the categories. The height of each rectangle represents the category's frequency or relative frequency.

What is a pareto chart?

Bar graph whose bars are drawn in decreasing order of frequency or relative frequency.

If the results of a survey were claimed to indicate 7.7% of adults in this country who owned a camera plan to replace it within the next year. Is this inferential or descriptive and why?

Inferential, because the survey reports on a sample of the country's population. Requires an inference to the population.

Why shouldn't classes overlap when summarizing continuous data in a frequency or relative frequency distribution?

Classes shouldn't overlap so there is no confusion as to which class an observation belongs.

_______ are the categories by which data are grouped.

Classes

The ____ is the smallest value within the class and the _______ is the largest value within the class.

Lower class limit, Upper class limit

The ______ is the difference between consecutive lower class limits.

Class width.

Wat does it mean if a statistic is resistant?

Extreme values (very large or small) relative to the data do not affect its value substantially

A histogram of a set of data indicates that the distribution of the data is skewed right. Which measure of central tendency will likely be larger, the mean or the median and why?

The mean will likely be larger because the extreme values in the right tail tend to pull the mean in the direction of the tail.

True or False: A data set will always have exactly one mode?

False

In a certain city, the average 20-29 year old man is 69.6 inches tall, with a standard deviation of 3.2 inches, while the average 20-29 year old woman is 64.3 inches tall, with a standard deviation of 3.9 inches. Who is relatively taller, a 85 inch man or

The z score for the man is larger, so he is relatively taller.

When is the IQR preferred? And what is an advantage of the standard deviation?

IQR is preferred when the data are skewed or have outliers. An advantage of the standard deviation is that it uses all the observations in its computation.

Mean

Center of gravity. When data are quantitative and the frequency distribution is roughly SYMMETRIC.

Median

Divides the bottom 50% of the data from the top 50% of the data. Skewed LEFT or RIGHT.

Mode

Most frequent observation. When the most frequent observation is the desired measure of central tendency or the data are qualitative.