Exam 1 Intro to Statistics

Suppose every student in a class is surveyed and it is found that 75% of the class plans to take another math class. it is reported that 75% of all students at the school plan to take another math class. Is this an example of descriptive or inferential st

A. Inferential statistics: the results of the class sample are described without making any generalizations about the population of all students.
B. Inferential statistics: the results of the class sample are extended to make a generalization about the po

Categorical

The variable "eye color" is an example of what type of variable?

Coding

The process of representing the categorical variables with number (such as letting a 1 represent "smoker" and a 0 represent "non-smoker") is called?

A stats class is made up of 12 men and 25 women. What percentage of the class is male?

32.4%
(12 + 25= 37/12=3)

A class has 271 students and 46.5% of them are men. How many men are in the class?

126 males
(46.5/100=0.465 271*0.465=126)

A class is made up of 54% women and has 15 women in it. What is the total number of students in the class?

28 Students
(54%/100%=0.54 15/0.54=28)

Two sections of statistics are offered, the first at 8am and the second at 10am. the 8am section has 25 women and the 10am section has 15 women. A student claims this is evidence that women prefer earlier statistics classes than men do. What information i

A. The professor may be female in one class and the male in the other, which could affect the female students' class preference.
B. The age and class standing of the students in unknown. older female students may prefer late classes, so it may be only you

Why are percentages or rates often better than counts for making comparisons?

A. They are more accurate than counts
B. They take into account possible differences among the sizes of the groups
C. They are more statistically significant than counts
D. Percentages allow us to compare groups that are not similar.
(B)

A student watched picnickers with a large cooler of soft drinks to see whether teenagers were less likely than adults to choose diet soft drinks over regular soft drinks.
Is this study Observational or Controlled

An observational study

A group of boys is randomly divided into two groups. One group watches violent cartoons for one hour, and the other group watches cartoons without violence for one hour. The boys are then observed to see how many violent actions they take in the next two

A controlled experiment

The outcome variable in a question about causality is also referred to as what?

The response variable

Of the following, which is the only method of data collection suitable for making conclusions about causal relationships?

A. Observational Studies
B. Controlled experiments
C. Anecdotes
D. All three are suitable
(B)

What is an identifying mark of an observational study?

Subjects in the study are put into treatment group or the control group either by their own actions or by the decision of someone not involved in the research study

What is the difference between two groups in an observational study that can explain why the outcomes were very different between the groups is called?

A confounding variable

is obtained by dividing the population into homogeneous groups and randomly selecting individuals from each group

Stratified sample

A simple random sample is always preferred because it obtains the same information as other sampling plans but requires a smaller sample size. True or False?

False, because other sampling techniques may provide more information for less cost than a simple random sample.

A frequency distribution lists the ________ of each category of data, while a relative frequency distribution lists the ________ of occurrences of each category of data.

Number & Proportion

After constructing any relative frequency distribution, what should be the sum of the relative frequencies?

1 or 100%

A distribution of a variable in which most of the values are relatively small but that also has a few very large values is called

Right-skewed

What distribution has mostly values that are relatively small but also has a few very large values. (this makes the graphical representation of the graph appear to have a tail that extends to the right)

Right-skewed distribution

The existence of multiple mounds in a distribution is sometimes a sign of which of the following?

A. The graph of the distribution was drawn incorrectly
B. All the values in the data are centered around one typical value
C. Two very different groups have been combined into a single collection
D. The data is not from a random sample
(C)

Values so large or so small that they do not fit into the pattern of the distribution are called what?

Outliers

What are two commonly used graphs to display the distribution of a sample of categorical data?

Bar graph and pie chart

What are Pareto charts?

They are bar charts that are sorted from most frequent to least frequent

Which of the following is not a difference between bar charts and histograms?

A. In a bar chart, sometimes the order in which you place the bars doesn't matter
B. A bar chart is used for numerical variables while a histogram is used for categorical variables
C. In a bar chart, it doesn't matter how wide or narrow the bars are.
D. I

True or False?
A bar chart is used for numerical variables while a histogram is used for categorical variables.

True

When describing the distribution of a categorical variable, the category that appears most often is called the_____?

Mode

A categorical variable is only called bimodal under what circumstances?

A. Two categories are nearly tied for most frequent outcomes
B. One category occurs more frequently than any other
C. The data consists of exactly two categories
D. Two categories have exactly the same frequency
(A)

What is the most common trick to mislead readers of bar graphs?

A. Change the scale of the horizontal axis so that it does not start at 0.
B. Change the scale of the vertical axis so that is does not start at 0.
C. Change the width of the bars to make some categories appear more important than others
D. Change the col

The mean represents the typical value in a set of data for what type of distribution?

A. For all distributions
B. For distributions that are roughly symmetric
C. For distributions that are bimodal
D. For distributions that are skewed
(B)

Statistics

Is the science of collecting, organizing, summarizing and analyzing data to answer questions and or draw conclusions.

Population

Collection of all data values that ever will occur for a group.

Sample

A subset of the population, represents the population at large and is easier to obtain this information.

Categorical

Describes a quality or class; can be numbers, but no arithmetic possible

Numerical

Describes a quantity or measurement

Coded data

Using numbers to record categorical data

Give an example of using coded data.

0= No, 1=Yes

Stacked data

Each row contains data for a single individual

Unstacked data

Each column is a variable from a different group; can only store data for tow variables.

Two-Way Tables

Displays results from two potentially related variables; shows combinations of the various variables

Treatment variable

Whether or not a specific treatment is used

Outcome(response) variable

Whether or not a certain outcome is seen

Treatment Vs. Outcome variable

To determine if the treatment variable causes a change in the outcome variable

Treatment group

Group that receives the treatment (or has the characteristic of interest)

Control group

Group that does not receive the treatment (or does not have the characteristic of interest)

Placebo Effect

Reacting to a treatment after being told you are receiving the treatment when you aren't

Controlled Experiment

Researchers assign subjects to a treatment group that can show causality

What are the four key features to Controlled Experiments?

...

Confounding Variables

A variable that has not been accounted for but which is causing a difference in the groups being studied

What are the three main components of a Numerical Distribution?

1. Shape
2. Center
3. Variability

Describe Histograms (3 things)

1. Group data into intervals, called bins
2. Count how many data values fall into each bin
3. Each rectangle has to have
> consecutive bins touch
> first value in each bin recorded in horizontal axis
> The height of each rectangle corresponds to the count

According to the Empirical Rule ______ will be within two standard deviations of the mean.

approximately 95% of the observations

The Empirical Rule applies to distributions that are

Symmetric and unimodal

If an observation has a z-score of 0, this means which of the following?

A. The z-score was computed incorrectly
B. The observation is equal to the standard deviation
C. The observation is equal to the median
D. The observation is equal to the mean
(C)

Name two measures of the center of a distribution, and state the conditions under which each is preferred fro describing the typical value of a single data set.

Median and mean

Under what conditions is the median preferred?

A. The median is preferred when there are many data points
B. The median is preferred when the data is strongly skewed or has outliers
C. The median is preferred when the data is relatively symmetric
D. The median is preferred when there are few data poin

Under what conditions is the mean preferred?

A. The mean is preferred when there are many data points
B. The mean is preferred when there are few data points
C. The mean is preferred when the data is relatively symmetric
D. The mean is preferred when the data is strongly skewed or has outliers
(C)

When a distribution is skewed, the _____ is used to measure the center and the _____ is used to measure variation.

Median, Interquartile range