AP Stat Chapter 4

population

the entire group of individuals about which we want information

sample

part of the population from which we actually collect information. we use information from a sample to draw conclusions about the entire population.

the first step in planning a sample survey

to say what population we want to describe

second step in planning a sample survey

to say exactly what we want to measure

final step in planning a sample survey

to decide how to choose a sample from the population

convenience sample

choosing individuals who are easiest to reach, often produce unrepresentative data

the design of a statistical study shows bias if

if systematically favors certain outcomes

voluntary response sample

consists of people who choose thenselves by responding to a general appeal. voluntary response samples show bias because people with strong opinions (often in the same direction) are most likely to respond.

simple random sample

consists of x individuals from the population chosen in such a way that every set of x individuals has an equal chance to be the sample actually selected

table of random digits

a long string of digits 0-9 with these properties:
-each entry in the table is equally likely to be any of the 10 digits 0-9
-the entries are independent of each other. that is, konwledge of one part of the table gives no information about any other part

how to use an SRS using Table D

1. Label: give each member of the population a numerical label of the same length
2. Table: read consecutive groups of digits of the appropriate length from Table D

to select a stratified random sample

first classify the population into groups of similar individuals, called strata. then choose a separate SRS un each stratum and combine rhese SRSs to form the full sample

if the individuals in each stratum are less varied than the population as a whole

a stratified random sample can produce better information about the population than an SRS of the same size

to take a cluster sample

first divide the population into smaller groups. ideally, these clusters should mirror the characteristics of the population. then choose an SRS of the clusters. all individuals in the chosen clusters are included in the sample.

inference

the process of drawing conclusions about a populations on the basis of sample data

the first reason to rely on random sampling

to eliminate bias in selecting samples from the list of available individuals

the second reason to use random sampling

the laws of probability allow trustworthy inference about the population

margin of error

tells us how much sampling variability to expect

sampling errors

mistakes made in the process of taking a sample that could lead to inaccurate informatino about the population

sampling frame

list of individuals from which we will draw our sample, ideally it should list every individual in the population

undercoverage

occurs when some groups in the population are left out of the process of choosing a sample
ex: a sample survey of households will miss homeless people, prison inmates, and college students

nonresponse

occurs when an indivual chosen for the sample can't be contacted or refuses to participate

response bias

systematic pattern of incorrect responses in a sample survey
ex: Calvin says that he spends $500 a week on bubble gum

most important influence on the answers given to a sample survey

wording of questions

observational study

observes individuals and measures variables of interest but does not attempt to influence the responses

experiment

deliberately imposes some treatment on individuals to measure their responses

goal of an observational study

to describe a group or situation, to comoare groups, or to examine relationships between variables

purpose of an experiment

to determine whether the treatment causes a change in the response

when our goal is to understand cause and effect, __________ are the only source of fully convincing data

experiments

lurking variable

variable that is not among the explanatory or response variables in a study but may influence the response variable
ex: in car extending life example, amount of money is a lurking variable

confounding

occurs when two variables are associated in such a way that their effects on a response variable cannot be distinguished from each other

with no association between the lurking variable and the explanatory variable, there can/can't be confounding?

can't be

observational studies of the effect of one variable on another often fail because of...

confounding between the explanatory variable and one or more lurking variables

treatment

a specific condition applied to the individuals in an experiment. if an experiment has several explanatory variables, a treatment is a combination of specific values of thesr variables.

experimental units

smallest collection of indivuals to which treatments are applied

subjects

human being units

factors

another name for explanatory variables

multifactor experient

each treatment is formed by combining a specific value (often called a level) of each of the factors

experimental units--->treatment--->measure response

design of many laboratory experiments

badly designed experiments often yield worthless results because of

confounding

if treatments are given to groups that differ greatly when the experiment begins, ____ will result

bias

random assignment

experimental units are assigned to treatments at random, that is, using some sort of chance process

comparative design

compares two treatments

completely randomized design

treatments are assigned to all the experimental units completely by chance

primary purpose of a control group

to provide a baseline for comparing the effects of the other treatments

when can you not use a control group?

if you simply want to compare the effects of several treatments and not to determine whether any of them works better than an inactive treatment

principles of experimental design

1. CONTROL for lurking variables that might affect the response: use the comparative design and ensure that the only systematic differences between the groups is the treatment administered
2. RANDOM ASSIGNMENT: Use impersonal chance to assign experimental

placebo effect

reson to a dummy treatment

double-blind experiment

neither the subjects nor those who interact with them and measure the response variabke know which treatment a subject received

single-blind

the subjects are unaware of which treatment they are receiving, or the people interacting with them and measuring the response variable do not know

statistically significant

ann observed effect so large that it would rarely occur by chance

a statistically significant association in data from a well-designed experiment does/does not imply causation

does

block

group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments, form of control

randomized block design

the random assignment of experimental units to treatments is carried out separately within each block

matched pairs design

create blocks by matching pairs of similar experimental units and then use chance to decide which member of a pair gets which treatment

in a matched pairs design when each pair consists of one individual being treated twice, the order of treatments can/can't influence the response

can

individuals were randomly selected and assigned to groups

inference about cause and effect and the population

individuals were randomly selected but not randomly assigned to groups

inference about the population but not cause and effect

individuals were not randomly selected but randomly assigned to groups

inference about cause and effect but not the population

individuals were not randomly selected or assigned to groups

no inferences about population or cause and effect

lack of realism

limits our ability to appy the conclusions of an experiment to the settings of great interest

what are the criteria for establishing causation when we can't do an experiment?

1. the association is strong
2. the association is consistent. many studies of different kinds of people in many countries link smoking to lung cancer. that reduces the chance that a lurking variable specific to one group explains the association.
3. larg