Political Science 3


Measures the effect of some variable (what we've been calling the independent variable, X), on another (what we've been calling the dependent variable, Y). We are now just changing up the vocabulary a bit-in terms of our causal inference framework, we wil

Causal effect

Assuming that all explanatory factors are identical, a causal effect measures the effect of some "treatment" (i.e., it compares the effect between some units that receive the treatment and other units that do not receive the treatment).

Fundamental problem of causal inference

We can't observe both worlds despite the fact that our definition of the causal effect requires it. We observe either the outcome we are interested in or the counterfactual, but never both at the same time. It is impossible to observe both the effect of

Four ways of thinking about potential causation

1) Regularity: is X correlated with Y? This approach examines correlation between supposed causes and effects. Correlation does not imply causation, however, no correlation gives stronger evidence that there is no causation.
2) Counterfactual: is not X co

Four hurdles to the causal relationship

1) Credible mechanism: what links X and Y? WE WANT THIS!
2) Covariation: is there a relationship between X and Y? WE WANT THIS!
3) Reverse causation: is it possible that instead Y causes X? WE DON'T WANT THIS!
4) Confounding variables (Z): does Z cause X

Research design

These are the strategies scientists use to test whether X does indeed cause Y-to overcome all four of the causal hurdles.

What are the two approaches to designing research?

1) Experimental design
2) Observational study


A research design in which the researcher both controls and randomly assigns values of the independent variable to the subjects. CONTROL and RANDOM ASSIGNMENT form a necessary and sufficient definition of an experiment.

What does it mean to say that a researcher "controls" the value of the independent variable that the subjects receive?

The values of the independent variable that the subjects receive are NOT determined by the subjects themselves, or by nature. This would produce confounding variables. The researchers must therefore decide themselves.

What does it mean to say that a researcher "randomly assigns" the value of the independent variable that the subjects receive?

The researcher must control not only the values of the independent variable, but also randomly assigns those values to subjects by using a mechanism that ensures the subjects are divided into a treatment group and a control group, i.e. tossing a coin or d

What is the benefit of randomization?

Random assignment ensures that the comparison we make between the treatment group and the control group is as pure as possible, and that some other cause (Z) of the dependent variable (Y) will not pollute that comparison. By taking a group of subjects and

Why are experiments said to have high internal validity?

Because they deal with confounding variables, the inferences we make about whether X causes Y or not are likely to be correct. Randomization allows us to claim that differences between the control and treatment groups are due to the treatment itself and n

What are internal and external validity?

Internal validity essentially addresses the question of whether we have unbiased estimates. That is, are there confounders we must worry about? Do we have reverse causation? External validity essentially addresses the question of whether our estimates app

Which types of research designs tend to have higher internal/external validity?

Experiments have higher internal validity than observational studies; laboratory experiments have higher internal validity than field experiments. Observational studies and field experiments tend to have higher external validity than laboratory experiment

What are the four types of experiments?

1) Laboratory experiment: where subjects are recruited to a common location, where the experiment is conducted, and the researcher controls the location's environment except for subjects' behavior. Example: testing different voting rules to see their effe

What is a natural experiment?

Where nature acts in a way that approximates the way a researcher would have intervened.

What is a "selection effect"?

No control for who gets assigned treatment because subjects self-select themselves into treatment and control group.

What are the four drawbacks to experiments?

1) Not all X's are subject to EXPERIMENTAL MANIPULATION, i.e. a researcher cannot "assign" subjects such values as gender, wealth, age, political regime, etc. 2) Also, experiments do not require a random sample of the target population, but a sample of co

What is an observational study?

A research design in which the researcher does not have control over values of the independent variable, which occur naturally. There is no random assignment of the treatment. We worry a lot about selection bias here, but if we can show that this isn't an

Can we demonstrate causality without experiments?

No. As a rule, it is more difficult to draw valid causal inferences from observational studies than experiments. Because experiments control for confounding variables, they overcome the fourth causal hurdle and allow researchers to prove causation. Observ

Unit of analysis

The sort of phenomena that constitutes cases in a given research context, i.e. countries, parties, individuals (basically whoever or whatever is being studied)


All the cases that an inference is said to apply to


The case(s) chosen for the study, referred to collectively. Want to apply results to broader population, so samples of convenience are problematic (not random samples).


An element of a case


Any observation that is intended to provide independent evidence of a proposition

N or n

The total number of observations in a given context (usually comprising the sample)

What are the two types of observational studies?

1) Cross-sectional: variation through space at one point in time (time stays constant, unit varies)
2) Time-series: variation through time in one spatial unit (unit stays constant, time varies)

What is the problem of measurement?

We need to be confident that the concepts in our theory correspond as closely as possible to our empirical observations. If we want to do a good job evaluating whether X causes Y, then we need to do a precise job measuring both X and Y. If we are sloppy i

Why do we have to rely on potentially imperfect measures of the concepts we care about?

The relationship that we care about most (X and Y) is one we cannot directly observe. We have to operationalize these variables.

What are the four issues of measurement?

1) Conceptual clarity: what is the exact nature of the concept we're trying to measure?
2) Reliability: does repeating the measurement yield the same result? An operational measure of a concept is said to be reliable to the extent that it is repeatable or

Differentiate between face validity and content validity.

Face validity: does the measure seem at first blush to get at the concept?
Content validity: does the measure reflect all the essential elements of the concept?

What are the two questions we need to ask about data once measurements have been conducted?

1) What do "typical" values for a variable look like?
2) How tightly clustered (or widely dispersed) are these values?

What is a variable's measurement metric?

The type of value that the variable takes on-we can think of each variable in terms of its label and its values.

What are a variable's label and values?

The label is the description of the variable, such as gender of survey respondent. The values are the denominations in which the variable occurs, such as male or female.

What are the three types of variables?

1) Categorical, I.e. Religion
2) Ordinal, I.e. Survey point scales (1=disagree, 2=neutral, 3=agree)
3) Continuous, I.e. Income

Explain categorical variables.

Categorical variables are variables for which cases have values that are either different or the same as the values for other cases, but whose values cannot be naturally rank-ordered from least to greatest. Example: religious identification-values could i

Explain ordinal variables.

Ordinal variables are also variables for which cases have values that are either different or the same as the values for other cases. But the distinction between ordinal and categorical variables is that we can rank-order across the values for ordinal var

Explain continuous variables.

An important characteristic that ordinal variables do not have is equal unit differences. The metric in which we measure a variable has equal unit differences if a one-unit increase in the value of that variable indicates the same amount of change across

What are equal unit differences?

A one-unit increase in the value of the variable always means the same thing. Continuous variables have equal unit differences, but ordinal variables do not.

What are the two types of descriptive statistics that are most relevant in the social sciences?

1) Measures of central tendency tell us about typical values for a particular variable.
2) Measures of variation (or dispersion) tell us about the distribution (or spread, or range) of values that it takes across the cases for which we measure it.
So diff

What are the three measures of central tendency?

Mode, median, mean

Define each.

Mode: most frequently occurring value
Median: value that falls in center after all values are ranked from smallest to largest (when we have an even number, we average the value of the two center-most ranked cases)
Mean: the average value

What is the measure of variation/dispersion?

Standard deviation

What is the standard deviation?

The most intuitive measure of variance. It is the average difference between the values of Y and the mean of Y.

Which measures of central tendency can we use for continuous variables?

Mode, median, mean

Which measures of central tendency can we use for ordinal variables?

Mode, median

Which measures of central tendency can we use for categorical variables?


What are outliers?

Cases for which the value of the variable is extremely high or low relative to the rest of the values for that variable.

Which types of distributions are the median and mean appropriate for?

The median is appropriate for skewed distributions with outliers. The mean is appropriate for symmetric distributions without outliers.

What are the advantages and drawbacks to experiments?


What are the advantages and drawbacks to observational studies?