PSYC 325 Research Test 1

Scientific inductivism

one observes nature, proposes a law to generalise an observed
pattern, confirms it by many observations, while discarding
disconfirmed laws

According to scientific inductivism...

science is based on observation, and the acceptance and rejection of
possible hypotheses based on these observations

Inductive reasoning makes

broad generalisations from specific observations ie. there are data,
then conclusions are drawn from the data

Example of inductivism

Swans I've seen are white, so I draw the conclusion that all swans
are white (jump in logic)

Falsificationism is based on...

based on deduction: you are trying to reject something


we cannot confirm hypotheses, only falsify them

Deductive inference

hold a theory and based on it we make a prediction of its
consequences (we predict what the observations should be if the theory
were correct). From general (theory) to specific (the observations)

Inductive inference

from specific (data) to general (theory)

Deductive hypothesis testing

Begin with a hypothesis (eg. all swans are white), and then collect
data. Data can falsify (eg. run across black swan).

Hypothesis testing t-test

P-value is below .05
We "reject the null hypothesis that there is no difference
between the two groups"
Falsificationism: we have "failed to falsify our
hypothesis" ie. "we have found support for our hypothesis

Null hypothesis:

no difference between groups
if our hypothesis is supported: we want to reject this

P-value is above .05We "fail to reject to the null hypothesis"
Isn't significant, so can't prove that we reject (our hypothesis
is supported) or fail to reject null hypothesis

Cyclical process

Cyclical process: general priciple

you create a formal inductive rule

Cyclical process: deduction

you recognise that your informal observations aren't rooted in
science, so you desire to test your hypothesis

Cyclical process: prediction

you make a formal deductive hypothesis

Cyclical process: specific instances: individual events

you systematically collect data in an effort to falsify your
hypothesis. Based on your findings, you will probably revise your hypothesis

Cyclical process: Observations

you casually notice lots of white swans at a nearby lake

Cyclical process: induction

your use of logic leads you to believe that swans are white

Deductive/inductive hypothesis testing cycle

1. observations
2. induction
3. general principle
4. deduction
5. prediction

Descriptive statistics

eg. frequencies, means, and SDs Aligned with inductive purposes
because one can discern patterns

Inferential statistics

eg. t-tests, manovas
Does statistical test reject or fail to reject the null hypothesis

What is the role of insights, observations, and theories in setting
up a study?

They can play a large role in constructing inductive theories or
hypotheses. Qualitative research can fulfil this role.

How do we collect data that will be useful in elucidating a hypothesis?

Design a good study using strong methodology that prevents threats to
internal validity (i.e., biases such as social desirability)

How do we analyse that data so that we end up with good evidence?

Statistical methods that illuminate associations between variables or
differences between groups. If performed with inferential statistics,
these results can be considered reliable and valid.

How do we combine and collect the results of numerous studies in
irder to make good conclusions?

Meta-analyses: combination of numerous studies by independent
researchers, to make valid conclusions about the data. eg. most
studies rejected the null hypothesis, or most studies accepted the
null hypothesis

Essential goal of ethics:

don't harm people who are helping you

Essential concepts of ethics

stress and psychological harm deception
informed consent debriefing privacy and
confidentiality care of animals costs and

Physical harm

One should warn the participants, and he/she has the right to refuse,
or to stop participation

Psychological harm

stress that is above that experiences on a daily basis is considered
to be too much (eg. seeing pictures of dead bodies; finding oneself in
a possible building fire; seeing explicit sexual images)

Stress and psychological harm

One can use these manipulations, but one should forewarn participants
of excessive stress to allow them to withdraw. As long as you gain
informed consent, ethics committees can approve that study.
Participants must know all details in order to give informed consent.


telling someone something that is not true, or leaving out something
that they should know

Harmless/ful deception

these photos are of convicted murderers vs. false feedback that tells
a subject that they are deficient, flawed, or abnormal.
Must only be harmless deception in psychological research

Even if you tell the subject later that you deceived them...

the manipulation may still have an unintended lasting effect. Person
may distrust psychologists afterwards, and form a bad opinion of research.

Informed consent

All participants should ideally be given an accurate description of
what they will experience in the study, and have the opportunity to
decide whether they wish to participate or not.
Must also be told that they can cease participation at any time
without penalty.

Informed consent + deception

You cannot describe the study accurately if you are performing
deception. Researchers usually deceive by omitting information or
being vague


Anonymous: one need not sign a consent form, participation is taken
as consent
Not anonymous: mandatory to obtain informed consent


Participant should be debriefed: told about the precise nature of the study

Why debrief?



the experimenter doesn't know who contributed data


the experimenter knows who contributed data, but will not tell anyone
else. Experimenter protects identities

Why not make everything anonymous?

Not always feasible eg. interviewing families over time-- get to know
them intimately. Can store the data separately from the list of names,
and then destroy the list after it is no longer needed.

Aggregated data (grouped)

No identification of specific individuals


Used for individuals who are quoted or described. Or initials.

Special populations:

anyone younger than 16 years of age, elderly persons or anyone with a
cognitive deficit/ mental disorder should have a guardian sign for them
prisoners are a special case too

Who speaks for special populations?

IRB (Institutional Review Board) in U.S
Ethics Committees in NZ
Review applications to determine whether individuals are
sufficiently protected

NZ ethics

New Zealand Psychological Society's Ethics Code

Animal Research

All universities have an "ethical treatment of lab animals"
code of ethics- must be treated in a "humane" fashion, no
unnecessary suffering

Costs v benefits

Each researcher must consider the balance of costs to participants
vs. benefits to society

Fraud in research

Changing data to get predicted results (replications expose the truth)

Theory is made up of ... whereas observations are based on ...

constructs, data
We understand constructs through capturing data that represents constructs


Formal representation of constructs

Conceptual variables and operationalizations

Can't measure constructs directly because they are hypothetical, so
have to measure them indirectly through variables

Conceptual variables and operationalizations example

Wellbeing (construct): (operationalizations) 5-item scale, no. of
smiles, brain scan

Three types of operationalization

1. self-report
2. observational
3. physiolo

Conceptual variables and operationalizations example
Does savouring predict wellbeing?

We want to conduct research based on variables that are...

reliable and valid

We want to conduct research based on variables that are reliable and
valid. Why?

Then we have confidence that they are fairly representing the
construct and not something else.


a measurement tool that consistently generates a similar empirical estimate

Most-least reliable variables:

stable demographic variables: most
psychological variables rooted in personality: mood
quickly changing and highly variable variables such as mood: lowest

How do we assess reliability?

most measures of test-retest reliability are simply correlations of
scores for the same individuals at two or more points-in-time
value depends on situation eg. you don't want mood measure to
yield high correlation but you you want gender to be very high

Types of reliability

-test-retest variability: correlation over time for the same individuals
-internal reliability (Cronbach's Alpha): average level of
intercorrelation among all of the items

Cronbach's alpha

close to 1: excellent
below 0.5: unacceptable

How to find a Cronbach's a

Algebraic equation that combines number of items, average variance,
and average covariance to come up with final numerical value
More items increases alpha. Higher average increases alpha.

Improving internal reliability?

You can remove items if they improve the overall alpha, esp. to
shorten the scale

Test-retest reliability

whether the scale yields similar numerical values for the SAME
low reliability: might mean scale is psychometrically poor, or
that your phenomenon is just inherently unstable

Internal reliability

how internally consistent the items of the measure are.
high cronbach's alpha: indicates that the items on the scale tend
to correlate with each other to a high degree

A good scale:

will evidence reasonable stability over time, and it will be
internally consistent


our scale measures what we intend it to measure

Types of validity

content validity
criterion validity
construct validity

Content validity

Do the items on the scale relate to or tap the overall construct?
Does the following item assess what you are measuring

Criterion validity

to what extent does the scale predict expected outcomes?

Construct validity

to what extent does the scale measure the intended hypothetical
construct (scale, not indiv. items. pay attention to definition
of construct)

More types of validity

Convergent validity, discriminant validity

Convergent validity

measures the extent to which the scale in question correlates with
scales that assess something similar

Discriminant validity

extent to which a scale does NOT correlate with scales that are
expected to be unrelated
looking for NON-significance, not a negative correlation (eg.
comparing to an opposite scale would be convergent validity)

Why are reliability and validity good?

We want "good scales," and these are defined as scales
possessing reliability and validity
-we want our scales to RELIABLY produce a similar score for the
same individuals for attributes that don't change much and those that
change moderately
-we want our scales to measure what they are intended to measure,
and nothing else.

If you are using a pre-existing scale, you need to be assured that:

internal reliability is acceptable test-retest
reliability is good the items of the scale seem to capture
the intended construct (content validity) it has been shown
to predict expected outcomes (criterion validity) it has
been shown to correlate with similar scales (convergent
validity) it does not correlate with dissimilar scales

Construct validity is the...

highest order, most abstract type of validity, and can only be
demonstrated through repeated demonstrations that the scale represents
the intended construct in numerous and various contexts-- the real world
good construct validity if numerous studies evidence all of the
above-mentioned types of validity

Types of variables/ scales of measurement

Nominal variables (categorical: classical nominal, ordinal variables,
interval (continuous variables), ratio variable

nominal variables

numerical variables that indicate membership within a particular
group eg. men= 0, women= 1, other= 2

ordinal variables

based on rankings- only feasible with relatively small groups of comparisons

interval (continuous) variables

variables with numerous obtained numerical values between the maximum
and minimum

ratio variable

like interval variables, but has a true zero point.
minimum numerical value has a special meaning

In psychology, the most common type of variable is... Why?

Continuous/interval. Many statistical tests (t-test, ANOVA,
regression, etc.) rely on assumptions of equal spacing between points
on a scale and normal distributions. Interval and ratio data are more
likely to achieve normality than other types (it is impossible for
nominal and ordinal

Other types of analyses must be used if your outcome variable is...

nominal or ordinal

nominal or ordinal variables use:

non-parametric tests (can use parametric but only as predictors or IVs)

Ratio and interval variables use:

parametric tests

parametric test

based on normality

Self report measures advantages

-who knows better than the individual in question?
particularly useful for internal beliefs, attitudes, and emotions
that are not evident to other people (eg. depression, anxiety,
mindfulness, intentions)
-easier and more efficient
and maybe more accurate than obtaining observations of the
person, other people's reports, or neuropsycological indices

self-report measures problems

awareness/memory response set/bias format of
the question questions tailored for samples

levels of measurement

Categorical/ nominal, ordinal/ ranked, continuous: interval, ratio

Yes/no binary pros and cons

good for children/ simple, very restrictive, lacks richness that
other data can give you, limits participant responses


gives participants lots of freedom and range, doesn't constrain
anything, good for qualitative studies: when you're not worried about
numbers, not good for children: creative answers not useful sometimes,
wording must give you some sort of data that is useful to you

Fill-in-the-blank produces... data

categorical data in more of a variable form

Likert produces... data

interval/ ordinal?

Multiple choice pros and cons

gives participants options, but don't give them all of the options,
must be mindful of what answers you offer

Multiple choice produces... data


Yes/no binary produces... data


response scales

yes/no (binary) fill-in-the-blank multiple
choice Likert

Use of "don't know" in a survey

Must weigh up need for data because it is nice to give option
Sometimes "don't know" response is just as useful

Graded boxes

good for kids who aren't good with putting emotions into words

Smiley/frowny faces scale

commonly used to asses liking/disliking
also use to assess pain
easy for participants to respond to this because they're relevant
to lots of people: everyone knows what a smiley/frowny face is
BUT there is still ambiguity: guy in middle

Creative" formats:

good for unique populations (children, low literacy)
veg. visual analogues: good for pain- blank line, cursor along line

Problems with digital administration

Can't skim forward and backward quickly and easily
Can't quickly determine how far through the survey you are
Fonts can be small and hard to read Tied to a screen
(typically a desk computer or laptop, although tablets and
smartphones can work well depending on the survey platform)
Computers can die and data can be lost

Why digital administration?

Point-an-click is easier and faster than using pen or pencil on
paper Can compile data (in Excel or SPSS formats) very
quickly and without error Can create a survey (a set of
self-report questionnaires) more easily, and can edit it more
easily Almost everyone has a screen to read the questions
(although smartphones have small screens) Can enact
"skip and branched" more easily than in paper surveys

Item wording: what to avoid (examples?)

complexity technical terms ambiguity
double-barreled questions double-negatives
emotive language leading questions invasion of
privacy sensitive topics with young people

Why sample?

It's usually not practical to obtain data from your entire
population, so take data from representative population and generalise outwards

External validity

to what degree can you generalise the findings to a larger group?

Sampling frame

If you can't afford to sample everyone in your population, focus on a
subset of the population to draw your sample
eg: population: children in NZ
sampling frame: children in Wellington
sample: a subset of children in Wellington

Population and sample examples

Probability Sampling tends to be...

expensive, time-consuming, but it's better than non-probability sampling

types of probability sampling

simple random, stratified random, cluster

Simple random sampling:

every person in the population has an equal chance of being sampled

stratified random sampling:

divide population along dimensions (eg. gender, SES, ethnicity, etc.)
and be sure that you sample proportionately across these dimensions

cluster sampling:

obtain participants from pre-existing groups or clusters. Try to get
a random sample of clusters

Non-probability sampling tends to be...

cheaper and easier, but you worry about representativeness

Types of nonprobability sampling:


convenience sampling:

sample from readily available sources. handy for the researcher.
biases are introduced.

quota sampling:

obtain appropriate percentages of different types of participants
(eg. gender, ethnicity(, but one is still obtaining these participants
from readily available sources

purposive sampling

you select individuals who fit within a particular category to fit a purpose

snowball sampling

recruit an initial group of participants, and then you obtain
referrals from them to obtain data from their friends and
acquaintances. Useful for rare types of participants (e.g surfers).

Most research psychologists use... because...

non-probability sampling, because it costs a lot of money, time, and
effort to obtain probability samples

What to pay attention to samples when you read research:

who is the most commonly sampled population in psychology research?
are we missing out?

Who is the most commonly sampled population in Psychology research?

western educated industrial rich democratic
(+ lots of uni students)

Ethics bias

if you're studying children and adolescents and only obtain about 60%
parental permissions, what are the other 40% like?

self-selection bias

When you have a low response rate, who are the participants you get?

Nonrepresentativeness of the sample bias

when the sampling frame significantly differs from the population,
you have introduced biases


nonrepresentativeness of the sample, self-selection bias, ethics

creative approaches

passive ethical consent for children and adolescents
compensation and inducements interesting ways to
collect data: laptops or iPads; internet; testing on cell phones;
diary studies; etc. interviews underutilised
samples: eg. from school to after-school program, to avoid survey
fatigue (people don't want to answer lots of surveys) and increase

The role of technology

increases our access to information:
-surveys over internet
-observations of naturally occurring behaviour
-through cell phones and tablets (multi-media portable computers)
-through surveillance of one's use of technological devices

Event Sampling Method (ESM)

captures data on an hourly, daily, or weekly basis
good for rapidly changing variables on relatively small
"the experience sampling methods, also referred to as daily
diary method, is an intensive longitudinal research methodology that
onvolves asking participants to report on their thoughts, feelings,
behaviours, and/or environment on multiple occasions over time

Advantages of ESM

capturing phenomena nearer the time that they occur: better
memory for events, feelings, and thoughts obtaining
multiple assessments of variables of interest, and more assessments
of a construct yield better reliability and validity can
identify contexts for important psychological states

Problems with ESM:

difficult to recruit and retain individuals who don't mind that
their day is interrupted by signals to report states
participants must be comfortable with the recording device
lots of missing data difficult to analyse this type of
data: numerous repeated measures

Statistical power

Probability of not making a Type 2 error
The more statistical power we have in our design, the better our
decision making

Type 1 error (alpha)

Incorrect decision: we reject null hypothesis, but null hypotheses
was true

Type 2 error (beta)

Incorrect decision: we do not reject null hypothesis, but null
hypothesis is not true

Values of beta

from 1 (perfect ability to avoid this error) to 0 (totally wrong all
of the time). Ideally power is on high side (.80 or so)

Type 2: Not rejecting null hypothesis when it is false: Null
hypothesis is false

this means that there probably actually is a difference between your
means- you'd find a difference most of the time. Your test is one of
the only times you didn't find a difference.

Type 2: Do not reject null hypothesis when it is false: fail to reject

We accept the null hypothesis, and say that there is no difference
between the means (our p-value is non-significant: above .05), when in
fact there is a difference (null hypothesis is false)

A type 2 error is "not rejecting the null hypothesis when it is
false." Translation

in the world, there is a true difference but your statistical test
yielded a p-value greater than .05, so you mistakingly do NOT reject
the null hypothesis.

Greater statistical power comes from...

larger sample sizes

Options for missing values in a dataset

Ignore the missing values and allow SPSS to perform listwise or
pairwise deletions
Impute the missing value with a proper method

Listwise deletion

an analysis drops only those participants that have a missing value
for a variable involved in the analysis

Pairwise deletion

If you conduct correlations on a variety of variables that are
missing different value, you get different Ns.

Types of imputation

MI (multiple imputation), EM (expectation maximisation), and FIML,
(full information maximum likelihood)

MI (multiple imputation)

can be computed in SPSS but it is unwieldy because it generates only
a single dataset

EM (expectation maximisation)

also can be computed in SPSS, and generates only a single dataset

FIML (full information maximum likelihood)

used in structural equation modelling

Why is imputation good?

it increases the number of participants who have complete data, so
your sample size reaches its maximum: it increases power
More power= decreases chance that you'll mistakenly accept the null hypothesis