Scientific inductivism

one observes nature, proposes a law to generalise an observed

pattern, confirms it by many observations, while discarding

disconfirmed laws

According to scientific inductivism...

science is based on observation, and the acceptance and rejection of

possible hypotheses based on these observations

Inductive reasoning makes

broad generalisations from specific observations ie. there are data,

then conclusions are drawn from the data

Example of inductivism

Swans I've seen are white, so I draw the conclusion that all swans

are white (jump in logic)

Falsificationism is based on...

based on deduction: you are trying to reject something

Falsificationism

we cannot confirm hypotheses, only falsify them

Deductive inference

hold a theory and based on it we make a prediction of its

consequences (we predict what the observations should be if the theory

were correct). From general (theory) to specific (the observations)

Inductive inference

from specific (data) to general (theory)

Deductive hypothesis testing

Begin with a hypothesis (eg. all swans are white), and then collect

data. Data can falsify (eg. run across black swan).

Hypothesis testing t-test

P-value is below .05

We "reject the null hypothesis that there is no difference

between the two groups"

Falsificationism: we have "failed to falsify our

hypothesis" ie. "we have found support for our hypothesis

Null hypothesis:

no difference between groups

if our hypothesis is supported: we want to reject this

Isn't significant, so can't prove that we reject (our hypothesis

is supported) or fail to reject null hypothesis

Cyclical process

Cyclical process: general priciple

you create a formal inductive rule

Cyclical process: deduction

you recognise that your informal observations aren't rooted in

science, so you desire to test your hypothesis

Cyclical process: prediction

you make a formal deductive hypothesis

Cyclical process: specific instances: individual events

you systematically collect data in an effort to falsify your

hypothesis. Based on your findings, you will probably revise your hypothesis

Cyclical process: Observations

you casually notice lots of white swans at a nearby lake

Cyclical process: induction

your use of logic leads you to believe that swans are white

Deductive/inductive hypothesis testing cycle

1. observations

2. induction

3. general principle

4. deduction

5. prediction

Descriptive statistics

eg. frequencies, means, and SDs Aligned with inductive purposes

because one can discern patterns

Inferential statistics

eg. t-tests, manovas

Does statistical test reject or fail to reject the null hypothesis

What is the role of insights, observations, and theories in setting

up a study?

They can play a large role in constructing inductive theories or

hypotheses. Qualitative research can fulfil this role.

How do we collect data that will be useful in elucidating a hypothesis?

Design a good study using strong methodology that prevents threats to

internal validity (i.e., biases such as social desirability)

How do we analyse that data so that we end up with good evidence?

Statistical methods that illuminate associations between variables or

differences between groups. If performed with inferential statistics,

these results can be considered reliable and valid.

How do we combine and collect the results of numerous studies in

irder to make good conclusions?

Meta-analyses: combination of numerous studies by independent

researchers, to make valid conclusions about the data. eg. most

studies rejected the null hypothesis, or most studies accepted the

null hypothesis

Essential goal of ethics:

don't harm people who are helping you

Essential concepts of ethics

stress and psychological harm deception

informed consent debriefing privacy and

confidentiality care of animals costs and

benefits

Physical harm

One should warn the participants, and he/she has the right to refuse,

or to stop participation

Psychological harm

stress that is above that experiences on a daily basis is considered

to be too much (eg. seeing pictures of dead bodies; finding oneself in

a possible building fire; seeing explicit sexual images)

Stress and psychological harm

One can use these manipulations, but one should forewarn participants

of excessive stress to allow them to withdraw. As long as you gain

informed consent, ethics committees can approve that study.

Participants must know all details in order to give informed consent.

Deception

telling someone something that is not true, or leaving out something

that they should know

Harmless/ful deception

these photos are of convicted murderers vs. false feedback that tells

a subject that they are deficient, flawed, or abnormal.

Must only be harmless deception in psychological research

Even if you tell the subject later that you deceived them...

the manipulation may still have an unintended lasting effect. Person

may distrust psychologists afterwards, and form a bad opinion of research.

Informed consent

All participants should ideally be given an accurate description of

what they will experience in the study, and have the opportunity to

decide whether they wish to participate or not.

Must also be told that they can cease participation at any time

without penalty.

Informed consent + deception

You cannot describe the study accurately if you are performing

deception. Researchers usually deceive by omitting information or

being vague

Anonymity

Anonymous: one need not sign a consent form, participation is taken

as consent

Not anonymous: mandatory to obtain informed consent

Debriefing

Participant should be debriefed: told about the precise nature of the study

Why debrief?

#NAME?

Anonymous:

the experimenter doesn't know who contributed data

Confidential:

the experimenter knows who contributed data, but will not tell anyone

else. Experimenter protects identities

Why not make everything anonymous?

Not always feasible eg. interviewing families over time-- get to know

them intimately. Can store the data separately from the list of names,

and then destroy the list after it is no longer needed.

Aggregated data (grouped)

No identification of specific individuals

Pseudonyms

Used for individuals who are quoted or described. Or initials.

Special populations:

anyone younger than 16 years of age, elderly persons or anyone with a

cognitive deficit/ mental disorder should have a guardian sign for them

prisoners are a special case too

Who speaks for special populations?

IRB (Institutional Review Board) in U.S

Ethics Committees in NZ

Review applications to determine whether individuals are

sufficiently protected

NZ ethics

New Zealand Psychological Society's Ethics Code

Animal Research

All universities have an "ethical treatment of lab animals"

code of ethics- must be treated in a "humane" fashion, no

unnecessary suffering

Costs v benefits

Each researcher must consider the balance of costs to participants

vs. benefits to society

Fraud in research

Changing data to get predicted results (replications expose the truth)

Plagiarism

Theory is made up of ... whereas observations are based on ...

constructs, data

We understand constructs through capturing data that represents constructs

Operationalisation

Formal representation of constructs

Conceptual variables and operationalizations

Can't measure constructs directly because they are hypothetical, so

have to measure them indirectly through variables

Conceptual variables and operationalizations example

Wellbeing (construct): (operationalizations) 5-item scale, no. of

smiles, brain scan

Three types of operationalization

1. self-report

2. observational

3. physiolo

Conceptual variables and operationalizations example

Does savouring predict wellbeing?

We want to conduct research based on variables that are...

reliable and valid

We want to conduct research based on variables that are reliable and

valid. Why?

Then we have confidence that they are fairly representing the

construct and not something else.

Reliability

a measurement tool that consistently generates a similar empirical estimate

Most-least reliable variables:

stable demographic variables: most

psychological variables rooted in personality: mood

quickly changing and highly variable variables such as mood: lowest

How do we assess reliability?

most measures of test-retest reliability are simply correlations of

scores for the same individuals at two or more points-in-time

value depends on situation eg. you don't want mood measure to

yield high correlation but you you want gender to be very high

Types of reliability

-test-retest variability: correlation over time for the same individuals

-internal reliability (Cronbach's Alpha): average level of

intercorrelation among all of the items

Cronbach's alpha

close to 1: excellent

below 0.5: unacceptable

How to find a Cronbach's a

Algebraic equation that combines number of items, average variance,

and average covariance to come up with final numerical value

More items increases alpha. Higher average increases alpha.

Improving internal reliability?

You can remove items if they improve the overall alpha, esp. to

shorten the scale

Test-retest reliability

whether the scale yields similar numerical values for the SAME

INDIVIDUALS over time

low reliability: might mean scale is psychometrically poor, or

that your phenomenon is just inherently unstable

Internal reliability

how internally consistent the items of the measure are.

high cronbach's alpha: indicates that the items on the scale tend

to correlate with each other to a high degree

A good scale:

will evidence reasonable stability over time, and it will be

internally consistent

Validity

our scale measures what we intend it to measure

Types of validity

content validity

criterion validity

construct validity

Content validity

Do the items on the scale relate to or tap the overall construct?

Does the following item assess what you are measuring

Criterion validity

to what extent does the scale predict expected outcomes?

Construct validity

to what extent does the scale measure the intended hypothetical

construct (scale, not indiv. items. pay attention to definition

of construct)

More types of validity

Convergent validity, discriminant validity

Convergent validity

measures the extent to which the scale in question correlates with

scales that assess something similar

Discriminant validity

extent to which a scale does NOT correlate with scales that are

expected to be unrelated

looking for NON-significance, not a negative correlation (eg.

comparing to an opposite scale would be convergent validity)

Why are reliability and validity good?

We want "good scales," and these are defined as scales

possessing reliability and validity

-we want our scales to RELIABLY produce a similar score for the

same individuals for attributes that don't change much and those that

change moderately

-we want our scales to measure what they are intended to measure,

and nothing else.

If you are using a pre-existing scale, you need to be assured that:

internal reliability is acceptable test-retest

reliability is good the items of the scale seem to capture

the intended construct (content validity) it has been shown

to predict expected outcomes (criterion validity) it has

been shown to correlate with similar scales (convergent

validity) it does not correlate with dissimilar scales

Construct validity is the...

highest order, most abstract type of validity, and can only be

demonstrated through repeated demonstrations that the scale represents

the intended construct in numerous and various contexts-- the real world

good construct validity if numerous studies evidence all of the

above-mentioned types of validity

Types of variables/ scales of measurement

Nominal variables (categorical: classical nominal, ordinal variables,

interval (continuous variables), ratio variable

nominal variables

numerical variables that indicate membership within a particular

group eg. men= 0, women= 1, other= 2

ordinal variables

based on rankings- only feasible with relatively small groups of comparisons

interval (continuous) variables

variables with numerous obtained numerical values between the maximum

and minimum

ratio variable

like interval variables, but has a true zero point.

minimum numerical value has a special meaning

In psychology, the most common type of variable is... Why?

Continuous/interval. Many statistical tests (t-test, ANOVA,

regression, etc.) rely on assumptions of equal spacing between points

on a scale and normal distributions. Interval and ratio data are more

likely to achieve normality than other types (it is impossible for

nominal and ordinal

Other types of analyses must be used if your outcome variable is...

nominal or ordinal

nominal or ordinal variables use:

non-parametric tests (can use parametric but only as predictors or IVs)

Ratio and interval variables use:

parametric tests

parametric test

based on normality

Self report measures advantages

-who knows better than the individual in question?

particularly useful for internal beliefs, attitudes, and emotions

that are not evident to other people (eg. depression, anxiety,

mindfulness, intentions)

-easier and more efficient

and maybe more accurate than obtaining observations of the

person, other people's reports, or neuropsycological indices

self-report measures problems

awareness/memory response set/bias format of

the question questions tailored for samples

levels of measurement

Categorical/ nominal, ordinal/ ranked, continuous: interval, ratio

Yes/no binary pros and cons

good for children/ simple, very restrictive, lacks richness that

other data can give you, limits participant responses

fill-in-the-blank

gives participants lots of freedom and range, doesn't constrain

anything, good for qualitative studies: when you're not worried about

numbers, not good for children: creative answers not useful sometimes,

wording must give you some sort of data that is useful to you

Fill-in-the-blank produces... data

categorical data in more of a variable form

Likert produces... data

interval/ ordinal?

Multiple choice pros and cons

gives participants options, but don't give them all of the options,

must be mindful of what answers you offer

Multiple choice produces... data

categorical

Yes/no binary produces... data

categorical

response scales

yes/no (binary) fill-in-the-blank multiple

choice Likert

Use of "don't know" in a survey

Must weigh up need for data because it is nice to give option

Sometimes "don't know" response is just as useful

Graded boxes

good for kids who aren't good with putting emotions into words

Smiley/frowny faces scale

commonly used to asses liking/disliking

also use to assess pain

easy for participants to respond to this because they're relevant

to lots of people: everyone knows what a smiley/frowny face is

BUT there is still ambiguity: guy in middle

Creative" formats:

good for unique populations (children, low literacy)

veg. visual analogues: good for pain- blank line, cursor along line

Problems with digital administration

Can't skim forward and backward quickly and easily

Can't quickly determine how far through the survey you are

Fonts can be small and hard to read Tied to a screen

(typically a desk computer or laptop, although tablets and

smartphones can work well depending on the survey platform)

Computers can die and data can be lost

Why digital administration?

Point-an-click is easier and faster than using pen or pencil on

paper Can compile data (in Excel or SPSS formats) very

quickly and without error Can create a survey (a set of

self-report questionnaires) more easily, and can edit it more

easily Almost everyone has a screen to read the questions

(although smartphones have small screens) Can enact

"skip and branched" more easily than in paper surveys

Item wording: what to avoid (examples?)

complexity technical terms ambiguity

double-barreled questions double-negatives

emotive language leading questions invasion of

privacy sensitive topics with young people

Why sample?

It's usually not practical to obtain data from your entire

population, so take data from representative population and generalise outwards

External validity

to what degree can you generalise the findings to a larger group?

Sampling frame

If you can't afford to sample everyone in your population, focus on a

subset of the population to draw your sample

eg: population: children in NZ

sampling frame: children in Wellington

sample: a subset of children in Wellington

Population and sample examples

Probability Sampling tends to be...

expensive, time-consuming, but it's better than non-probability sampling

types of probability sampling

simple random, stratified random, cluster

Simple random sampling:

every person in the population has an equal chance of being sampled

stratified random sampling:

divide population along dimensions (eg. gender, SES, ethnicity, etc.)

and be sure that you sample proportionately across these dimensions

cluster sampling:

obtain participants from pre-existing groups or clusters. Try to get

a random sample of clusters

Non-probability sampling tends to be...

cheaper and easier, but you worry about representativeness

Types of nonprobability sampling:

convenience

quota

purposive

snowball

convenience sampling:

sample from readily available sources. handy for the researcher.

biases are introduced.

quota sampling:

obtain appropriate percentages of different types of participants

(eg. gender, ethnicity(, but one is still obtaining these participants

from readily available sources

purposive sampling

you select individuals who fit within a particular category to fit a purpose

snowball sampling

recruit an initial group of participants, and then you obtain

referrals from them to obtain data from their friends and

acquaintances. Useful for rare types of participants (e.g surfers).

Most research psychologists use... because...

non-probability sampling, because it costs a lot of money, time, and

effort to obtain probability samples

What to pay attention to samples when you read research:

who is the most commonly sampled population in psychology research?

are we missing out?

Who is the most commonly sampled population in Psychology research?

WEIRD:

western educated industrial rich democratic

(+ lots of uni students)

Ethics bias

if you're studying children and adolescents and only obtain about 60%

parental permissions, what are the other 40% like?

self-selection bias

When you have a low response rate, who are the participants you get?

Nonrepresentativeness of the sample bias

when the sampling frame significantly differs from the population,

you have introduced biases

Biases

nonrepresentativeness of the sample, self-selection bias, ethics

creative approaches

passive ethical consent for children and adolescents

compensation and inducements interesting ways to

collect data: laptops or iPads; internet; testing on cell phones;

diary studies; etc. interviews underutilised

samples: eg. from school to after-school program, to avoid survey

fatigue (people don't want to answer lots of surveys) and increase

motivation

The role of technology

increases our access to information:

-surveys over internet

-observations of naturally occurring behaviour

-through cell phones and tablets (multi-media portable computers)

-through surveillance of one's use of technological devices

**ETHICS IMPT**

Event Sampling Method (ESM)

captures data on an hourly, daily, or weekly basis

good for rapidly changing variables on relatively small

samples

"the experience sampling methods, also referred to as daily

diary method, is an intensive longitudinal research methodology that

onvolves asking participants to report on their thoughts, feelings,

behaviours, and/or environment on multiple occasions over time

Advantages of ESM

capturing phenomena nearer the time that they occur: better

memory for events, feelings, and thoughts obtaining

multiple assessments of variables of interest, and more assessments

of a construct yield better reliability and validity can

identify contexts for important psychological states

Problems with ESM:

difficult to recruit and retain individuals who don't mind that

their day is interrupted by signals to report states

participants must be comfortable with the recording device

lots of missing data difficult to analyse this type of

data: numerous repeated measures

Statistical power

Probability of not making a Type 2 error

The more statistical power we have in our design, the better our

decision making

Type 1 error (alpha)

Incorrect decision: we reject null hypothesis, but null hypotheses

was true

Type 2 error (beta)

Incorrect decision: we do not reject null hypothesis, but null

hypothesis is not true

Values of beta

from 1 (perfect ability to avoid this error) to 0 (totally wrong all

of the time). Ideally power is on high side (.80 or so)

Type 2: Not rejecting null hypothesis when it is false: Null

hypothesis is false

this means that there probably actually is a difference between your

means- you'd find a difference most of the time. Your test is one of

the only times you didn't find a difference.

Type 2: Do not reject null hypothesis when it is false: fail to reject

We accept the null hypothesis, and say that there is no difference

between the means (our p-value is non-significant: above .05), when in

fact there is a difference (null hypothesis is false)

A type 2 error is "not rejecting the null hypothesis when it is

false." Translation

in the world, there is a true difference but your statistical test

yielded a p-value greater than .05, so you mistakingly do NOT reject

the null hypothesis.

Greater statistical power comes from...

larger sample sizes

Options for missing values in a dataset

Ignore the missing values and allow SPSS to perform listwise or

pairwise deletions

Impute the missing value with a proper method

Listwise deletion

an analysis drops only those participants that have a missing value

for a variable involved in the analysis

Pairwise deletion

If you conduct correlations on a variety of variables that are

missing different value, you get different Ns.

Types of imputation

MI (multiple imputation), EM (expectation maximisation), and FIML,

(full information maximum likelihood)

MI (multiple imputation)

can be computed in SPSS but it is unwieldy because it generates only

a single dataset

EM (expectation maximisation)

also can be computed in SPSS, and generates only a single dataset

FIML (full information maximum likelihood)

used in structural equation modelling

Why is imputation good?

it increases the number of participants who have complete data, so

your sample size reaches its maximum: it increases power

More power= decreases chance that you'll mistakenly accept the null hypothesis