PSYC 532 - Lecture

SPRITE - Sample Parameter Reconstruction via Iterative Techniques

~ Takes reported stats and tells you about the raw data underlying them
~ Important when doing claim investigations
~ Peripheral information may not mean much

Evidential Value

~ Prereg
~ Design and Methods
~ Power and n planning
~ Effective visualizations
~ P<0.05
~ Reproducible
~ Appropriate analysis
~ CI and Effect Sizes
~ Open data

Inferential stats

~ Pop x --> sampling strategy --> sample po x and collect data --> analyze and make claims about pop x

General Linear Model

~ Variance in individual DV scores (Y) explained by IV/predictors (X) and residual error variance (e)
~ e norm dist.
~ Y contin.

Testing Approach

Raw data --> descriptive stats --> test stat --> p value + 95% CI and Effect sizes --> conclusion

P-Value????

~ The probability of finding the observed, or more extreme, results when the null hypothesis (H 0) of a study question is true

Determining Significance

1) Determine desired long-run rate of false-positive (Type I) errors (?) --> change to 0.005
2) Take an estimate of effect (e.g., a mean difference) and uncertainty, and convert them into a test statistic
3) Compare test statistic to critical value associ

Not looking for a lot of values between.... for true sig.

0.01 and 0.05

P- value Distribution when Ho is true

Flat

P-value dist with true effect

+ skew that gets more extreme

Critical t value

a/2% of observed test stats fall

The prob we want, what we would be happy with, what we get (vertical line if true then)

P(H0[T])
P(H0[T]|data)
P(data|H0[T])
P(D given H): Probability of obtaining some data given a hypothesis

p-value integrity maintained when

~ Assumptions underlying the test statistic are met
~ The researcher conducts/interprets one statistical test of a given hypothesis, without any post hoc changes to sample size, variables in the model, etc.,

Bayesian Inference

~ An approach to statistical inference that takes...:
--> What you initially think about H0/H1now ("prior" probabilities)
--> The strength of evidence for H0/H1 in your data (the "likelihood")
~ And combines them using probability axioms to calculate:
-->

Bayes Factor

P(data|H1[T])/P(data|H0[T])
How much more probable are the data if H1 were true then Ho
~ Cont. measure

BF Limits

~ Can't control long run error rates
~ Smaller n --> selected prior can influence outcome

Benefit of Confidence Intervals

Range of estimates of effect
Can reject a multitude of null
Sime evidence whether or not a study was precise enough

Calculate CI

1) Identify your effect of interest
2) Calculate the effect's standard error
3) Determine level of confidence and associated critical value for test statistic
4) Calculate Lower and Upper Limits of CI
LL: your effect + (negative critical value * SE)
UL: y

Effect Sizes

~ Nec. for comparing effects
~ Planning sample size --? smallets effect size of interest :)

Limits of ES

1) Original scaling can sometimes be very much more informative
2) Proposed cutoffs for "small", "medium" and "large" effects were/are
~ Unintentional
~ Subjective
~ Poorly intuited
3) In some cases, distort the patterns of data across studies
4) Might be

Recliability Crisis

~ Most psych research has not been reproducible
~ Registered Replication Reports

Internal Reproduc.

~ Collaborators

External Reprod

~ Skeptical Readers

Misreporting

~50% of articles contain contain at least one misreported value; ~15% contain a "gross error" (i.e., mistakenly rejecting/failing-to-reject null)

Type 1 Error

False positive
~ Not real effect
~ Reject Null

Type 2 Error

False Negative
~ Real effect
~ Fail to reject null

Power

1 - B
~ Probability of NOT making a Type II (false-negative) error
~ We aspire to achieve 80% power in psychological research
~ Influenced by desired ??, ?, effect size, and n

Increasing power

Increase your sample size
Use repeated-measure designs
Use more reliable measures
Use more impactful manipulations
Use more sophisticated statistical analysis techniques
Use one-tailed tests*

P dist with power

Tighter with higher power

What counts as p hacking

~ More data
~ Identify outliers
~ Add covariate/interactions
~ Alter coding schemes of variable
~ Drop a condition
~ Drop a DV that didn't work
~ Change to 1-tailed

Sequential Analysis

If significant, stop: you have a significant effect and you saved a bunch of resources
If not, collect data until next planned "peek": rinse-and-repeat
If necessary, continue until largest planned n based on power-analysis

Why data visualization?

~ Quickly pass on info
~ Non scientists understand
~ Prevent mistakes
~ Direct communication

Types of Visuals
Histograms
D plots
Grouped Box and Whisker
Bar
Dplots
Pirate
Scatter

~ Histograms --> distribution of a continuous variable (frequency)
� Goldilocks with number of bins
~ Density Plots --> an intelligent smoothing algorithm for the histograms
~Grouped Box-and-Whiskers --> goldilocks (categorical and continuous outcome, err

Types of T tests

~ Single
~ Paired
~ Independent

t-test equation

t= (M-uM)/SM
~ Sample mean- pop mean/ estimated standard error

Z vs T

Z
~ When sigma known
~ No df --> have pop parameters
T
~ sigma unknown
~ df --> estimate values

Single Samp T

Compare sample mean to assumed null pop. mean
Assume normality

Paired Samp T

Compare within-subject mean difference to assumed null pop. mean difference
~ Assume normality

Independent Samp T

Compare between-group differences in means to assumed pop. difference in means
normality and hov

Interpreting D

92% overlap 0.2 small
80% overlap 0.5 med
69% overlap 0.8 large

Equivalence Testing

~ Used when you want to make stronger claims about H0
~ Flip NHST logic on its head: can you reject H0(s) of values within which you would consider an effect to be trivial
~ Traditional NHST: can I reject H0 of 0 difference
~ TOSTs: can I reject...
~ H0 t

Write Up

What were you looking at (i.e., general overview), and how you did it (i.e., what kind of analysis did you conduct?)
Mention pre-registered (and one-tailed test, if used)
Outcome of evaluating your assumptions (and how they were evaluated)
A human-readabl

Types of ANOVA

Mixed
Within
Between
1 or 2 way (# of IVs analyzed)

ANOVA basics

~ Ususal for IV with 2+ levels
~ Relies of F
--> Ratio of between-group/ within-group variance
Null F=1 (Groups are equally different from each other, than are people within groups)

1-way between groups ANOVA

df: G-1
MS SSbetween/dfbetween
F MSbetween/MSwithin (variances)
Sum Ssquares ?(M-GM)2/df

1-way within groups ANOVA

n-G
SS ?(M-M)2

Bonferroni correction

~ Run as many t test as you want but with new t crit based on adjusted a

Tukeys

Pairwise comparisons

Effect Size for ANOVA

eta
SSn/SSTotal
Proportion of variance in DV explained by IV
~ Sumes to 1

Limits of ANOVA

~ No cont IV
~ Not good for within sub
~ No interact with cont. variables
~ No cat x cat interactions

0 order correlation coefficients

~ Total ass. between x and y

Semi-parital correlation

~ unique ass. between x and y with influences from other variables partialled out from X

Parital correlation

~ unique ass. between x and y with influences from other variables partialled out from X and Y

Equation for r

r = SPxy/[SR(SSx) x SR(SSy)]

0 Order Correlation Coefficients

Pearson --> parametric for linear (uses cont.)
Spearman Rank Correlation --> nonparametric for monotonic both go up (at least ordinal)
Kendal Rank --> nonparametric for strength of dependence (at least ordinalu good for small n

Correlation as effect size

0.1 = small
0.3 = med
0.5 = large
0.7 = very large
We want 0.1-0.3

Bivariate Regression Formula

But with B0
Y = scores or dv
b0 = y intercept predicted for x = 0
B1 is slope of the line change in y for one unit increase in x
X is predictor
e is unexplained variability

Intercept only

~ Compare Y from M

Bivariate model

predict y from x

Assumptions for Regresstion

~ Linearity
Residuals norm dist.
Ansence of outliers
~ Homoscdasticity or HoV --> not changing speeds

Steps for regression

~ Calc slope
~ Calc intercept
~ Calc Y
~ Calc SSres
~ Calc F --> looking for sig
~ Calc R2
~ Calc SE of estimate
~ Calc SE of slope
~ Calc t test for slope
~ Calc CIS

Evaluating Model Fit for Regression

~ Calc F
~ If F sig --> interpret t tests of slopes
~ Not sig do not interpret

Dummy Coding

~ Compare means to ref group
~ not small n
~ Not catch all group
~ Ref = 0

Effects Code

~ Compares means of groups to grand unweighted mean
~ Choose least interesting group
~ -1 for base group
~ 1 for at patic code or 0

Contrast Code

~ Test more complicated hyp. about mean diff between groups and comb of groups
~ Weights of cs sum to 0
Diff +/- should sum to 1

Moderation

� In the context of regression, an opportunity to evaluate whether the slope of Y on X changes as a function of the moderator variable, M
~ Effect of x is conditional on M

Terms of Equation for Interactions

� B1 now gives us the slope for X when M is at 0
? If M wasn't at 0, the slope would be different
� B2 is now the conditional effect for M when X is at 0
? Again, if X wasn't 0, then the slope would be different

Probing a sig interaction

~ Sig b3 (diff between group slope and average slope)
~ Slope of x changes across levels of M
~ Center moderator on level of interest and interpret b1

Interpreting interactions with coded

D: baseline group est. intercept and slop for baseline group
E: across groups
B0 and B2 (M)
Code variables: reflect differences in interecepts(b1)
Interaction terms: reflect differences in slopes (b3)

With Interactions of Cont x Cont

~ Re center with lpw, average and high levels of M

Curvilinear examples

Sexual freq and relational well-being
Money and happiness

Nonlinearity in the Variable Equation

Y = b0 + b1X+ b2x^2
b0 --> predicted value of y when x is 0
b1 --> slope of y on x when x is 0
b2 --> the expected change in slope of y on x for every one unit increase in x

Testing Curvilinear models

~ MR
--> include both x and x2
--> F and slope for x2 are sig evidence
~ HR
--> x2 at later step
--> F test of change R2 for adding X2 and slope of X2 are sig evidence

Study design permits causal inferences?
Study design does not permit causal inferences?
Baron and Kenny

~ Mediation
~ Indirect effect

Baron & Kenny, 1986 and steps

~ C path is total effect
Must meet the expect.
~ C path = sig
~ a path is sig
~ b path is sig
~ If X--> Y controlling for M is closer to 0 then c-path....
and non sig --> M completely mediates X--> Y
and sig --> M pariatlly mediates x --> Y

Limits of B and K method

~ Mediated effect never directly tested
~ 3 infeential tests must be passed to conclude mediation present
~ Total effect must be sig ( not always case)
~ With out formal testing cannot compare mediation effect

Alt. to B and K

~ Bootstrapping
~ Monte Carlo
~ Normal Theory/Wald Test

Monte Carlo

~ nonparametric simulation method
? Simulate and average
? Different to bootstrapping --> does not make assumptions about the distribution of ab, but assumes normality of residuals for a and b
? Raw data not needed

Normal Theory

? Normality of residuals assumption
? Assuming distributions is normal is often not true
? Generic Sobel better than Kenny

Bootstrapping

~ Nonparametric resampling
~ GOld standard
~ Randomly sample a certain number of times. Sample observations then put them back and then create an average of the samples (N=10000)
Estimate a
Estimate b
Estimate ab and then place ab in order of magnitude an

Interpreting R output

? Intercept estimate --> holding everything else constant that is what the predicted DV score would be for grand average of both coded groups (so the average person). If we dummy coded and females were the reference then it would be for females?. Intercep