Statistics
A set of methods and rules for collecting, organizing, summarizing, analyzing, interpreting, and presenting information (data).
Descriptive Statistics
Statistical procedures to summarize, organize and simplify data.
Inferential Statistics
Techniques (using probability) that allow us to use sample data to draw conclusions and make generalizations about the population.
Data
Measurements or observations
Qualitative Data
(descriptive and non-numerical; e.g. hair color, blood type, ethnic group, the car a person drives)
Quantitative Data
(numerical; amount of money, pulse rate, number of people living in Houston, number of students in this class)
Dataset
Collections of measurements or observations
Scores
Typically, quantitative data
Population
Set of all individuals, things, or objects of interest in a study.
Sample
A subset of individuals, things, or objects randomly selected from the population of interest
Parameter
Any characteristic of a population is called
Statistic
Any characteristic of a sample is called a
Sampling Error
The discrepancy between a sample statistic and the corresponding population parameter. Also called margin of error. A result of the actual process of sampling. (The discrepancy between a statistic and parameter might also be driven by non-sampling errors
Random Selection or Random Sampling
Every element of a population has the same chance of being selected for the sample. Prevents self-selection and selection bias.
Variable
A characteristic or condition that changes or can have different values for different elements of a population.
Constant
A characteristic or condition that does not vary and is the same for all elements of a population.
Theory
A set of statements or principles to organize, unify, and explain a group of observations and facts regarding some phenomena.
Experimental Hypothesis
A prediction about the relationship between variables.
Constructs
Hypothetical concepts in theories to organize observations.
Operational Definition
Define a construct in terms of specific operations/procedures and their resulting measurements
Correlational
Make observations of the variables as they exist naturally and measure their relationship.
Experimental
Manipulate a variable to assess its effect on another variable.
-tests a casual relationship, random assignment important when possible.
Independent Variable (IV)
Experimental variable that is manipulated = treatment variable = predictor variable = explanatory variable (e.g. drug treatment, exercise regimen, gender)
Dependent Variable
Variable that is measured for changes as a result of the IV (e.g. memory scores, cholesterol level, height). Can also be called response or outcome variable.
Discrete
Separate, indivisible categories. Can be qualitative (e.g. hair color, ethnic group, political affiliation) or quantitative (e.g. number of students in this class, number of phone calls you receive for each day of the week. Thus, countable whole numbers).
Continuous
Only quantitative values. Divisible into infinite number of fractional parts (e.g. weight, time, distance).
Nominal
Consists of a set of categories of different names. It measures qualitative data that can be classified into two or more categories/levels/groups (e.g. blood type, hair color, ethnic group).
Ordinal
Consists of a set of categories that can be organized in an ordered sequence
Interval
Consists of a set of ordered categories that form a series of intervals of exactly the same size (e.g. temperature in Celsius or Fahrenheit, IQ).
Ratio
An interval scale with an absolute zero point (e.g. temperature in Kelvin, height, weight, time).
Frequency Distribution
An ordered tabulation or graphical presentation of the number of individual scores located in each category on the scale of measurement
Relative Frequency
#NAME?
Histogram
Data are represented as a _____________ when measurements are on a continuous scale (interval or ratio)
Bar Graph
Data are represented as a ___________ when measurements are on a discrete scale (nominal or ordinal)
Polygon
Data can also be represented as a ___________ when measurements are on a continuous scale (interval or ratio)
Shape, Central Tendency, Variability
Three characteristics that completely describe any distribution.
Normal Distribution
It is a commonly occurring population distribution that is, loosely speaking, symmetrical, with the greatest frequency at its middle and relatively smaller frequencies towards its tails. It is also referred to as a Bell curve
Stem and Leaf Plot
comes from the field of exploratory data analysis. It is a quick way to picture small datasets.
Line Graph
Typically used to plot change in data over time i.e. longitudinal data. E.g. change in temperature highs from Jan - Dec
Scatterplot
Used to see if there is a linear relationship among data points. They indicate both the direction of the relationship between the x variables and the y variables, and the strength of the relationship.
Variability
It provides a quantitative measure of the degree to which scores in a distribution are spread out or clustered
Three Measures of Variability
The range, Interquartile Range, Standard Deviation
Range
It is the difference between the largest score and the smallest score in a distribution
Standard Deviation
A number that measures how far scores are from their mean.
Degrees of Freedom
number of scores that are free to vary. It is the number of observations in a dataset that are free to vary when estimating population parameter.
Standardized Distribution
It is derived using the mean and standard deviation of a distribution to transform each score (X value) into a z-score or standard score.
Z-scores
Tells you where the raw scores are located, either above or below the mean, in terms of standard deviations.