Statistics

sample

selects a subset of a population

population

measures every individual

random

method where each individual has equal probability of being selected

stratified

sample that is proportional to the size of a demographic in a population

voluntary

sample that contains only those individuals that choose to participate

cluster

method that selects small groups selected randomly by geography or association

convenience

sample includes those that are easiest to collect data from

systematic

methods that uses a rule or pattern to select an individual

destructive

method that destroys the individual in order to collect the data

bias

systematic error in an experiment that over or under measures

validity

experimental results are reproducible by others

reliable

conclusions made from the data make credible predictions

primary

source of data collected or recorded first hand

secondary

source of data collected and interpreted by someone else

categorical

data collected by a description rather than measured

continuous

variable of measured data (uncountable outcomes)

discrete

variable that contains data from a countable set of outcomes

histogram

graphical depiction of continuous one variable data

bar graph

graphical depiction of discrete data

normal

data where measures of central tendency are equal or sufficiently close

skew

the mean is pulled away from the other central measures

outlier

an anomalous datum that is separated from or doesn't match the pattern in the rest of the data set

mean

average of a data set

median

middle value of a sorted data set or the 50th percentile

mode

most frequent value of a data set

range

the minimum value subtracted from the maximum value

variance

average area of squares defined by each piece of data to the mean of the data set

standard deviation

square root of the variance

inter-quartile range

the first quartile subtracted from the third quartile

quartiles

divides the data set into four equally sized subgroups

box and whisker

graphical depiction of quartiles with the median

bimodal

has two modes outside of the mean and median

accuracy

how close the data is to the correct value

precision

how close the data points are in relation to each other

inferential

statistics that extrapolate to the entire population

descriptive

statistics used to describe a data set

quantitative

numerical measures of discrete or continuous variables

qualitative

anecdotal observations or categorical data

variable

an attribute that measured or counted

coefficient of determination

r^2 value that measures how much the change in the dependent variable can be explained by the change in the independent variable

linear regression

an algorithm to find the equation of line of fit for a set of data

correlation coefficient

the PPMC for a data set that indicates the strength and direction of the linear relationship between -1 and 1

PPMC

Pearson Product Moment of Correlation also known as the correlation coefficient or r-value

cause and effect

a relationship in which change in the independent variable causes change the dependent variable

influential point

an extreme data point that affects the slope of the line of best fit and increases the coefficient of determination

percentile

breaks a data set into 100 equal parts and is used to rank data

weighted mean

an average calculation where each piece of data has a component factor used to amplify or diminish it

index

an actual, relative, or subjective value that is tracked over time

real value

used to compare the value of investments after inflation has been discounted

sample bias

occurs when the participant group does not reflect the population that is being studied or a sample is too small

non-response bias

occurs when subgroups are under-represented because of low participation rates

measurement bias

occurs when the device used for the experiment is not calibrated accurately

response bias

occurs when participates purposely provide false or misleading answers

reverse cause and effect

when the independent and dependent variables are reversed

presumed relationship

when the dependent variable is assumed to be related to the independent variable

accidental relationship

when the independent variable is not related to the dependent variable

common cause

when the change in independent and dependent variable are caused by an outside factor