Statistics

Data

collections of observations

Statistics

the science of planning studies and experiments; obtaining data; and then organizing, summarizing, presenting, analyzing, and interpreting those data and then drawing conclusions based on them

population

complete collection of all measurements or data

census

collection of data from every member of the population

sample

a sub-collection of members selected from a population

voluntary response sample (self-selected sample)

the respondents themselves decide whether to be included

parameter

numerical measurement describing some characteristic of a population

statistic

numerical measurement describing some characteristic of a sample

Quantitative (numerical) data

consist of numbers representing counts or measurements

Qualitative (categorical) data

consist of names or labels that are not numbers representing counts or measurements

Quantitative data can be further described by?

discrete and continuous

Discrete data

result when the data values are quantitative and the number of values is "countable

Continuous data

result from infinitely many possible quantitative values, where the collection of values in not countable

Nominal level of measurement

categories only. data cannot be arranged in order

Ordinal level of measurement

data can be arranged in order but differences either can be found or are meaningless

Interval level of measurement

differences are meaningful, but there is no natural zero starting point and ratios are meaningless

Ratio level of measurement

there is a natural zero starting point and ratios are meaningful

example of ratio

heights, lengths, distances, volumes

example of interval

body temperatures in degrees

example of ordinal

ranks of colleges in US News & World Report

example of nominal

eye colors

Observational study

we observe and measure specific characteristics but we don't attempt to modify the subjects being studied

Experiment

we apply some treatment and then proceed to observe its effects on the subjects

Simple random sample

n subjects is selected in such a way that every possible sample of the same size n has the same change of being chosen

Systematic sampling

we select some starting point and then select every Kth (such as every 50th) element in the population

Convenience sampling

we simply use results that are very easy to get

Stratified sampling

we subdivide the population into a least two different subgroups so that subjects within the same subgroup share the same characteristics. Then we draw a sample from each subgroup

Cluster sampling

divide the population area into sections. Then we randomly select some of those clusters and choose all the members from those selected clusters

Cross-sectional study

data are observed, measured, and collected at one point in time, not over a period of time

retrospective (case-control) study

data collected from a past time period by going back in time

prospective study

data collected in the future from groups that share common factors

randomization

used when subjects are assigned to different groups through a process of random selection

replication

the repetition of an experiment on more than one subject

blinding

subject does not know whether he or she is receiving a treatment of placebo

placebo effect

occurs when an untreated subjects an improvement in symptoms

double-blind

blinding occurs at 2 levels

sampling error

occurs when the sample has been selected with a random method but there is a discrepancy between a sample result and the true population

non-sampling error

result of human error

nonrandom sampling error

result of using a sampling method that is not random

lower class limits

smallest numbers that can belong to the different classes

upper class limits

largest numbers that can belong to the different classes

class boundaries

numbers used to separate the classes, but without the gaps created by class limits

class midpoints

the values in the middle of the classes

class width

difference between two consecutive lower class limits in a frequency distribution

frequency distribution

helpful in organizing and summarizing data

histogram

a better tool than a frequency distribution

data skewed

if it is not symmetric and extends more to one side than to the other

skewed right (positively skewed)

have a longer right tail

skewed left (negatively skewed)

have a longer left tail

normal distribution

normal if the pattern of the points in the moral quantile plot is reasonably close to a straight line

not a normal distribution

if the normal quantile plot is not in a straight line

measure of center

a value at the center or middle of a data set

mean

average

median

50/50

mode

most repeated value