Statistics
The science of planning studies and experiments; obtaining data; and then interpreting those data in order to draw conclusions based on them
Population
The complete collection of ALL measurements or data that are being considered
Census
The collection of data from every member of the population
Sample
A subcollection of members selected from a population
Voluntary response sample
The respondents themselves decide whether to be included.
Parameter
numerical measurement describing some characteristic of a POPULATION
Statistic
Numerical measurement describing some characteristic of a SAMPLE
Quantitative Data
numbers representing counts or measurements
(ages of survey respondents)
Quantitative Data is also known as
Numerical data
categorical data
names or labels that are not numbers representing counts or measurements
(political party affiliations of survey respondents)
categorical data is also known as
qualitative or attribute data
Discrete Data
values are quantitative and the number of values is finite or countable.
(Number of eggs in a carton - you cant have half and egg)
Continuous data
infinitely many possible quantitative values, where the collection of values is not countable.
(amount of gasoline in the tank)
What are the levels of measurement?
Nominal
Ordinal
Interval
Ratio
Nominal Level of Measurement
data the consist of names, labels, or categories only. Cannot be arranged in an ordering scheme
(eye color)
Ordinal Level of Measurement
can be arranged in some order, but differences (obtained by subtraction) between data values either cannot be determined or are meaningless.
(college ranking)
Interval Level of Measurement
can be arranged in order, and differences between data values can be found and are meaningful. Do not have a natural zero starting point at which none of the quantity is present.
(body temp)
Ratio Level of Measurement
can be arranged in order, differences can be found and are meaningful, and there is a natural zero stating point. Differences and ratios are both meaningful.
(Heights, lengths, distances, volumes)
Observational study
observe and measure specific characteristics, but don't attempt to modify the subjects being studied
Experiment
apply some treatment and then proceed to observe its effects on the subjects.
What are the two basic ways to obtain data?
Observational study
Experiments
Simple Random Sample
sample N is selected in a way that every possible sample of the same size N has the same chance of being chosen.
Random Sampling
Each member of the population has an equal chance of being selected. Computers are ofter used to generate random telephone numbers.
Systematic sampling
Select some starting point and then select every Kth (every 3rd person) element in the population.
Convenience sampling
Use results that are easy to get
Stratified sampling
Subdivide the population into at least two different groups so that subjects within the same subgroup share the same characteristics then draw a sample from each group.
Cluster sampling
Divide the population into sections, then randomly select some of those clusters and then choose all members from those selected clusters.
Retrospective study
Go back in time to collect data over some past period.
Cross-sectional study
Data are measured at one period in time
Prospective study
Go forward in time and observe groups sharing common factors, such as smokers and non smokers
Prospective study is also known as
Longitudinal or cohort study
Confounding
Occurs in an experiment when the investigators are not able to distinguish among the effects of different factors
Sampling error
Occurs when the sample has been selected with a random method, but there is a discrepancy between a sample result and the true population result
Nonsampling error
Is the result of human error, including such factors as wrong data entries, computing errors, questions with bias working, false data provided by respondents, forming biased conclusions, or applying statistical methods that are not appropriate for the cir
Nonrandom sampling error
Is the result of using a sampling method that is not random, such as using a convenient sample or a voluntary response sample