Stats Ch. 1

Statistics

The science of planning studies and experiments; obtaining data; and then interpreting those data in order to draw conclusions based on them

Population

The complete collection of ALL measurements or data that are being considered

Census

The collection of data from every member of the population

Sample

A subcollection of members selected from a population

Voluntary response sample

The respondents themselves decide whether to be included.

Parameter

numerical measurement describing some characteristic of a POPULATION

Statistic

Numerical measurement describing some characteristic of a SAMPLE

Quantitative Data

numbers representing counts or measurements
(ages of survey respondents)

Quantitative Data is also known as

Numerical data

categorical data

names or labels that are not numbers representing counts or measurements
(political party affiliations of survey respondents)

categorical data is also known as

qualitative or attribute data

Discrete Data

values are quantitative and the number of values is finite or countable.
(Number of eggs in a carton - you cant have half and egg)

Continuous data

infinitely many possible quantitative values, where the collection of values is not countable.
(amount of gasoline in the tank)

What are the levels of measurement?

Nominal
Ordinal
Interval
Ratio

Nominal Level of Measurement

data the consist of names, labels, or categories only. Cannot be arranged in an ordering scheme
(eye color)

Ordinal Level of Measurement

can be arranged in some order, but differences (obtained by subtraction) between data values either cannot be determined or are meaningless.
(college ranking)

Interval Level of Measurement

can be arranged in order, and differences between data values can be found and are meaningful. Do not have a natural zero starting point at which none of the quantity is present.
(body temp)

Ratio Level of Measurement

can be arranged in order, differences can be found and are meaningful, and there is a natural zero stating point. Differences and ratios are both meaningful.
(Heights, lengths, distances, volumes)

Observational study

observe and measure specific characteristics, but don't attempt to modify the subjects being studied

Experiment

apply some treatment and then proceed to observe its effects on the subjects.

What are the two basic ways to obtain data?

Observational study
Experiments

Simple Random Sample

sample N is selected in a way that every possible sample of the same size N has the same chance of being chosen.

Random Sampling

Each member of the population has an equal chance of being selected. Computers are ofter used to generate random telephone numbers.

Systematic sampling

Select some starting point and then select every Kth (every 3rd person) element in the population.

Convenience sampling

Use results that are easy to get

Stratified sampling

Subdivide the population into at least two different groups so that subjects within the same subgroup share the same characteristics then draw a sample from each group.

Cluster sampling

Divide the population into sections, then randomly select some of those clusters and then choose all members from those selected clusters.

Retrospective study

Go back in time to collect data over some past period.

Cross-sectional study

Data are measured at one period in time

Prospective study

Go forward in time and observe groups sharing common factors, such as smokers and non smokers

Prospective study is also known as

Longitudinal or cohort study

Confounding

Occurs in an experiment when the investigators are not able to distinguish among the effects of different factors

Sampling error

Occurs when the sample has been selected with a random method, but there is a discrepancy between a sample result and the true population result

Nonsampling error

Is the result of human error, including such factors as wrong data entries, computing errors, questions with bias working, false data provided by respondents, forming biased conclusions, or applying statistical methods that are not appropriate for the cir

Nonrandom sampling error

Is the result of using a sampling method that is not random, such as using a convenient sample or a voluntary response sample