Statistics Vernia 1.1-1.4


observations(measurements, genders, survey reponses) that have been collected


a collection of methods for planning experiments, obtaining data and then organizing, summarizing, presenting, analyzing, interpreting and drawing conclusions based on the data


the complete collection of ALL elements to be studied


collection of data from EVERY element in a population


a SUB COLLECTION of elements selected from a population

(Statistical thinking) When analyzing data collected the following factors should be considered:

Context, Source, Sampling Method, Conclusions and Practical implications

Bad sample examples:

Self-selected survey, GIGO, small samples

Self-selected survey/voluntary response sample

one in which the respondents themselves decide whether to be included


garbage in, garbage out

(Statistical Significance) is achieved when

you get a result that is very unlikely to occur by chance (winning the lottery twice)


a numerical measurement describing some characteristic of a POPULATION


a numerical measurement describing some characteristic of a SAMPLE

Quantitative data

consist of NUMBERS representing counts or measurements

Discrete data

data which results from either a finite number of possible values or a countable number of possible values (1,2,3...)

Continuous data

data which results from infinitely many possible values that can be associated with points on a continuous scale in such a way that there are no gaps, interruptions, or jumps (could go on... measurements... weight, height, etc)

Qualitative (categorical or attribute) data:

nonnumeric data that can be separated into different CATEGORIES


characterized by data that consists of names, labels, or categories only Data cannot be arranged in an ordering scheme or order is not meaningful (fave color, SS#, etc)


involves data that may be arranged in some order, but differences (subtraction) between data values either cannot be determined or are meaningless (t-shirts, low-med-high)


like the ordinal level, with the additional property that we can determine meaningful amounts of differences between data. However, data at this level do not have a natural zero starting point (subtraction but no ration, ex: temperature)


the interval level modified to include the natural zero starting point where zero indicates that none of the quantity is present. Differences and ratios are both meaningful. Ratio test: if one number is twice the other, is the quantity being measured also

Observational study

we observe and measure specific characteristics without attempting to modify the subjects being studied


we apply some treatment and then observe its effects on the subjects

systematic sampling

select some starting point and then select ever kth element in the population ( pulling every person out of the phone book and calling)

convenience sampling

use results that are readily available or very easy to get (family and friends)

stratified sampling

subdivide the population into subgroups that share the same characteristics then draw a simple random sample from each subgroup (driving to different schools in Indiana)

cluster sampling

divide the population into sections, randomly select some of those clusters, and then choose all members from the selected clusters


used when an experiment is repeated on a sample of subjects that is large enough so that we can see the true nature of any effects


occurs in an experiment when you are not able to distinguish among the effects of different factors

sampling error

the difference between a sample result and the true population result such an error results from chance sample fluctuation

non sampling error

sample data that is incorrectly collected, recorded or analyzed (collected biased sample)

nonrandom sampling error

the result of using a sampling method that is not random, such as using a convenience sample or a voluntary response sample