Statistics
Deals with... collection, classification, interpretation of data in order to draw conclusions
Descriptive statistics
Describes a data set (sample)
STATISTIC --SAMPLE
Inferential statistics
Conclusions a larger set (population) -- obtained from a part (sample) of the population
PARAMETER -- POPULATION
Experimental unit
Unit, individual
Population (n)
A set (or totality or collection) of all units of interest [ex. totality of all registered voters]
Totality
Proportion or percentage
Population size (N)
Number of units in a population [ex. 120 million]
Variable
Characteristics or properties of the an individual population unit [ex. opinion of a voter]
Measurement
Process to assign numbers (labels) to variables of interest
Parameter
Summary measure computed to describe a characteristic of a population [as opposed to opinion of each voter]
Census
List of all units in a population [population is small -- possible to measure every unit in population]
Representative sample
Exhibits properties typical of those possessed by the target population
Random sample
Every sample in the population has the same chance of selection
Simple random sampling
Every possible subset of size n of the population has the same chance of selection -- N--> n
Stratified sample
Obtained by separating the population into homogeneous, non-overlapping groups (strata) and then obtaining a simple random sample from each stratum
Systematic sample
Obtained by selecting every kth individual from population
Cluster sample
Obtained by selecting all individuals within a randomly selected groups of individuals
Convenience sample
One in which the individuals in the sample are easily obtained
Statistic (estimate)
Summary measure that is computed to describe a characteristic from only a sample of the population
Stastistical Inference
Generalization about a population based on information contained in a sample -- using info contained in the smaller sample to conclusions about the larger population
Measure of reliability
Statement about the degree of uncertainty associated with a stasticial inference
Data
List of measurements (observations) a variable
Ex. observations - M or F; variable - gender
Classification of variables and data
Single variable - univariate data set
Two variables - bivariate data set
More than two variables - multivariate
Quantitative (numerical) data
Measurements that can be recorded on a numerical scale; arithmetic operations provide meaningful results
Ex. Age; GPA; Salary
Qualitative (categorical) data
Measurements that cannot be recorded on numerical scale [instead - categories]; arithmetic operations provide meaningful results
Nominal data
Qualitative
Data that consist of names, labels, or categories - UNORDERED
Ex. gender (M or F)
Ordinal data
Qualitative
Data that consist of names, labels, categories - ORDERED
Ex. health status (poor, good, very good, etc.)
Discrete variable
Quantitative
Countable (whole - [0,6]) number of possible values
Ex. spilled marbles; number of emails received by one student
Continuous variable
Quantitative
Infinite; not countable (all - [0,6]) and can take on all values in a certain interval
Ex. spilled coffee; AGE; HEIGHT; WEIGHT; SPEED OF A CAR
Interval level of measurement
Nominal (categories cannot be ordered) + ordinal (categories can be arranged in some order) + differences in values of the variable are meaningful (addition + substitution) and zero does not mean absence of the quantity
Ratio level of measurement
Nominal (categories cannot be ordered) + ordinal (categories can be arranged in some order) + interval (differences are meaningful; add/sub) + ratios in values are meaningful (mul + div) and zero means the absence of quantity
Summary levels of measurements
Nominal - categories only
Ordinal - categories with some order
Interval - differences are meaningful, but no natural starting point
Ratio - differences and ratios are meaningful and there is a natural starting point
Observational study
Observe units in natural settings without trying to influence the outcome of the study
Designed experiment
Strict control over the experiment, the units and the values of variables in the experiment
Lurking variable
Additional variable that influences the two variables being studied
Explanatory variable
Ex. frequency/level of cellphone usage
Response variable
Ex. whether or not brain cancer was contracted
Sample without replacement
Individual selected is then removed from population and cannot be chosen again
Sample with replacement
Selected individual is placed back into population