statistics
branch of science that deals with DATA ANALYSIS and the study of it. The science of collecting, organizing and analyzing data for the purpose of estimation and making inferences.
population
all subjects possessing a common characteristic that is being studied
sample
A subgroup or subset of the population.
parameter
characteristic or measure obtained from a population
statistic
characteristic or measure obtained from a sample
descriptive statistics
collection, organization, summary, and presentation of DATA. Use GRAPHICAL DISPLAYS and NUMERIC summarizations to represent DATA.
inferential statistics
deals with procedures used to make inferences about a population parameter from information contained in a sample.
data
values which arise from observing from characteristics on a selected group of individuals. The characteristics which are observed are called VARIABLES.
variables
characteristic or attribute that can assume different values e.g. major, height, age, weight, gender.
qualitative
variables which assume non-numerical values. e.g. eye color, first name, favorite movie
quantitative
variables which assume numerical values. e.g. height, weight, income. DISCRETE or CONTINUOUS
discrete
variables which assume a finite or countable number of possible values. obtained by counting.
continuous
variables which assume infinite number of possible values. obtained by measurement. e.g. height of a person, amount of time spent studying. weight of an apple
levels of measurement
nominal, ordinal, interval, ratio
nominal
qualitative only. data values serve as labels, but labels have no meaningful order. e.g. hair color (blond, black, red, brown)
ordinal
qualitative or quantitative. Data values serve as labels but labels have natural meaningful order. Differences between values meaningless. e.g. class(fresh, sophomore, junior, senior)
interval
always quantitative. numerical data values so they have natural meaningful order, and differences are meaningful. ratios are meaningless.
ratio
always quantitative. numerical data values, have order, and both differences and ratios between values are meaningful.
sampling methods
random, systematic, stratified, cluster, convenience,
random sampling
data collected using chance methods or random numbers. same chance of being selected for the sample e.g. telephone polling, drawing names from a hat.
systematic sampling
data obtained by selecting ever kth object. e.g. choosing sample of voters by choosing every 25th voter from county voters roll. testing every 300th produce from assembly line.
stratified sampling
population is divided into groups (strata) according to some characteristic. Each strata is sampled using one of the other sampling techniques.
example of stratified sampling
choosing 200 men and 200 women for a sample, group population by income level & choose sample of low, middle, and high income individuals.
cluster sampling
population is divided into groups (geographically). Some groups are randomly selected, and all elements in those groups are selected. e.g. randomly choose 10 polling stations in a city and exit poll all voters at those stations
convenience sampling
choose individuals for a sample for easy to include. e.g. internet polls, mail-in customer survey.
observational study
observations and measurements of individuals conducted in a way that does not change the RESPONSE or the variable being measured.
experiment
TREATMENT is deliberately imposed on the EXPERIMENTAL UNITS in order to observe possible change in response or variable being measured. Common way to assign treatments to experimental units is by using random process.
treatment
level (amount) of factor applied to the experimental units
response
measured or observed traits in the experiment
experimental units
person, animal, plant, or thing, studied by a researcher
completely randomized design
one in which treatments randomly assigned to experimental units
CRD
treatments randomly assigned to experimental units
randomized complete block design
individuals first sorted into BLOCKS, & treatments randomly assigned to units in each block.
RCBD
individuals first sorted into BLOCKS, & treatments randomly assigned to units in each block.
data sources
secondary & primary
secondary data
data already available. Ex. statistical abstract of USA. Advantage: less expensive. Disadvantage: may not satisfy your needs.
primary data
data which must be collected
methods of collecting primary data
telephone interview, mail questionnaires, door-to-door survey, mall intercept, new product registration, personal interview, experiments