Understandable Statistics - Chapter 1

Statistics

is the study of how to collect, organize, analyze, and interpret numerical information from data. You can say the 'power' of statistics is that they measure uncertainty.

Individuals

are the people or objects included in the study.

Variable

is a characteristic of the individual to be measured or observed.

Quantitative variable

has a value or numerical measurement for which operations such as addition or averaging make sense.

Qualitative variable

describes an individual by placing the individual into a category or group, such as male or female; classification, non-numeric

Population data

the data are from every individual of interest

Sample data

the data are from only some of the individuals of interest; sub-set of the population

Parameter

a numerical measure that describes an aspect of a population; typically a value that fits for an instant

Statistic

is a numerical measure that describes an aspect of a sample

Level of measurement: Nominal

applies to data that consist of names, labels, or categories (classifications). There are no implied criteria by which the data can be ordered from smallest to largest, not for numeric calculations

Level of measurement: Ordinal

applies to data that can be arranged in order (where the order is meaningful). However, differences between data values either cannot be determined or are meaningless (differences not meaningful).

Level of measurement: Interval

applies to data that can be arranged in order. In addition, differences between data values are meaningful; zero may not be the correct starting point, ratios are meaningless, ex: 100 degrees is not twice as hot as 50 degrees. Celsius and Farenheit are in

Level of measurement: Ratio

applies to data that can be arranged in order. In addition, both differences between data values and ratios of data values are meaningful. Data at the ration level have a true zero. Ex: time, money, distance

Descriptive statistics

involves methods of organizing, picturing, and summarizing information from samples or populations.

Inferential statistics

involves methods of using information from a sample to draw conclusions regarding the population.

Simple random sample

Every member of the population has an equal chance of being selected and every sample (subset) has an equal chance of being selected

Random-number table

make one yourself by writing the digits 0 through 9 on separate cards and mixing up these cards in a hat. Then draw a card, record the digit, return the card, and mix up the cards again. Draw another card, record the digit, and so on.

Simulation

is a numerical facsimile or representation of a real-world phenomenon

Sampling with replacement

means that although a number is selected for the sample, it is not removed from the population. Therefore, the same number may be selected for the sample more than once.

Stratified sample

Divide the entire population into distinct subgroups called strata. The strata are based on a specific characteristic such as age, income, education level, and so on. All members of a stratum share the specific characteristic. Draw random samples from eac

Systematic sample

Number all members of the population sequentially. then, from a starting point selected at random, include every kth member of the population in the sample.

Cluster sample

a method of sampling used extensively by government agencies and certain private research organizations. Divide the entire population into pre-existing segments or clusters. The clusters are often geographic. Make a random selection of clusters. Include e

Multistage sample

Use a variety of sampling methods to create successively smaller groups at each stage. the final sample consists of clusters.

Convenience sample

method of sampling that simply uses results or data that are conveniently and readily obtained. runs risk of being severely biased.

Sampling frame

a list of individuals from which a sample is actually selected.

Undercoverage

results from omitting population members from the sample frame.

Sample(ing) error

is the difference between measurements from a sample and corresponding measurements from the respective population. It is caused by the fact that the sample does not perfectly represent the population.

Nonsample error

is the result of poor sample design, sloppy data collection, faulty measuring instruments, bias in questionnaires, and so on.

Census

In a ____, measurements or observations from the entire population are used.

Observational study

In this kind of study, observations and measurements of individuals are conducted in a way that doesn't change the response or the variable being measured.

Experiment

In an _____, a treatment is deliberately imposed on the individuals in order to observe a possible change in the response or variable being measured.

Placebo effect

occurs when a subject receives no treatment but (incorrectly) believes he or she is in fact receiving treatment and responds favorably.

Completely randomized experiment

an experiment in which a random process is used to assign each individual to one of the treatments.

Block

is a group of individuals sharing some common features that might affect the treatment.

Randomized block experiment

In a ________, individuals are first sorted into blocks, and then a random process is used to assign each individual in the block to one of the treatments.

Double-blind experiment

An experiment in which neither the individual in the study nor the observers know which subjects are receiving the treatment

Control group

This group received a dummy treatment, enabling the researchers to control for the placebo effect. In general, it is used to account for the influence of lurking variables.

Treatment group

The group receiving the treatment

Confounding variable

Two variables are confounded when the effects of one cannot be distinguished from the effects of the other. Confounding variables may be part of the study, or they may be outside lurking variables.

Lurking variable

A variable for which no data have been collected but that nevertheless has influence on other variables in the study

Randomization

used to assign individuals to the two treatment groups. This helps prevent bias in selecting members for each group.

Replication

____ of the experiment on many patients reduces the possibility that the differences in pain relief for the two groups occurred by chance alone.

Survey

A means to gather data about people --- asking them questions

Nonresponse

Individuals either cannot be contacted or refuse to participate. Nonresponse can result in significant undercoverage of a population.

Voluntary response

Individuals with strong feelings about a subject are more likely than others to respond. Such a study is interesting but not reflective of the population.

Hidden bias

The question may be worded in such a way as to elicit a specific response. The order of questions might lead to biased responses. Also, the number of responses on a Likert scale may force responses that do not reflect the respondent's feelings or experien

Sampling technique: Random sample

Use a simple random sample from the entire population.

Sample

In a ____, measurements or observations from part of the population are used.

Truthfulness of response

Respondents may not accurately remember when or whether an event took place.

Vague wording

Words such as "often," "seldom," and "occasionally" mean different things to different people.

Interviewer influence

Factors such as tone of voice, body language, dress, gender, authority, and ethnicity of the interviewer might influence responses.

Population

all individuals of interest