Statistics

Sampling Distribution of Sample Means

Distribution of the sample means that is obtained when we repeatedly draw samples of the same size from the same population

Probability distribution

Collection of values of random variable along with their corresponding probabilities

statistic

Measured characteristic of a sample

Density Curve

Graph of a continous probability distribution

Hypothesis

statement or claim about some property of a population

Hypothesis test

Method for testing claims made about populations; also called test of significance

Linear Correlation Co-efficient

measure of the strength of the relationship between two variables

Normal Distribution

bell-shaped probability distribution described algebraically by formula

Degree of freedom

number of values that are free to vary after certain restrictions have been imposed on all values

Measure of Variation

any of several measures designed to reflect the amount of variation or spread for a set of values

Bi Modal

have 2 modes

Census

collection of data from every element in a population

class mid-points

In a class of a frequency distribution, the value midway between the lower class limit and the upper class limit

class width

The difference between 2 consecutive lower class limits in a frequency distribution

Expected Variable

For a discrete random variable, the mean value of the outcome

Simple Random

Sample of a particular size selected so that every possible sample of the same size has the same chance of being chosen

Ordinal

Level of measurement of data; characterizes data that may be arranged in order but, differences between data values either can not be determined or are meaningless

Ogive

Graphical representation of a cummulative frequency distribution

Observation Study

study in which we observe and measure but don't attempt to manipulate or modify the subjects being studied

Mode

Value that occurs most frequently

Range

the measure of variation that is the difference between the highest and lowest values

Standard deviation

measure of variation; equal to the square root of varience

Frequency Polygon

Graphical representation of the distribution of data using connected straight lines

Discrete data

Data with the property that the number of possible value is either a finite number or a countable number

Data

Number or information describing some characteristic

Convenience

Sampling in which data is selected because it is readily available

Experiment

Application of some treatment followed by observation of it's effect on the subjects

Histogram

Vertical bar graph representing the frequency distribution of a set of data

Quantitative data

data consisting of numbers representing counts or measurements

Qualitative data

Data that can be separated into different categories distinguished by some nonnumerica characteristic

P value

probability that a test statistic in a hypothesis test is at least as extreme as the one actually obtained

Cluster

Dividing the population area into sections (or clusters) then randomly selecting a few of those sections. Then selecting all the members from those selected sections

Null hypothesis

claim made about some population characteristic; usually involving case of no difference;

Dependent Sample

Sample whose values are related to the values in another sample

Variance

measure of variation; equal to the square of the standard deviation

Relative frequency histogram

bar graph (histogram) in which frequencies are replaced by relative frequencies

Upper class limits

Largest number that can belong to the different classes in a frequency distribution

Continuity Correction

Adjustment made when a discrete random variable is being approximated by a continous random variable

Correlation

Statistical assocation between two variables

Test statistic

statistic based on the sample data; used in making decision about rejection of the null hypothesis

Systematic sampling

Sampling in which every "k"th element is selected

Stratified sampling

Sampling in which samples are drawn from each stratum(class)

Simple event

Experimental outcome that cannot be further broken down

Critical region

the set of values of the test statistic that would cause rejection of the null hypothesis

Critical value

Value separating the critical region from the values of the test statistic that would not lead to rejection of the null hypothesis

Confidence Level

Probability that a population paramenter is contained within a particular confidence interval

Voluntary response sample

Sample in which the respondents decide whether to be included

Range rule of thumb

Rule based on the principle that for typical data, the difference between the lowest value and the highest value is approx. 4 standard deviations.

Pareto Chart

bar graph for qualitative data with the bars arranged in order, according to frequency

Parameter

measured characteristic of a population

Significance level

Probability of making a type I error when conducting a hypothesis test

Event

Outcome of an experiment

Factorial Rule

N" different items can be arranged "N!" different ways

Fundamental counting rule

For a sequence of two events in which the 1st event can occur in "n" ways; the 2nd can occur in "m" ways; together they can occur in "m*n" ways

Interval level

level of measurement of data; data can be arranged in order and differences between data values are meaningful

Lower class limit

Smallest number that can belong to the different classes in a frequency distribution

measure of center

value intended to indicate the center value in a data set

Multimodial

having more than 2 modes

Nominal

level of measurement of data; data consisting of names, labels or categories

non sampling errors

errors from external factors not related to sampling

binomial probability distribution

discrete probability distribution of the number of sucesses

discrete random variable

random variable with either a fininte(whole) number value or a countable number.

arithmetic mean

the sum of a set of values divided by the number of values

Statistics

collection of methods for planning experiments, obtaining data, organizing, summarizing, presenting, analyzing, interpreting and drawing conclusions, based on data

Standard normal distribution

normal distribution with a mean of 0 and a standard deviation =1

Outlier

values that are very unusual; they are very far away from most of the data

median

Middle value of a set of values arranged in order of magnitude

Midrange

1/2 of the sum of the highest value + lowest value

Complement

outcomes in which the original event does not occur

disjoint

events that can NOT occur at the same time

class boundaries

values obtained from a frequency distribution by increasing the upper class limit and decreasing the lower class limit by the same amount, so there are NO GAPS between consecutive classes

Continuous data

data resulting from infinitely many possible values that correspond to some continous scale that covers a range of values without gaps.

Cumulative frequency

sum of the frequencies for a class and all preceding classes

Frequency distribution

listing of data values (either individually or by groups) along with their corresponding counts

Continuous random variables

random variable with infinite values that can be associated with points on a continuous line interval

Placebo effect

occurs when an untreated subject incorrectly believe that he/she is receiving a real treatment and reports improvement in symptoms

Population

complete and entire collection of elements to be studied

random sample

sample selected in a way that EVERY member of the population to have the same chance of being chosen

Characteristics of Data

Center, Variance, Distribution, Outliers, changing over Time

Sampling error

difference between a sample result and the true population result; caused by chance sample fluctuations

scatter plot

Graphical display of paired (x,y) data

Random variable

variable (usually x) that has a single numerical value(determined by chance) for each outcome of an experiment

Confidence Interval

range of values used to estimate some population parameter with specific level

confidence interval limits

2 numbers that are used as the high & low boundaries of a confidence interval

sample size

number of items in a sample

ratio level

level of measurement of data; can be arranged in order; differences are meaningful; there is an inherent zero starting point

relative frequency distribution

basic frequency distribution in which the frequency for each class is divided by the total of all frequencies

sample

subset of a population

sample space

all possible outcomes(events) in an experiment that cannot be further broken down

Point Estimate

single value that serves as an estimate of a population parameter

standard score

number of standard deviations that a given value is above or below the mean; also called z-score