Bimodal Distribution
When a data set has TWO modes
Bar Chart
A chart that uses bars to show comparisons between catagories of data
Box Plot
A graphical rendition of statistical data based on the minimum, first quartile, median, third quartile and the maximum
Central Tendency
* (Mean/Median/Mode)
* "the average case"
* Uses a single number to describe data.
* Helps to understand the data in terms of "average
Confidence Interval
Measures the probability that a population parameter will fall between two set values
Degrees of Freedom (DF)
* An ESTIMATE of the RANGE of VARIABILITY
* Degrees of Freedom are ALWAYS = to "n = 1
Descriptive Statistics
* NUMERIC measurements of central tendency & variability
* HELPS explain the data more accurately & GREATER detail than graphical displays
* (It is good to begin with a graphical display to inspect the distribution & confirm what was seen in the numerical
Frequency Distribution
* Common way to present data
* shows a possible value of a variable along with the corresponding frequency of that value
* Ungrouped or Grouped
Histogram
* Display of statistical information that uses rectangles to show the FREQUENCY of data items in SUCCESSIVE NUMERICAL intervals of equal size
* Most common histogram: the independent variable is plotted along the horizontal axis and the dependent variable
Intraquartile Range
*Difference between the 75th percentile & the 25th percentile
* The IQR is LESS SENSITIVE to OUTLIERS or EXTREMES since it used only between the 25 & 75 percentiles
* Does NOT use ALL the values in a data set
* Does NOT use the smallest or largest in a da
Line Chart
* Type of chart which displays information as a series of data points called 'MARKERS' connected by straight line segments
* Also used to visualize a trend in data over intervals of time
* A time series
* Often drawn chronologically
Mean
* 1 of the 3 common measures of central tendency
* Often called "the average"
* MUST make the data quantifiable in order to determine interval and ratio levels of measurement
* THIS should NOT be used when unusual or outlying data values are present
* EXT
Mode
* 1 of the 3 common measures of central tendency
* MOST FREQUENTLY occurring number in a given data set
* Useful for the nominal level of measurement
Multimodal Distribution
The data set has more then 2 modes (3 or more)
Normal Distribution
The percentages of data values are equal from the center of the distribution
Outlier
* An observation point that is distant from other observations
* May be due to variability in the measurement or it may indicate experimental error
* The latter is sometimes excluded from the data set
Percentile
* When some % of value is above some specified value
* A measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall.
What is the percentile & quartile of a median
50%, 2nd quartile is the median
Pie Chart
* Circular statistical graphic which is divided into slices to illustrate numerical proportion
* The arc length of each slice is proportional to the quantity it represents
Point Estimates
Involves the use of sample data to calculate a single value (a statistic) which is to serve as a "best guess" or "Best estimate" of an unknown (fixed or random) population parameter
* Predicts a parameter by a single number
Point Estimate GRAPHIC
Interval Estimate
*An interval within which the value of a parameter of a population has a stated probability of occurring
* An interval of numbers that are believable values for the parameter
Interval Estimate GRAPHIC
Range
* Difference between the largest & the smallest values in the data set
* Provides a ROUGH estimate of the variability of a data set
* Does NOT USE ALL of the data values in a computation
Intraquartile and Range GRAPHIC
Scatterplot
* A graphic tool used to display the relationship between 2 quantitative variables
* Consists of an X-axis (HORIZONTAL), & a Y-axis (VERTICAL), & a series of dots
* Each dot represents one observation from a data set
Skewness
* Characterization of the data
* A measure of symmetry, or more precisely, the LACK of symmetry
Skewness Example GRAPHIC
Kurtosis
Measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution
Standard Deviation
* Equal to the square root of the variance!
* Numerical value used to indicate how widely individuals in a group vary
* If individual observations vary greatly from the group mean, this is big, and vise versa
* Theoretical curve divided into normal standa
Standard Deviation GRAPHIC
Standard Normal Distribution
* A special case of the normal distribution
* A "BELL CURVE" (visual)
* The distribution that occurs when a normal random variable has a mean of ZERO and a standard deviation of one
* Normal random variable of a standard normal distribution is called a st
Stem and Leaf Plot
A plot where each data value is split into a "leaf" (usually the last digit) and a "stem" (the other digits)
Stem and Leaf Plot
Unimodal Distribution
Only ONE MODE or only ONE reoccurring number or the MOST reoccuring number (ONE)
Variability
The range and spread of the data from the center
Variance
* The expectation of the squared deviation of a random variable from its mean.
* measures how far a set of (random) numbers are spread out from their average value
* Numerical value used to indicate how widely individuals in a group vary.
Variation
Different levels among data values in an data set
Variable
Can be Qualitative or Quantitative
Z Scores
* aka STANDARD SCORE
* Indicates how many standard deviations an element is from the mean
* and in what direction
* the further way this score, the more "surprising" the value of the statistic is
Z Score
Mean and Standard Deviation
Are ALWAYS together!
Mean and Measure of Dispersion
Should be reported together