Frequency Distribution
Table that shows classes or intervals of data entries with a count of the number of entries in each class
Lower Class Limit
Least number that can belong to a class
Upper Class Limit
Greatest number that can belong to a class
Class Width
Distance between lower and upper limits of a consecutive class
Ex: Lower limit 2 - lower limit 1 = class width
Upper limit 2- upper limit 1 = class width
Range
Difference between the maximum and minimum data entries
Midpoint
Sum of the lower and upper limits of the class divided by 2
Relative Frequency
The portion or percentage of the data that falls in that class
Relative Frequency = class frequency/sample size = f/n
Cumulative Frequency
Sum of the frequencies of that class and all previous classes
Cumulative frequency of the last class should equal the sample size
Frequency Histogram
A bar graph that represents the frequency distribution of a data set
1. The horizontal scale is quantitative and measure the data entries.
2. Vertical scale measures the frequencies of the classes
3. Consecutive bars much touch
Class boundaries
Numbers that separate classes without forming gaps between them
Frequency Polygon
A line graph that emphasizes the continuous change in frequencies
Relative Frequency Histogram
A bar graph that represents the relative frequency distribution of a data set
Cumulative Frequency Graph
Ogive
Line graph that displays the cumulative frequency of each class at its upper class boundary
Stem-and-leaf plot
each number is separated into a stem and leaf ; as many leaves as entries in original data set in single digits
Stem
Entry's leftmost digits
Leaf
Entry's rightmost digits
Dot Plot
Each data entry is plotted using a point above a horizontal axis
Pareto Chart
Vertical bar graph in which the height of each bar represents the frequency or relative frequency, positioned in order of decreasing height with the tallest bar at the left
Paired Data sets
When each entry in one data set corresponds to one entry in a second data set
Scatter Plot
Ordered pairs are graphed as points in a coordinate plane and shows a relationship between two quantitative variables
Time Series
Data set that is composed of quantitative entries taken at regular intervals
Time Series chart
Chart of the time series used to graph it
Year / Number / Average Number
Measure of central tendency
Value that represents a typical, or central, entry of a data set
Mean, median, and mode
Mean
Sum of the data entries divided by the number of entries
Median
Value that lies in the middle of the data when the data set is ordered
Mode
A data entry that occurs with the greatest frequency
Bimodial
When a data set has two entries occurring with the same greatest frequency
Outlier
The data entry that is far removed from the other data entries in a data set
Gaps
Space in a distribution cause by outliers
Weighted mean
Mean of a data set whose entries have varying weights
Ex: x bar = sigma(x * W)/sigma(W)
When finding your grade in a class with sections (like tests) weighted differently
Mean of a frequency distribution
x bar = sigma(x * f)/n
where x and f are midpoint and frequency of each class
Symmetric
when a vertical line can be drawn through the middle of a graph of the distribution and the halves are about mirror images
Uniform/rectangular
When all entries, or classes, in the distribution have equal or about equal frequencies
Skewed
When the "tail" of a graph elongates more to one side than the other
Negatively skewed
Skewed left; the tail extends to the left
Positively skewed
Skewed right; the tail extends to the left
Deviation
The difference between the entry and the mean of the data set
Sum of squares
Sum of the squares of the deviations
Population variance
Average of the squares of the deviations
Standard deviation
Square root of the variance
Sample Variance
Sample Standard Deviation
Empirical Rule
68-95-99.7 Rule;
For data sets with distributions that are about symmetric and bell shaped
1. 68% lie within one standard deviation
2. 95% lie within 2 standard deviations
3. 99.7% lie within 3 standard deviations
Chebychev's Theorem
Gives an inequality statement that applies to all distributions
1. The portion of any data set lying within k standard deviations of the mean is at least 1-(1/k^2)
Fractiles
Numbers that partition, or divide, an ordered data into equal parts
Quartiles
Divide an ordered data set into four equal parts
First Quartile
Q1; 1/4 of data fall on or below
Second Quartile
Q2; 1/2 of data fall on or below
Same as median
Third Quartile
Q3; 3/4 of data fall on or below
Interquartile Range
A measure of variation that gives the range of the middle portion (about half) of the data
IQR= Q3-Q1
Box-and-whisker-plot
Boxplot
Exploratory data analysis tool that highlights the important features of a data set
Five number summary
1. The minimum entry
2. First quartile
3. Median (Q2)
4. Third Quartile
5. Maximum entry