frequency distribution
a table that shows classes, or intervals of data entries with a count of the number of entries in each class; the frequency (f) of a class is the number of data entries in the class
lower class limit
the least number that can belong to the class
upper class limit
greatest number that can belong to the class
class width
the distance between lower (or upper) limits of consecutive classes
range
difference between the max and min data entries
mispoint
class mark; average
relative frequency
the portion of percentage of data that falls in that class; to get the ____ of a class, divide the frequency by the sample size
cumulative frequency
the sum of the frequencies of that class and all previous classes; the ____ of the last class is equal to the sample size (n)
frequency histogram
a bar graph that represents the frequency distribution of a data set; has three properties:
1. the horizontal scale is quantitative and measures the data values
2. the vertical scale measures the frequencies of the classes
3. the consecutive bars must tou
class boundaries
the numbers that separate classes without forming gaps between them
frequency polygon
is a line graph that emphasizes the continuous change in frequencies
relative frequency histogram
has the same shape and the same horizontal scale as the corresponding frequency histogram
cumulative frequency graph
ogive; a line graph that displays the cumulative frequency of each class at its upper class boundary; the upper boundaries are marked on the horizontal axis, and the cumulative frequencies are marked on the vertical axis
stem and leaf plot
example of exploring data analysis; each number is separated into a stem and a leaf; similar to a histogram, but has the advantage that the graph still contains the original data values; also an easy way to sort data
dot plot
(to graph quantitative data) each entry is plotted, using a point, above a horizontal axis; like a stem & leaf plot, a ___ allows you to see how data is distributed, determine specific data entries, and identify unusual data values
pie charts
provide a convenient way to present qualitative data graphically as percents of a whole; a chart divided into sectors that represent categories; the area of each sector is proportional to the frequency of each category; in most cases you will be interpret
Pareto chart
another way to graph qualitative data; is a vertical bar graph in which the height of each bar represents frequency or relative frequency; the bars are positioned in order of decreasing height, with the tallest bar positioned at the left; helps highlight
paired data sets
when each entry in one data set corresponds to one entry in a second data set
scatter plot
a way to graph paired data sets; where the ordered pairs are graphed as points in a coordinate plane; used to show the relationship between two quantitative variables
time series
data set that is composed of quantitative entries taken at regular intervals over a period of time
measure of central tendency
a value that represents a typical, or central, entry of a data set; commonly used: mode, mean, and median
mean
average
population mean
...
sample mean
...
median
data that lies in the middle of the data when that data set is ordered; it measures the center of an ordered data set by dividing it into two equal parts; if the data set is even, the median is the mean of the two numbers in the center
mode
data entry that occurs the greatest frequency; if two entries occur the same greatest frequency, each entry is a ___ and the data set is called bimodal
outliers
a data entry that is far removed from the other entries in the data set; usually greatly affects the mean; conclusions that contain ___ may be flawed
gaps
when a data set can have one or more outliers causing ___ in a distribution
mean of frequency distribution
...
symmetric
when a vertical line can be drawn through the middle of a graph of the distribution and the resulting halves are approximately mirror images
uniform (rectangular)
when all entries, or classes, in the distribution have equal of approximately equal frequencies; a ____ distribution is also symmetric
skewed left (negatively skewed)
the tail extends to the left
skewed right (positively skewed)
the tail extends to the right
range
is the difference between the max and min data entries in the set; the data must be quantitative to find the ___
deviation
of an entry "x" in a population data set- is the difference between the entry and the mean ? of the data set
sum of squares
when you add the squares of the deviations (ss)
population variance
the mean of the squares of deviations in a population data set (N)
population standard deviation
of a population data set (N) entries- is the square root of the population variance
sample variance
...
sample standard deviation
...
empirical rule
for data with a symmetric bell-shaped distribution, the standard deviation has the following characteristics
1. about 68% of the data lie within one standard deviation of the mean
2. about 95% of the data lie within two standard deviations of the mean
3.
Chebychev's theorem
the portion of any data set lying within k standard deviation (k>1) of the mean is at least :
fractiles
numbers that partition, or divide, and ordered data set into equal parts
quartiles
Q1, Q2, and Q3 approximately divide and ordered data set into 4 equal parts
- about 1/4 of the data fall on or below the 1st quartile
- about 1/2 of the data fall on of below the 2nd quartile
^ same as the median of the data set
- about 3/4 of the data fa
inter-quartile range (IQR)
is a measure of variance that gives the range of the middle 50% of the data; the difference between the third and first quartiles
IQR= Q3-Q1
box-and-whisker plot (box plot)
and exploratory data analysis tool that highlights the important features of a data set; to graph a box plot, you must know the following values:
1. the min entry
2. the 1st quartile Q1
3. the median Q2
4. the 3rd quartile Q3
5. the max entry
^these numbe
standard score (z-score)
represents the number of standard deviations a given value "x" falls from the mean ?; __ can be negative/positive/or zero; can be used to identify an unusual value of a data set that is approximately bell-shaped
if the z-score is negative
the corresponding x-value is less than the mean
if the z-score is positive
the corresponding x-value is equal to the mean
Hawthorne effect
occurs in an experiment when subjects change their behavior simply because they know they are participating in an experiment.