Statistics Chapter 2 Test 1

What are the five important characteristics of data?

center, variation, distribution, outliers, time aka computer viruses destroy or terminate

What is a center?

a representative or average value that indicates where the middle of the data set is located

What is variation?

a measure of the amount that the data values vary among themselves?

What is distribution?

the nature or shape of the distribution of the data (such as bell-shaped, uniform or skewed)

What are outliers?

sample values that lie very far away from the vast majority of other sample values

What is time?

changing characteristics of the data over time

What is a method that's objective is to summarize or describe the important characteristics of a set of data?

descriptive statistics

What is a method that is used with sample data to make inferences (or generalizations) about a population that goes beyond the data?

inferential statistics

What is a table that lists data values (either individually or by groups of intervals), along with their corresponding frequencies (or counts)?

frequency distribution

What is the number of original values that fall into a particular class?

frequency

What are the smallest numbers that can belong to the different classes in a frequency distribution?

lower class limits

What are the largest numbers that can belong to the different classes in a frequency distribution?

upper class limits

What are the numbers used to separate classes, but without the gaps created by class limits? How are they found?

class boundaries/ divide gap between upper limit of one class and lower limit of next by 2 and then add number to upper class limit or subtract it from lower class limit

What are the midpoints of the classes in a frequency distribution?

class midpoints/ add the lower class limit to the upper class limit and divide the sum by 2

What is the difference between two consecutive lower class limits or two consecutive lower class boundaries in a frequency distribution?

class width

What are 3 reasons to construct frequency distributions?

To summarize a large data set, gain some insight into the nature of data and have a basis for constructing important graphs

What is found by dividing each class frequency by the total of all frequencies?

relative frequencies

What is the difference between a relative frequency and a frequency distribution?

it uses the same class limits as a frequency distribution but relative frequencies are used instead of actual frequencies and shown as percents

What is the sum of the frequencies for that class and all previous classes called?

cumulative frequency

What does the horizontal and vertical scale of a histogram represent?

bar graph with a horizontal scale representing classes of data values and the vertical scale representing actual frequencies

What do the heights of the bars on the histogram correspond to?

frequency values, and the bars are drawn adjacent to each other (without gaps)

What has the same shape and horizontal scale as a histogram, but its vertical scale is marked with relative frequencies?

relative frequency histogram

What uses line segments connected to points located directly above class midpoint values?

frequency polygon

What is a line graph that depicts cumulative frequency, just as the cumulative frequency distribution lists cumulative frequencies?

ogive

What is a graph where each data value is plotted as a point along a scale of values?

dotplot

Who saved lives with statistics by showing people that most soldiers died due to unsanitary hospitals?

Florence Nightingale

What represents data by separating each value into two parts: the stem (leftmost digit) and leaf (rightmost digit)?

stem-and-leaf plot

What is a bar graph for qualitative data with the bars arranged in order according to frequencies?

Pareto chart

What is a graph depicting qualitative data as slices of a pie?

pie chart

What is a plot of paired (x,y) data with a horizontal x-axis and a vertical y-axis matching 2 diff data sets?

scatter diagram or scatterplot

What are data that have been collected at different points in time?

Time-series data

What is a value at the center or middle of a data set?

measure of center

What are the different ways to define the measure of center?

mean, median, mode and midrange

What is the measure of center found by adding the values and dividing the total by the number of values?

arithmetic mean

What does E mean?

the addition of a set of values?

What does x mean?

the variable usually used to represent the individual data values

What does n mean?

number of values in a sample

What does N mean?

number of values in a population

What is the measure of center that is the middle value when the original data values are arranged in order of increasing (or decreasing) magnitude?

median

What is the value of a data set that occurs most frequently?

mode

When two values occur with the same greatest frequency the data set is? more than 2 values? no repeated value?

bimodal, multimodal, no mode

What is the measure of center that is the midway between the highest and lowest values in the original data set? How is it found?

midrange/ add the maximum and minimum value and then divide the sum by 2

What is the round-off rule?

carry one more decimal place than is present in the original set of values

What is a mean computed with the different values assigned different weights?

weighted mean (Ex. find average in a class)

What is it called if a distribution of data is not symmetric and extends more to one side than the other?

skewed (if not skewed it is symmetric)

What does it mean when something is skewed to the left?

it is negatively skewed and the mean and median are to the left of the mode, there is also a longer left tail

What does it mean when something is symmetric?

there is zero skewness and the mean, median and mode are the same

What does it mean when something is skewed to the right?

it is positively skewed and the mean and median are to the right of the mode, there is also a longer right tail

What is the difference between the maximum and minimum value in a set of data?

range

What is a measure of variation of values about the mean?

standard deviation, an average deviation of values from the mean

What is a measure of variation equal to the square of the standard deviation?

variance

What is s?

sample standard deviation

What is s squared?

sample variance

What is for a set of non-negative sample or population data, expressed as a percent and describes the standard deviation relative to the mean?

coefficient of variance

What is based on the principle that for many data sets the majority (95%) of sample values are within 2 standard deviations of the mean?

range rule of thumb

What can be used to roughly estimate standard deviation?

s = range/4

How can you find rough estimates of the minimum and maximum usual sample values?

minimum usual value = mean - (2 x std dev)
maximum usual value = mean + (2 x std dev)

What rule states that for data sets having a bell-shaped distribution 68% of it is within 1 std dev of the mean, 95% within 2 std dev of the mean and 99.7% within 3 std dev of the mean?

empirical rule

What theorem applies to any data set and says at least 75% of all values are within 2 std dev of the mean and 89% within 3 std dev of the mean?

Chebyshev's Theorem (Found by using 1 - 1/K squared)

What is the number of standard deviations that a given value x is above or below the mean?

standardized score or z score, round z to 2 decimal places

What do percentiles measure?

relative standing

What is the process of using statistical tools (such as graphs, measures of center and variation) to investigate data sets in order to understand their important characteristics?

exploratory data analysis

What is a value that is located very far away from almost all of the other values?

outlier