Elementary Statistics
Thirteenth Edition
Chapter 2
Summarizing and
Graphing Data
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Summarizing and Graphing Data
2-1 Frequency Distributions for Organizing and
Summarizing Data
2-2 Histograms
2-3 Graphs that Enlighten and Graphs that Deceive
2-4 Scatterplots, Correlation, and Regression
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Key Concept
While a frequency distribution is a useful tool for
summarizing data and investigating the distribution of
data, an even better tool is a histogram, which is a
graph that is easier to interpret than a table of numbers.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Histogram
• Histogram
– A graph consisting of bars of equal width drawn adjacent
to each other (unless there are gaps in the data)
The horizontal scale represents classes of quantitative
data values, and the vertical scale represents
frequencies. The heights of the bars correspond to
frequency values.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Important Uses of a Histogram
• Visually displays the shape of the distribution of
the data
• Shows the location of the center of the data
• Shows the spread of the data
• Identifies outliers
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Relative Frequency Histogram
• Relative Frequency Histogram
– It has the same shape and horizontal scale as a
histogram, but the vertical scale is marked with
relative frequencies instead of actual frequencies.
Time
(seconds)
Frequency
75-124
11
125-174
24
175-224
10
225-274
3
275-324
2
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Critical Thinking Interpreting
Histograms
Explore the data by analyzing the histogram to see
what can be learned about “CVDOT”:
• the Center of the data,
• the Variation,
• the shape of the Distribution,
• whether there are any Outliers,
• and Time.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Common Distribution Shapes
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Normal Distribution
Because this histogram is roughly bell-shaped, we
say that the data have a normal distribution.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Skewness (1 of 3)
• Skewness
– A distribution of data is skewed if it is not symmetric
and extends more to one side than to the other.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Skewness (2 of 3)
Data skewed to the right (positively skewed) have
a longer right tail.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Skewness (3 of 3)
Data skewed to the left (negative skewed) have
a longer left tail.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Assessing Normality with Normal
Quantile Plots (1 of 5)
Criteria for Assessing Normality with a Normal
Quantile Plot
• Normal Distribution: The pattern of the points in the
normal quantile plot is reasonably close to a straight line,
and the points do not show some systematic pattern that
is not a straight-line pattern.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Assessing Normality with Normal
Quantile Plots (2 of 5)
Criteria for Assessing Normality with a Normal
Quantile Plot
• Not a Normal Distribution: The population distribution is
not normal if the normal quantile plot has either or both of
these two conditions:
– The points do not lie reasonably close to a straight-line
pattern.
– The points show some systematic pattern that is not a
straight-line pattern.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Assessing Normality with Normal
Quantile Plots (3 of 5)
Normal Distribution: The points are reasonably
close to a straight-line pattern, and there is no other
systematic pattern that is not a straight-line pattern.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Assessing Normality with Normal
Quantile Plots (4 of 5)
Not a Normal Distribution: The points do not lie
reasonably close to a straight line.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Assessing Normality with Normal
Quantile Plots (5 of 5)
Not a Normal Distribution: The points show a
systematic pattern that is not a straight-line pattern.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Elementary Statistics
Thirteenth Edition
Chapter 2
Summarizing and
Graphing Data
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Summarizing and Graphing Data
2-1 Frequency Distributions for Organizing and
Summarizing Data
2-2 Histograms
2-3 Graphs that Enlighten and Graphs that Deceive
2-4 Scatterplots, Correlation, and Regression
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Key Concept
Introduce other common graphs that foster
understanding of data.
Discuss some graphs that are deceptive because
they create impressions about data that are somehow
misleading or wrong.
Technology now provides us with powerful tools for
generating a wide variety of graphs.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Graphs that Enlighten: Dotplots (1 of 2)
• Dotplots
– A graph of quantitative data in which each data value
is plotted as a point (or dot) above a horizontal scale of
values. Dots representing equal values are stacked.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Graphs that Enlighten: Dotplots (2 of 2)
• Dotplots
– Features of a Dotplot
▪ Displays the shape of distribution of data.
▪ It is usually possible to recreate the original list of
data values.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Stemplots (1 of 2)
• Stemplots (or stem-and-leaf plot)
– Represents quantitative data by separating each value
into two parts: the stem (such as the leftmost digit) and
the leaf (such as the rightmost digit).
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Stemplots (2 of 2)
• Stemplots (or stem-and-leaf plot)
– Features of a Stemplot
▪ Shows the shape of the distribution of the data.
▪ Retains the original data values.
▪ The sample data are sorted (arranged in order).
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Time-Series Graph (1 of 2)
• Time-Series Graph
– A graph of time-series data, which are quantitative data
that have been collected at different points in time, such
as monthly or yearly
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Time-Series Graph (2 of 2)
• Time-Series Graph
– Feature of a Time-Series Graph
▪ Reveals information about trends over time.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Bar Graph (1 of 2)
• Bar Graphs
– A graph of bars of equal width to show frequencies of
categories of categorical (or qualitative) data. The bars
may or may not be separated by small gaps.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Bar Graph (2 of 2)
• Bar Graphs
– Feature of a Bar Graph
▪ Shows the relative distribution of categorical data so
that it is easier to compare the different categories.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Pareto Chart (1 of 3)
• Pareto Charts
– A Pareto chart is a bar graph for categorical data, with
the added stipulation that the bars are arranged in
descending order according to frequencies, so the
bars decrease in height from left to right.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Pareto Chart (2 of 3)
• Pareto Charts
– Features of a Pareto Chart
▪ Shows the relative distribution of categorical data so
that it is easier to compare the different categories.
▪ Draws attention to the more important categories.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Pareto Chart (3 of 3)
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Pie Chart (1 of 3)
• Pie Charts
– A very common graph that depicts categorical data
as slices of a circle, in which the size of each slice is
proportional to the frequency count for the category
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Pie Chart (2 of 3)
• Pie Charts
– Feature of a Pie Chart
▪ Shows the distribution of categorical data in a commonly
used format.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Pie Chart (3 of 3)
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Frequency Polygon (1 of 3)
• Frequency Polygon
– A graph using line segments connected to points
located directly above class midpoint values
– A frequency polygon is very similar to a histogram, but a
frequency polygon uses line segments instead of bars.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Frequency Polygon (2 of 3)
• Frequency Polygon
– A variation of the basic frequency polygon is the relative
frequency polygon, which uses relative frequencies
(proportions or percentages) for the vertical scale.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Frequency Polygon (3 of 3)
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Relative Frequency Polygon
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Graphs That Deceive (1 of 4)
• Nonzero Vertical Axis
– A common deceptive graph involves using a vertical
scale that starts at some value greater than zero to
exaggerate differences between groups.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Graphs That Deceive (2 of 4)
• Nonzero Vertical Axis
Always examine a graph carefully to see whether a vertical
axis begins at some point other than zero so that differences
are exaggerated.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Graphs That Deceive (3 of 4)
• Pictographs
– Drawings of objects, called pictographs, are often
misleading. Data that are one-dimensional in nature
(such as budget amounts) are often depicted with
two-dimensional objects (such as dollar bills) or threedimensional objects (such as stacks of coins, homes,
or barrels).
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Graphs That Deceive (4 of 4)
• Pictographs
– By using pictographs, artists can create false
impressions that grossly distort differences by using
these simple principles of basic geometry:
▪ When you double each side of a square, its area
doesn’t merely double; it increases by a factor of four.
▪ When you double each side of a cube, its volume
doesn’t merely double; it increases by a factor of eight.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Pictographs
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Concluding Thoughts (1 of 2)
In addition to the graphs we have discussed in this
section, there are many other useful graphs - some of
which have not yet been created. The world needs more
people who can create original graphs that enlighten us
about the nature of data.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Concluding Thoughts (2 of 2)
In The Visual Display of Quantitative Information,
Edward Tufte offers these principles:
• For small data sets of 20 values or fewer, use a table
instead of a graph.
• A graph of data should make us focus on the true nature
of the data, not on other elements, such as eye-catching
but distracting design features.
• Do not distort data; construct a graph to reveal the true
nature of the data.
• Almost all of the ink in a graph should be used for the
data, not for other design elements.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Elementary Statistics
Thirteenth Edition
Chapter 2
Summarizing and
Graphing Data
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Summarizing and Graphing Data
2-1 Frequency Distributions for Organizing and
Summarizing Data
2-2 Histograms
2-3 Graphs that Enlighten and Graphs that Deceive
2-4 Scatterplots, Correlation, and Regression
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Key Concept
Introduce the analysis of paired sample data.
Discuss correlation and the role of a graph called a
scatterplot, and provide an introduction to the use of
the linear correlation coefficient.
Provide a very brief discussion of linear regression,
which involves the equation and graph of the straight
line that best fits the sample paired data.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Scatterplot and Correlation (1 of 2)
• Correlation
– A correlation exists between two variables when the
values of one variable are somehow associated with
the values of the other variable.
• Linear Correlation
– A linear correlation exists between two variables
when there is a correlation and the plotted points of
paired data result in a pattern that can be approximated
by a straight line.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Scatterplot and Correlation (2 of 2)
• Scatterplot (or Scatter Diagram)
– A scatterplot (or scatter diagram) is a plot of paired
(x, y) quantitative data with a horizontal x-axis and a
vertical y-axis. The horizontal axis is used for the first
variable (x), and the vertical axis is used for the second
variable (y).
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Waist and Arm
Correlation (1 of 2)
• Correlation: The distinct pattern of the plotted points
suggests that there is a correlation between waist
circumferences and arm circumferences.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Waist and Arm
Correlation (2 of 2)
• No Correlation: The plotted points do not show a distinct
pattern, so it appears that there is no correlation between
weights and pulse rates.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Linear Correlation Coefficient r
• Linear Correlation Coefficient r
– The linear correlation coefficient is denoted by r,
and it measures the strength of the linear association
between two variables.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Using r for Determining Correlation
The computed value of the linear correlation
coefficient, r, is always between −1 and 1.
• If r is close to −1 or close to 1, there appears to be a
correlation.
• If r is close to 0, there does not appear to be a linear
correlation.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Correlation between Shoe
Print Lengths and Heights? (1 of 2)
Shoe Print Length (cm)
Height (cm)
29.7
29.7
31.4
31.8
27.6
175.3
177.8
185.4
175.3
172.7
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Correlation between Shoe
Print Lengths and Heights? (2 of 2)
It isn’t very clear whether
there is a linear correlation.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
P-Value
• P-Value
– If there really is no linear correlation between two
variables, the P-value is the probability of getting
paired sample data with a linear correlation coefficient
r that is at least as extreme as the one obtained from
the paired sample data.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Interpreting a P-Value from the
Previous Example
The P-value of 0.294 is high. It
shows there is a high chance of
getting a linear correlation
coefficient of r = 0.591 (or more
extreme) by chance when there
is no linear correlation between
the two variables.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Interpreting a P-Value from the
Example Where n = 5
Because the likelihood of getting r = 0.591 or a more
extreme value is so high (29.4% chance), we conclude
there is not sufficient evidence to conclude there is a
linear correlation between shoe print lengths and heights.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Interpreting a P-Value
Only a small P-value, such as 0.05 or less (or a 5%
chance or less), suggests that the sample results are
not likely to occur by chance when there is no linear
correlation, so a small P-value supports a conclusion
that there is a linear correlation between the two
variables.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Correlation between Shoe
Print Lengths and Heights (n = 40)
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Correlation between Shoe
Print Lengths and Heights
The scatterplot shows a distinct pattern. The value of the
linear correlation coefficient is r = 0.813, and the P-value
is 0.000. Because the P-value of 0.000 is small, we have
sufficient evidence to conclude there is a linear correlation
between shoe print lengths and heights.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Regression
• Regression
– Given a collection of paired sample data, the regression
line (or line of best fit, or least-squares line) is the
straight line that “best” fits the scatterplot of the data.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Regression Line (1 of 2)
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Regression Line (2 of 2)
The general form of the
regression equation has a
y-intercept of b0 = 80.9 and
slope b1 = 3.22.
Using variable names, the equation is:
Height = 80.9 + 3.22 (Shoe Print Length)
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Purchase answer to see full
attachment