Module 6 Exercises
Find examples of data presentations that you consider to be particularly insightful and/or powerful. Discuss their strengths and weaknesses.
Find examples of data presentations that you consider to be particularly misleading and/or useless. Discuss their strengths and weaknesses.
How do you think new technologies (e.g. virtual or augmented reality, 3D-printing, wearable computing) will influence data presentations?
Consider the following datasets:
GlobalCitiesPBI.csv
,2016collisionsfinal.csv
,polls_us_election_2016.csv
, andHR_2016_Census_simple.csv
.Create a data dictionary for this dataset. Establish a list of variables that you think are crucial to a good understanding of the dataset. Justify your choices.
Create (at least) 5 bivariate/univariate visualizations that can help you understand the dataset.
Produce (at least) 3 “definitive” visualizations for the dataset. Use the principles discussed in class (including documentation, legends, annotations, Multiple I’s, etc.). Emphasis should be placed on content AND on presentation (suggestions: consider creating a reasonably high number of charts using a random selection of a random number of variables in order to minimize the odds of missing out on useful information).
Repeat the previous question with any dataset of your liking.
Identify a scenario for which a dashboard could prove useful. Determine specific questions that the dashboard could help answer or insights that it could provide. Identify data sources and data elements that could be fed into your dashboard. Design a display (with pen and paper) with mock charts. What are the strengths and limitations of your dashboard? Is it functional? Elegant?
The remaining exercises use the Gapminder Tools (there is also an offline version).
At what point in the data science workflow do you think that visualizations of this nature could be useful?
What are the ways in which observations could be anomalous? Have you found any such anomalies? Do you have explanations for them? In particular, consider the case of South Africa in 2012, which appears to be a clear outlier. Follow the path of the South African bubble from 1975 to 2020, in relation to the general pattern. Does the apartheid/income inequity explanation suggested in the text still make sense?
Pick 2+ “definitive” visualizations (methods, variables, etc.) other than the default configuration. What are some important insights?
How would you describe the insights of step 3 without resorting to visual vocabulary?
Can you think of ways in which the data of interest to you in your day-to-day activities could benefit from the same treatment? What situations could you explore in such a scenario? How would that help your team better understand the system under consideration?
Consider the following Australian population figures, by state (in 1000s):
Graph the New South Wales (NSW) population with all defaults using
plot()
. Redo the graph by adding a title, a line to connect the points, and some colour.Compare the population of New South Wales (NSW) and the Australian Capital Territory (ACT) by using the functions
plot()
andlines()
, then add a legend to appropriately display your graph.Use a bar chart to graph the population of Queensland (QLD), add an appropriate title to your graph, and display the years from 1917 to 2017 on the appropriate bars.
Create a light blue histogram for the population of South Australia (SA).
Year | NSW | Vic. | Qld | SA | WA | Tas. | NT | ACT | Aust. |
1917 | 1904 | 1409 | 683 | 440 | 306 | 193 | 5 | 3 | 4941 |
1927 | 2402 | 1727 | 873 | 565 | 392 | 211 | 4 | 8 | 6182 |
1937 | 2693 | 1853 | 993 | 589 | 457 | 233 | 6 | 11 | 6836 |
1947 | 2985 | 2055 | 1106 | 646 | 502 | 257 | 11 | 17 | 7579 |
1957 | 3625 | 2656 | 1413 | 873 | 688 | 326 | 21 | 38 | 9640 |
1967 | 4295 | 3274 | 1700 | 1110 | 879 | 375 | 62 | 103 | 11799 |
1977 | 5002 | 3837 | 2130 | 1286 | 1204 | 415 | 104 | 214 | 14192 |
1987 | 5617 | 4210 | 2675 | 1393 | 1496 | 449 | 158 | 265 | 16264 |
1997 | 6274 | 4605 | 3401 | 1480 | 1798 | 474 | 187 | 310 | 18532 |
2007 | 6889 | 5205 | 4182 | 1585 | 2106 | 493 | 215 | 340 | 21017 |
2017 | 7861 | 6324 | 4928 | 1723 | 2580 | 521 | 246 | 410 | 24599 |
Create a useful dashboard out of the
Gapminder.csv
dataset.Use the “Accounting Transactions” set-up of Section 5.3.
Create a measure to show the ratio of Minor to Major Capital.
Create a line chart using this measure; use “Effective Date” for the “X Axis”.
Create a slicer that uses “Effective Date”, and explore its effect on the chart’s shape.
Add the slicer to some of the previous charts from the section and explore its effects on their shape.
Use the “Accounting Transactions” set-up of Section 5.3.
Create a new bar chart.
Use “Journal Voucher Type Code” as the Axis.
Use “Item Amount” as the Value.
Change the chart so that it is sorted using a “Custom Sort Order” column (ordering according to some custom order).
Use the “Accounting Transactions” set-up of Section 5.3.
Create a chart that contains values from two or more of the tables with “Project Identifier” as an axis.
Go into any one of the data files and add a new project on the bottom with a NEW Project Identifier that is not on the list.
Save the Excel file and then update the Power BI data model. What happens?
Use the “Accounting Transactions” set-up of Section 5.3, with the appended tables.
Create a chart that contains values from two or more of the tables with “Project Identifier” on the axis.
Go into any one of the data files and add a new project on the bottom with a NEW Project Identifier that is not found in the list.
Save the excel and then update the Power BI to see what happens.
Determine if the following are stories, as discussed in Visualization and Storytelling.
Two identical infants lay in a cradle. “One you bore, the other is a Changeling. Choose wisely,” the Fae’s voice echoed from the shadow. “I’m taking both my children,” the mother said defiantly.
Solomon was required to decide which of two women was the mother of a baby, when each of them claimed parenthood. Both had recently given birth, but one child had died. Solomon announced that the child should be cut in two, so that each mother should have half. The real mother, unable to bear her son being killed, immediately offered it to the other woman, to save the child’s life, whereas the other agreed to the proposal. The false mother was thus exposed, and Solomon returned the living child to its real mother.
Yankel and Moishe ride a train across Poland, thinking about their brides – whom they’re about to meet for the first time . Suddenly, Yankel stands up and says, “I’m not ready for marriage. I’m not getting married!” He grabs his suitcase and runs off the train at the next stop. Moishe watches him go. A day later he reaches his destination, where the two mothers of the prospective brides are shocked to discover there’s only one groom on the train. “He’s mine!” “Not on your life! He’ll marry my daughter!” Moishe lets his prospective mothers-in-law argue over him for a while, then suggests they all go to the Rebbe for a ruling. “The solution is plain, per the wisdom of King Solomon. Cut the boy in two, and each of you take half.” The first mother looks shocked. The second mother says, “Well, Rebbe, I didn’t think that you had it in you. Yes! Cut him in half.” The Rebbe points to the second mother and says, “That’s the real mother-in-law!”
For sale: baby shoes. Never worn.
Doctors think that they may have improved the diagnosis of liver disease by 1%.
Scientists claims cure for cancer.
Spiritualist medium claims cure for cancer.
Sens rally after blowing lead; beat Leafs to gain on Habs.
Macbeth and his wife
Want to become the royals
So they kill ’em all.The following chart from FiveThirtyEight:
The following news report, from the Associated Press:
Find examples of stories that are about individuals; about organizations; about cultures/societies.
Find examples of stories that are used to persuade; to educate; to entertain.
Find examples of data stories that are about individuals; about organizations; about cultures/societies.
Find examples of data stories that are used to persuade; to educate; to entertain.
In your organization, who makes up the audience? Is there only one audience? What are the storytelling goals? Is the storytelling context clear? Constant? Universal?
The Death of the Author: Isaac Asimov once sat anonymously in a class where the topic of discussion was one of his stories. He introduced himself afterwards to the teacher, saying that he had found his interpretation of the story interesting, but it wasn’t really what he had meant at all. The teacher’s response was “Just because you wrote it, what makes you think you have the slightest idea what it’s about?” How could this come into play when telling stories with data?
Guess the story:
A group of friends spends 9 hours returning jewellery.
A talking frog convinces a son to kill his father.
A young woman talks to furniture and marries her kidnapper.
A depressed, widowed father teams up with a mentally-ill woman to find his disabled son.
In the following charts, who is the intended audience? What are the goals? Are the outcomes universal?
Select a few stories of your liking (all genres, media, formats are on the table).
Build the corresponding story spines.
Identify some of their storytelling tropes.
Build the corresponding story molecules.
How would you fix the following stories, using the concepts presented in this module?
One day I woke up. I went outside and there was weather. I came back inside and did things.
Once upon a time there was a hero. She left on an adventure and fought a giant dragon. Then she fought another giant dragon. Then she fought another giant dragon. Then she fought another dragon. Then she went home, victorious.
The rain was bucketing down in immense impenetrable sheets of torrential freezing water. “Oh woe is me!”, the brave but sobbing child screamed. How shall I ever manage to dash as quickly as humanly possible into the terrible wreck of an ancient schoobus without completely mangling my magnificent hair. Indeed, the desolate child failed. She was the laughingstock of all of the pompous fools on the school bus.
Start by taking a few moments to think of a very short story you can tell (context, events, outcome).
Pair up and tell the story (one of you will be the storyteller, the other person is the audience).
Pair up again and tell the story (the storytellers should become the audience for someone who was an audience of another group in part a.)
Pair up a third time and repeat parts a. and b., with different teammates if possible.
How did the story change the second time you told it? What made you change it?
Think of a work story. Create a sketch that could illustrate this work story. What visual storytelling choices and combinations would you consider using? Would accessibility considerations change the way in which the story is presented to the audience?
Re-cast the data stories presented in Visualization and Storytelling (or other data stories, as required) using different visual storytelling choices and combinations.
Find good candidates for the type of chart that could be used with the following:
A dataset with information about financial transactions throughout the year.
A dataset with the results of a survey of employee work satisfaction.
A dataset showing levels of regulatory compliance with a set of government regulations (e.g., regulations relating to environmental reporting).
Any other dataset of your choosing.
Identify instances of scoping, exploration, explanation, persuasion among the dashboards and charts presented in Visualization and Storytelling (or other data stories, as required).
What do you think the underlying dataset structure and limitations are?
What analytical and data focus choices are at play?
Are the charts falsifiable?
What do you think is missing to turn this evolved data chart into a data story?
Evolve the following charts into data stories. Focus on the message and how to avoid misleading the audience. Use data storytelling tropes as needed.

Evolution 5
Consider a data question of interest to you personally, your organization, or your society.
Identify the target audience and the goals for your dashboard.
Do you require an exploration dashboard? A storybook? A situational awareness dashboard? Some combination of the above?
Identify the presentation requirements for your dashboard.
Create a storyboard for your dashboard.
What type of narrative and logic do you think would best serve your needs?
Comment on the aesthetics of the following charts, according to the Gestalt principles, their use of colours, clutter (or lack thereof), size and positioning, etc. Provide suggestions for improvements.
- Consider the following examples of charts found in the wild. Are they examples of exploration, storytelling, situational awareness? Are they data stories? If not, how would you turn them into stories? If so, are they good stories? Bad ones? Ugly ones? If they are not good stories, how would you improve them?
Identify uses of the Gestalt principles in the charts presented in the previous question.
Deconstruct the charts introduced in the book in terms of the grammar of graphics. What do some of the most effective charts have in common? What about the least effective ones? Does that suggest a strategy?
Create some simple ggplot2 visualizations with data available in
R
. The emphasis should on becoming familiar with various geometries, their aesthetics, and the use of facets. You may use the examples found in the text as the basis of your work.80The custdata.tsv
file is derived from U.S. Census PUMS data. This synthetic dataset contains customer information for individuals whose health insurance status is known. Recreate the following ggplot2 charts (ignore missing values in this example).
The available built-in datasets are obtained by running
data()
at theR
command prompt.↩︎