2.2 Workhorse Data Visualizations

Some types of simple charts are used extensively for data exploration (and even for data communication and storytelling).

2.2.1 Rug Plots

Gaps in the number line indicate an absence of those numeric values in the dataset

Remember: this is (possibly) different from the order that values appear in the dataset – since it is a number line, it shows where the values fall numerically

If values are exactly the same, they will be on top of each other.

2.2.2 Bar Charts and Histograms

Pros: known by many non-technical individuals easy to read (looks like something right out of high-school) can be adorned with added information (median, mean, hairs, etc.)

Cons: somewhat depressingly, a vast majority of the population does not know how to read them… yet it’s conceivably one of the simplest graphical representations.

2.2.3 Line Charts

Pros: classic look easy to read (looks like something right out of high-school)

Cons: the right look does not necessarily convey a valid point (but that’s an analysis problem, not necessarily a visualization one)

2.2.4 Scatterplots

Pros: easy to spot (suspected) outliers or clusters provides a detailed visual representation of two (or more) variables at a time diagonal and upper triangular elements can be used to showcase other dataset features

Cons: not as easy to determine centrality, spread and skewness (still possible, nevertheless) complexity increases with the number of variables what about relationships between 3 or more variables?