2.3 Representing Multivariate Observations

At most two fields can be represented by position in the plane. How can we then represent other crucial elements on a flat computer screen?

Potential solutions include:

  • third dimension

  • marker size

  • marker colour

  • colour intensity and value

  • marker texture

  • line orientation

  • marker shape

  • motion/movie

These elements do not always mix well – efficient design is as much art as it is science.

The following examples, along with concise descriptions of key components and lists of questions that they could help answer, highlight charts’ strengths (and limitations). Some additional diagrams showcasing the four presentation types discussed above are also provided.

Bubble Chart

Example: Health and Wealth of Nations (see Figure 1.3)

Gapminder's Health and Wealth of Nation (2012)

Figure 1.3: Health and Wealth of Nations, in 2012 (Gapminder Foundation).

  • Data:

    • 2012 life expectancy in years

    • 2012 inflation adjusted GDP/capita in USD

    • 2012 population for 193 UN members and 5 other countries

  • Some Questions and Comparisons:

    • Can we predict the life expectancy of a nation given its GDP/capita?
      (The trend is roughly linear: \(\mbox{Expectancy}\approx 6.8 \times \ln \mbox{GDP/capita} + 10.6\))

    • Are there outlier countries? Botswana, South Africa, and Vietnam, at a glance.

    • Are countries with a smaller population healthier? Bubble size seems uncorrelated with the axes’ variates.

    • Is continental membership an indicator of health and wealth levels? There is a clear divide between Western Nations (and Japan), most of Asia, and Africa.

    • How do countries compare against world values for life expectancy and GDP per capita? The vast majority of countries fall in three of the quadrants. There are very few wealthy countries with low life expectancy. China sits near the world values, which is expected for life expectancy, but more surprising when it comes to GDP/capita – compare with India.

  • Multivariate Elements:

    • positions for health and wealth

    • bubble size for population

    • colour for continental membership

    • labels to identify the nations

  • Comments:

    • Are life expectancy and GDP/capita appropriate proxies for health and wealth?

    • A fifth element could also be added to a screen display: the passage of time. In this case, how do we deal with countries coming into existence (and ceasing to exist as political entities)?

Choropleth Map

Example: Mean Elevation by U.S. State (see Figure 2.5)

Mean elevation by U.S. state; high resolution elevation mapMean elevation by U.S. state; high resolution elevation map

Figure 2.5: Mean elevation by U.S. state, in feet (source unknown); contrast with high resolution elevation map (by twitter user @cstats.)

  • Data: 50 observations, ranging from sea level (0-250) to (6000+)

  • Some Questions and Comparisons:

    • Can the mean elevation of the U.S. states tell us something about the global topography of the U.S.? West has higher mean elevation related to the presence of the Rockies; Eastern coastal states are more likely to suffer from rising water levels, for instance.

    • Are there any states that do not “belong” in their local neighbourhood, elevation-wise? West Virginia and Oklahoma seem to have the “wrong” shade – is that an artifact of the colour gradient and scale in use?

  • Multivariate Elements: geographical distribution and purple-blue colour gradient (as the marker for mean elevation)

  • Comments:

    • Is the ‘mean’ the right measurement to use for this map? It depends on the author’s purpose.

    • Would there be ways to include other variables in this chart? Population density with texture, for instance.

Network Diagram

Example: Lexical Distances (see Figure 2.6).

Lexical distance of European languages

Figure 2.6: Lexical distance of European languages (T. Elms, [26]).

  • Data:

    • speakers and language groups for 43 European languages

    • lexical distances between languages

  • Some Questions and Comparisons:

    • Are there languages that are lexically closer to languages in other lexical groups than to languages in their own groups? French is lexically closer to English than it is to Romanian, say.

    • Which language has the most links to other languages? English has 10 links.

    • Are there languages that are lexically close to multiple languages in other groups? Greek is lexically close to 5 groups.

    • Is there a correlation between the number of speakers and the number of languages in a language group? Language groups with more speakers tend to have more languages.

    • Does the bubble size refer only to European speakers? Portuguese is as large as French?

  • Multivariate Elements:

    • colour and cluster for language group

    • line style for lexical distance

    • bubble size for number of speakers

  • Comments:

    • How is lexical distance computed?

    • Some language pairs are not joined by links – does this mean that their lexical distance is large enough not to be rendered?

    • Are the actual geometrical distances meaningful? For instance, Estonian is closer to French in the chart than it is to Portuguese – is it also lexically closer?

References

[26]
T. Elms, Lexical Distance of European Languages. Etymologikon, 2008.