Effective Data Communication With {ggplot2}

Part II - Putting theory into practice

Renata Diaz

2024-06-13

Code-along script

Script for today’s examples.

Learning Objectives

  • Implement principles of effective design using {ggplot}. Including:
    • Maximize data:ink ratio using themes and facets
    • Customize marks and channels using geoms and aes(thetics).
    • Use helper tools including cols4all and esquisse to make thoughtful visualization choices.
    • Work through an example as a group.

Maximizing data:ink ratio

The {ggplot2} default

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() 

{theme_*}

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme_bw()

{theme_*}

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme_void()

{theme_*}

For a complete list of preset themes:

?theme_void

Or online here.

{theme_*}

Important

theme_bw() and theme_minimal can remove a lot of unnecessary ink with one fell swoop.

Use theme_set(theme_bw()) outside your plot code to set the theme for a whole script or notebook.

Modify a theme

  • The preset themes either leave unnecessary ink, or remove necessary information.
  • The theme() function allows you to modify individual elements of a theme
  • The theme options are vast. We’ll break it down.

Anatomy of theme

  • theme( <part of the plot> = <element_*(element_options)>)
  • Parts of the plot: axis, legend, panel, plot, strip.
  • Elements: element_blank, element_text, element_line, element_rect.
  • You can modify all lines/rectangles at once using theme(line = element_line()), etc.
  • See the rendered documentation to help you figure out how to change specific elements.

Remove the background panel

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) 

Re-add grid lines

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank(),
        panel.grid.major = element_line(color = "black", linewidth = .5))

Hierarchichal modifications

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) +
  theme(panel.grid.major = element_line(color = "black", linewidth = .5))  +
  theme(panel.grid.major.x = element_line(color = "red", linewidth = .1))

Maximizing use of space

facet_wrap

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) +
  facet_wrap(vars(island))

facet_wrap

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) +
  facet_wrap(vars(island), 
              ncol = 1)

facet_grid

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) +
  facet_grid(rows = vars(island), 
              cols = vars(sex))

Modify facet label placement

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) +
  facet_grid(rows = vars(island), 
              cols = vars(sex), 
              switch = "y")

Modify facet scales

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) +
  facet_grid(rows = vars(island), 
              cols = vars(sex), 
              switch = "y", 
              scales = "free_y")

Modify facet scales

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) +
  facet_grid(rows = vars(island), 
              cols = vars(sex), 
              switch = "y", 
              scales = "free_y")

Important

Use judiciously!

Modify facet strip options

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) +
  facet_grid(rows = vars(island), 
              cols = vars(sex), 
              switch = "y") +
  theme(strip.background = element_blank())

Move the legend

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm,
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) +
  facet_grid(rows = vars(island), 
              cols = vars(sex), 
              switch = "y") +
  theme(legend.position = "bottom")

Marks and channels

  • Marks correspond to geoms
    • geom_point, geom_line, geom_col, ...
  • We map marks to data using aes (aesthetics):
    • x and y for position
    • color and fill for color scale
    • Discrete or continuous
    • alpha for opacity
    • size for size
  • Aesthetics can be set manually, if not mapped to data.

Counterintuitive marks and channels

Size is much worse than position for showing continuous data!

ggplot(penguins, aes(sex, species, size = body_mass_g)) +
  geom_jitter() +
  theme(panel.background = element_blank())

Counterintuitive marks and channels

Size is much worse than position for showing continuous data!

ggplot(penguins, aes(body_mass_g)) +
  geom_histogram() +
  theme(panel.background = element_blank()) +
  facet_grid(rows = vars(species),
             cols = vars(sex), 
             switch = "y")

Marks and channels: Options

Shape is less effective than color for differentiating categories!

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm, 
                     shape = species)) +
  geom_point() +
  theme(panel.background = element_blank()) 

Marks and channels: Options

Shape is less effective than color for differentiating categories!

ggplot(penguins, aes(body_mass_g, 
                     bill_depth_mm, 
                     color = species)) +
  geom_point() +
  theme(panel.background = element_blank()) 

Effective color scales

  • There are multitudes of color palettes available for R.
  • Color palettes can be sequential, categorical, diverging, or bivariate.
  • Not all color palettes will be equally accessible to all viewers.
  • Cols4All helps you explore color palettes and find one to meet your needs.

cols4all

install.packages("cols4all", dependencies = TRUE)

library(cols4all)

c4a_gui()

A data-to-viz helper: Esquisse

  • esquisse is a ggplot2 extension for exploring data-viz pairings
  • Stay tuned for session 3 for more on extensions!

Esquisse demo

install.packages("esquisse")

esquisse::esquisser(penguins)

Let’s put it together

  • Take a look at the “Measles” data table here.
  • Let’s think through a visualization we could make of these data. Ask:
    • Which data variables will tell a story? What types of data are they?
    • Which chart type(s) would be appropriate for these data types?
    • What marks and channels shall we use?
    • What additional design decisions could we make to improve the graphic?

Resources

Next up: ggplot2 extensions!

More information and register here.