Presentation ready plots I

Lecture 10

Dr. Mine Çetinkaya-Rundel

Duke University
STA 313 - Spring 2024

Warm up

Announcements

  • HW 3 is posted, not due till after project but get started early!
  • RQ 3 is due next Tuesday
  • Project 1 due next Wednesday

Setup

# load packages
library(countdown)
library(tidyverse)
library(ggrepel)
library(patchwork)
library(tidytext)
library(hrbrthemes)
library(scales)
library(textdata)

# set theme for ggplot2
ggplot2::theme_set(ggplot2::theme_minimal(base_size = 16))

# no plot sizing defaults for this slide deck

Telling a story

Multiple ways of telling a story

  • Sequential plots: Motivation, then resolution

  • A single plot: Resolution, and hidden in it motivation

Project note: you’re asked to create two plots per question. One possible approach: Start with a plot showing the raw data, and show derived quantities (e.g. percent increases, averages, coefficients of fitted models) in the subsequent plot.

Simplicity vs. complexity

When you’re trying to show too much data at once you may end up not showing anything.

  • Never assume your audience can rapidly process complex visual displays

  • Don’t add variables to your plot that are tangential to your story

  • Don’t jump straight to a highly complex figure; first show an easily digestible subset (e.g., show one facet first)

  • Aim for memorable, but clear

Project note: Make sure to leave time to iterate on your plots after you practice your presentation. If certain plots are getting too wordy to explain, take time to simplify them!

Consistency vs. repetitiveness

Be consistent but don’t be repetitive.

  • Use consistent features throughout plots (e.g., same color represents same level on all plots)

  • Aim to use a different type of visualization for each distinct analysis

Designing effective visualizations

Keep it simple

Judging relative area

Use color to draw attention



Tell a story

Leave out non-story details

Order matters

Clearly indicate missing data

Reduce cognitive load

Use descriptive titles

Annotate figures

Project workflow overview

Demo

project-1

  • Rendering individual documents
  • Write-up:
    • Cross referencing
    • Citations
  • Presentation:
    • Pauses
    • Smaller text
  • Website: https://vizdata-s24.github.io/project-1-YOUR_TEAM_NAME/
    • Rendering site
    • Making sure your website reflects your latest changes
    • Customizing the look of your website

Plot layout

Sample plots

p_hist <- ggplot(mtcars, aes(x = mpg)) +
  geom_histogram(binwidth = 2)
p_box <- ggplot(mtcars, aes(x = factor(vs), y = mpg)) +
  geom_boxplot()
p_scatter <- ggplot(mtcars, aes(x = disp, y = mpg)) +
  geom_point()
p_text <- mtcars |>
  rownames_to_column() |>
  ggplot(aes(x = disp, y = mpg)) +
  geom_text_repel(aes(label = rowname)) +
  coord_cartesian(clip = "off")

Slide with single plot, little text

The plot will fill the empty space in the slide.

p_hist

Slide with single plot, lots of text

  • If there is more text on the slide

  • The plot will shrink

  • To make room for the text

p_hist

Small fig-width

For a zoomed-in look

```{r}
#| fig-width: 3
#| fig-asp: 0.618

p_hist
```

Large fig-width

For a zoomed-out look

```{r}
#| fig-width: 10
#| fig-asp: 0.618

p_hist
```

fig-width affects text size

Multiple plots on a slide

First, ask yourself, must you include multiple plots on a slide? For example, is your narrative about comparing results from two plots?

  • If no, then don’t! Move the second plot to to the next slide!

  • If yes,

    • Insert columns using the Insert anything tool

    • Use layout-ncol chunk option

    • Use the patchwork package

    • Possibly, use pivoting to reshape your data and then use facets

Columns

Insert > Slide Columns

Quarto will automatically resize your plots to fit side-by-side.

layout-ncol

```{r}
#| fig-width: 5
#| fig-asp: 0.618
#| layout-ncol: 2

p_hist
p_scatter
```

patchwork

```{r}
#| fig-width: 7
#| fig-asp: 0.4

p_hist + p_scatter
```

patchwork layout I

(p_hist + p_box) /
  (p_scatter + p_text)

patchwork layout II

p_text / (p_hist + p_box + p_scatter)

patchwork layout III

p_text + p_hist + p_box + p_scatter + 
  plot_annotation(title = "mtcars", tag_levels = c("A"))

patchwork layout IV

p_text + {
  p_hist + {
    p_box + p_scatter + plot_layout(ncol = 1) + plot_layout(tag_level = 'new')
  }
} + 
  plot_layout(ncol = 1) +
  plot_annotation(tag_levels = c("1","a"), tag_prefix = "Fig ")

More patchwork


Learn more at https://patchwork.data-imaginist.com.

Want to replicate something you saw in my slides?


Look into the source code at https://github.com/vizdata-s24/vizdata-s24/tree/main/slides.

Take a sad plot, and make it better

Opinion pieces from The Chronicle

chronicle <- read_csv("data/chronicle.csv")
chronicle
# A tibble: 500 × 6
   title                                 author date       abstract column url  
   <chr>                                 <chr>  <date>     <chr>    <chr>  <chr>
 1 All the world’s a stage               Anna … 2024-02-22 If we a… STUDE… http…
 2 Words that matter: For Alexei Navalny Carol… 2024-02-22 In some… STUDE… http…
 3 Which would you save: Friend or roma… Jess … 2024-02-22 Love sh… STUDE… http…
 4 Happiness is not what you’re looking… Paul … 2024-02-21 We hing… STUDE… http…
 5 Closing Duke's Herbarium: A fear of … Matth… 2024-02-21 Without… LETTE… http…
 6 CS Majors launch 'ambiguous and labe… Monda… 2024-02-20 Unlike … STUDE… http…
 7 The fear of being single              Heidi… 2024-02-20 But it … STUDE… http…
 8 Save the Duke Herbarium               Henry… 2024-02-17 The Duk… LETTE… http…
 9 What Duke can learn from retiring ex… Rober… 2024-02-17 In Duke… GUEST… http…
10 Love, love                            Gabri… 2024-02-16 Somehow… STUDE… http…
# ℹ 490 more rows

Step 1

chronicle_to_plot <- chronicle |>
  tidytext::unnest_tokens(word, abstract) |>
  anti_join(tidytext::stop_words) |>
  left_join(tidytext::get_sentiments("afinn")) |> 
  group_by(author, title) |>
  summarize(total_sentiment = sum(value, na.rm = TRUE), .groups = "drop") |>
  group_by(author) |>
  summarize(
    n_articles = n(),
    avg_sentiment = mean(total_sentiment, na.rm = TRUE),
  ) |>
  filter(n_articles > 1 & !is.na(author)) |>
  arrange(desc(avg_sentiment)) |>
  slice(c(1:10, 49:58)) |>
  mutate(
    author = fct_reorder(author, avg_sentiment),
    neg_pos = if_else(avg_sentiment < 0, "neg", "pos"),
    label_position = if_else(neg_pos == "neg", 0.25, -0.25)
  )

Step 1

Joining with `by = join_by(word)`
Joining with `by = join_by(word)`
chronicle_to_plot
# A tibble: 20 × 5
   author                  n_articles avg_sentiment neg_pos label_position
   <fct>                        <int>         <dbl> <chr>            <dbl>
 1 Alex Berkman                     2          5.5  pos              -0.25
 2 Amy Unell                        2          4    pos              -0.25
 3 Gabrielle Mollin                 2          2.5  pos              -0.25
 4 Miranda Straubel                 2          2.5  pos              -0.25
 5 Anna Sorensen                    4          2.25 pos              -0.25
 6 Monday Monday                   17          1.53 pos              -0.25
 7 Duke Climate Coalition           2          1.5  pos              -0.25
 8 Susan Chemmanoor                 6          1.5  pos              -0.25
 9 Jess Jiang                       5          1.4  pos              -0.25
10 Angikar Ghosal                   9          1.33 pos              -0.25
11 Viktoria Wulff-Andersen          7         -1    neg               0.25
12 Pilar Kelly                      9         -1.22 neg               0.25
13 Billy Cao                        5         -1.4  neg               0.25
14 Valerie Tan                     11         -1.45 neg               0.25
15 Dan Reznichenko                  3         -1.67 neg               0.25
16 Matthew Arakaky                  3         -1.67 neg               0.25
17 Sydney Brown                     2         -2    neg               0.25
18 Spencer Chang                    3         -2.33 neg               0.25
19 Ayesham Khan                     2         -4    neg               0.25
20 Carol Apollonio                  3         -4    neg               0.25

Step 2

chronicle_to_plot |>
  ggplot(aes(y = author, x = avg_sentiment)) +
  geom_col()

How would you improve this visualization?

Step 3

chronicle_to_plot |>
  ggplot(aes(y = author, x = avg_sentiment)) +
  geom_col(aes(fill = neg_pos))

Step 4

chronicle_to_plot |>
  ggplot(aes(y = author, x = avg_sentiment)) +
  geom_col(aes(fill = neg_pos), show.legend = FALSE)

Step 5

chronicle_to_plot |>
  ggplot(aes(y = author, x = avg_sentiment)) +
  geom_col(aes(fill = neg_pos), show.legend = FALSE) +
  scale_fill_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91"))

Step 6

chronicle_to_plot |>
  ggplot(aes(y = author, x = avg_sentiment)) +
  geom_col(aes(fill = neg_pos), show.legend = FALSE) +
  geom_text(
    aes(x = label_position, label = author, color = neg_pos),
    hjust = c(rep(1,10), rep(0, 10)),
    show.legend = FALSE,
    fontface = "bold"
  ) +
  scale_fill_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91")) +
  scale_color_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91"))

Step 6

Step 7

chronicle_to_plot |>
  ggplot(aes(y = author, x = avg_sentiment)) +
  geom_col(aes(fill = neg_pos), show.legend = FALSE) +
  geom_text(
    aes(x = label_position, label = author, color = neg_pos),
    hjust = c(rep(1,10), rep(0, 10)),
    show.legend = FALSE,
    fontface = "bold"
  ) +
  geom_text(
    aes(label = round(avg_sentiment, 1)),
    hjust = c(rep(1.25,10), rep(-0.25, 10)),
    color = "white",
    fontface = "bold"
  ) +
  scale_fill_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91")) +
  scale_color_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91"))

Step 7

Step 8

chronicle_to_plot |>
  ggplot(aes(y = author, x = avg_sentiment)) +
  geom_col(aes(fill = neg_pos), show.legend = FALSE) +
  geom_text(
    aes(x = label_position, label = author, color = neg_pos),
    hjust = c(rep(1,10), rep(0, 10)),
    show.legend = FALSE,
    fontface = "bold"
  ) +
  geom_text(
    aes(label = round(avg_sentiment, 1)),
    hjust = c(rep(1.25,10), rep(-0.25, 10)),
    color = "white",
    fontface = "bold"
  ) +
  scale_fill_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91")) +
  scale_color_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91")) +
  scale_x_continuous(breaks = -5:5, minor_breaks = NULL) +
  scale_y_discrete(breaks = NULL) +
  coord_cartesian(xlim = c(-5, 5))

Step 8

Step 9

chronicle_to_plot |>
  ggplot(aes(y = author, x = avg_sentiment)) +
  geom_col(aes(fill = neg_pos), show.legend = FALSE) +
  geom_text(
    aes(x = label_position, label = author, color = neg_pos),
    hjust = c(rep(1,10), rep(0, 10)),
    show.legend = FALSE,
    fontface = "bold"
  ) +
  geom_text(
    aes(label = round(avg_sentiment, 1)),
    hjust = c(rep(1.25,10), rep(-0.25, 10)),
    color = "white",
    fontface = "bold"
  ) +
  scale_fill_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91")) +
  scale_color_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91")) +
  scale_x_continuous(breaks = -5:5, minor_breaks = NULL) +
  scale_y_discrete(breaks = NULL) +
  coord_cartesian(xlim = c(-5, 5)) +
  labs(
    x = "negative  ←     Average sentiment score (AFINN)     →  positive",
    y = NULL,
    title = "The Chronicle - Opinion pieces\nAverage sentiment scores of abstracts by author",
    subtitle = "Top 10 average positive and negative scores",
    caption = "Source: Data scraped from The Chronicle on Feb 21, 2024"
  )

Step 9

Step 10

chronicle_to_plot |>
  ggplot(aes(y = author, x = avg_sentiment)) +
  geom_col(aes(fill = neg_pos), show.legend = FALSE) +
  geom_text(
    aes(x = label_position, label = author, color = neg_pos),
    hjust = c(rep(1,10), rep(0, 10)),
    show.legend = FALSE,
    fontface = "bold"
  ) +
  geom_text(
    aes(label = round(avg_sentiment, 1)),
    hjust = c(rep(1.25,10), rep(-0.25, 10)),
    color = "white",
    fontface = "bold"
  ) +
  scale_fill_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91")) +
  scale_color_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91")) +
  scale_x_continuous(breaks = -5:5, minor_breaks = NULL) +
  scale_y_discrete(breaks = NULL) +
  coord_cartesian(xlim = c(-5, 5)) +
  labs(
    x = "negative  ←     Average sentiment score (AFINN)     →  positive",
    y = NULL,
    title = "The Chronicle - Opinion pieces\nAverage sentiment scores of abstracts by author",
    subtitle = "Top 10 average positive and negative scores",
    caption = "Source: Data scraped from The Chronicle on Feb 21, 2024"
  ) +
  theme_void(base_size = 16) +
  theme(
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, margin = unit(c(0.5, 0, 1, 0), "lines")),
    axis.text.y = element_blank(),
    plot.caption = element_text(color = "gray30")
  )

Step 10

Step 11

```{r}
#| output-location: slide
#| code-line-numbers: "|4-6"
#| fig-width: 8
#| fig-asp: 0.75
#| fig-align: center

chronicle_to_plot |>
  ggplot(aes(y = author, x = avg_sentiment)) +
  geom_col(aes(fill = neg_pos), show.legend = FALSE) +
  geom_text(
    aes(x = label_position, label = author, color = neg_pos),
    hjust = c(rep(1,10), rep(0, 10)),
    show.legend = FALSE,
    fontface = "bold"
  ) +
  geom_text(
    aes(label = round(avg_sentiment, 1)),
    hjust = c(rep(1.25,10), rep(-0.25, 10)),
    color = "white",
    fontface = "bold"
  ) +
  scale_fill_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91")) +
  scale_color_manual(values = c("neg" = "#4d4009", "pos" = "#FF4B91")) +
  scale_x_continuous(breaks = -5:5, minor_breaks = NULL) +
  scale_y_discrete(breaks = NULL) +
  coord_cartesian(xlim = c(-5, 5)) +
  labs(
    x = "negative  ←     Average sentiment score (AFINN)     →  positive",
    y = NULL,
    title = "The Chronicle - Opinion pieces\nAverage sentiment scores of abstracts by author",
    subtitle = "Top 10 average positive and negative scores",
    caption = "Source: Data scraped from The Chronicle on Feb 21, 2024"
  ) +
  theme_void(base_size = 16) +
  theme(
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, margin = unit(c(0.5, 0, 1, 0), "lines")),
    axis.text.y = element_blank(),
    plot.caption = element_text(color = "gray30")
  )
```

Step 11