Skip to Main Content

R and RStudio in Digital Scholarship

This guide will provide an introduction to using R and RStudio in research and instruction and what resources are available in the Freedman Center,.

Data Visualization with ggplot2

ggplot2 is THE tool for data visualization in R. It is part of the tidyverse and is built around the principles of the Grammar of Graphics, which allows you to create complex plots by layering components such as data, aesthetics, and geometric objects. This approach encourages users to think about the structure of a plot rather than just the end result, making it easier to build, customize, and understand visualizations.

To begin visualizing data with ggplot2, you first define the dataset and the aesthetic mappings—that is, which variables go on the x- and y-axes, and optionally, which variables define color, size, shape, or other properties. Then, you add geoms (geometric objects) to represent the data, such as geom_point() for scatterplots, geom_bar() for bar charts, or geom_boxplot() for comparing distributions. You can further enhance your plots with faceting (using facet_wrap() or facet_grid()), which creates multiple subplots based on the values of a categorical variable. This is especially helpful when comparing groups.  ggplot2also makes it easy to add layers like trend lines (for example, with geom_smooth()), annotations, and custom themes to improve the clarity and aesthetics of your plots. 

Because it integrates well with dplyr, ggplot2 fits naturally into a tidyverse workflow, allowing for seamless transitions from data wrangling to visualization.

The free ebook ggplot2: Elegant Graphics for Data Analysis (3e) is an excellent resource for learning more about how to use ggplot2.  And don't forget about the cheat sheets