Scatterplots

Getting Started

First, be sure you have installed ggformula. Remember, you only need to install the package once on your machine.

Then, be sure to load the package ggformula. Remember, you need to do this with each new Quarto/RMarkdown document or R Session.

#| label: setup
library(ggformula) #for graphs

Data for Examples

As a reminder (see Overview of Data Visualization, we will be using the penguins data from the palmerpenguins package:

library(palmerpenguins)

Here is a snippet of the data:

Palmer Penguins
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Chinstrap Dream 46.0 18.9 195 4150 female 2007
Adelie Torgersen 38.9 17.8 181 3625 female 2007
Chinstrap Dream 53.5 19.9 205 4500 male 2008
Adelie Dream 36.6 18.4 184 3475 female 2009
Adelie Torgersen 39.5 17.4 186 3800 female 2007

Scatterplots

Basic Code

For two quantitative variables, x and y, here is the general structure for a scatterplot.

gf_point(y ~ x, data = mydata)

Run the code below to see an example using the quantitative variables bill_length_mm and bill_depth_mm from the penguins data. Then replace bill_length_mm with another quantitative variable from the penguins data or switch the order to see what happens.

#| label: scatterplot-one
gf_point(bill_length_mm ~ bill_depth_mm, 
         data = penguins)
Warnings in R

Notice the warning produced from running the code. This is just a warning that there were rows (penguins) ignored due to missing data for the variables visualized. A Warning is simply R communicating a decision it made without your consent. The code still works.

Adding a Regression Line