tl;dr
In September of 2018, I began to track email solicitations by the Trump Campaign. I have noticed a striking pattern of increasing fundraising activity that started just after the July 4 weekend but I wanted to verify this over the span of the data. In short, something is up.
The Data
I will use the wonderful gmailr package to access my gmail. You need a key and an id that the vignette gives guidance on.
tidyTuesday beyonce_lyrics
Load the data.
beyonce_lyrics <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-29/beyonce_lyrics.csv')
## ## ── Column specification ────────────────────────────────────────────────────────
## cols(
## line = col_character(),
## song_id = col_double(),
## song_name = col_character(),
## artist_id = col_double(),
## artist_name = col_character(),
## song_line = col_double()
## )
str(beyonce_lyrics)
## spec_tbl_df [22,616 × 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ line : chr [1:22616] "If I ain't got nothing, I got you" "If I ain't got something, I don't give a damn" "'Cause I got it with you" "I don't know much about algebra, but I know 1+1 equals 2" .
The datasaurus dozen
The datasaurus sozen is a fantastic teaching resource for examining the importance of data visualization. Let’s have a look.
datasaurus <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-10-13/datasaurus.csv')
## ## ── Column specification ────────────────────────────────────────────────────────
## cols(
## dataset = col_character(),
## x = col_double(),
## y = col_double()
## )
Two libraries to make our work easy.
library(tidyverse)
library(skimr)
First, the summary statistics.
datasaurus %>% group_by(dataset) %>% skim()
Table 1: Data summary
Name
Piped data
Number of rows
1846
Number of columns
3
_______________________
Column type frequency:
numeric
2
________________________
Group variables
dataset
Variable type: numeric