My new blog will be rww-science.website.
I am going to host the blogdown part under a new url. All of the content in one static site was never something that I complete thought through and making it a bit more modular makes sense on a whole bunch of levels. Besides, new versions of Hugo have made it far easier to do a variety of things than it was when this started in 2018 and I would like to take advantage of a lot of new features in an updated blog.
tl;dr
In September of 2018, I began to track email solicitations by the Trump Campaign. I have noticed a striking pattern of increasing fundraising activity that started just after the July 4 weekend but I wanted to verify this over the span of the data. In short, something is up.
The Data
I will use the wonderful gmailr package to access my gmail. You need a key and an id that the vignette gives guidance on.
Oregon COVID data
I wanted to create a self-updating visualization of the data on COVID-19 in the state of Oregon provided by OHA. I still have yet to do that but decided to build this one to visualize the New York Times data.
There is a separate page of daily maps. Oregon reports a set of daily snapshots while progression requires ingesting new data each day so I began tracking it March 20; the process of scraping it is detailed in a separate file.
The Socrata package makes it easy to access API calls built around SODA for open data access. If you try to skip the Socrata part, you usually only get a fraction of the available data.
install.packages("RSocrata")
library(RSocrata)
SchoolSpend <- read.socrata("https://data.oregon.gov/resource/c7av-ntdz.csv")
The first bit of data that I found details various bits about spending and students in Oregon school districts. I want to look at a few basics of this.
library(tigris)
library(rgdal)
library(htmltools)
library(viridis) library(sf)
library(ggrepel)
library(ggthemes)
library(gganimate)
library(patchwork)
library(hrbrthemes)
load(url(paste0("https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID",Sys.Date(),".RData")))
Verifying, these data are current as of 2020-12-01 according to the loaded dataset.
A base map
To build a map to work from, I need a map library. Load the tigris library then grab the map as an sf object; there is a geom_sf that makes them easy to work with. Finally, I join the map to the data.
New York Times data for the US
The New York Times has a wonderful compilation of United States on the novel coronavirus. The data update automatically so the following graphics were generated with data retrieved at 2020-11-30 16:51:46.
The Basic State of Things
options(scipen=9)
library(tidyverse); library(hrbrthemes); library(patchwork); library(plotly); library(ggdark); library(ggrepel); library(lubridate)
CTP <- read.csv("https://covidtracking.com/api/v1/states/daily.csv")
state.data <- read_csv(url("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv"))
Rect.NYT <- complete(state.data, state,date)
# Create new cases and new deaths
Rect.
Spending on Kids
First, let me import the data.
kids <- read.csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-15/kids.csv')
# kids <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-15/kids.csv')
Now let me summarise it and show a table of the variables.
summary(kids)
## state variable year raw ## Length:23460 Length:23460 Min. :1997 Min. : -60139 ## Class :character Class :character 1st Qu.:2002 1st Qu.: 71985 ## Mode :character Mode :character Median :2006 Median : 252002 ## Mean :2006 Mean : 1181359 ## 3rd Qu.
Bonds A dataset for illustrating the various available visualizations needs a certain degree of richness with manageable size. The dataset on Bonds contains three categorical and a few quantitative indicators sufficient to show what we might wish.
Loading the Data Bonds <- read.csv(url("https://raw.githubusercontent.com/robertwwalker/DADMStuff/master/BondFunds.csv")) A Summary library(skimr) Bonds %>% skim() Table 1: Data summary Name Piped data Number of rows 184 Number of columns 9 _______________________ Column type frequency: character 4 numeric 5 ________________________ Group variables None Variable type: character
How To Import a Microsoft Excel File
The go to tool comes from the readxl library in R. We can install it with:
install.packages("readxl")
To use it, the Markdown must call it – make it active – just as we must at the command line to make it work. The Files pane will make this easier, we can right click to import and get code from the subsequent interaction.
Bonds A dataset for illustrating the various available visualizations needs a certain degree of richness with manageable size. The dataset on Bonds contains three categorical and a few quantitative indicators sufficient to show what we might wish.
Loading the Data Bonds <- read.csv(url("https://raw.githubusercontent.com/robertwwalker/DADMStuff/master/BondFunds.csv")) A Summary library(skimr) Bonds %>% skim() Table 1: Data summary Name Piped data Number of rows 184 Number of columns 9 _______________________ Column type frequency: character 4 numeric 5 ________________________ Group variables None Variable type: character