Oregon COVID data
I wanted to create a self-updating visualization of the data on COVID-19 in the state of Oregon provided by OHA. I still have yet to do that but decided to build this one to visualize the New York Times data.
There is a separate page of daily maps. Oregon reports a set of daily snapshots while progression requires ingesting new data each day so I began tracking it March 20; the process of scraping it is detailed in a separate file.
library(tigris)
library(rgdal)
library(htmltools)
library(viridis) library(sf)
library(ggrepel)
library(ggthemes)
library(gganimate)
library(patchwork)
library(hrbrthemes)
load(url(paste0("https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID",Sys.Date(),".RData")))
Verifying, these data are current as of 2020-12-01 according to the loaded dataset.
A base map
To build a map to work from, I need a map library. Load the tigris library then grab the map as an sf object; there is a geom_sf that makes them easy to work with. Finally, I join the map to the data.
cars data
I will work with R’s internal dataset on cars: cars. There are two variables in the dataset, this is what they look like.
plot(cars)
An Hypothesis Test
I will work with the speed variable. The hypothesis to advance is that 17 or greater is the true average speed. The alternative must then be that the average speed is less than 17. Knowing only the sample size, I can figure out what \(t\) must be to reject 17 or greater and conclude that the true average must be less with 90% probability.
Bonds A dataset for illustrating the various available visualizations needs a certain degree of richness with manageable size. The dataset on Bonds contains three categorical and a few quantitative indicators sufficient to show what we might wish.
Loading the Data Bonds <- read.csv(url("https://raw.githubusercontent.com/robertwwalker/DADMStuff/master/BondFunds.csv")) A Summary library(skimr) Bonds %>% skim() Table 1: Data summary Name Piped data Number of rows 184 Number of columns 9 _______________________ Column type frequency: character 4 numeric 5 ________________________ Group variables None Variable type: character
How To Import a Microsoft Excel File
The go to tool comes from the readxl library in R. We can install it with:
install.packages("readxl")
To use it, the Markdown must call it – make it active – just as we must at the command line to make it work. The Files pane will make this easier, we can right click to import and get code from the subsequent interaction.
Bonds A dataset for illustrating the various available visualizations needs a certain degree of richness with manageable size. The dataset on Bonds contains three categorical and a few quantitative indicators sufficient to show what we might wish.
Loading the Data Bonds <- read.csv(url("https://raw.githubusercontent.com/robertwwalker/DADMStuff/master/BondFunds.csv")) A Summary library(skimr) Bonds %>% skim() Table 1: Data summary Name Piped data Number of rows 184 Number of columns 9 _______________________ Column type frequency: character 4 numeric 5 ________________________ Group variables None Variable type: character
Bonds A dataset for illustrating the various available visualizations needs a certain degree of richness with manageable size. The dataset on Bonds contains three categorical and a few quantitative indicators sufficient to show what we might wish.
Loading the Data Bonds <- read.csv(url("https://raw.githubusercontent.com/robertwwalker/DADMStuff/master/BondFunds.csv")) A Summary library(skimr) Bonds %>% skim() Table 1: Data summary Name Piped data Number of rows 184 Number of columns 9 _______________________ Column type frequency: character 4 numeric 5 ________________________ Group variables None Variable type: character
Bonds A dataset for illustrating the various available visualizations needs a certain degree of richness with manageable size. The dataset on Bonds contains three categorical and a few quantitative indicators sufficient to show what we might wish.
Loading the Data Bonds <- read.csv(url("https://raw.githubusercontent.com/robertwwalker/DADMStuff/master/BondFunds.csv")) A Summary library(skimr) Bonds %>% skim() Table 1: Data summary Name Piped data Number of rows 184 Number of columns 9 _______________________ Column type frequency: character 4 numeric 5 ________________________ Group variables None Variable type: character
R Natural Earth Map of Europe
library(tidyverse)
library(rnaturalearth)
Europe <- ne_countries(scale = 'medium', type = 'map_units', returnclass = 'sf', continent="Europe")
ggplot(Europe) + geom_sf()
Woops. Because they come from an entire world map, latitude and longitude are too big. To cut them off, we need to crop the map. The returnclass above means we need a tool for sf data, that tool is st_crop out of the sf library.
NB: This was last updated on March 25, 2020.
Building Oregon COVID data
I have a few days of data now. To rebuild it, I will have to use the waybackmachine. The files that I need to locate and follow updates to this page from Oregon’s OHA.
A Scraper
Let me explain the logic for the scraper. NB: I had to rewrite it; the original versions of the website had three tables without data on hospitalizations.