R

Visualizing Two Qualitative Variables

Bonds A dataset for illustrating the various available visualizations needs a certain degree of richness with manageable size. The dataset on Bonds contains three categorical and a few quantitative indicators sufficient to show what we might wish. Loading the Data Bonds <- read.csv(url("https://raw.githubusercontent.com/robertwwalker/DADMStuff/master/BondFunds.csv")) A Summary library(skimr) Bonds %>% skim() Table 1: Data summary Name Piped data Number of rows 184 Number of columns 9 _______________________ Column type frequency: character 4 numeric 5 ________________________ Group variables None Variable type: character

Visualizing One Qualitative Variable

Bonds A dataset for illustrating the various available visualizations needs a certain degree of richness with manageable size. The dataset on Bonds contains three categorical and a few quantitative indicators sufficient to show what we might wish. Loading the Data Bonds <- read.csv(url("https://raw.githubusercontent.com/robertwwalker/DADMStuff/master/BondFunds.csv")) A Summary library(skimr) Bonds %>% skim() Table 1: Data summary Name Piped data Number of rows 184 Number of columns 9 _______________________ Column type frequency: character 4 numeric 5 ________________________ Group variables None Variable type: character

Cocktails

The Data cocktails <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-26/cocktails.csv') ## ## ── Column specification ──────────────────────────────────────────────────────── ## cols( ## row_id = col_double(), ## drink = col_character(), ## date_modified = col_datetime(format = ""), ## id_drink = col_double(), ## alcoholic = col_character(), ## category = col_character(), ## drink_thumb = col_character(), ## glass = col_character(), ## iba = col_character(), ## video = col_logical(), ## ingredient_number = col_double(), ## ingredient = col_character(), ## measure = col_character() ## ) boston_cocktails <- readr::read_csv('https://raw.

Non-Profits in Oregon: Socrata is Cool

Socrata: The Open Data Portal I did not previously know much about precisely how open data portals had evolved. Oregon’s is quite nice and I will take the opportunity to map and summarise non-profits throughout the state. Here is the data. library(RSocrata) Oregon.Nonprofits <- read.socrata("https://data.oregon.gov/resource/8kyv-b2kw.csv") glimpse(Oregon.Nonprofits) ## Rows: 163,489 ## Columns: 18 ## $ registry_number <int> 299818, 299818, 299818, 299818, 299818, 5… ## $ business_name <chr> "UNITED METHODIST CHURCH, OREGON CITY, OR… ## $ entity_type <chr> "DOMESTIC NONPROFIT CORPORATION", "DOMEST… ## $ registry_date <chr> "1850-05-17 00:00:00", "1850-05-17 00:00:… ## $ nonprofit_type <chr> "RELIGIOUS WITH MEMBERS", "RELIGIOUS WITH… ## $ associated_name_type <chr> "MAILING ADDRESS", "PRESIDENT", "PRINCIPA… ## $ first_name <chr> "", "MIKE", "", "MIKE", "CHRISTA", "", "S… ## $ middle_name <chr> "", "", "", "", "", "", "E", "", "", "", … ## $ last_name <chr> "", "BENISCHEK", "", "BENISCHEK", "PALMER… ## $ suffix <chr> "", "", "", "", "", "", "", "", "", "", "… ## $ not_of_record_entity <chr> "", "", "", "", "", "", "", "", "", "", "… ## $ entity_of_record_reg_number <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N… ## $ entity_of_record_name <chr> "", "", "", "", "", "", "", "", "", "", "… ## $ address <chr> "18955 S SOUTH END RD", "18955 S SOUTH EN… ## $ address_continued <chr> "", "", "", "", "", "", "", "", "", "", "… ## $ city <chr> "OREGON CITY", "OREGON CITY", "OREGON CIT… ## $ state <chr> "OR", "OR", "OR", "OR", "OR", "OR", "OR",… ## $ zip_code <chr> "97045", "97045", "97045", "97045", "9704… A basic zip code map or_zips <- zctas(cb = TRUE, starts_with = "97", class="sf") or_zips %>% ggplot(.

A GeoFacet of Credit Quality

In previous work with Skip Krueger, we conceptualized bond ratings as a multiple rater problem and extracted measure of state level creditworthiness. I had always had it on my list to do something like this and recently ran across a package called geofacet that makes it simply to easy to do. So here goes. The code is below the post. library(haven) library(dplyr) Pew.Data <- read_dta(url("https://github.com/robertwwalker/academic-mymod/raw/master/data/Pew/modeledforprediction.dta")) library(tidyverse) load(url("https://github.com/robertwwalker/academic-mymod/raw/master/data/Pew/Scaled-BR-Pew.RData")) state.ratings <- data.

A Quick tidyTuesday on Beer, Breweries, and Ingredients

Beer Distribution The #tidyTuesday for March 31, 2020 is on beer. The essential elements and a method for pulling the data are shown: Imgur A Comment on Scraping .pdf The Tweet The details on how the data were obtained are a nice overview of scraping .pdf files. The code for doing it is at the bottom of the page. @thomasmock has done a great job commenting his way through it.

Mapping COVID-19 in Oregon

Oregon COVID data The Oregon data are available from OHA here. I cut and pasted the first two days because it was easy with datapasta. As it goes on, it was easier to write a script that I detail elsewhere that I can self-update. urbnmapr The Urban Institute has an excellent state and county mapping package. I want to make use of the county-level data and plot the starter map.

tidyTuesday on the Office

The Office library(tidyverse) office_ratings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-17/office_ratings.csv') A First Plot The number of episodes for the Office by season. library(janitor) TableS <- office_ratings %>% tabyl(season) p1 <- TableS %>% ggplot(., aes(x=as.factor(season), y=n, fill=as.factor(season))) + geom_col() + labs(x="Season", y="Episodes", title="The Office: Episodes") + guides(fill=FALSE) p1 Ratings How are the various seasons and episodes rated? p2 <- office_ratings %>% ggplot(., aes(x=as.factor(season), y=imdb_rating, fill=as.factor(season), color=as.factor(season))) + geom_violin(alpha=0.3) + guides(fill=FALSE, color=FALSE) + labs(x="Season", y="IMDB Rating") + geom_point() p2 Patchwork Using patchwork, we can combine multiple plots.

Tracking COVID-19 2020-03-24

R to Import COVID Data library(tidyverse) library(gganimate) COVID.states <- read.csv(url("http://covidtracking.com/api/states/daily.csv")) COVID.states <- COVID.states %>% mutate(Date = as.Date(as.character(date), format = "%Y%m%d")) The Raw Testing Incidence I want to use patchwork to show the testing rate by state in the United States. Then I want to show where things currently stand. In both cases, a base-10 log is used on the number of tests.

A Look at VIX :2020-10-19

Get Some VIX data NB: I originally wrote this on February 27, 2020 so there is commentary surrounding that date. It was done on the quick for curiosity. I will update it by recompiling it with new data and will update the commentary noting when it took place. Chicago Board Of Exchange (CBOE) makes data available regularly. To judge the currency of the data, I have tailed it below after converting the dates to a valid date format.