R Markdown There is detailed help for all that Markdown can do under Help in the RStudio. The key to it is knitting documents with the Knit button in the RStudio. If we use helpers like the R Commander, Radiant, or esquisse, we will need the R code implanted in the Markdown document in particular ways. I will use Markdown for everything. I even use a close relation of Markdown in my scholarly pursuits.
Scraping NFL data
Note: An original version of this post had issues induced by overtime games. There is a better way to handle all of this that I learned from a brief analysis of a tie game between Cleveland and Pittsburgh in Week One.
The nflscrapR package is designed to make data on NFL games more easily available. To install the package, we need to grab it from github.
Scraping NFL data with nflscrapr
The nflscrapR package is designed to make data on NFL games more easily available. To install the package, we need to grab it from github.
devtools::install_github(repo = "maksimhorowitz/nflscrapR")
The github page for nflscrapR is quite informative. It has a lot of useful insight for working with the data; the set itself is quite large.
Getting Some Data
Following the guide to the package on GitHub, let me try their example.
Archigos
Is an amazing collaboration that produced a comprehensive dataset of world leaders going pretty far back; see Archigos on the web. For thinking about leadership, it is quite natural. In this post, I want to do some reshaping into country year and leader year datasets and explore the basic confines of Archigos. I also want to use gganimate for a few things. So what do we know?
FRED via fredr
The Federal Reserve Economic Database [FRED] is a wonderful public resource for data and the r api that connects to it is very easy to use for the things that I have previously needed. For example, one of my students was interested in commercial credit default data. I used the FRED search instructions from the following vignette to find that data. My first step was the vignette for using fredr.
tidyquant
Automates a lot of equity research and calculation using tidy concepts. Here, I will first use it to get the components of the S and P 500 and pick out those with weights over 1.25 percent. In the next step, I download the data and finally calculate daily returns and a cumulative wealth index.
library(tidyquant)
library(tidyverse)
tq_index("SP500") %>% filter(weight > 0.0125) %>% select(symbol,company) -> Tickers
Tickers <- Tickers %>% filter(symbol!
Trump’s Tone
A cool post on sentiment analysis can be found here. I will now get at the time series characteristics of his tweets and the sentiment stuff.
I start by loading the tmls object that I created in the previous post.
Trump’s Overall Tweeting
What does it look like?
library(tidyverse)
library(tidytext)
library(SnowballC)
library(tm)
library(syuzhet)
library(rtweet)
load(url("https://github.com/robertwwalker/academic-mymod/raw/master/data/TMLS.RData"))
names(tml.djt)
## [1] "user_id" "status_id" ## [3] "created_at" "screen_name" ## [5] "text" "source" ## [7] "display_text_width" "reply_to_status_id" ## [9] "reply_to_user_id" "reply_to_screen_name" ## [11] "is_quote" "is_retweet" ## [13] "favorite_count" "retweet_count" ## [15] "hashtags" "symbols" ## [17] "urls_url" "urls_t.
The tidyTuesday for this week is coffee chain locations
For this week:
1. The basic link to the #tidyTuesday shows an original article for Week 6.
First, let’s import the data; it is a single Excel spreadsheet. The page notes that starbucks, Tim Horton, and Dunkin Donuts have raw data available.
library(readxl)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.