mathart
A cool package for math generated art that I just discovered. Here is the install code for it
install.packages(c("devtools", "mapproj", "tidyverse", "ggforce", "Rcpp"))
devtools::install_github("marcusvolz/mathart")
devtools::install_github("marcusvolz/ggart")
devtools::install_github("gsimchoni/kandinsky")
Load some libraries
library(mathart)
library(ggart)
library(ggforce)
library(Rcpp)
library(tidyverse)
Generate some Art?
This is quite fun to do.
set.seed(12341)
terminals <- data.frame(x = runif(10, 0, 10000), y = runif(10, 0, 10000))
df <- 1:10000 %>%
map_df(~weiszfeld(terminals, c(points$x[.], points$y[.])), .id = "id")
p <- ggplot() +
geom_point(aes(x, y), points, size = 1, alpha = 0.
The Economist’s Errors and Credit Where Credit is Due
The Economist is serious about their use of data visualization and they have occasionally owned up to errors in their visualizations. They can be deceptive, uninformative, confusing, excessively busy, and present a host of other barriers to clean communication. Their blog post on their errors is great.
I have drawn the following example from a #tidyTuesday earlier this year that explores this.
Scraping NFL data
Note: An original version of this post had issues induced by overtime games. There is a better way to handle all of this that I learned from a brief analysis of a tie game between Cleveland and Pittsburgh in Week One.
The nflscrapR package is designed to make data on NFL games more easily available. To install the package, we need to grab it from github.
tidyquant
Automates a lot of equity research and calculation using tidy concepts. Here, I will first use it to get the components of the S and P 500 and pick out those with weights over 1.25 percent. In the next step, I download the data and finally calculate daily returns and a cumulative wealth index.
library(tidyquant)
library(tidyverse)
tq_index("SP500") %>% filter(weight > 0.0125) %>% select(symbol,company) -> Tickers
Tickers <- Tickers %>% filter(symbol!
Variance in the Outcome: The Black Box
Regression models engage an exercise in variance accounting. How much of the outcome is explained by the inputs, individually (slope divided by standard error is t) and collectively (Average explained/Average unexplained with averaging over degrees of freedom is F). This, of course, assumes normal errors. This document provides a function for making use of the black box. Just as in common parlance, a black box is the unexplained.
Correlations and the Impact on Sums and Differences
I will use a simple R function to illustrate the effect of correlation on sums and differences of random variables. In general, the variance [and standard deviation] of a sum of random variables is the variance of the individual variables plus twice the covariance; the variance [and standard deviation] of a difference in random variables is the variance of the individual variables minus twice the (signed) covariance.