This is a more convenient place than the WISE ecosystem to host materials for my data class in the MBA for Professionals. I will post solution sets, exercises, and builts of help here, along with those posted in the
R directory
.
The Basic Idea
Two classes of statistics have known distributions.
Means have a t distribution.
Proportions have a normal distribution if the expected number of both categories exceeds 5.
The Mean: t
The t distribution
- is entirely defined by degrees of freedom.
- has as metric, the standard error [in this case of the mean]
The equations:
and
The true mean is symmetric about the sample mean with t defining the number of standard errors of the mean above and below.
Probability: The Logic of Science
Jaynes presents a few core ideas and requirements for his rational system. Probability emerges as the representation of circumstances in which any given realization of a process is either TRUE or FALSE but both are possible and expressable by probabilities
that sum to one for all events
are greater than or equal to zero for any given event
General Representation of Probability
Is of necessity two-dimensional,
Fast Food Data
These data came care of a Tidy Tuesday a while ago. The data consist of Fast Food menu items for a selection of fast food chains. The units are menu items. We have the chain [restaurant], item [the item name], and a series of variables (columns) representing sodium, cholesterol, fat, calories, and other information. Some is missing. The data can be imported from the tidytuesday website on github as .
Variance in the Outcome: The Black Box
Regression models engage an exercise in variance accounting. How much of the outcome is explained by the inputs, individually (slope divided by standard error is t) and collectively (Average explained/Average unexplained with averaging over degrees of freedom is F). This, of course, assumes normal errors. This document provides a function for making use of the black box. Just as in common parlance, a black box is the unexplained.