NFL ScrapR

Scraping NFL data with nflscrapr

The nflscrapR package is designed to make data on NFL games more easily available. To install the package, we need to grab it from github.

devtools::install_github(repo = "maksimhorowitz/nflscrapR")

The github page for nflscrapR is quite informative. It has a lot of useful insight for working with the data; the set itself is quite large.

Getting Some Data

Following the guide to the package on GitHub, let me try their example.

library(nflscrapR)
all_2018_games <- scrape_game_ids(2018) # Default is regular season

I saved a local copy of it to work with; this is the 2018 season.

library(emo)
library(tidyverse)
library(kableExtra)
library(nflscrapR)
library(RCurl)
all_2018_games <- readRDS(url("https://github.com/robertwwalker/academic-mymod/raw/master/data/NFLGames2018.rds"))
all_2018_games %>% select(week, game_id, home_team, home_score, away_team, away_score) %>% 
    kable(format = "html") %>% kable_styling() %>% scroll_box(width = "600px", height = "500px")
week game_id home_team home_score away_team away_score
1 2018090600 PHI 18 ATL 12
1 2018090900 BAL 47 BUF 3
1 2018090907 NYG 15 JAX 20
1 2018090906 NO 40 TB 48
1 2018090905 NE 27 HOU 20
1 2018090904 MIN 24 SF 16
1 2018090903 MIA 27 TEN 20
1 2018090902 IND 23 CIN 34
1 2018090901 CLE 21 PIT 21
1 2018090908 LAC 28 KC 38
1 2018090911 DEN 27 SEA 24
1 2018090910 CAR 16 DAL 8
1 2018090909 ARI 6 WAS 24
1 2018090912 GB 24 CHI 23
1 2018091000 DET 17 NYJ 48
1 2018091001 OAK 13 LA 33
2 2018091300 CIN 34 BAL 23
2 2018091600 ATL 31 CAR 24
2 2018091608 WAS 9 IND 21
2 2018091607 TEN 20 HOU 17
2 2018091606 TB 27 PHI 21
2 2018091605 PIT 37 KC 42
2 2018091604 NYJ 12 MIA 20
2 2018091601 BUF 20 LAC 31
2 2018091602 GB 29 MIN 29
2 2018091603 NO 21 CLE 18
2 2018091610 SF 30 DET 27
2 2018091609 LA 34 ARI 0
2 2018091612 JAX 31 NE 20
2 2018091611 DEN 20 OAK 19
2 2018091613 DAL 20 NYG 13
2 2018091700 CHI 24 SEA 17
3 2018092000 CLE 21 NYJ 17
3 2018092300 ATL 37 NO 43
3 2018092309 WAS 31 GB 17
3 2018092308 PHI 20 IND 16
3 2018092307 MIN 6 BUF 27
3 2018092306 MIA 28 OAK 20
3 2018092301 BAL 27 DEN 14
3 2018092302 CAR 31 CIN 21
3 2018092303 HOU 22 NYG 27
3 2018092304 JAX 6 TEN 9
3 2018092305 KC 38 SF 27
3 2018092310 LA 35 LAC 23
3 2018092312 SEA 24 DAL 13
3 2018092311 ARI 14 CHI 16
3 2018092313 DET 26 NE 10
3 2018092400 TB 27 PIT 30
4 2018092700 LA 38 MIN 31
4 2018093005 JAX 31 NYJ 12
4 2018093006 NE 38 MIA 7
4 2018093007 TEN 26 PHI 23
4 2018093004 IND 34 HOU 37
4 2018093003 GB 22 BUF 0
4 2018093002 DAL 26 DET 24
4 2018093001 CHI 48 TB 10
4 2018093000 ATL 36 CIN 37
4 2018093008 ARI 17 SEA 20
4 2018093009 OAK 45 CLE 42
4 2018093011 NYG 18 NO 33
4 2018093010 LAC 29 SF 27
4 2018093012 PIT 14 BAL 26
4 2018100100 DEN 23 KC 27
5 2018100400 NE 38 IND 24
5 2018100700 BUF 13 TEN 12
5 2018100707 PIT 41 ATL 17
5 2018100706 NYJ 34 DEN 16
5 2018100705 KC 30 JAX 14
5 2018100704 DET 31 GB 23
5 2018100703 CLE 12 BAL 9
5 2018100701 CAR 33 NYG 31
5 2018100702 CIN 27 MIA 17
5 2018100708 LAC 26 OAK 10
5 2018100710 SF 18 ARI 28
5 2018100709 PHI 21 MIN 23
5 2018100711 SEA 31 LA 33
5 2018100712 HOU 19 DAL 16
5 2018100800 NO 43 WAS 19
6 2018101100 NYG 13 PHI 34
6 2018101400 ATL 34 TB 29
6 2018101408 WAS 23 CAR 17
6 2018101407 OAK 3 SEA 27
6 2018101406 NYJ 42 IND 34
6 2018101405 MIN 27 ARI 17
6 2018101401 CIN 21 PIT 28
6 2018101402 CLE 14 LAC 38
6 2018101403 HOU 20 BUF 13
6 2018101404 MIA 31 CHI 28
6 2018101409 DEN 20 LA 23
6 2018101411 TEN 0 BAL 21
6 2018101410 DAL 40 JAX 7
6 2018101412 NE 43 KC 40
6 2018101500 GB 33 SF 30
7 2018101800 ARI 10 DEN 45
7 2018102100 LAC 20 TEN 19
7 2018102104 JAX 7 HOU 20
7 2018102108 PHI 17 CAR 21
7 2018102107 NYJ 17 MIN 37
7 2018102102 CHI 31 NE 38
7 2018102103 IND 37 BUF 5
7 2018102109 TB 26 CLE 23
7 2018102106 MIA 21 DET 32
7 2018102101 BAL 23 NO 24
7 2018102110 WAS 20 DAL 17
7 2018102111 SF 10 LA 39
7 2018102105 KC 45 CIN 10
7 2018102200 ATL 23 NYG 20
8 2018102500 HOU 42 MIA 23
8 2018102800 JAX 18 PHI 24
8 2018102805 KC 30 DEN 23
8 2018102807 PIT 33 CLE 18
8 2018102806 NYG 13 WAS 20
8 2018102804 DET 14 SEA 28
8 2018102803 CIN 37 TB 34
8 2018102802 CHI 24 NYJ 10
8 2018102801 CAR 36 BAL 21
8 2018102808 OAK 28 IND 42
8 2018102809 ARI 18 SF 15
8 2018102810 LA 29 GB 27
8 2018102811 MIN 20 NO 30
8 2018102900 BUF 6 NE 25
9 2018110100 SF 34 OAK 3
9 2018110401 BUF 9 CHI 41
9 2018110402 CAR 42 TB 28
9 2018110403 CLE 21 KC 37
9 2018110404 MIA 13 NYJ 6
9 2018110400 BAL 16 PIT 23
9 2018110405 MIN 24 DET 9
9 2018110406 WAS 14 ATL 38
9 2018110407 DEN 17 HOU 19
9 2018110408 SEA 17 LAC 25
9 2018110409 NO 45 LA 35
9 2018110410 NE 31 GB 17
9 2018110500 DAL 14 TEN 28
10 2018110800 PIT 52 CAR 21
10 2018111101 CIN 14 NO 51
10 2018111102 CLE 28 ATL 16
10 2018111100 CHI 34 DET 22
10 2018111105 KC 26 ARI 14
10 2018111108 TEN 34 NE 10
10 2018111107 TB 3 WAS 16
10 2018111106 NYJ 10 BUF 41
10 2018111104 IND 29 JAX 26
10 2018111109 OAK 6 LAC 20
10 2018111110 LA 36 SEA 31
10 2018111103 GB 31 MIA 12
10 2018111111 PHI 20 DAL 27
10 2018111200 SF 23 NYG 27
11 2018111500 SEA 27 GB 24
11 2018111801 BAL 24 CIN 21
11 2018111800 ATL 19 DAL 22
11 2018111805 NYG 38 TB 35
11 2018111810 JAX 16 PIT 20
11 2018111806 WAS 21 HOU 23
11 2018111804 IND 38 TEN 10
11 2018111803 DET 20 CAR 19
11 2018111808 LAC 22 DEN 23
11 2018111807 ARI 21 OAK 23
11 2018111809 NO 48 PHI 7
11 2018111802 CHI 25 MIN 20
11 2018111900 LA 54 KC 51
12 2018112200 DET 16 CHI 23
12 2018112201 DAL 31 WAS 23
12 2018112202 NO 31 ATL 17
12 2018112503 CIN 20 CLE 35
12 2018112507 TB 27 SF 9
12 2018112501 BUF 24 JAX 21
12 2018112500 BAL 34 OAK 17
12 2018112502 CAR 27 SEA 30
12 2018112505 NYJ 13 NE 27
12 2018112506 PHI 25 NYG 22
12 2018112508 LAC 45 ARI 10
12 2018112504 IND 27 MIA 24
12 2018112509 DEN 24 PIT 17
12 2018112510 MIN 24 GB 17
12 2018112600 HOU 34 TEN 17
13 2018112900 DAL 13 NO 10
13 2018120200 ATL 16 BAL 26
13 2018120209 TB 24 CAR 17
13 2018120207 NYG 30 CHI 27
13 2018120206 MIA 21 BUF 17
13 2018120205 JAX 6 IND 0
13 2018120204 HOU 29 CLE 13
13 2018120201 CIN 10 DEN 24
13 2018120202 DET 16 LA 30
13 2018120203 GB 17 ARI 20
13 2018120210 OAK 33 KC 40
13 2018120211 TEN 26 NYJ 22
13 2018120213 SEA 43 SF 16
13 2018120212 NE 24 MIN 10
13 2018120208 PIT 30 LAC 33
13 2018120300 PHI 28 WAS 13
14 2018120600 TEN 30 JAX 9
14 2018120900 BUF 23 NYJ 27
14 2018120908 WAS 16 NYG 40
14 2018120907 TB 14 NO 28
14 2018120906 MIA 34 NE 33
14 2018120905 KC 27 BAL 24
14 2018120904 HOU 21 IND 24
14 2018120903 GB 34 ATL 20
14 2018120902 CLE 26 CAR 20
14 2018120910 SF 20 DEN 14
14 2018120909 LAC 26 CIN 21
14 2018120911 ARI 3 DET 17
14 2018120913 OAK 24 PIT 21
14 2018120912 DAL 29 PHI 23
14 2018120901 CHI 15 LA 6
14 2018121000 SEA 21 MIN 7
15 2018121300 KC 28 LAC 29
15 2018121500 NYJ 22 HOU 29
15 2018121501 DEN 16 CLE 17
15 2018121603 CHI 24 GB 17
15 2018121602 BUF 14 DET 13
15 2018121601 BAL 20 TB 12
15 2018121600 ATL 40 ARI 14
15 2018121604 CIN 30 OAK 16
15 2018121608 NYG 0 TEN 17
15 2018121607 MIN 41 MIA 17
15 2018121606 JAX 13 WAS 16
15 2018121605 IND 23 DAL 0
15 2018121609 SF 26 SEA 23
15 2018121610 PIT 17 NE 10
15 2018121611 LA 23 PHI 30
15 2018121700 CAR 9 NO 12
16 2018122200 TEN 25 WAS 16
16 2018122201 LAC 10 BAL 22
16 2018122305 CLE 26 CIN 18
16 2018122306 DAL 27 TB 20
16 2018122307 DET 9 MIN 27
16 2018122308 NE 24 BUF 12
16 2018122309 NYJ 38 GB 44
16 2018122310 PHI 32 HOU 30
16 2018122304 CAR 10 ATL 24
16 2018122300 IND 28 NYG 27
16 2018122302 MIA 7 JAX 17
16 2018122311 ARI 9 LA 31
16 2018122312 SF 9 CHI 14
16 2018122313 NO 31 PIT 28
16 2018122314 SEA 38 KC 31
16 2018122400 OAK 27 DEN 14
17 2018123001 BUF 42 MIA 17
17 2018123010 TB 32 ATL 34
17 2018123008 NYG 35 DAL 36
17 2018123007 NO 14 CAR 33
17 2018123006 NE 38 NYJ 3
17 2018123003 HOU 20 JAX 3
17 2018123002 GB 0 DET 31
17 2018123012 WAS 0 PHI 24
17 2018123013 DEN 9 LAC 23
17 2018123004 KC 35 OAK 3
17 2018123005 MIN 10 CHI 24
17 2018123000 BAL 26 CLE 24
17 2018123014 LA 48 SF 32
17 2018123009 PIT 16 CIN 13
17 2018123015 SEA 27 ARI 24
17 2018123011 TEN 17 IND 33

That’s all the regular season games in the 2018 season. They suggest that it is straightforward to grab an entire season of play by play data and it is. 👏.

full_season_2018 <- scrape_season_play_by_play(2018, "reg")
saveRDS(full_season_2018, file="../../../data/2018NFLSeason.rds")

That gets the data though it took over an hour to acquire it all and it threw two error messages. I do not yet know if they are consequential. My goal here is to use this package and the ability to plot the win probability charts to try to summarise an entire Dallas Cowboys season – the team I grew up with along with the former Houston Oilers and I was never really okay about them moving. 🔻

full_season_2018 <- readRDS(url("https://github.com/robertwwalker/academic-mymod/raw/master/data/2018NFLSeason.rds"))  # Downloaded it once.  It sits in rww.science/data

Let me skim the data for an idea of what it looks like. I will also pull in three packages that I am deploying or will.

library(skimr)  # Summary
library(teamcolors)  # NFL Team Colors
library(gganimate)  # Animation made easy
full_season_2018 %>% head() %>% kableExtra::kable() %>% kable_styling() %>% scroll_box(width = "700px", 
    height = "400px")
play_id game_id home_team away_team posteam posteam_type defteam side_of_field yardline_100 game_date quarter_seconds_remaining half_seconds_remaining game_seconds_remaining game_half quarter_end drive sp qtr down goal_to_go time yrdln ydstogo ydsnet desc play_type yards_gained shotgun no_huddle qb_dropback qb_kneel qb_spike qb_scramble pass_length pass_location air_yards yards_after_catch run_location run_gap field_goal_result kick_distance extra_point_result two_point_conv_result home_timeouts_remaining away_timeouts_remaining timeout timeout_team td_team posteam_timeouts_remaining defteam_timeouts_remaining total_home_score total_away_score posteam_score defteam_score score_differential posteam_score_post defteam_score_post score_differential_post no_score_prob opp_fg_prob opp_safety_prob opp_td_prob fg_prob safety_prob td_prob extra_point_prob two_point_conversion_prob ep epa total_home_epa total_away_epa total_home_rush_epa total_away_rush_epa total_home_pass_epa total_away_pass_epa air_epa yac_epa comp_air_epa comp_yac_epa total_home_comp_air_epa total_away_comp_air_epa total_home_comp_yac_epa total_away_comp_yac_epa total_home_raw_air_epa total_away_raw_air_epa total_home_raw_yac_epa total_away_raw_yac_epa wp def_wp home_wp away_wp wpa home_wp_post away_wp_post total_home_rush_wpa total_away_rush_wpa total_home_pass_wpa total_away_pass_wpa air_wpa yac_wpa comp_air_wpa comp_yac_wpa total_home_comp_air_wpa total_away_comp_air_wpa total_home_comp_yac_wpa total_away_comp_yac_wpa total_home_raw_air_wpa total_away_raw_air_wpa total_home_raw_yac_wpa total_away_raw_yac_wpa punt_blocked first_down_rush first_down_pass first_down_penalty third_down_converted third_down_failed fourth_down_converted fourth_down_failed incomplete_pass touchback interception punt_inside_twenty punt_in_endzone punt_out_of_bounds punt_downed punt_fair_catch kickoff_inside_twenty kickoff_in_endzone kickoff_out_of_bounds kickoff_downed kickoff_fair_catch fumble_forced fumble_not_forced fumble_out_of_bounds solo_tackle safety penalty tackled_for_loss fumble_lost own_kickoff_recovery own_kickoff_recovery_td qb_hit rush_attempt pass_attempt sack touchdown pass_touchdown rush_touchdown return_touchdown extra_point_attempt two_point_attempt field_goal_attempt kickoff_attempt punt_attempt fumble complete_pass assist_tackle lateral_reception lateral_rush lateral_return lateral_recovery passer_player_id passer_player_name receiver_player_id receiver_player_name rusher_player_id rusher_player_name lateral_receiver_player_id lateral_receiver_player_name lateral_rusher_player_id lateral_rusher_player_name lateral_sack_player_id lateral_sack_player_name interception_player_id interception_player_name lateral_interception_player_id lateral_interception_player_name punt_returner_player_id punt_returner_player_name lateral_punt_returner_player_id lateral_punt_returner_player_name kickoff_returner_player_name kickoff_returner_player_id lateral_kickoff_returner_player_id lateral_kickoff_returner_player_name punter_player_id punter_player_name kicker_player_name kicker_player_id own_kickoff_recovery_player_id own_kickoff_recovery_player_name blocked_player_id blocked_player_name tackle_for_loss_1_player_id tackle_for_loss_1_player_name tackle_for_loss_2_player_id tackle_for_loss_2_player_name qb_hit_1_player_id qb_hit_1_player_name qb_hit_2_player_id qb_hit_2_player_name forced_fumble_player_1_team forced_fumble_player_1_player_id forced_fumble_player_1_player_name forced_fumble_player_2_team forced_fumble_player_2_player_id forced_fumble_player_2_player_name solo_tackle_1_team solo_tackle_2_team solo_tackle_1_player_id solo_tackle_2_player_id solo_tackle_1_player_name solo_tackle_2_player_name assist_tackle_1_player_id assist_tackle_1_player_name assist_tackle_1_team assist_tackle_2_player_id assist_tackle_2_player_name assist_tackle_2_team assist_tackle_3_player_id assist_tackle_3_player_name assist_tackle_3_team assist_tackle_4_player_id assist_tackle_4_player_name assist_tackle_4_team pass_defense_1_player_id pass_defense_1_player_name pass_defense_2_player_id pass_defense_2_player_name fumbled_1_team fumbled_1_player_id fumbled_1_player_name fumbled_2_player_id fumbled_2_player_name fumbled_2_team fumble_recovery_1_team fumble_recovery_1_yards fumble_recovery_1_player_id fumble_recovery_1_player_name fumble_recovery_2_team fumble_recovery_2_yards fumble_recovery_2_player_id fumble_recovery_2_player_name return_team return_yards penalty_team penalty_player_id penalty_player_name penalty_yards replay_or_challenge replay_or_challenge_result penalty_type defensive_two_point_attempt defensive_two_point_conv defensive_extra_point_attempt defensive_extra_point_conv
37 2018090600 PHI ATL ATL away PHI PHI 35 2018-09-06 900 1800 3600 Half1 0 1 0 1 NA 0 15:00 PHI 35 0 73 J.Elliott kicks 65 yards from PHI 35 to end zone, Touchback. kickoff 0 0 0 0 0 0 0 NA NA NA NA NA NA NA NA NA NA 3 3 0 NA NA 3 3 0 0 NA NA NA 0 0 0 0.0013737 0.1626315 0.0044407 0.2541786 0.2330812 0.0036557 0.3406386 0 0 0.8149985 0.0000000 0.0000000 0.0000000 0.000000 0.000000 0.0000000 0.0000000 NA NA 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 NA NA NA NA NA NA NA 0.0000000 0.0000000 0.0000000 0.0000000 NA NA 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA J.Elliott 00-0033787 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0 NA NA NA NA 0 NA NA 0 0 0 0
52 2018090600 PHI ATL ATL away PHI ATL 75 2018-09-06 900 1800 3600 Half1 0 1 0 1 1 0 15:00 ATL 25 10 73 (15:00) PENALTY on ATL-L.Paulsen, False Start, 5 yards, enforced at ATL 25 - No Play. no_play 0 0 0 0 0 0 0 NA NA NA NA NA NA NA NA NA NA 3 3 0 NA NA 3 3 0 0 0 0 0 0 0 0 0.0013737 0.1626315 0.0044407 0.2541786 0.2330812 0.0036557 0.3406386 0 0 0.8149985 -0.7217597 0.7217597 -0.7217597 0.000000 0.000000 0.0000000 0.0000000 NA NA 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.5000068 0.4999932 0.4999932 0.5000068 -0.0210159 0.5210091 0.4789909 0.0000000 0.0000000 0.0000000 0.0000000 NA NA 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0 ATL 00-0027215 L.Paulsen 5 0 NA False Start 0 0 0 0
75 2018090600 PHI ATL ATL away PHI ATL 80 2018-09-06 900 1800 3600 Half1 0 1 0 1 1 0 15:00 ATL 20 15 73 (15:00) M.Ryan pass short right to J.Jones pushed ob at ATL 30 for 10 yards (M.Jenkins). pass 10 0 0 1 0 0 0 short right 8 2 NA NA NA NA NA NA 3 3 0 NA NA 3 3 0 0 0 0 0 0 0 0 0.0014756 0.1856642 0.0069365 0.2902967 0.2234354 0.0038931 0.2882984 0 0 0.0932388 0.8183435 -0.0965838 0.0965838 0.000000 0.000000 -0.8183435 0.8183435 0.0422408 0.7761027 0.0422408 0.7761027 -0.0422408 0.0422408 -0.7761027 0.7761027 -0.0422408 0.0422408 -0.7761027 0.7761027 0.4789909 0.5210091 0.5210091 0.4789909 0.0261091 0.4949000 0.5051000 0.0000000 0.0000000 -0.0261091 0.0261091 0.0015623 0.0245468 0.0015623 0.0245468 -0.0015623 0.0015623 -0.0245468 0.0245468 -0.0015623 0.0015623 -0.0245468 0.0245468 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 00-0026143 M.Ryan 00-0027944 J.Jones NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA PHI NA 00-0026990 NA M.Jenkins NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0 NA NA NA NA 0 NA NA 0 0 0 0
104 2018090600 PHI ATL ATL away PHI ATL 70 2018-09-06 862 1762 3562 Half1 0 1 0 1 2 0 14:22 ATL 30 5 73 (14:22) J.Jones left end pushed ob at ATL 41 for 11 yards (D.Barnett). run 11 0 0 0 0 0 0 NA NA NA NA left end NA NA NA NA 3 3 0 NA NA 3 3 0 0 0 0 0 0 0 0 0.0015177 0.1613336 0.0034158 0.2489321 0.2321779 0.0039908 0.3486320 0 0 0.9115823 1.3625068 -1.4590906 1.4590906 -1.362507 1.362507 -0.8183435 0.8183435 NA NA 0.0000000 0.0000000 -0.0422408 0.0422408 -0.7761027 0.7761027 -0.0422408 0.0422408 -0.7761027 0.7761027 0.5051000 0.4949000 0.4949000 0.5051000 0.0432767 0.4516233 0.5483767 -0.0432767 0.0432767 -0.0261091 0.0261091 NA NA 0.0000000 0.0000000 -0.0015623 0.0015623 -0.0245468 0.0245468 -0.0015623 0.0015623 -0.0245468 0.0245468 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 NA NA NA NA 00-0027944 J.Jones NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA PHI NA 00-0033876 NA D.Barnett NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0 NA NA NA NA 0 NA NA 0 0 0 0
125 2018090600 PHI ATL ATL away PHI ATL 59 2018-09-06 826 1726 3526 Half1 0 1 0 1 1 0 13:46 ATL 41 10 73 (13:46) D.Freeman right end to PHI 39 for 20 yards (M.Jenkins). run 20 0 0 0 0 0 0 NA NA NA NA right end NA NA NA NA 3 3 0 NA NA 3 3 0 0 0 0 0 0 0 0 0.0012985 0.1109844 0.0011514 0.1725676 0.2906260 0.0036329 0.4197392 0 0 2.2740891 1.3437998 -2.8028905 2.8028905 -2.706307 2.706307 -0.8183435 0.8183435 NA NA 0.0000000 0.0000000 -0.0422408 0.0422408 -0.7761027 0.7761027 -0.0422408 0.0422408 -0.7761027 0.7761027 0.5483767 0.4516233 0.4516233 0.5483767 0.0456144 0.4060090 0.5939910 -0.0888910 0.0888910 -0.0261091 0.0261091 NA NA 0.0000000 0.0000000 -0.0015623 0.0015623 -0.0245468 0.0245468 -0.0015623 0.0015623 -0.0245468 0.0245468 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 NA NA NA NA 00-0031285 D.Freeman NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA PHI NA 00-0026990 NA M.Jenkins NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0 NA NA NA NA 0 NA NA 0 0 0 0
146 2018090600 PHI ATL ATL away PHI PHI 39 2018-09-06 790 1690 3490 Half1 0 1 0 1 1 0 13:10 PHI 39 10 73 (13:10) M.Ryan pass incomplete short right to C.Ridley (J.Mills, J.Hicks). pass 0 0 0 1 0 0 0 short right 4 NA NA NA NA NA NA NA 3 3 0 NA NA 3 3 0 0 0 0 0 0 0 0 0.0008716 0.0623621 0.0001931 0.0965497 0.3456556 0.0032664 0.4911014 0 0 3.6178889 -0.5220527 -2.2808378 2.2808378 -2.706307 2.706307 -0.2962908 0.2962908 -0.2967783 -0.2252744 0.0000000 0.0000000 -0.0422408 0.0422408 -0.7761027 0.7761027 0.2545376 -0.2545376 -0.5508284 0.5508284 0.5939910 0.4060090 0.4060090 0.5939910 -0.0171052 0.4231141 0.5768859 -0.0888910 0.0888910 -0.0090039 0.0090039 -0.0096399 -0.0074653 0.0000000 0.0000000 -0.0015623 0.0015623 -0.0245468 0.0245468 0.0080776 -0.0080776 -0.0170816 0.0170816 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00-0026143 M.Ryan 00-0034837 C.Ridley NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 00-0032803 J.Mills 00-0032129 J.Hicks NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0 NA NA NA NA 0 NA NA 0 0 0 0

Fixing up the color scheme

nfl_teamcolors <- teamcolors %>% filter(league == "nfl")
dal_color <- nfl_teamcolors %>% filter(name == "Dallas Cowboys") %>% pull(primary)
car_color <- nfl_teamcolors %>% filter(name == "Carolina Panthers") %>% pull(primary)

dal_color is a value for the primary Cowboys color.

The Dallas Cowboys Games

Now let me get the Dallas games. The easiest way I can figure out how to do this would be to filter by Dallas being one of the two teams where Dallas is DAL.

dal_season <- full_season_2018 %>% filter(home_team == "DAL" | away_team == "DAL")
CAR_DAL <- full_season_2018 %>% filter(home_team == "CAR" & away_team == "DAL")

Cool. Now let me plot the first game. That is the filter line in the ggplot. Then I will facet out the seasons but it will take some work to get the labels and things to generate.

# DFS <- dal_season %>% left_join(all_2018_games)
cardal_wp <- CAR_DAL %>% filter(!is.na(home_wp), !is.na(away_wp)) %>% unite(GIDGSR, 
    game_seconds_remaining, game_id, sep = ":")
cardal_wp <- cardal_wp %>% dplyr::select(GIDGSR, home_wp, away_wp) %>% gather(team, 
    wpa, -GIDGSR) %>% separate(., GIDGSR, c("GSR", "game_id"), sep = ":") %>% mutate(game_seconds_remaining = as.integer(GSR))
cardal_wp %>% ggplot(aes(x = game_seconds_remaining, y = wpa, color = team)) + geom_line(size = 2) + 
    geom_hline(yintercept = 0.5, color = "gray", linetype = "dashed") + scale_color_manual(labels = c("CAR", 
    "DAL"), values = c(car_color, dal_color), guide = FALSE) + scale_x_reverse(breaks = seq(0, 
    3600, 300)) + annotate("text", x = 3000, y = 0.75, label = "CAR", color = car_color, 
    size = 8) + annotate("text", x = 3000, y = 0.25, label = "DAL", color = dal_color, 
    size = 8) + geom_vline(xintercept = 900, linetype = "dashed", color = "black") + 
    geom_vline(xintercept = 1800, linetype = "dashed", color = "black") + geom_vline(xintercept = 2700, 
    linetype = "dashed", color = "black") + geom_vline(xintercept = 0, linetype = "dashed", 
    color = "black") + labs(x = "Time Remaining (seconds)", y = "Win Probability", 
    title = "Week 1 Win Probability Chart", subtitle = "Carolina Panthers vs. Dallas Cowboys", 
    caption = "Data from nflscrapR") + theme_bw()

Now I want to try and build a plot of all of Dallas Cowboys games for the season. Here are the steps. First, I am going to make a table that contains all of the names and colors for all of the game IDs. Even though I only need those for Dallas, building a shiny app for this would mean that I could select by teams above and make it extensible. At the end, I will only need those for Dallas so I will separate them off.

ntable <- all_2018_games %>% group_by(game_id) %>% summarise(home_team = first(home_team), 
    away_team = first(away_team)) %>% ungroup()
ntableH <- ntable
ntableH <- ntableH %>% left_join(nflteams, by = c(home_team = "abbr"))
names(ntableH) <- paste0("Home_", names(ntableH), sep = "")
ntableH <- ntableH %>% rename(., game_id = Home_game_id, home_team = Home_home_team, 
    away_team = Home_away_team)
ntableA <- ntable
ntableA <- ntableA %>% left_join(nflteams, by = c(away_team = "abbr"))
names(ntableA) <- paste0("Away_", names(ntableA), sep = "")
ntableA <- ntableA %>% rename(., game_id = Away_game_id, home_team = Away_home_team, 
    away_team = Away_away_team)
My.NFL.Table <- ntableH %>% inner_join(ntableA)
Dallas.Table <- My.NFL.Table %>% filter(home_team == "DAL" | away_team == "DAL")
Dallas.Table %>% head()
## # A tibble: 6 x 15
##   game_id home_team away_team Home_team Home_primary Home_secondary
##   <chr>   <chr>     <chr>     <chr>     <chr>        <chr>         
## 1 201809… CAR       DAL       Carolina… #0085ca      #000000       
## 2 201809… DAL       NYG       Dallas C… #002244      #b0b7bc       
## 3 201809… SEA       DAL       Seattle … #002244      #69be28       
## 4 201809… DAL       DET       Dallas C… #002244      #b0b7bc       
## 5 201810… HOU       DAL       Houston … #03202f      #a71930       
## 6 201810… DAL       JAX       Dallas C… #002244      #b0b7bc       
## # … with 9 more variables: Home_tertiary <chr>, Home_quaternary <chr>,
## #   Home_division <chr>, Away_team <chr>, Away_primary <chr>,
## #   Away_secondary <chr>, Away_tertiary <chr>, Away_quaternary <chr>,
## #   Away_division <chr>

One nice little bit of data recovery here, the season schedule.

Dallas.Table %>% select(game_id, Home_team, Away_team)
## # A tibble: 16 x 3
##    game_id    Home_team           Away_team           
##    <chr>      <chr>               <chr>               
##  1 2018090910 Carolina Panthers   Dallas Cowboys      
##  2 2018091613 Dallas Cowboys      New York Giants     
##  3 2018092312 Seattle Seahawks    Dallas Cowboys      
##  4 2018093002 Dallas Cowboys      Detroit Lions       
##  5 2018100712 Houston Texans      Dallas Cowboys      
##  6 2018101410 Dallas Cowboys      Jacksonville Jaguars
##  7 2018102110 Washington Redskins Dallas Cowboys      
##  8 2018110500 Dallas Cowboys      Tennessee Titans    
##  9 2018111111 Philadelphia Eagles Dallas Cowboys      
## 10 2018111800 Atlanta Falcons     Dallas Cowboys      
## 11 2018112201 Dallas Cowboys      Washington Redskins 
## 12 2018112900 Dallas Cowboys      New Orleans Saints  
## 13 2018120912 Dallas Cowboys      Philadelphia Eagles 
## 14 2018121605 Indianapolis Colts  Dallas Cowboys      
## 15 2018122306 Dallas Cowboys      Tampa Bay Buccaneers
## 16 2018123008 New York Giants     Dallas Cowboys

The first thing to notice is that this color scheme thing is going to cause trouble when Dallas plays teams that apparently have the same primary color. For now, I think that I will just make the opponent some constant color.

dal_wp <- dal_season %>% 
  filter(!is.na(home_wp),
         !is.na(away_wp)) %>% unite(GIDGSR, game_seconds_remaining, game_id, sep=":")
dal_wp <- dal_wp %>% dplyr::select(GIDGSR,
                home_wp,
                away_wp)  %>%
  gather(team, wpa, -GIDGSR) %>% separate(., GIDGSR, c("GSR", "game_id"), sep=":") %>% 
  mutate(game_seconds_remaining = as.integer(GSR))
dal_plt <- dal_wp %>% left_join(My.NFL.Table)
# dal_plt$colors1 <- NA
dal_plt[dal_plt$team=="home_wp","colors1"] <- dal_plt[dal_plt$team=="home_wp","Home_primary"]
dal_plt[dal_plt$team=="away_wp","colors1"] <- dal_plt[dal_plt$team=="away_wp","Away_primary"]
dal_plt[dal_plt$team=="home_wp","team"] <- dal_plt[dal_plt$team=="home_wp","home_team"]
dal_plt[dal_plt$team=="away_wp","team"] <- dal_plt[dal_plt$team=="away_wp","away_team"]
dal_plt$dateG <- substring(dal_plt$game_id, 1, 8)
dal_plt$titleS <- with(dal_plt, paste0(dateG,": ",Home_team," v. ",Away_team))
p <- dal_plt %>% ggplot() + aes(x = game_seconds_remaining, y = wpa, color = team) +
  geom_line(size = 2) +
  geom_hline(yintercept = 0.5, color = "gray", linetype = "dashed") +
  scale_color_viridis_d(guide=FALSE) +
  scale_x_reverse(breaks = seq(0, 3600, 300)) + 
  annotate("text", x = 3000, y = .75, label = "Home", size = 8) + 
  annotate("text", x = 3000, y = .25, label = "Away", size = 8) +
  geom_vline(xintercept = 900, linetype = "dashed", color="black") + 
  geom_vline(xintercept = 1800, linetype = "dashed", color="black") + 
  geom_vline(xintercept = 2700, linetype = "dashed", color="black") + 
  geom_vline(xintercept = 0, linetype = "dashed", color="black") + 
  labs(
    x = "Time Remaining (seconds)",
    y = "Win Probability",
    title = "{closest_state}",
#    subtitle = "Carolina Panthers vs. Dallas Cowboys",
    caption = "Data from nflscrapR"
  ) + theme_bw() + transition_states(titleS, transition_length=6, state_length = 10)
animate(p, nframes=400)

I will stop with that for this blog post, I did a bit more with this and also built a shiny app for doing it for all of the teams that is linked there.

Avatar
Robert W. Walker
Associate Professor of Quantitative Methods

My research interests include causal inference, statistical computation and data visualization.

Next

Related