Women’s National Basketball Association Shots
Motivation
The Women’s National Basketball Association (WNBA) is the top professional women’s basketball league in the world. The league records every shot players take along with contextual information about the shot such as its location, a description of the shot type, as well as the outcome. With this dataset, you can predict the success of each shot attempt to compute the expected value of shot types and compare team decision making.
Data
This dataset contains information about 41,497 shots during the 2021-2022 WNBA season.
The data was collected using the wehoop
package in R
.
Variable | Description |
---|---|
game_id | Unique integer ID for each WNBA game |
game_play_number | Integer indicating the recorded play number for the shot attempt, where 1 indicates the first play of the game |
desc | String detailed description of shot attempt |
shot_type | String description of the shot type (e.g., dunk, layup, jump shot, etc.) |
made_shot | Boolean denoting if the shot was made (TRUE) or not (FALSE) |
shot_value | Numeric value of the shot outcome (0 for shots that were not made, and a positive value for made shots) |
coordinate_x | Horizontal location in feet of shot attempt where the hoop would be located at 25 feet |
coordinate_y | Vertical location in feet of shot attempt with respect to the target hoop (the hoop should be a little in front of 0 but the coordinate system is not exact) |
shooting_team | String name of the team taking the shot |
home_name | String name of the home team |
away_name | String name of the away team |
home_score | Integer value of the home team score after the shot |
away_score | Integer value of the away team score after the shot |
qtr | Integer denoting the quarter/period in the game |
quarter_seconds_remaining | Numeric integer value for number of seconds remaining in quarter/period |
game_seconds_remaining | Numeric integer value for number of seconds remaining in game |
Questions
Build a classification model to predict the shot outcome based on the spatial x,y coordinates of the shot.
Create a visualization displaying the joint frequency of shot locations. Do there appear to be any clear modes of frequently taken shots? Create a conditional version of this display by shot outcome. Does the distribution shape vary by shot outcome? (You can also perform a similar analysis by team and shot type).
References
Gilani S, Hutchinson G (2022). wehoop: Access Women’s Basketball Play by Play Data. R package version 1.5.0, https://CRAN.R-project.org/package=wehoop.