Justin Verlander Pitches
Motivation
After nearly two full seasons due to injury, at the age of 39 Justin Verlander returned for the 2022 season to win the American League Cy Young award and World Series with the Houston Astros. Leading the league in a variety of statistics, Verlander dominated in his starts throughout the season. Pitch selection has played a key role into Verlander’s recent success with the Astros. Verlander throws four types of pitches (using MLB’s abbreviation): fastball (FF), slider (SL), curveball (CB), and changeup (CH). However, pitches are thrown in the context of an at-bat where the ball-strike count starts 0-0, and progresses until either the batter strikes out (reaches three strikes), is walked (reaches four balls), or is either hit-by-pitch or hits the ball in-play. As the count varies, pitchers often decide to favor certain pitches over others, e.g, with three balls (i.e., 3-X counts) the pitcher may favor throwing more accurate fastballs relative to out-of-the-zone offspeed pitches that are favored with two strikes (i.e., X-2 counts).
Data
Below are two datasets containing information about pitches thrown by Justin Verlander during the 2019 and 2022 seasons (with separate datasets for each season). Each dataset includes the pitch type, ball-strike count, release speed, and a description of what happened on the pitch.
Data courtesy of baseballsavant.mlb.com and accessed using the baseballr package.
Variable | Description |
---|---|
pitch_type | The type of pitch Verlander threw, either: fastball (FF), slider (SL), curveball (CU), or changeup (CH). |
count | The ball-strike count when Verlander threw the pitch. |
release_speed | Pitch speed (MPH) at the time of release. |
description | A brief text description of the result of the pitch, such as a ball, foul, called_strike, swinging_strike, etc. |
Questions
Using the appropriate statistical test, assess whether or not Justin Verlander’s
pitch_type
is independent of thecount
. Perform this analysis for both the 2019 and 2022 datasets. Are the conclusions similar or different?Which combinations of
pitch_type
andcount
appear more than expected under the assumption of independence? Which combinations appear fewer than expected? Perform this analysis for both the 2019 and 2022 datasets. Are the conclusions similar or different?
References
Petti B, Gilani S (2022). baseballr: Acquiring and Analyzing Baseball Data. R package version 1.3.0, https://CRAN.R-project.org/package=baseballr.