Statistics of PGA Tournaments During the 2022 Season

Correlation
Linear regression
Measures of Putting, Driving, and Success for individual golfers in 19 professional tournaments in 2022
Author

Alysssa Bigness

Published

May 19, 2023

Motivation

A common expression among golfers is “Drive for show, putt for dough.” This implies that the long initial tee shots (drives) on each hole are impressive, but the real key to success is the final strokes rolling the ball along the green into the hole (putts). Do data support this adage? The dataset for this activity was obtained from the PGA statistics website. Cases include all golfers who made the cut in each of 19 PGA tournaments in 2022. The dataset includes variables for driving ability, putting ability, and measuring success in the tournament. The “driving” variables include average driving distance (avgDriveDist), driving accuracy percentage (drivePct), and strokes gained off the tee (driveSG). The “putting” variables are average putts per round (avgPuttsPerRound), one putt percentage (onePuttPct), and strokes gained putting (puttsSG). The variables to measure success are scoring average (avgScore), official money won (Money), and Fedex Cup points (Points).

Data

Each row of data gives the measures for one golfer in one tournament. The dataset covers 19 PGA tournaments from the 2022 season with 1387 cases in all. Each tournament consists of four rounds of golf. Some golfers are eliminated after the first two (or sometimes three) rounds. Only players who competed in all four rounds (i.e. those that made the “cut”) are included in this dataset.

PGA2022.csv
Variable Description
playerName Name of the player
country The country where the player is from
avgDriveDist Average driving distance (in yards)
drivePct Percentage of times a tee shot comes to rest in the fairway
driveSG Strokes gained off the tee measures player performance off the tee on all par 4s and par 5s of how much better or worse a player’s drive is than the PGA Tour average
avgPuttsPerRound Average number of total putts per round
onePuttPct Percentage of times it took one putt to get the ball into the hole
puttsSG Strokes gained putting measures how many strokes a player gains or loses on the greens compared to the PGA Tour average
avgScore The scoring average is the total strokes divided by the total rounds
Money The official money is the prize money award to the Professional members
Points FedexCup Regular Season Points are awarded points by finished position for performance in each tournament
Tournament The tournament where the PGA Tour is taking place

Questions

  1. Are driving or putting statistics generally more effective for predicting success in a PGA tournament?

  2. Which measure of success (avgScore, Money, or Points) appears to be easiest to predict with either driving or putting information?

  3. For each measure of success, generate a scatterplot with one the putting or driving variables. Which of these plots suggest we might want to use a transformation of one of the variables to improve linearity?

  4. Can you find a transformation that improves the linearity of the relationship for one of the problematic cases in question #3?

References

The data were obtained from the PGA statistics website.