General Clinton Canoe Regatta
Motivation
Downriver paddling is a difficult and competitive sport. The General Clinton Canoe Regatta (nicknamed “the 70-miler”) is a prominent race in the sport that has been running since the 1960s. The race is a 63 mile course down the Susquehanna River in New York (from Cooperstown to Bainbridge). Paddlers compete against each other to achieve the fastest time in their class, with the fastest boats finishing about 7-8 hours after they start.
Understanding Downriver Racing
Downriver racing is a sport where competitors take their watercraft (usually a canoe, kayak, or stand-up paddle board) down a race course in competition with other paddlers, sometimes navigating rapids, in an attempt to get the fastest time either overall or in their class (classes are split up by boat and paddler type). Times are tracked by race officials on shore.
Classes
The number of classes and how they are separated is variable depending on the race. Type of boat is the primary separation, with kayaks and canoes (possibly stand-up paddle boards) having completely separate categories that are then further split. For canoes, the General Clinton splits classes based on the boat’s material and build (classic aluminum boats versus small and light pro boats) as well as the paddlers in the boat. The number of paddlers matters, as does gender.
General Clinton Canoe Regatta
As mentioned, the General Clinton is 63 miles long and can take paddlers anywhere between 7 and 13 hours to complete. Due to the difficulty, almost all paddlers have a pit crew of some kind who can get them water and fuel refills at various points throughout the race, as well as keep them informed about their pace and performance. The effort required for this race means that it is imperative that race officials know as much as possible what to expect on race day, to keep everyone involved safe.
Data
Each row in the clinton_canoe_regatta
data set represents a specific paddler from a year of the Clinton. This data set contains results from 2014-2025 (excluding 2020 and 2021, when the race was cancelled due to the Covid-19 Pandemic).
Variable Descriptions
Variable | Description |
---|---|
year | Year raced |
boatType | Type of boat raced (canoe, kayak, SUP) |
bib | Bib number |
classID | Class (oc4stock, oc2pro, etc) |
place | Where the boat placed in their class’s results |
time | The time to complete the race (in sec) |
overall | Overall rank (out of all boats ever) |
byBoat | Rank out of all boats of the same type |
byClass | Rank out of all boats in class (all years) |
byYrClass | Rank out of the boats in their class in their year |
byYrOver | Rank out of all the boats that year (all classes) |
byYrBoat | Rank out of all boats of the same type that year |
courseID | Version of the course raced (all the same here) |
notes | Extra information (usually about class) |
personID | The person’s PaddleStats ID |
displayName | Name of the racer |
sortName | Name of the race with last name first |
city | Racer’s home city (as given at registration) |
state | Racer’s home state or province (as given at registration) |
country | Racer’s home country (as given at registration) |
teamName | Racer’s team name |
status | Finishing status (DNF, DNS, DQ, etc.) |
Download Clinton Data: clinton_canoe_regatta.csv
Questions
Which class has the most boats?
Are times significantly different based on year?
Are paddlers with an existing Paddlestats id faster than those without?
References
The data are compiled from Paddlestats, an online database of paddling results.