Better Crossfit Benchmarking


Gotta love benchmark day at the gym. You’ve been working hard and you get hyped up with your friends, ready to crush a workout like “Fran” or “Grace”, excited to see how much improvement you’ve made. At the end of the WOD, you are spent knowing that you left it all on the floor and go home holding your head high. Once you’re in the shower, it hits you: did you really improve? If you are smart, you track your progress on lifts and benchmark workouts (as hopefully your coaches nag you to do) and can simply go back and check. Let’s say you beat your old time using the RX weight. Great! you’re faster and better at Crossfit. But what does this mean? Did you perform better simply because you got stronger and the weight got relatively lighter? Did your conditioning improve? Did your technique? What if you used a lighter weight on the last benchmark? Now it really gets confusing as you have no idea how to interpret your results. All you know is that you are faster than last time but slower than Rich Froning, so you go to bed abandoning these questions with the simple assertion: I got better, that’s all that counts. You fall asleep and hit it hard the next day.

Unless you’re me. Then you stare at the ceiling pondering these questions further. How could I measure what factor led to the greatest improvement? How can I compare myself to other athletes if they are using different weights? Are they just stronger or weaker? Faster or slower? What factors of my training will have the biggest effect on my overall improvement? These questions and many others race through my mind until the early dawn peaks through the window and I find myself huddled in a pile questioning my own existence and the eternal question “What is CrossFit?”

Hopefully, you are not like me. However, many others are and they have started tackling questions like these. My obsession answering these questions has led to the development of WODRight and a change in the way we approach benchmarking at our gym. The first question I set out to answer was what made the biggest difference in CrossFit workouts: raw power or your engine? I began my quest the same way I tackle most problems in life: I googled it. I found this fantastic NIH study, “Do physiological measures predict selected CrossFit benchmark performance”. To summarize the study, they had 14 athletes complete “Fran”(21-15-9 thrusters 95/65, pullups), “Grace” (30 clean and jerks 135/95), “Cindy”, “The Crossfit Total”, VO2 max test, and the Wingate power/capacity test on different days and then looked for correlation between the scores. They also measured height, weight, BMI, and age. What they found was that the Crossfit Total, the sum of your 1 rep max back squat, deadlift, and overhead press, was the best predictor of your performance on Grace and Fran. Your CF total explains 65% and 88% of the variance of your Fran and Grace times, respectively. The linear regression analysis is below and I highly encourage you to go read the article.



This got me thinking on something I already suspected: how strong you are matters the most in barbell workouts for Crossfit. A “duh” moment for many, but a really important one to prove. I knew I was onto something when I came across “What is a good Grace time?”. The author, Mr Brant M Cebulla, did a wonderful job of scraping the CrossFit Open site for data on CrossFit athletes out there and compared their self-reported Grace times to their 1 Rep max clean and jerk. What he found mirrored the NIH study: your 1RM clean and jerk was a great predictor of your grace time, accounting for 60-64% of the variability in Grace times and used an exponential correlation instead of a linear. His graph of men and women’s times are below (and once again go read his post, it is amazing and all credit for these graphs goes to him):

Grace times

grace women

It turns out, how strong you are makes up a huge amount of your Grace time. This led me to a revelation: if you can control for strength you would be able to measure each athlete’s conditioning/technique. This could really inform your benchmarks as you would actually know what an athlete needed to work on to improve in the sport of fitness. Here is my reasoning: in the Cebulla article he finds the average 1 rep max clean and jerk for the men to be 225 and the RX weight to be 135. That means, the “average” athlete is using 60% of their max for the workout. If you had everyone use 60% of their clean and jerk max, you should see a normal distribution of times with 68% of the athletes being within 1 standard deviation of the mean time, 95% within 2 standard deviations, and 99% within 3. You could then use their time to identify what they needed to work on.

So that’s what I did. I took our athlete’s on a Saturday and had them test 1RM clean and jerk, then do Grace at 60% of that weight. I then constructed a histogram of the data, presented below:


I found that the average time for the 12 athletes was 209 seconds, weight used was 81.6, and the distribution of their times approximated a normal distribution (bi-modal I know, but it is a small sample size).I then measured each athlete’s score against the mean and talked to the people more than 1 standard deviation faster the mean. I found two things: either they sandbagged their 1RM and used a significantly lower weight (which is the case for Fernando with a time of 123 seconds and some of the very new people that don’t have great technique to lift heavy), or their engine was solid. These people probably need to work on getting stronger and developing technique to really succeed in the sport of fitness, holding all else constant.

I then talked to the people more than 1 standard deviation slower than the mean. I found lots excuses, hangovers, sleepless nights, and not really “feeling it” that day. All of these are fine and I would expect factors like these to affect your performance. However, the slowest time came from the strongest athlete. Marcus had a 1RM of 275 and was throwing around 165. By his own admission he was hungover, but he also admitted he hadn’t been challenged like that in a workout in a long time. Marcus is an all around strong guy and his strength has allowed him to coast through many workouts. We failed to keep challenging him as he got stronger and his conditioning has suffered. This really informed our approach to training Marcus as we now know to step up the weights on him above RX and start working on building a bigger engine was we approach the open.

I plan on repeating this benchmark test in a few months and seeing what kind of results are produced. Furthermore, improvement in this test will really inform the athlete’s training as we know where they have improved and where they have fallen back. Hopefully in January, we will be able to produce a follow up report on the results. I’m most curious to see if we can really get people both faster and stronger simultaneously, as CrossFit seems to do.

If you run this benchmark on your own, I would love to have your results. You can send any data you have over to to be included in the study. The bigger the data set, the more accurate our assessment is. We will publish all of our results and you will be able to see where you stack up in the community. (I might even do a little percentile graph so you can see how much better you are than all the other crossfitters.) If you have questions or feedback, send them to us as well.