After over ten years of enjoying craft beer, and five years of really thinking about what I was drinking, I’ve realized that there is a minimal difference in most beer. My observation is that the vast majority of beer is very good, a bit of it is great, and very few things rank as the greatest. Luckily, not very few beers are bad, though they are out there. I could just write about my anecdotal evidence, personal thoughts, observations, and opinions, but I decided to collect some data to see if my observation is correct.
I contacted Untappd and got shot down, then I emailed RateBeer and never heard back, but RateBeer does mention on one page about the acceptable use of their data. Hopefully, I didn’t break those rules. Trying to copy and paste data from the styles doesn’t work because they only give you the top 50 of each style, the breweries didn’t work too well either thanks to lots of additional fields. In the end, I found they have a section of tags that cover all levels of ratings for each beer with that tag all in a very easy to copy/paste format. Per the RateBeer data usage guidelines, I collected all ratings in March 2016.
I started going through the tags, testing one or two at first, then picking tags that I felt would bring in a large number of unique beers. In the end, I pulled in 12,709 records for 11,711 unique beers. Many of the beers had very few ratings, so I decided to set my minimum rating threshold at ten ratings. This resulted in a dataset of 6,200 beers, having a total of 1,351,020 ratings, which I’ll be using to examine this topic.
Now we get to the data. We can clearly see that the majority of beers in the set fall into the 3 and 4 range, with a slight majority between 3.5 and 4.
What are we looking at here? The aggregate ratings of hundreds of thousands of people across thousands of beers. The score for each beer is calculated from the ratings of at least ten individuals.
Good, Great, Greatest
So far we’ve been dealing with numbers, but at this point, I need to associate those numbers with words. I think it’s simple and reasonable to assign the word good to 3, great to 4, and greatest to 5.
But What About The Whalez Bro?
I suspect some people will quickly say, “but did your data include [insert your favorite beer here]?” I don’t know if it did, I wasn’t after specific beers, I was after a huge amount of data. That said, here’s a list of “whalez” that were, and weren’t, included:
- Founder’s CBS
- Russian River Pliny the Elder
- Three Floyds Dark Lord (plus eight variants)
- Westy 12
- Alchemist Heady Topper
- Cigar City Hunahpu (though eight variants were included)
- Founder’s KBS
- Russian River Pliny the Younger
- Westbrook Mexican Cake (though three variants were included)
Before you say “those missing beers are the best ever, and the lack of their ratings invalidates all your data,” I decided to go collect those ratings:
- Heady Topper – Average score 4.24, 1,432 ratings
- Hunahpu – Average score 4.29, 903 ratings
- KBS – Average score 4.28, 2,159 ratings
- Pliny the Younger – Average score 4.35, 751 ratings
- Mexican Cake – Average score 4.05, 504 ratings
While all of these are in the range of great beers, none beat the highest score in my data set (see below). Also, five great scores are not going to significantly impact 6,200 other scores when only 216 of those scores are in the great range. Even if we had in these five it’s still 221 great beers vs. 5,472 good beers.
Yes, there are some great beers out there (216), but the vast majority are good (5,472). I set good as a rating of 3 and great as a rating of 4, but let’s set anything between 3.5 and 4 as very good. In that case, we end up with 2,659 good beers and 2,813 very good beers.
The average rating, across all 6,200 beers, of 3.4 shows that even those rare, sought after beers are less than 1 point higher than the average. All told there were only three beers greater than 1 point above the average: Närke Kaggen Stormaktsporter (4.49), Westy 12 (4.43), and BCBS Rare (4.41).
We can see that these great beers aren’t scored significantly better than the good beers, so why do we perceive them to be so much better? Luckily, Bryan Roth wrote a fantastic article this morning dealing with that.
Other Fun Facts From Looking at 6,200 Beers
- Most Ratings – So I was only going to do number 1 until I saw how much I love the top 5.
- Guinness Draught – 4,984 ratings, average score 3.4
- Rochefort 10 – 4,943 ratings, average score 4.3 (one of my favorite beers ever, available at any better bottle shop and only $6/bottle)
- Chimay Blue – 4,565 ratings, average score 4
- Dogfish Head 90 Minute IPA – 4,517 ratings, average score 4.04
- Sierra Nevada Pale Ale – 4,458 ratings, average score 3.63
- Highest Score – Närke Kaggen Stormaktsporter with 550 ratings and an average score of 4.49. This 9.5% Imperial Stout jives with what Bryan Roth (again) has seen in his analysis of RateBeer’s Best New Beers list.
- Lowest Score – Natural Ice with 879 ratings and an average score of 1.08
- Highest ABV – BrewDog End of History at 55%, 60 ratings and average score of 3.37
- Lowest ABV – I’m honestly shocked, but it is again, BrewDog! This time with Nanny State, .5% ABV, 530 ratings, and an average score of 2.86.
- Average ABV – 7.4%
- Average Rating – 3.4
- Average Number of Ratings – 217
What do you think of all this information? Anyone have any other takes on the data? Do you even care if the reality is that whatever you’re chasing isn’t actually that much better than something easily available? Leave a comment and start a discussion.
Ed. Note: To anyone at Ratebeer who reads this, I reached out to you but received no response. I tried to follow the rules I found for using the data. If I used anything incorrectly, please let me know and I will fix it.