How I Rate Beers

1  How I Rate – The Short Version

When I started publishing ratings of homebrews at this website, I had rated commercial brews on for quite some time. I therefore chose to rate homebrews the same way as I rate commercial brews, using the criteria and the grading scales outlined by Ratebeer:

Aroma:       1 - 10
Appearance:  1 -  5
Taste:       1 - 10
Palate:      1 -  5
Overall:     1 - 20

The final Ratebeer score is the sum of the categories above divided by 10.

In this way the final Ratebeer score ends up somewhere between 0.5 and 5.0. I tend to be rather strict when rating, so the brews I rate on Ratebeer (and on Beer With Me) are seldom awarded a higher score than 4.0.

Taking Ratebeer’s advice, I judge a beer according to how much it pleases my nose, eyes and tongue, and not according to style. Therefore, a perfectly crafted pale lager will probably never get a very high rating from me, simply because I tend not to like pale lagers that much.

I always use a beer scoring form when I rate beers. The one I’m currently using is a personalised version of a form I found at

A thorough walkthrough of my interpretation of the five rating criteria can be found in Section 2, and a statistical overview of my Ratebeer ratings is given i Section 3.

2  How I Rate – The Long version

There are many criteria one can use when judging a beer. Some will want “Aftertaste” as separate criterion, some will claim that the Appearance of a beer is unimportant while others think that what really matters is what the beer tastes like, and would want a much stronger emphasis on the Taste criterion than what Ratebeer seems to prefer.

Being a Ratebeer user I don’t have much of a choice, so I need to use the 5 predefined Ratebeer criteria and grading scales. However, the exact definitions of the criteria aren’t written in stone, this applies especially for Taste. In the following subsections I will therefore explain how I interpret the different criteria.

2.1  Aroma [1-10]

Some aroma elements seem to disappear or grow weaker very quickly after a beer is poured. Therefore, the first thing I do after I’ve poured my daily beer, is to take a couple of sniffs. I try to put a label on all elements I pick up, I try to describe how strong the aroma as a whole is, what aroma elements are in front and what elements are further in the background. After I’ve smelled the beer for a couple of minutes I decide on an Aroma mark. Some times I go back and adjust Aroma after I’ve started tasting the beer, but most of the times Aroma is set before I’ve taken the first sip.

As a starting point I use the aroma of a standard, boring, industrial pale lager without any obvious off-aromas: this will get a 4. If there are unpleasing elements the Aroma will end up lower. Most beers I rate get a 7, this means that I find the aroma very pleasing and inviting.

As for all of the 5 criteria I try not to judge aroma according to style. I still must admit that I probably go harder on a beer that has a very faint aroma with very few hoppy notes if the beer is supposed to be an IPA than if it claims to be a pale lager.

2.2  Appearance [1-5]

I don’t find this criterion very important, and I often end up giving 3 or 4. The size, lifetime and visual texture of the head, in addition to it’s ability to stick to the glass (lacing), all count positive. I also reward beers that looks like it’s actually tasting something, while a very pale body might lead to a lower score. I tend not to punish or reward the amount of haziness of a beer, but I don’t find beers with lots of huge bits floating around particularly appealing.

2.3  Palate [1-5]

The Palate is simply referring to the mouthfeel: how does it feel to have the beer in your mouth, using only the sense of touch. What is the texture like? Is it oily, dry, creamy or maybe watery? Does the body feel thin or full? What is the carbonation level like? Does the beer feel warming due to alcohol, or maybe hot due to chilli? I also add any additional sensations I get in my throat when swallowing the beer. I give most beers 3 or 4.

2.4  Taste [1-10]

Now it’s getting interesting: what is taste? Or rather, what is taste in the Ratebeer context? What sensations and experiences should be considered when determining the Ratebeer Taste mark? Just the basic tastes bitter, sweet, acidic, salty and umami? Or could we use all the words used to describe the aroma when describing the taste as well? If you go for the latter answer, does it make sense to claim that something tastes “tar”, “wet dog” or “horse blanket” when you probably (and hopefully) haven’t tasted either?

In order to solve this problem the concept of flavour comes in handy. According to the International Organization for Standardization (ISO), the definition of flavour is:

The complex combination of the olfactory, gustatory and trigeminal sensations perceived during tasting. Flavor may be influenced by tactile, thermal, painful, and/or kinesthetic effects.

Puh! In plain English Olfactory refers to smell, gustatory to the sensations picked up by the tongue and mouth and trigeminal refers to nerve impulses caused by touch in the mouth region. The trigeminal part of flavour is dealt with by the Palate criterion, see Section 2.4, so we are left with the olfactory and gustatory aspects.

Some claim that the olfactory part is taken care of by the Aroma criterion, but I beg to differ. To me Aroma deals with the scent of the beer that is detected before the liquid enters the mouth, or to use a word the ISO-guys would appreciate: the orthonasal smell. However, once the beer is in the mouth, magical things start to happen. After you have taken a sip of your brew and breath out, the retronasal smell is detected, and this smell is perceived differently, and might even be processed differently by the brain, than the orthonasal smell. In addition, chemical reactions between the beer and the drinker’s saliva may produce aroma elements that were not even present in the beer when it was still in the glass, and these may be picked up retronasally. New odorants may also be produced simply by fact that the beer is warmed up when in the oral cavity.

Considering these aspects, I find it natural to include the retronasal smell as a part of the Taste criterion, in addition to the five basic tastes. I do understand that this is not scientifically correct, since detection of chemical compounds by the olfactory system is per definition aroma. But again, given the five fixed Ratebeer rating criteria, which doesn’t include  the more covering concept of flavour, this is the best way to do it in my opinion. To me, Taste in the Ratebeer sense therefore includes all olfactory and gustatory associations I get when I have the beer in my mouth, including things I probably don’t eat on a regular basis, like “barnyard” or “earth”.

In other words, I consider Taste to be the ISO definition of flavour, but leaving out trigeminal effects.

A side note to mess things up a bit: in my ratings I often mention both taste and flavour. In these cases taste will refer to the basic tastes and flavour will refer to the retronasal smell. Combined, these will make up the Taste mark.



The Essence of Gastronomy – Understanding the Flavor of Foods and Beverages by Peter Klosse

Differences between orthonasal and retronasal olfactory functions in patients with loss of the sense of smell, Landis BN et al

Differential Neural Responses Evoked by Orthonasal versus Retronasal Odorant Perception in Humans, Dana M. Small et al

2.5  Overall [1-20]

Given the rating guidelines I’ve outlined in the sections above, the criterion that by far is the most important to me is of course Taste. A beer may look nice, smell delicious while in the glass and even have a perfect mouthfeel, but if the beer doesn’t give me oral pleasure (sorry, I just had to) it simply isn’t a good beer! Therefore, I set the Overall mark almost entirely based on Taste. Most of the times I end up giving an Overall equal to twice the Taste or twice the Taste minus 1. My most frequent Taste mark is 7, and consequently the most frequent Taste marks are 13 and 14.


3  Statistical analysis of my Ratebeer ratings

I’ve been asked a number of times if I can give a detailed statistical analysis of my Ratebeer ratings. Ok, the only one that has asked about this is myself, but I’ve asked myself this question a number of times.

The Ratebeer site has a nice and simple statistics page for each rater. Here you can study how your ratings are distributed, both for individual beer style and for all your rated beers as a whole. My Ratebeer score histogram for all registered ratings looks like this:

Screen Shot 2015-01-18 at 09.38.50

Figure 1: My Ratebeer ratings. The score on the x axis, the number of beers on the y axis.

3.1  Appearance, Aroma, Taste, Palate and Overall

Now, it might be interesting to break down the Ratebeer scores into the individual criteria Aroma, Taste, Appearance, Palate and Overall. That isn’t possible to do on the Ratebeer site, but you may download files with your compiled ratings if you want to do the job yourself.

The figures below are based on my ratings as of January 2015, which contains information from 1220 ratings of beers, ciders and meads (only 22 of them are ciders and meads, so from now on I’ll talk about “beers”, even though that’s not strictly true).

First let’s take a look at some plots showing how many beers I’ve given the available marks for Aroma, Taste, Appearance and Palate:

Histograms of Aroma, Taste, Appearance and Palate. Based on 1220 ratings.

Figure 2: Histograms of Aroma, Taste, Appearance and Palate. Based on 1220 ratings.

The Aroma and Taste plots are not normally distributed (i.e. they’re not symmetric around a mid point). This doesn’t necessarily mean that my rating is skewed, the reason is probably that I simply tend to buy good beers. If my goal was to rate all available pale lagers and pilseners, the distributions would probably be more symmetric. Note that I’ve never given a 10 in either categories yet.

I find the Appearance and Palate to be more difficult to quantify than Aroma and Taste, and as you see I end up at 3 or 4 most of the times, and I hardly ever give 1.

The shape of the Overall histogram looks pretty much like the Aroma and Taste histograms:

Histogram of Overall score. Based on 1220 ratings.

Figure 3: Histogram of Overall score. Based on 1220 ratings.

As for Aroma and Taste I don’t use the full scale: I’ve never given 19 or 20 for Overall score.

Figure 2 showed that both the Aroma and Taste distributions peaked at 7. However, the two histograms aren’t able to say anything about a possible correlation between Aroma and Taste. Therefore in Figure 4 I’ve plotted Taste as a function of Aroma. The size of a blue blob is an approximate visual measure of the number of beers having that specific Aroma/Taste combination.

Figure 4: Taste as function of Aroma. Blob size indicates the number of beers with the individual combinations of Aroma and Taste values.

Figure 4: Taste as function of Aroma. Blob size indicates the number of beers with the individual combinations of Aroma and Taste values. The straight line marks where Taste equals Aroma.

Figure 4 has two striking features: the first is the huge blob at Aroma=7, Taste=7, i.e. this is by far the most common combination of Aroma and Taste. The other thing to note is that the blobs tend to follow the straight Aroma=Taste line, meaning that I often give approximately the same mark for both Aroma and Taste. For the statistically inclined: the linear Pearson correlation coefficient between Aroma and Taste is 0.86 (for the not-so-statistically inclined: if the correlation coefficient is 1 the two variables are perfectly correlated, so 0.86 is pretty close!).

The shape of the Aroma, Taste and Overall distributions looks very similar (Figures 1 and 2). Figure 5 shows Overall as a function of Taste, and it’s pretty obvious that there is a strong correlation between the two (the correlation coefficient is 0.96, Aroma is slightly less correlated with Overall: 0.87).

Overall as a function of Taste. The straight line marks where Overall equals twice the Taste. Based on 1220 ratings.

Figure 5: Overall as a function of Taste. The straight line marks where Overall equals twice the Taste. Based on 1220 ratings.

The straight line in Figure 5 denotes Overall = 2*Taste, and the blobs tend to lie close to this line. If you want numbers: 40% of all my ratings have Overall = 2*Taste, 43% have Overall = 2*Taste-1.

3.2  My ratings compared to the rest of Ratebeer

Unfortunately the files with compiled ratings that you can download from RateBeer miss a couple of very important statistics, one of them being the average rating a beer has among all Ratebeer users. However, this information is available on the webpages, so I’ve written a program that downloads this and a couple of other useful facts about the beers that I’ve rated.

How do my ratings compare to those of the average Ratebeer user? The mean of all my ratings is very close to the mean of the Ratebeer scores of the same collection of 1220 beers: 3.19 (me) vs 3.22 (Ratebeer). But a single number isn’t any fun, we need a graph or two! In Figures 6 and 7 I’ve tried to highlight the difference between my scores and the mean Ratebeer scores using two different approaches.

Figure 6 shows my score as a function of the average Ratebeer score (actually, a weighted average Ratebeer score) for all beers I’ve tasted. If I had given the same rating to all beers as the average Ratebeer user, all points would follow the tilted dark blue line. The points are very scattered, but if we’re concentrating on the blue blobs, which denotes the mean value of the average Ratebeer scores for each of my scores, we see that I tend to give a lower score for beers that have a low average Ratebeer score, while I overrate beers that have a high average rating.

My Score as a function of the weighted average RateBeer score. The blobs indicate the mean value of all weighted RateBeer scores for each of my scores. The size of the blobs are proportional to the number of beers I've given the individual scores.

Figure 6: My Score as a function of the weighted average RateBeer score. The blobs indicate the mean value of all weighted RateBeer scores for each of my scores. The size of the blobs are proportional to the number of beers I’ve given the individual scores.

Figure 7 shows the difference between my score and the average score on Ratebeer as a function of the score I give. In other words, a star above the light blue horizontal line in this figure marks a beer that I’ve rated higher than the average Ratebeer user, stars below the line symbolises beers that I’ve underrated compared to the rest of you guys.

Screen Shot 2015-01-19 at 09.48.50

Figure 7: The difference between My Score and the weighted average Ratebeer score as a function of My score. The blobs indicate the mean value of the rating difference for each of my scores. The size of the blobs are approximately proportional to the number of beers I’ve given the individual scores.

As was the case for Figure 6, also this figure shows that the lower score I give the more I underrate the beer compared to the Ratebeer average. Interestingly, it also shows that all of my highest ratings (4.1 or better) are higher than the Ratebeer average.

So far so good. Now, the question is why do I underrate or overrate beers? Which beers do I like better than the average Ratebeer user and vice versa? One place to start looking is at the alcohol content. Figure 8 shows a pretty crowded plot of the difference between my score and the Ratebeer average as a function of ABV. The black tilted line is linear fit to the data, and although the trend isn’t very clear and the spread is huge, we can see that I do tend to underrate lower alcohol beers while higher ABV beers are slightly overrated compared to the Ratebeer average.

Screen Shot 2015-01-20 at 15.20.17

Figure 7: The difference between My Score and the weighted average Ratebeer score as a function of ABV. The blobs indicate the mean value of the score difference for each of the ABVs. The size of the blobs are approximately proportional to the number of beers with the respective ABVs.

It’s also possible to investigate which beer styles contribute to the different parts of Figure 7. I won’t bore you with the details, just briefly mention that a majority of the underrated beers around 4.5 – 5.5% ABV are Pale Lagers, Premium Lagers, Pilseners, Dunkel/Tmavý and Premium Bitter/ESB. My most frequent beer style, however, India Pale Ale (IPA), is on overage overrated by 0.12, and is an important contributor to the positive rating difference for beers in the 6 – 7% ABV range. A final remark: my two highest rated beer styles are both overrated compared to the Ratebeer average and are high in ABV. Of the 11 Barley Wines I’ve rated (with a mean ABV of 11%) the mean difference between my scores and the average Ratebeer score is 0.23, and for my 76 Imperial Stouts (mean ABV is 10.8%) the difference is 0.08. My third highest rated style on the other hand, is actually ever so slightly underrated: Abt/Quadrupel (13 beers, 10.3% ABV) has a rating difference of -0.02.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s