Statistics 101

Precise understanding of compensation data is necessary to define a sound, and fair, compensation strategy.

Alexis Toyane avatar
Written by Alexis Toyane
Updated over a week ago

When it comes to managing compensation, you have to deal with data. Precise understanding of compensation data is necessary to define a sound, and fair, compensation strategy for your company. But it's far from easy to fully understand, and apply correctly, statistical concepts such as percentiles, median, sample size, when you don't have a mathematics or science background. Well that's where this article comes in! In the next few paragraphs, we'll try to cover those concepts as simply as possible, within the context of compensation.

Mean, median, average?

In most compensation benchmark information, such as Figures, you'll see "median" everywhere instead of "mean". Why is that? First things first, definitions:

  • Mean (most of the time refers to arithmetic mean): the sum of the numbers divided by how many numbers are being averaged

  • Median: value separating the higher half from the lower half of a data sample

When we use the term "on average", it's unclear whether or not we're talking about the mean or the median. Well, when it comes to compensation, using the median is often much more appropriate. Why?

Let's take a made-up example: let's say you're Head of HR looking to make an offer for your open CTO role. You want to know how much a CTO earns "on average" in a company like yours, a post Series A FinTech start-up of 70 to 100 employees based in Paris.

You get your market data (from Figures of course! 😉), it turns out there are 5 total CTOs matching that description, with the following packages:

Company

Total Package

Company A

58 000 €

Company B

65 000 €

Company C

70 000 €

Company D

82 000 €

Company E

250 000 €

Note: Company E went and hired a big time CTO from the Bay Area, with very specific skills and ton of niche expertise to lead their Tech efforts, which meant a huge package associated. It happens!

So let's look at the mean and the median:

  • Mean = 105 000 € (sum of all packages divided by the number of CTOs, 5)

  • Median = 70 000 € (50%, 2 persons, are paid more than that and 50% are paid less than that)

So if you were to make a "fair" offer for this role, you could be tempted do use the mean. However is really 105k€ reflective of the market average ? 4 out of 5 CTOs on similar roles earn significantly less !!

That's why using the median makes more sense: 50% of CTOs earn more than 70k€, 50% of CTOs earn less: it's a way better representation of the market average than 105k€!

OK but what about percentiles?

Well now that you understand what the mean is, percentiles are easy to understand! So, the median is the value at which 50% of comparable employees are paid more, and 50% are paid less. Well, turns out that the median is the 50th percentile! Apply that same principle to the 25th percentile: 25% of people are paid less than this value, and 75% are paid more. So the Xth percentile value is the value at which X% are paid less, and (100%-X%) are paid more.

Example: let's look at Figures' market data for Back-end Developers in France, at the Intermediate level

total Cash - Percentiles - Back end Int.png

Note: P25 stands for 25th percentile, Median is the 50th percentile and P75 the 75th percentile.

This means that out of all Intermediate Back-end developers in France :

  • 25% earn less than 48k€, 75% earn more

  • 50% earn less than 52k€, 50% earn more

  • 75% earn less than 56k€, 25% earn more

Sample size?!

We kept the easiest one for last, I swear. Sample size, when used in the domain of compensation benchmarks, is referring to the number of comparable employees (or incumbents) when looking at market data. Why does it matter? It's a good indicator of how reliable your market data is.

Looking back to our CTO example above, the sample size is 5 (as in 5 comparable incumbents). Let's say now, that you want to look at market data after having removed some of the restrictive filters used (FinTech / post Series A companies), and after having loosened up the headcount criteria a bit (let's say companies from 40 to 150 instead of 70 to 100).

Now you get a sample size of 45 CTOs, and the market average (the median 😉) of this sample is 82k€.

Would you trust more the 70k€ median, based on a sample size of 5 CTOs, or the 82k€ median, based on a sample size of 45 CTOs?

The first number might come from very comparable companies, but a very limited dataset (small sample size). The second one however, with slightly less restrictive criteria used, is much, much stronger.

Well, now that you master those concepts, you can leverage any benchmark data you possess much more easily, and crush that compensation strategy of yours.

Phoebe from Frieds series saying Good Luck!

Did this answer your question?