RFM analysis for Shopify stores: a comprehensive guide

A toll-free phone number for your Shopify store gives you credibility and signals your customers you’re serious.

RFM analysis is a powerful tool for understanding how your customers behave.

Once you understand how they behave, you can work to encourage or adjust their behavior towards goals that benefit your store. e.g. place another other, write a product review, etc.

What is RFM analysis?

model for marketing, analytics

RFM analysis at its core is a data analytics algorithm that looks at behavior. Specifically customer behavior.

Don't let the jargon scare you off.

By the time you finish reading this article, you'll know plenty about RFM. I promise I won't get into too much math.

RFM is pretty flexible as far as data models go. It can be used by a wide variety of ecommerce stores:

big stores
small stores
physical stores
ecommerce-only stores
omnichannel stores ("The stores are everywhere! Run away!")

It can also be used to measure a wide variety of behavior:

customer purchase behavior
website visit and interactions
email subscriber behavior
social media activity

Like the 80/20 principle, it can be applied to almost any behavior.

That said, for ecommerce and Shopify stores there's one primary purpose:

measure customer purchase behavior for RFM segmentation.

RFM customer segmentation, it's all about the buckets

RFM when used for customer segmentation will group customers into different buckets based on their behaviors.

Customers who buy a lot go into one bucket.

Customers who spend a lot of money go into another bucket.

Etc, etc.

The difference between RFM segmentation and other segmentation methods is that RFM is only looking at customer behavior. It doesn't look at demographic data (e.g. age, gender), psychographic data (e.g. hobbies, values, attitudes), or anything else other than ordering data.

As you'll see soon, this makes it incredibly easy to understand the scoring once you get past the basics.

(Also, for those privacy-minded people: by avoiding all of that data you can avoid a lot of the headaches that come from personally identifiable information (PII), which can reduce your risk from GDPR-like laws)

Calculating RFM and the RFM analysis

Let's jump right into RFM by looking at what customer ordering data it uses. You might be surprised at how little it needs, which is why it works even for small stores without a data scientist on staff.

The three components of the RFM score

RFM is composed of three different components that each measure something different. Let's define them really quick as we'll come back to them often.

Recency - measures how soon the customer last ordered.
Frequency - measures how many times a customer has placed orders.
Monetary - measures how much money a customer's orders are worth (if you thought "that's Customer Lifetime Value", ding-ding-ding you got it right)

Now that those have been defined we can get into calculating them. RFM gets a bit tricky here.

Calculating Recency

Recency is measuring the customer's last order only so data-wise, all you need is the last order date (or date and time) for each customer.

But RFM doesn't just use the dates as they are, it groups them. Sidebar time...

Grouping customers, or why quintiles are so cool

Using the raw order date gets funky because how would you score someone who ordered last Monday vs the 4th of last month?

You'd end up making a bunch of judgments or guesses and end up having a model that needs a lot of tuning before it worked.

RFMs groups solve that by making a set number of groups the same size.

Let's say you have six customers who ordered: last Monday, last Tuesday, last Wednesday, last Thursday, last Friday, and last Saturday (they had Sunday off...).

First thing RFM does is sort the customers by the ordering dates.

last Monday
last Tuesday
last Wednesday
last Thursday
last Friday
last Saturday

For Recency specifically, customers who ordered the most recent are more active so RFM reverses the sort (newest, the most recent at the top)

last Saturday
last Friday
last Thursday
last Wednesday
last Tuesday
last Monday

Now RFM distributes customers into buckets so each bucket is roughly equal. We'll use three buckets because there's only six customers.

Bucket one
- last Saturday
- last Friday
Bucket two
- last Thursday
- last Wednesday
Bucket three
- last Tuesday
- last Monday

Next each bucket gets a value, with higher values better. The values are based on the number of buckets used, three in this example. So Bucket one which has last Saturday's and last Friday's customers in it, is assigned the value of 3. Bucket two gets the next lowest value (2) and Bucket three gets the lowest value (1).

Recency 3
- last Saturday
- last Friday
Recency 2
- last Thursday
- last Wednesday
Recency 1
- last Tuesday
- last Monday

Three buckets is weak for most stores. You only get three recency segments (and you'll find you only get nine total RFM segments). Not a lot of variety or options to work with customer behavior.

That's why most real-world RFM models use quintiles, aka five buckets. (Funny how math people like to use complex names to describe basic things...)

With five buckets, you balance power (lots of segments) with ease of understanding (only five numbers).

Back to Recency itself...

Calculating Recency (continued)

After that sidebar, you'll already seen how to calculate recency for three groups. It's the exact same process for five groups, only you have more buckets to distribute those sorted customers into.

The easiest way to do this distribute the customers is to find out how many customers you have in total and then divide by five. This is how many customers should be in each bucket (ignore rounding).

If you have your customers sorted by their last order date, with the most recent at the top, just count down that many customers for your first bucket. Give them the recency score of 5.

Count down the next set of customers, give them a recency score of 4.

Continue on until the last group which gets a score of 1. You might have some leftover customers at the bottom, just give them a score of 1.

(RFM analysis doesn't have to be super-precise with this rounding as you'll see later. There'll be a lot of activity over-time that balances out any rounding errors).

So if you had 20 customers, each quintile group would have 4 customers in it. The top four would be scored with a 5, the next four would be scored with a 4, the next four a 3, the next four a 2, and the final four a 1.

At this point, you should have sorted and scored all of your customers for Recency. Onto the next component.

Calculating Frequency

The second component of RFM is Frequency. Remember the definition of frequency is how many times a customer has placed orders.

As you might suspect by now, you don't use the actual number of orders for the frequency score. If Harry ordered 7 times and Sally ordered 8 times, their frequency score isn't 7 or 8.

Just like you did with recency, you're going to want to sort your entire customer base. This time though, you'll use their number of orders from highest to lowest. e.g. Sally with 8 orders will be above Harry's 7.

Don't worry about the size of the order or any of that. Just how many orders has each customers placed.

Once they are sorted, follow the same process as before and count down the customers to distribute to each bucket. Start with 5 for the first set, then 4, etc.

There are two modifiers I like to make to Frequency.

Cancelled and fully ordered

When counting orders, if an order has been fully cancelled or refunded then I like to skip it. Additionally if an order's total is 0 like if an order if comped or a replacement order, I'll skip it too.

Most cancelled orders are due to some customer behavior you don't want to measure or reward. e.g. returns, replacements, fraud, etc.

By excluding them in the analysis, that helps frequency measure the actual orders which is the point of the RFM analysis.

Ecommerce store frequency

The vast majority of customers for Shopify and other ecommerce stores only place one order. This means many of your buckets will be filled with these one-time ordering customers. I've seen them fill up all of the 4, 3, 2, and 1 buckets in many stores.

If all of those buckets are filled with similar customers, it weakens your analysis and RFM as a whole.

In [my app][cta] I use a modified version of Frequency. Basically, if a customer only has one order then they get a score of 1. Then the remaining customers are distributed into the 5, 4, 3, and 2 buckets. This spreads out your repeat customers across four scores with lets you detect a lot more behavior.

It's a little bit more math but you get way better results this way.

Calculating Monetary

With two of your RFM component's scored, there's only two steps left.

As you might suspect, calculating Monetary follows the same process as before. It's also a straight-forward calculation if you have the customer's Lifetime Value already (CLTV or LTV). If you don't track that, just total up all orders for each customer.

Sort the customers as before with the higher amounts on the top and start distributing them into buckets.

Calculating the RFM score

At this point, every customer should have three scores with values from 5-1: recency, frequency, and monetary.

Turning that into the final RFM score is easy but be careful as there's a potentially big mistake here.

Take each of the scores and combine them together into a three-digit score. Don't add the three scores, just smash them together like a runonsentence.

Harry: 5, 3, 1 become 531
Sally: 4, 4, 4 becomes 444
Jane: 1, 2, 3 becomes 123

This is the final RFM score for each customer.

If you've had trouble calculating everything or scoring the customers, my app [Repeat Customer Insights][cta] will do all of the calculations for your Shopify store.

Making sense of the RFM score

Now that you've scored all of your customers, it's time to interpret what the scores mean.

This is the fun part and the whole point of the RFM analysis.

Each column matters

When reading a score, don't read it as a three-digit number. Each column or place-value should be looked at separately. The three digit score is just a shorthand.

Let's use the examples from above:

Harry 531
Sally 444
Jane 123

Based on this, Harry's score of 531 isn't better than Sally's 444. Harry's Recency is better (5__ vs 4__) due to them ordering more recently. But Sally's Frequency beats Harry's (_4_ vs _3_) and as well as their Monetary (__4 vs __1).

What I like to do in my head is to read down the column when comparing the score across customers. It makes the changes easier to spot.

Deciding is Harry is a better customer than Sally requires a talk about what your goals are with the RFM analysis. Different goals will give you different answers.

What makes a good or healthy RFM Score?

With multiple components coming out from the analysis, now is the time to figure out what's a good score.

Since we used five buckets, anyone with a score of five in any of the three components could be considered strong in that area. But what about fours? Can a pair of fours make up for a five? What about straight threes, can that beat a single five?

Sounds like poker game invented by statisticians.

Let's dig into the analysis portion.

Every store's analysis should be guided by their goals. That means every store's analysis will be different and even the same store will have different results depending on how their goals change over time. That's okay and totally fine.

Option 1: Analyzing one RFM component

If your goal matches what one of the three RFM components measures, you'll have an easy time with the analysis. You just pick the component that matches your goals and then customers with a score of 5 are your best, 4 are good, 3 are so-so, etc.

Recency - the currently active customers
Frequency - the customers who keep coming back
Monetary - the customers who spend the most

By using one component, you'll get five customer segments corresponding to the five buckets from earlier.

You might notice that you'll end up ignoring the other two components of the score, and that's fine in some cases. The RFM analysis is designed to give you multiple customer segmenting options but it could be that all you need is one component's score for your purposes.

Before you commit though, you might want to consider Option 2 as it might give you even better goals to use.

Option 2: Analyzing two RFM components together

If your goals don't exactly match the components, can you combine two components to describe the customers you're looking for?

Recency and Frequency - currently active, repeat customers
Recency and Monetary - currently active, high spending customers
Frequency and Monetary - repeat, high spending customers

Oftentimes one of these combinations will get really close to your goal (or you'll be attracted to one of these as a replacement to your goal).

In these cases, you're going to end up ignoring one component and using the other two. This will give you 25 customer segments (5 * 5) in total, of which four or five will be your top segments.

To figure out who'd be your best customers, you'll need to decide which scores to use. My advice would be to do the following:

Scores both 5: VIP customers (1 segment)
Score of 5 with one 4: Great customers (2 segments)
Score of both 4: Potentially great customers (1 segment)

This breakdown is easy to remember and gives you four different segments that you can consider with your marketing. You can group them all into one larger segment ("best customers") or send each one different marketing campaigns tailored to their specific scores.

Size-wise, the three segments would be about 16% of your customer base which is really close to the 80/20 concept.

You can also decide to favor one component more than another. Let's say you're goals are around repeat customers who spend a lot. You'd decide that Frequency and Monetary are the two components you'll target but put a higher emphasis on Monetary. Using the same segment descriptions, you might come up with this analysis:

Scores both 5: VIP customers (1 segment)
Score of 5 Monetary, with score of 5 or 4 Frequency: Great customers (2 segments)
Score of 4 Monetary, with a score of 5 or 4 Frequency: Potentially great customers (1 segment)

Notice how with this setup, customers with an improved Monetary jump up to a higher segment right away (Monetary 4 -> 5 would go from Potentially -> Great) while an improvement in Frequency might not cause a jump.

However you do it and whatever pair of components you use, you can always add more. Maybe you have the four segments from Recency and Frequency for your customer loyalty goals and then four segments from Frequency and Monetary to chase after larger ordering customers.

Option 3: Analyzing all three RFM components at once

If analyzing one component of the RFM score is useful, two are better, you'd think three would be even more powerful.

You'd be right but with that power comes great responsibility.

Instead of working with 5 segments (one-score) or 25 segments (two-scores), three scores gives you 125 segments.

That can be overwhelming for any ecommerce store except for the largest. Even then, the big ones can run into trouble trying to keep all of the segments straight.

The reason is because someone has to think about how all three scores relate to each other and use that in their analysis. Difficult, but do-able with 25 scores, but complex with 125. Software can help but you'd still need to explain to the software what all 125 scores mean to your business.

But there's a simple solution, one you might have guessed at from the last section.

That's combining the scores into super-segments.

Super-segments are where you take each score combination and put it into groups with similar scores. For example, you might put 533, 534, 544, and 555 into the super-segment "recent buyers"

This will still require a lot of tedious analysis work and fine-tuning but in the end, it can be a powerful way to use RFM to segment your customers.

But the million-dollar question is, will you make use of that power?

In order for customer segmenting and RFM analysis to be valuable to the business as a whole, it needs to help make decisions that drive results. The activity of grouping RFM scores itself is worthless. The value comes from the decisions made when building campaigns, segmented emails, or copywriting for those segments.

With that in mind, try analyzing two components at a time but give each customer multiple segments. You might have a Great customer based on Recency and Frequency but lousy on Frequency and Monetary.

Segmenting based on two components at a time is the idea behind [my customer grids][grids] and it's surprising how intuitive they can be. Add in historical versions of the analysis (e.g. how were they 12 months ago) and the customer insights go deep without becoming overwhelming.

What are the bad RFM scores?

With so much on the good RFM scores, it's easy to describe the bad RFM scores.

Basically they are the bottom scores of whatever analysis method you used.

One caveat, if you're using the modified Frequency score I mentioned above: make those Frequency 1 one-timers their own segment. It helps to keep them all in one area for ease of analysis.

Shortcut for smaller stores: Add up the RFM scores

For smaller stores without a lot of customers (say under 25,000 customers), one shortcut you can do is to add up each of the RFM components into a final number.

This will give you a number from 3 to 15 which can be used to quickly compare customers.

For our example customers

Harry would end up with a score of 9 (5 + 3 + 1)
Sally would have a 12 (4 + 4 + 4)
Jane would have a 6 (1 + 2 + 3)

This method has a lot of drawbacks so be careful using it, especially if you're a larger Shopify store.

The root problem is that for ecommerce stores, the three scores aren't equally valued. Recency and Frequency are usually more important then Monetary. Frequency is a better descriptor of long-term behavior. Recency changes rapidly causing wild swings in the total score.

With regular RFM scores, keeping them separate reduces their impact on each other.

But I wanted to highlight this approach in case you wanted a shortcut. If you want to go further, you can start work on [customer grading][grading] by building a custom model (or use [my app][cta] which comes with one pre-built for you).

RFM vs AI and machine learning

When it comes to customer segmenting, not many beat RFM analysis. Artificial Intelligence (AI) tools like machine learning are making progress but they require a lot of resources to use correctly. You'd need a lot of data, plenty of staff, and the ability to fine-tune the machine learning model before it can produce results. For big ecommerce companies like Amazon, Walmart, or Target that's no big deal. But it's outside the reach of the majority of stores.

Even the full 3-individual score RFM analysis pushes many stores to their limit and that doesn't even include a AI making decisions you have no clue on.

A generic machine learning setup that's offered by a tool vendor could augment the RFM analysis but you'll need to be cautious. If they tune their AI models on the customer behavior for some clients, it might shift how it interprets your data.

AI and machine learning has potential for customer segmenting, it's just over-hyped right now. Especially when RFM analysis has been proven to work really well for many stores.

Refine your automated marketing campaigns with better timing

When building any automated marketing campaign that sends messages over time, you need to know how long the campaign should be and how long to delay the messages. The Customer Purchase Latency metrics calculated by Repeat Customer Insights can help you figure out that timing.

Install Repeat Customer Insights for Shopify