How to generate the data needed for a cohort chart- cohort analysis Part 3
Wednesday, February 22, 2012 1:15:53 PM (GMT Standard Time, UTC+00:00)
Sean Ronan pointed out on twitter that "what" the columns on www.quickcohort.com are which is a good point so I thought I would fill in the gaps a little. The data for a cohort is fairly simple and can be as granular as you decide but the columns needed are:
- First Action Date/Time
- Most Recent Action Date/Time
- Count of customers which have this First/Most Recent Action Date/Times
Most of the time you can strip the time part from the date/time (especially if you're looking at the data on a month basis) but the tricky part is getting the count of users within each date grouping. You can't just select min/max dates as you need the data grouped by your unique customer identifier. If you're running SQL Server 2005+ then you've got the benefit of Common Table Expressions.
For this example, I've assumed a simple order table structure which contains a Customer Reference (CustomerId) and an Order Date (OrderDate). You could however use any date and identifier which groups actions together e.g. ProfileId and LastLoginDate.
SQL 2005 or later
WITH Actions (FirstAction, LastAction, UniqueId)
AS (
SELECT min(dateadd(dd, datediff(dd, 0, o.[OrderDate]), 0))
, max(dateadd(dd, datediff(dd, 0, o.[OrderDate]), 0))
, o.[CustomerId]
FROM Orders o
GROUP BY CustomerId
)
SELECT a.[FirstAction]
, a.[LastAction]
, count(a.[UniqueId]) AS [CountOfCustomers]
FROM
Actions a
GROUP BY a.[FirstAction]
, a.[LastAction]
ORDER BY a.[FirstAction]
, a.[LastAction]
Otherwise, I think you'll need to write something using temporary tables e.g.:
Pre SQL 2005
CREATE TABLE #Actions(
FirstAction SMALLDATETIME,
LastAction SMALLDATETIME,
UniqueId INT
)
INSERT INTO #Actions
(
FirstAction
,LastAction
,UniqueId
)
SELECT min(dateadd(dd, datediff(dd, 0, o.[OrderDate]), 0))
, max(dateadd(dd, datediff(dd, 0, o.[OrderDate]), 0))
, o.[CustomerId]
FROM #Orders o
GROUP BY CustomerId
SELECT a.[FirstAction]
, a.[LastAction]
, count(a.[UniqueId]) AS [CountOfCustomers]
FROM
#Actions a
GROUP BY a.[FirstAction]
, a.[LastAction]
ORDER BY a.[FirstAction]
, a.[LastAction]
DROP TABLE #Actions
This should then generate some data that looks like this:
| FirstAction |
LastAction |
CountOfCustomers |
| 2011-03-01 |
2011-03-01 |
1 |
| 2011-03-01 |
2011-04-01 |
1 |
| 2011-04-01 |
2011-06-01 |
1 |
| 2011-05-01 |
2011-10-01 |
1 |
| 2011-06-01 |
2011-11-01 |
1 |
| 2011-07-01 |
2011-08-01 |
1 |
| 2011-08-01 |
2011-08-01 |
1 |
| 2011-09-01 |
2011-12-01 |
1 |
| 2011-10-01 |
2012-02-01 |
1 |
| 2011-11-01 |
2012-02-01 |
1 |
| 2011-12-01 |
2012-02-01 |
1 |
| 2012-01-01 |
2012-01-01 |
1 |
| 2012-02-01 |
2012-02-01 |
2 |
Which you should just be able to drop into www.quickcohort.com. I've not written a version for MySQL as I suspect someone far better at MySQL will be able to pop something together but the pre SQL 2005 script should work.
How to read and interpret a cohort chart - cohort analysis Part 2
Monday, February 20, 2012 2:31:06 PM (GMT Standard Time, UTC+00:00)
In my last post about Cohort analysis I briefly introduced what a cohort graph is, in this one I'm going to go into a little more detail about how you can use it. I'm planning on releasing some code so you can add it to your reporting suite in another post but if you can't wait until then, I've thrown up a quick online cohort generator at: www.quickcohort.com -just dump your data into it and click graph.
What can the cohort chart tell us?
A cohort chart can give you an idea of customer loyalty -and- an indication of potential problems in their lifecycle.
The key metric it gives us is customer loyalty. This is also one of the most frequently ignored metrics in every company (mostly because people aren't clear on how to measure it) yet it's probably one of the most important at the same time.
What is Customer Loyalty?
That depends very much on your business and as a result the graphs will likely look very different. It's probably easiest to compare what a couple of business types -a retailer and an online magazine- might consider loyalty.
| Retailer | Magazine |
- Repeat sales
- Visits to the store
- Engagement with the store e.g. email opens
| - Multiple Logins
- Subscription
- Number of comments on an article
|
As you can see, although there may be some cross overs (Visits to the store and Logins are probably quantified by a similar metric) the "what" depends very much on your business but the longer the duration between the first and last engagement in all scenarios above is how you can demonstrate loyalty.
How should I read a cohort chart?
Unlike most tables which you read left to right (a row at a time), you'll probably get more value out of a cohort chart from reading it by a column at a time. This will enable you to spot possible problems in the user's lifecycle.
Problem points in the user's lifecycle can often be spotted where the colours change in the same column. The greater the difference between the shades of colour, the bigger difference is between the two months. In a perfect world, 100% of the customers from Month 0 will still be using your site in Month 12 but life is rarely like that.
Looking at the chart below of user logins over time, the eager should spot that there are three months which appear to have issues: Month 3, Month 5 and Month 9:

If you're not sure on what you're looking for, you're spotting those columns which have a similarly shaded background which then lightens in the next month (or in the case of Month 9 is completely unshaded.
What does this mean?
As the chart above was logins over time, a quick glance over this cohort chart this would suggest the following to me:
- Overall there is a pretty good retention for the users (the percentage of customers going from month to month are pretty high)
- The majority of users only remain engaged up to month 3, at which point a lot of users lose interest in the site
- There is a clear drop off in customer retention after month 9
- The eldest customers had the best retention rates (despite the drop off in Month 9, 30-40% of visitors were still coming back until then)
When reading a cohort chart, you can generally discount the last cell of each cohort as it's the current month.
What might the cohort chart look like?
Repeat Sales
Logically, to be a repeat customer, you have to make at least two purchase from the retailer so the duration we're interested in (the month) is the period between the first purchase and most recent purchase.
First Date (Month 0): First Purchase
Last Date (Month x): Most Recent Purchase
Check out the cohort chart below and see what you can interpret.
Remember, Month 0 represents the first purchase -all customers appear in this column. Looking over the cohort chart, of the 69 customers the retailer had in October 2010, 39% (27 customers) were still around in November 2010 (month 1), 21% (15 customers) were still around in December 2010 and so on.
None of the 69 customers who made their first purchase in October 2010 are still a customer 12 months later (although if your customers tend to make a purchase near the end of the month, the customer who made a purchase in September 2011 may still make a purchase).
So looking at that chart, 30-40% of customers would make a second purchase 2 months after their first purchase. Of the older customers (those which first purchased before March 2011) 10% would make another purchase 5-6 months later.
What else can we gleam?
There are a few interesting things with the chart above, another is the sudden drop off in month 3 for those customers who first purchased in May/Jun 2011, similarly, the customers who signed up in Mar/Apr 2011 also stumbled in the same month, that could indicate some form of seasonal trend or change in marketing.
With a little background you will be able to get a much better insight into the meaning behind some of the numbers. If for instance you had changed your marketing routine around Jun 2011 this might explain the difference in numbers. It might be that your business is very seasonal (in which as you'd be better to look at a 24 month chart rather than 12 month).
Keeping a record of what you were doing around the different months is important, for instance you might start a pay-per-click campaign. Everyone's happy because you notice an increase in sales (so an increase in Month 0) but are they a valuable customer (a repeat purchaser) or a one-off? Cohort chart analysis will quickly highlight this to you as the increase in Month 0 will be reflected in Month 1.
Although time will tell, it would appear that a lot of customers make another purchase about 2 months after their first. This could be co-incidence or it might be that you're selling a product with a small sample accessory (e.g. a free pack of chalk with each chalkboard) and that sample pack runs out after a couple of months. Alternatively it could be a fault in the product e.g. you sell hinges and the oil runs out after a couple of months so they're buying more grease. By adding a little context to the data you'll likely get even more interesting stats out (we certainly have in the past).
Visits to the store or customer engagement
Things start to get really interesting when you start comparing two cohort charts for the same customer base and period against each other.
First Date (Month 0): First Purchase
Last Date (Month x): Most Recent Login
What's interesting when you compare the two charts is most of the customers who have made a purchase are still returning to the site (over 30% are still logging in in Month 11). This would suggest there's not as greater a problem with customer retention as there is with sales.
This could be because your store sells seasonal products but you keep customers engaged, it might just be a co-incidence but it should drive investigation.
![Cohort-Created-To-Last-Purchse[3] Cohort-Created-To-Last-Purchse[3]](http://blogs.thesitedoctor.co.uk/tim/images/How-to-read-and-interpret-a-Cohort-Trian_FC17/Cohort-Created-To-Last-Purchse3_thumb.png)
What Can I do with this information?
In isolation it's helpful but only really gives you a top level view on a customer's lifetime with you, it's when you're able to combine this data with knowledge of your business, sales statistics, marketing strategy and information such as a customer's LTV (Lifetime Value) that it gets really interesting and useful.
Using a cohort chart and average sale value, you can use it as part of your sales forecasting and predict what your company's sales will be going forwards e.g.: if the Average Order Value is £10 and your average first 5 months looked like this:
| Month: | 0 | 1 | 2 | 3 | 4 |
| Customer Trend | 1,000 | 500 | 200 | 100 | 10 |
You'll then know that of the 1,000 or so customers which sign up in the current month your revenue is likely to look something like this:
| Month: | 0 | 1 | 2 | 3 | 4 |
| Expected Sales | £10,000 | £3,000 | £1,000 | £900 | £100 |
How have I got to those numbers? Well, of the 1,000 customers that purchase this month, 50% will make another purchase on or after month 1 and 20% will make a purchase on or after month 2. With this in mind, of the 1,000 customers from Month 0, 30% will make a purchase in Month 1 so the calculation is as follows:
([Number of customers] * [Percentage Returning in Month]) * [Average Order Value] = [Expected Sales]
(1,000 * 30%) * £10 = £3,000
There are a few assumptions with doing it like this -for instance, customers who purchase monthly will only be counted once etc but this is still a good start.
Using a cohort chart with sales data becomes very powerful as you're able to get a really good insight into whether campaigns are generating worthwhile leads or just generating traffic to the sites. If you're interested in reading more about that I'll overview it in another post as that gets pretty heavy on number crunching.
If you find that customers tend to drop off after a set number of months then it might be worth setting up some form of customer engagement which is triggered just before this point e.g. an email upselling a product that complements theirs or asking them to get in touch with feedback.
Start-up's golden triangle of customer loyalty - cohort analysis Part 1
Monday, December 12, 2011 5:13:12 PM (GMT Standard Time, UTC+00:00)
If you're from a financial or medical background you'll probably already be familiar with cohort analysis but more recently it's become a very popular way of measuring customer loyalty among your consumers. I've been playing with it with our customers for a while now and I think many more businesses can benefit from it's insights.
Once you've worked out the customer's lifetime value (often referred to as LTV), average order spend and time until first purchase, your analysis often end there. LTV is better than nothing however you may be missing some major issues in their journey. Lets for example say you sell a widget. Your widget lasts 12 months but needs to be oiled every 3 months. Your gut tells you that this is the case but proving this is difficult without analysing each client individually.
Individually you can't gain a great insight into your customer as each is slightly different. Cohort analysis works around the granularity and groups customers together into cohorts. Each cohort is based on a fixed point in the customer's timeline with you -for instance the date/time they signed up or their first order. We can then use this fixed point to compare other customers who have gone past the same period in their lifetime (or not as the case may be) to spot trends.
The easiest way to understand you group the users is to imagine the following timeline of customer signups, we have three customers (Green, Blue and Red) and they all signup up at different times throughout a 6 month period:

Spotting trends in these customers is less than obvious however all customers have passed through a number of similar points in their lifecycle (in this example month 1, month 2 and month 3) so looking at the data from this perspective will help you spot the trends:

Most cohort analysis groups customers into monthly groups however the size of each group will depend on the number of signups/orders you have e.g. a system like Twitter will have enough data to produce cohorts on a minute or even second basis. By grouping customers together in this way you can then spot seasonal trends and retention (the length of time you keep a customer).
What does a cohort chart look like?
A cohort chart looks like this:

You get this very distinctive triangle because the users on the first row are your eldest users and will have been with you for the longest time (which is also why they have the longest row). The users on the last row have just joined in the current month so have the shortest row.
The chart above is rather encouraging; it's from one of our e-commerce clients and it shows really a rather dedicated customer base -a year after signing up 41% of the customers are still logging into the system!
You may also notice an interesting dip in retention for those customers who signed up in Dec 2010/Jan 2011. Although additional investigation is required, the type of customer base they have is very busy during these months so they've probably forgotten about signing up. This does however leave a prime opportunity for them to be contacted directly and encouraged back.
In my next cohort analysis blog post I'll overview how you can read and interpret the chart in more detail and use it to spot trends.