This describes a technique which I use in order to understand the drivers behind conversion on a website in a deeper way than web analytics tools offer out of the box.
At the same time, it is also how I believe segment comparisons should work in Google Analytics. So Google, if you are listening, please make GA work like this; it would make my life a lot easier. The example here relates to conversion; however this technique works for the comparison between any segments and is a powerful way to understand what drives the difference between two groups or sets of behaviours.
If you can’t be bothered to read all of this you can download and view the conversion profile and read the following most important things you need to know:
- This is a comparison of conversion/purchase as a dependent or input variable using many other behavioural dimensions in GA as independent or explanatory variables. The point is to describe what makes a buyer different to a non-buyer in order to understand which behaviours are correlated to purchase.
- As you can see (in the second sheet or from the above example), it then becomes possible to rank and sort these behaviours according to their positive or negative impact on purchase.
- Just some of the things I was able to deduce for this client from this specific case analysis:
- People are most likely to buy when they have been 9-14 times, spent a long time on the site, visited many pages etc). Now this is kind of obvious, and yet if you look at where the greatest volume of buyers are distributed you see that it is the first visit. This means that, even though most people buy on the first visit, they are much more likely to buy if they are comfortable and familiar with the site/brand. The insight here tells me that nurturing more engagement (creating reasons for people to visit and revisit) will drive greater efficiency of conversion. Again, kind of obvious, but what a nice way to demonstrate this for the client.
- The external sources and the landing paths which consistently come out as strong in this analysis are from a particular set of sources in which the user has compared products before arriving at the site. This tells me that comparing products makes people confident in their decision and more likely to purchase, and the recommendation is to bake this comparison behaviour into the site itself to expose more people to its effects. Again, this is common sense, but without this kind of analysis this would never have been surfaced.
- One of the factors most negatively correlated with conversion is the Android platform. Digging into this further uncovered some very interesting insights and some quick wins to resolve this.
- Site search is highly correlated with conversion. This drives a recommendation to test improving the visibility and usability of the search box.
If you have a bit more time and want to know how to recreate this read on:
What drives someone to convert?
Web analytics tools offer the ability to view conversion rates according to segments. For example, I can look at which keywords have the greatest conversion or which landing pages drive the greatest conversion. But what is driving conversion overall? What are the behavioral factors which are the most important when it comes to the decision to buy? How can we get macro insights from this conversion data in order to drive ecommerce strategy?
These questions are unfortunately not easy to answer directly from GA or other similar tools because they force you to look at factors in isolation. The following details my way of answering these by creating what I call a conversion profile.
What is a conversion profile? Where does it come from?
Before I worked in digital I worked in direct and database marketing. In the DM world there is a very frequently used data analysis technique known as data profiling. The most common application of this technique is for the comparison and description of two different data segments. For example, I may want to compare responders to a piece of marketing to non-responders in order to understand the factors which describe response. This allows me to apply better targeting to future mailers thus increasing my response rates.
The above is an example of a MOSAIC profile. MOSAIC is a geo-demographic segmentation which can be tagged to customer data using the postcode and provides information on a person based on where they live. Imagine that this example is comparing buyers (the Target column) to non-buyers or prospects (the Base column). What this would be telling you is that your buyers are overrepresented in affluent segments in comparison to none-buyers. Put in a slightly different way: affluent people are more likely to buy from you than other segments, even if they don’t actually represent the greatest volume of buyers.
I am deliberately keeping this post light on the actual stats side of this, but if you are interested in digging deeper this whole idea is broadly based on the concept of dependent and independent variables. You will also see this used in many other places, especially in the analysis of research data.
For DM the insight you get from this is that, if you were only to target affluent people, you would get the best response and the greatest efficiency and ROI. However it also tells you who should be targeting creatively and strategically.
Applying this to the digital world
Now, it shouldn’t be too difficult to see where I am going with this. The technique and what it is driving at are perfect for understanding website converters vs. non-converters (and in fact any other segments you can think of), and generating insights which can lead to more effective and efficient optimisation, plus more strategic directional insights about how to build for success.
Ultimately what we need to ask is:
- Which types of visitors are most likely to buy from us?
- Which behaviours are most closely correlated with conversion?
- Where would optimisation be most effective and efficient?
- Which kinds of content and activities would drive the greatest improvement in overall conversion?
This is what the conversion profile answers using the techniques borrowed from DM. It is great shame that this isn’t available within Google Analytics and other tools, as whilst it is time consuming to do manually, it is not complicated. Come on Google!
How to build the conversion profile
Download the example conversion profile to use as a template. Here are the steps to recreating this:
- You will need to use Advanced Segments in Google Analytics in order to create a buyers. There is a standard segment called Visits with Conversions if you just want to use all buyers. If not you will need to create your own segment.
- Next you need to apply this segment to your reporting as well as ‘all visits’.
- Now, whichever way you do it the next part is relatively fiddly (hence why I hope someone at Google is reading this). We basically need to extract the data for both segments for each different dimension that we want to analyse (basically the more you put in the better the insight, so I would advise on being as comprehensive as possible). The basic format of data you need is as follows (using the example of browser as a dimension):
(it would increase the length of this post ridiculously for me to go into exact instructions of how to do this, however needless to say that you need to tidy up the data quite a bit. It should be fairly evident from the example profile how the data is constructed. Finally, I would also recommend using something like Excellent Analytics to make the process of extraction a lot easier)
- Once you have the basic data in place you can use my spreadsheet as a template and copy the calculations for the index score and the significance score. Again, I will not go into the statistical ins and outs of these calculations, but the basic way to read these is as follows:
- The index score tells you how over- or under-represented your target segment is in that dimension in comparison to your base segment. In the example, look at the site search section: 18% of buyers used site search within their visit, but only 10% of all visitors used search within their visit. Therefore, buyers are 84% overrepresented in the ‘site searchers’ segment in comparison to non-buyers. Basically any score over about 125 is a significant over-representation and any under about 75 is a significant under-representation.
- The Sig. column relates to the confidence level in the divergence of the factors. I generally discount anything between -3 and 3 to remove anything insignificant.
- You can then interpret the data or create a sorted summary as follows (I did this by selecting the top 10 most positive and negative but not including more than one factor from each dimension, however it is also interesting just to look at the top and bottom 10, as you will see which independent variables are strongest.
- Remember when interpreting the data not to get caught out by the fallacy of cause and causation