The ‘Nitty Gritty’ of Customer Pathway Analysis

The ‘Nitty Gritty’ of Customer Pathway Analysis

I was chatting with a friend recently who knew of my background in ‘data stuff’. He described a situation where a retail bank had delivered a bad customer experience. My friend asked me how we track people and identify when someone is behaving differently or is annoyed and considering leaving an organisation or considering buying a home. I used retail banking in my explanation of a technique called ‘Customer Pathway Analysis’.

‘Customer Pathway Analysis’ is a term that describes complex analyses of customer behaviour and events over time. I like to use the term ‘Customer Pathway Analysis’ because it reflects the concept of a journey, and the analysis used often focuses upon patterns, sequences, and time series.  This means that the analysis is sensitive to changes and patterns over time.  In plain business speak, we are talking about understanding friction points in ‘Customer Experience’ and changes to the common behaviour we see in an individual customer.

Often this outcome is aligned to a core KPI or business metric, like customer attrition/churn or acquiring a home loan/mortgage. 

The examples below are described in context to retail banking, but is applicable to many industries. 

Typical applications of Customer Pathway Analysis might include finding and describing specific differences in credit card transactions, bank account balance, or interactions with the branch, that occur before a customer applies for a new home loan,  or leaves the bank and acquires a new home loan with a competitor.

Multiple channels and methods of interaction are often involved.  In the context of retail banking, this includes bank account and  credit card balances, fixed term savings, customer visits to a branch, or visits to the mobile banking portal, and clicking on a home loan banner on a web page.

A key challenge is to automatically identify the specific changes in customer behaviour or interactions with the bank that occur at any historical point in time across all channels before the specific outcome or event of interest.  Once this pattern of behaviour is understood we can then detect other customers that elicit a similar or partial sequence of actions, so that we can intervene and improve outcomes. 

This is a particularly challenging analytics problem because it involves analysis of very large volumes of  time series data and sequences of events that can traverse across multiple product portfolios including credit card transactions, savings accounts, and home loans  and multiple channels and events like visiting a branch, contacting a call centre, requesting a token for online banking, using online banking, or clicking on a banner for home loans within the online banking portal.  Many of these activities produce data that forms a time series or a sequence of events – huge numbers of distinct patterns over time.

Every customer is unique, and therefore the data analytics challenge is to build a specific behavioural profile for each customer over time.  Profiles are subsequently used to detect significant changes across all products and channels.  This nature of solution requires high-performance technology, scalable approaches to data analysis, and robust data integration and data management methodologies. This analytics solution is recording, measuring, segmenting, and ranking the customer experience of every individual customer intra-day spanning months or years of historical data.

Customer Pathway Analysis in Retail Banking

An effective solution involves multiple statistical analysis techniques, each with a strength in detecting changes over time.  I’ve provided one example below with some basic illustrations.

One data analysis approach we employ is time series decomposition, including waveform analysis to examine the historical time series patterns of each customer. We figure out different simpler underlying patterns within the time series.

Think of your retail bank account – every day you buy things, and each month (or week) you get paid a salary.  Very simplistically, your bank balance in a single month might look like this:


This is called a ‘Sawtooth’ waveform and, over an entire year, a person’s bank account might (simplistically) follow a basic pattern that looks like this:


Each month (or week) we get paid and we spend the same amount of money buying stuff.

But in the real world things are never that easy.  In fact, it is more likely that there will be a trend, either growth or decline in the overall balance over time:


If those two different time series are added together, then we have a bank account balance that is starting to look semi-realistic:


And yet that still looks ridiculously simple.  During some months or periods in the year significant events occur that make the time series pattern more complicated. For example, at Christmas we might get a salary bonus and out-going expenses may increase.  Significant events can occur, perhaps involving a spike and a level shift – following a pattern like this:


Therefore, a more realistic bank account balance might include spikes in activity like this:


The challenge, of course, is to start with a realistic and complicated pattern (above is starting to look realistic) and then break apart the time series pattern into the components described above- trend, seasonality, spikes, and other time series attributes.

In the fictitious example above it is straightforward to see the event occurring in December.  In reality, the separation or decomposition of the time series components reveals subtle events that may be slightly larger than normal activity.  Once we have separately identified trend and the sawtooth waveform, then it is much easier to identify and measure isolated events.  Given enough data we can identify the meaning and association of those events with subsequent outcomes, for example acquiring a home loan or leaving the bank. 

We apply time series transformations, including time warping, temporal rescaling, and temporal compression in order to normalise customers that have similar behaviour but within different time frequencies (i.e. matching people with similar weekly versus monthly salaries and outgoings).  Furthermore, we use statistical and machine learning techniques to forecast future balances for each customer and monitor variations from this forecast as one of many forms of detection for significant change.

Another approach is to apply transformations on the historical time series data so that we examine sequences of time series as transitions between static states for each customer, and compare these sequences of states to other historical customers that subsequently elicited a negative outcome (attrition etc). The sequence of states is also reversed within boundaries or treated as parallel (in time) associations in order to measure the strength of sequence order upon the progression of time series states and negative outcomes (i.e. customer attrition/churn).

Clustering (such as K-means nearest neighbour) is also used to segment customers based upon their time series profiles, and centroid distance measures are used (along with other measures) to detect individual customers that begin to differentiate themselves from their historical segment due to variations in their personal time series behaviour.

Ultimately, these activities are used to detect events and significant changes for each individual customer so that a relevant offer can be made at precisely the right time, and in context of the customer’s behaviour and needs at that time.

Many organisations have mechanisms and solutions in place to communicate with customers, but few have sophisticated data-driven solutions that automatically generate targeted customer lists that are time specific and relevant.