(From chapters 17 and 18 of my second book ADVANCED CUSTOMER ANALYTICS, Kogan Page, 2017)

So, what is loyalty?  Should be easy to define, we all know what it is, right?  In the context of analytics, loyalty is when a consumer becomes a customer and likes the brand enough to come back again.  This customer likes the brand enough to continue coming back and even spread the word to their family and friends, even recommend it to their peers and network, even be an ambassador for the brand.  Note that at its base loyalty is about the customer, it is NOT about the firm or brand.  That is, loyalty analytics is (as it always should be) focused on the customer–what does the customer need, what does the customer like, what is the customer sensitive to, what will it take for the customer to become emotionally involved with the brand, what touchpoints are most important to a customer?  Often this means defining loyalty in terms of customer segments, especially how loyal a segment is, which needs or benefits does the brand satisfy for one segment over another, what is the range of loyalty–is it merely transactionally loyal or is a segment emotionally involved as an ambassador for the brand?

So the first issue is that loyalty should be designed as a win-win and viewed primarily from the customer’s POV, not the firm’s.  Note that most loyalty analytics, and even most loyalty books (even the pillar of loyalty books, Reichfeld’s The Loyalty Effect) is mostly about the firm.  That position tries to explain why loyalty helps a firm, how a firm should be interested in loyalty, what metrics should the firm track to gauge its customer’s loyalty, how understanding loyalty and increasing loyalty is a benefit to the firm.  This is short-sighted.  This approach will produce only a pareto effect achieved quickly and never increased.

While loyalty no doubt has an important value to the firm, the right framework is obsessing on the customer: their experience, their wants or needs, what is valuable to THEM.  This has everything to do with program design.  Why would a firm put a loyalty program in place?  If a firm is trying to collect members in order to send them emails about promotions and discounts, that is NOT a loyalty program, it is an email club.  That may have some value, especially if the firm’s products require a discount in order to buy, but that should not be called a loyalty program.  One thing to learn when understanding loyalty from a customer’s POV is that not all customers want the same thing, not all customers care about a discount.  (This is what elasticity modelling is all about.)  Some of them want something else!  Remember there are four Ps in tactical marketing and PRICE is only one of them.



There is a range of loyalty from none to transactional (rational) to brand (emotional).  The point of loyalty analytics is to understand where on this spectrum a customer or segment is and learn how to incentivize and change their behavior to move up the scale.  If done aright, this is not only for the customer’s or segment’s benefit, it is of benefit to the firm.  Some customers or segments will not move, or that it costs too much to get them to move on the spectrum, and that is a valuable insight!




Note that there is actually no such thing as a blatant entity called or quantified as “loyalty”.  It is a latent variable.  The idea is that it is like intelligence, which is also unquantifiable as itself; it can only be indirectly measured as something like a score on an IQ test, which in turn measures dimensions of intelligence: spatial ability, logic, mathematics, verbal skills, etc.  Same is true for loyalty.  It can be seen and surmised by other actions.

So let’s use our behavioral segmentation based on customer transactions and responses to marcomm.  We are interested in how loyal each segment is, which is not necessarily the same thing as how much they spend or how many transactions they have.  So we do primary marketing research and ask questions about opinions and attitudes around price, value, quality and satisfaction.  These metrics will show a range of loyalty.  We also ask about share of voice, competitive density and the convenience of our stores compared to our competitors.

See the below loyalty framework.  It posits that there has been a behavior segmentation finished.  Different segments score differently on loyalty metrics. One segment is emotionally (brand) loyal and the other is transactionally loyal.

Let’s say we have survey data on segment responders including the below attitudes and metrics: PRICE, QUALITY, VALUE, SATISFACTION, SHARE OF VOICE, COMPETITIVE DENSITY and CONVENIENCE.  Using SEM, these variables will score on the loyalty spectrum, from zero loyalty to transitional loyalty up to emotional loyalty.  Thus we can ascertain how loyalty and with what dimension each segment is.

The model above tries to put a framework together that says consumer behavior (transactions, responses, etc.) is caused by a spectrum of loyalty (from none to transactional to emotional) which are in turn caused by attitudes around price, value, satisfaction and quality as well as opinions / metrics of operational logistics like convenience, share of voice and competitive density.

So the general analytic idea is that there are no such metrics or quantities as emotional or transactional loyalty.  These are latent variables.  But adding these variables helps explain the behavior of customers purchasing and customers responding.  This latent variable is discovered by a factor analysis-type technique used in SEM.  That is, the manifest variables indirectly show the influence of the latent variable and that latent variable is “teased out” and labeled.

(A quick note about the difference between transactional and emotional loyalty should clarify this important point.  It is possible for a customer to appear very loyal in terms of buying a lot of products, having a short time between purchases, responding to marcomm, etc., but not be in fact actually very loyal.  These are heavy purchasers because there might not be any competitors around, or our stores are very convenient or our share of voice is comparatively large.  Thus it’s important to know how “loyal” customers are, independent of other dimensions.  That is, a transactionally loyal customers may jump ship if competitors move in near their location, or change their share of voice.)

The results below are from applying the loyalty model to two different segments, say X and Y.  The segments were defined by (transactions and marcomm response) behavior.  The question is how loyal (what kind of loyalty) they are and what can be done about it.  Let’s say that each segment has generally the same metrics on transactions and responses.  Segment X scores as a transactionally loyal customer.  Note the parameter estimates of convenience and competitive density are very high and significant while share of voice is strong and negative.  These are traditional indications of the transactionally loyal segment.  Note also high and positive impacts of attitudes around price and quality.  And recognize that most of the variables on the emotional path are insignificant.



Path variable parm est st error t value
price 5.65 3.23 1.75
quality 6.21 1.65 3.75
value 3.03 2.07 1.47
satisfaction 1.35 0.66 2.05
convenience 5.22 0.75 6.96
competition 2.66 0.99 2.68
share of voice -1.55 1.03 -1.51
Path variable parm est st error t value
price 0.03 2.66 0.01
quality 0.56 1.07 0.53
value 1.04 2.36 0.44
satisfaction 1.66 1.03 1.62
convenience 1.99 1.66 1.2
competition 0.66 2.04 0.32
share of voice 2.55 1.69 1.51



Now, a segment that scores as a strong transactionally loyal only segment is something of a red flag.  This is especially true if they LOOK like they are loyal based on their number and amount of purchases.

How can we use the above model to move the segment from mere transactionally loyal to emotionally loyal?  The answer is in the emotional loyal path.  The single largest impact is share of voice and that is a metric we can (somewhat) control.  There is a business case around what is the cost to spend and increase our relative share of voice applied against the added security (and perhaps increased purchasing) of a segment that evolves into emotionally loyal.  See that share of voice is negative in the transactional path?  As SOV increases this segment is less transactionally and more emotionally loyal.

Now let’s look at the opposite kind of loyalty, the brand or emotional kind.  These are customers that love our brand, no matter what.  View the output below for segment Y, which scores mostly as an emotionally loyal group.  Note on the emotional path convenience and competitive density are negative.  This segment is so connected to the brand that even if it is inconvenient to go to our store they go anyway and even if more competition moves in these customers come to our store anyway.  This is emotional loyalty.  You see also that on the emotional path, while price is positive it’s insignificant and quality is very small.  It should be no surprise that both value and satisfaction are high.  On the transactional path none of those metrics are significant.



Path variable parm est st error t value
price -1.27 5.65 -0.22
quality 2.07 6.24 0.33
value 2.07 1.65 1.25
satisfaction 0.03 5.07 0.01
convenience 0.23 0.2 1.17
competition 0.04 0.02 1.8
share of voice -2.65 1.54 -1.72
Path variable parm est st error t value
price 3.25 3.04 1.07
quality 0.24 0.12 2.06
value 1.26 0.76 1.67
satisfaction 3.23 1.23 2.63
convenience -3.65 1.26 -2.91
competition -2.07 0.56 -3.66
share of voice 1.27 0.87 1.45



This is the power of SEM, hypothesizing and testing a latent variable.  This latent variable accounts for movement in the customer transactions and customer responses.  If only a blatant or manifest model was used the fit would not have been so well and the insights (differentiating between the two kinds of loyalty) would not be realized.  So is that cool, or what?

Structural Equation Modelling (SEM) is a powerful systems method especially in dealing with latent variables.  This has great importance into subjects like satisfaction in terms of loyalty and quantifying various degrees of loyalty.


Why Go Beyond RFM?

ANALYTIC SOLUTION: Explain advantages and disadvantages of RFM

(This chapter was published in a different format in Marketing Insights, April 2014)


While RFM (Recency, Frequency and Monetary) is used by many firms, it in fact has limited marketing usage.  It is really only about engagement.  It is valuable for a short term, financial orientation but as organizations grow and become more complex a more sophisticated analytic technique is needed.  RFM requires no marketing strategy and as firms increase complexity there needs to be an increase in strategic planning.  Segmentation is the right tool for both.

RFM has been a pillar of database marketing for 75 years.  It can easily identify your “best” customers.  It works.  So why go beyond RFM?  To answer that, let’s make sure we all know what we’re talking about.


One definition could be, “An essential tool for identifying an organization’s best customers is the Recency / Frequency / Monetary formula.” RFM came about more than 75 years ago for direct marketers.  It was especially popular when database marketing pioneers (Stan Rapp, Tom Collins, David Shepherd, Arthur Hughes, etc.) started writing their books and advocating database marketing (as the next generation of direct marketing) nearly 50 years ago.  It became a popular way to make a database build (an expensive project) return a profit.  Thus, the most pressing need was to satisfy finance.

Jackson and Wang wrote, “In order to identify your best customers, you need to be able to look at customer data using Recency, Frequency and Monetary analysis (RFM)…”  Again the focus is on identifying your best customers.  But, it is not marketing’s job to just identify your “best” customers.  “Best” is a continuum and should be based on far more than merely past financial metrics.

The usual way RFM is put into place, although there are an infinite number of permutations, ends up incorporating three scores.  See figure below.  First, sort the database in terms of most recent transactions and score the top 20%, say, with a 5 and on down to the bottom 20% with a 1.  Then re-sort the database based on frequency, maybe with the number of transactions in a year.  Again, the top 20% get a 5 and the bottom 20% get a 1.  The last step is to re-sort the database on, say, sales dollar volume.  The top 20% get a 5 and the bottom 20% get a 1.  Now, sum the three columns (R + F + M) and each customer will have a total ranging from 15 to 3.  The highest scores are the “best” customers.


999 3 2 1 6
1001 5 3 3 11
1003 4 4 2 10
1005 1 5 2 8
1007 1 4 1 6
1009 2 4 3 9
1010 3 4 4 11
1012 2 3 5 10
1014 3 1 5 9
1016 4 1 4 9
1017 5 2 3 10
1018 4 3 4 11
1020 4 4 3 11
1022 3 5 3 11
1024 2 4 2 8
1026 1 3 5 9


Note that this “best” is entirely from the firm’s point of view.  The focus is not about customer behavior, not about what the customer needs, why those with a high score are so involved or why those with a low score are not so engaged.  The point is to make a (financial) return on the database, not to understand customer behavior.  That is, the motivation is financial and not marketing.

RFM works, as a method of finding those most engaged.  It works to a certain extent, and that extent is selection and targeting.  RFM is simple and easy to use, easy to understand, easy to explain and easy to implement.  It requires no analytic expertise.  It doesn’t really even require marketers, only a database and a programmer.

Say you rescore the database every month, in anticipation of sending out the new catalog.  That means that every month each customer potentially changes RFM value tiers.  After every time period a new score is ran and a new migration emerges.  Note that you cannot learn why a customer changed their purchasing patterns, why they decreased their buying, why they made fewer purchases or why the time between purchases changed.  Much like the tip of an iceberg, only the blatant results are seen and RFM gives nothing in the way of understanding the underlying motivations that caused the resultant actions.  There can be no rationale as to customer behavior because the purpose of the algorithm used was not for understanding customer behavior.  RFM uses the three financial metrics and does not use an algorithm that differentiates customer behavior.

Because RFM cannot increase engagement (it only benefits from whatever level of involvement, brand loyalty, satisfaction, etc., you inherited at the time–with no idea WHY) it tends to make marketers passive.  There is no relationship building because there is no customer understanding.  That is, because RFM cannot provide a rationale as to what makes one value tier behave the way they do, marketing strategists cannot actively incentivize deeper engagement.

RFM is a good first step, but to make a great step requires something beyond RFM.  Marketers require behavioral segmentation in order to practice marketing.


Behavioral segmentation (BS) quickly followed RFM, due to the frustrations that RFM produced good, but not great, results.  As with most things, complex analysis requires complex analytic tools and expertise.  BS was put into place to apply marketing concepts when using a database for marketing purposes.

In order to institute a marketing strategy, there needs to be a process.  Kotler recommended the four Ps of strategic strategy: Partition, Probe, Prioritize and Position.  Partitioning is the process of segmentation.

While it’s mathematically true that partitioning only requires a business rule (RFM is a business rule) to divide the market into sub-markets, behavioral segmentation is a specific analytic strategy.  It uses customer behavior to define the segments and it uses a statistical technique that maximally differentiates the segments.  James H. Meyers even says, “Many people believe that market segmentation is the key strategic concept in marketing today.”

BS is from the customer’s point of view, using customer transactions and marcom response data to specifically understand what’s important to customers.  It is based on the marketing concept of customer centricity.  BS works for all strategic marketing activities: selection targeting, optimal price discounting, channel preference / customer journey, product penetration / category management, etc.  BS allows a marketer to do more than mere targeting.

An important point might be made here.  Behaviors are caused by motivations, both primary and experiential.  Behaviors are purchases, visits, product usage and penetration, opens, clicks and marcom responses, etc.  These behaviors cause financial results, revenue, growth, life-time value, margin, etc.

Primary motivations would be unseen things like attitudes, tastes and preferences, lifestyle, value set on price, channel preferences, benefits, need arousal, etc.  There are experiential, secondary causes of behavior, typically based on some brand exposure.  These are not behaviors, but cause subsequent behaviors.  These secondary causes would be things like loyalty, engagement, satisfaction, courtesy, velocity, etc.  Note that RFM uses recency and frequency, which are metrics of engagement, which is a secondary cause.  RFM also uses monetary metrics which are resultant financial measures.  Thus RFM does not use behavioral data, but engagement and financial data.  These are very different than behavioral data used in BS.  One simple way to distinguish behavioral data from secondary data is that behaviors are nouns: purchases, responses, etc.  Note that secondary causes are adjectives: engagement metrics, loyal customers, recent transactions, frequently purchased, etc.

BS typically requires analytic expertise to implement.  Behavioral segmentation is a statistical output.  (See the sidebar.)

One critical difference between BS and RFM is that in a behavioral segmentation members typically do not change groups.  That is, the behavior that defines a segment evolves very slowly.  For example, if one person is sensitive to price, her defining behavior will not really change.  She is sensitive to price even after she has a baby, she is sensitive to price as she ages, or if she gets a puppy, or buys a new house.  Her products purchased might change, her interests in certain campaigns might change, but her defining behavior will not change.  This is one of the advantages of BS over RFM.  This is what drives your learning about the segments.  BS provides such insights that each segment generates a rationale, a story, as to why it’s unique enough to BE a segment.

While RFM uses only three dimensions, BS uses any and all behavioral dimensions that best differentiate the segments.  It typically requires far more then three variables to optimally distinguish a market.

Because marketing mix testing can be done on each segment (using product, price, promotion and place) the insights generated make for differentiated marketing strategies for each segment.  To test if RFM tiers drive behavior is probably inappropriate, because tier membership potentially changes every time period.  Much like studies that proclaim, “Women who smoke give birth to babies with low birth weight,” there is spurious correlation going on.  Just as another dimension (socio-economic, culture, etc.) might be the real (unseen) cause of the low birth weight and NOT necessarily (only) the smoking, so as there are other dimensions of (unseen) behavior using RFM to explain, say, campaign responses.  That is, the response is not caused by the RFM tier, but some other motivation.

In short, BS goes far beyond RFM.  The insights and resultant strategies are typically worth it.


As mentioned, BS delivers a cohort of segment members that are maximally differentiated from other segment members.  Because these members typically do not change segments, various marketing strategies can be leveled at each segment to maximize cross-sell, up-sell, ROI, margin, loyalty, satisfaction, etc.

BS identifies variables that optimally define each segment’s unique sensitivities.  E.g., one segment might be defined by channel preference, another by price sensitivity, another by differing product penetrations and another by a preferred marcom vehicle.  This knowledge, in and of itself, generates vast insights into segment motivations.  These insights allow for a differentiated positioning of each segment based on each segment’s key differentiators.  You get away from trying to incentivize customers out of the “bad” tiers and into the “good” tiers.  In BS, there are no good or bad tiers.  Your job is now to understand how to maximize each segment based on what drives each segment’s behavior, rather than focus on only migration.  Thus, BS gives you a test-and-learn plan.

Because of the insights provided, knowledge is gained of each segment’s prime pain points, which means that each segment can be treated with the right message, at the right time, with the right offer and at the right price.  This kind of positioning creates a “segment of one” in the customer’s mind.  This uniqueness differentiates the firm, perhaps even to the extent to move it away from heavy competition and toward monopolistic competition.  This means you approach a degree of market power that is, becoming a price maker.

Because BS provides such insights it tends to make marketer’s very active in understanding motivations.  This tends to generate very lucrative strategies for each segment.


What are the advantages of RFM?  It’s fast, simple and easy to use, explain and implement.  What are the disadvantages of behavioral segmentation?  It requires analytic expertise to generate, is more costly and takes longer to do.

BS uses behavioral variables and uses them for the purpose of understanding customer behavior and it uses a statistical algorithm to maximally differentiate each segment based on behavior (see sidebar).  As mentioned, the vast majority of marketers that evolve from RFM to BS say it’s worth it, and their margins agree.