All posts by mikegrigsby

I've been involved in marketing science for over 25 years. It’s the only job I've ever had. Sad but true. I guess I’m a mile deep and an inch wide. I was a statistical analyst at a water utility before moving to Sprint where I forecasted calls. At Dell my job was demand analysis, where I learned elasticity and had my first taste of direct / database marketing. I went to Millward Brown, as marketing research director. There were leadership stints at Hewlett-Packard in database marketing, online analytics at the Gap and marketing operations at Emerson. I segued into consulting at Rapp, as VP Analytics and settled at Targetbase, heading up the strategic retail analysis practice. I believe in marketing science as focused on understanding consumer behavior that is, driving demand. My PhD dissertation was about new ways of modeling demand. I've written articles for academic and trade journals. I've taught at both the graduate and under-graduate levels. I have spoken at trade conventions and seminars (National Conference of Database Marketing, Internet Retailing, Direct Marketing Association, American Statistical Association, etc.) I think I have a general understanding of the industry and great sympathy for those actually trying to do analytics. I have developed all this as a basis for promotion. The industry needs a gentle guidebook focused on helping analysts do marketing science: pull a targeted list, provide segmentation, test campaign effectiveness, forecast demand, etc. What are mostly available are dry, scholarly references that require wading through pages of mathematic formulas. Often by then the thread of the argument is lost and frustration results. I have felt this frustration myself, which is why I started this marketing science blog.



Category management comes from CPG industries.  This is a strategy used to assign a role to a major product category.  The roles are destination, occasion, and convenience and routine.  These roles are assigned based on calculations along two dimensions: percent purchasing the category and number of purchases.

CPG assumes their entire customer base assigns the same role to each category.  A better strategy for retail would be delivery of a behavioral segmentation and calculate each role BY SEGMENT.  Thus segment X might assign the role of say destination to one category but segment Y assigns the role of occasion, based in the landing in the 2×2 grid.  This provides better targeting and creation of more compelling message, in that not one size fits all.


Category management (as a strategy) comes from CPG industries.  CPG defines four “roles” that customers give to product categories.  A category is a distinct, managed group of products that customers perceive to be interrelated or substituted for their needs.  These roles are not driven by finance, advertising or supply-side logistics but by customer behavior.

The roles assigned to product categories (using groceries as an example) are:

DESTINATION: Key staples like milk, bread and meat.  It is WHY shoppers visit.  A large percent of customers buy these products and they buy a large number of them.

OCCASION: Important to the shopper, mostly seasonal or based on occasion / season, e.g. birthday, anniversary, Christmas.

CONVENIENCE: Purchased infrequently, but important when a customer buys them. In a grocery store these are hardware items, shoe polish, etc.

ROUTINE: These tend to be items like pet products, paper towels, toilet tissue, etc.  A small percent of consumers purchase these but they buy a large number of them.

In groceries, a role is assigned to a product and assumed that all shoppers give that same role to the product.  In retail that is not the case.  This is important.  Grocery stores assume all customers assign say milk as a destination role and come to the store specifically for that.  Retail assumes that different segments may give different roles to the same product.  That is, one segment may indeed go to the grocery store specifically to buy milk but another buys milk only for say cooking and therefore assigns it a routine role.

Analytics can show that different segments assign a different roles to the SAME category.  Again, say segment X assigns the role of “destination” to kid’s clothes (it is the reason they come to the store) but segment Y assigns the role of “seasonal” to kid’s clothes.  No marketing strategist would message kid’s clothes the same way to each segment and that is the point.

Now, how do we determine (calculate) the assignment of these four roles by segment?  There are usually two metrics that distinguish a simple lattice squares, a 2×2 matrix.  The percent of the segment purchasing the categorizing is on the vertical axis and the number of items (of the category) purchased is on the horizontal axis.  See figure 1 below.  The four quadrants of these metrics, when comparing one segment to another, tends to differentiate and assign the roles described above.  (In practice, these metrics are indexed by segment and plotted on the above metrics as axis.)





# of items purchased


Using this descriptive framework, each segment can be plotted in terms of product categories.  That is, where a grocery store assumes all customers treat say steak as a staple (a destination role) there may be a segment that treats steak as an occasional (even seasonal) role.  This means that after segmentation has been performed each segment can be plotted and different roles can be assigned.

As an example, see figure 2 below showing MEN’S CLOTHING & ACCESSORIES.  The metrics are “percent of segment purchasing” and “average number of category units purchased by segment”.  These are then indexed to the mean and plotted on the graph (figure 3) after that.  Note that segment 1 plots as occasionally (even seasonally) buying MEN’S CLOTHING & ACCESSORIES whereas segment 2 calculates as assigning a destination role to MEN’S CLOTHING & ACCESSORIES.  No marketing manager would send the same messages or offers to these two segments in terms of this category.



SEGMENT1 0.65 1.10
SEGMENT2 2.18 2.41
SEGMENT3 0.03 0.05
SEGMENT4 1.88 1.54
SEGMENT5 0.26 0.27





Category management came from CPG industries.  This is an approach that each category (major product) is defined in terms of four different roles assigned by customers.  These roles are destination, occasion, convenience and routine.  The classification of these roles depend on customers scoring along two axis: percent purchasing and number of purchases.

Take this approach one step further and do category management by segment.  That is, one segment may treat a product category as say a destination but another will treat that same product as convenience.  This differentiation means that messages and promotions and bundling offers can be versioned by segment.





By Mike Grigsby, PhD



A specialty retailer wanted to develop a model to ascertain revenue performance by stores. They wanted to differentiate first time buyers from repeat buyers, in order to exploit those different sensitivities. They believed there were regional differences and needed the model to account for those differences.

Some of their stores were in attractive areas, having little competition and / or good demographics (income, lifestyle, etc.) whereas other stores were in less attractive areas. The question was around these uncontrollable dimensions vis a vis controllable ones like pricing, staffing, store appearance, customer service, satisfaction, etc. That is, they wanted to develop a “scorecard” for each store taking into account their controllable operations given their uncontrollable circumstances.

Because this framework clearly had at least one independent variable to be used as a dependent variable (satisfaction) and because there was staging involved (customer service => satisfaction => sales), simultaneous equations was the econometric technique of choice.

The resulting analysis allowed development of a scorecard for each store.     This meant that for one store a particular variable, say, net price could be very powerful in terms of driving sales, but for another store (perhaps in another region) net price was less impactful. It also meant that two stores with similar uncontrollable situations that varied in their same store sales could be analyzed in terms of better operations, etc.


A specialty retailer had about 450 stores nationwide. They wanted to develop a model to ascertain revenue performance. What explained same store sales? The goal of this was to both predict sales and to account for sales. That is, assess accountability for store managers in terms of performance.

It was hypothesized there were two general classes of performance drivers, some in the store’s control and some not in the store’s control. Examples of variables not in the store’s control include number of competitors, demographics around each store’s trade area, etc. Examples of variables within the store’s control (and hence things they could do to increase their performance) included net price, marketing spend, staffing (both the number and type), customer service training, culture, employee engagement, etc.

These differences in performance likely varied by, say, region. Some of the regions may have a large employer move in (or out), differences in unemployment, income, household size, etc., and might make a difference in how effective a store’s operations were. That is, senior management wanted to know if a store is performing aright, taking into account their regional circumstances. It may be possible for a store to do no better than it did, maximizing their pricing and marketing and staffing, and it might be possible for a store to do far better, given their very attractive circumstances.

They wanted to differentiate first time buyers from repeat buyers, in order to exploit and target those different behaviors. First time buyers may be motivated by lead generation, cooperative partnerships, a social media reputation score, whereas none of these would have much of a bearing on repeat buyers.


This business objective required data from three major sources. First the transaction database would supply same store sales, net price, etc. The second source was primary marketing research in terms of employee engagement, satisfaction, loyalty, customer service and store culture. The last source was overlay data to detail number of competitors, demographics, interests and lifestyle.

The transactional database supplied same store revenue, units, average net price and number and type of staffing. There were also data including certification of industry excellence standards, external and internal store appearance, distance each customer was from each store, etc.

There was a heavy investment in marketing research, primarily focusing on these areas: satisfaction, customer service, employee engagement, quality of assortment and store culture. These responses came from the database of customers so were easy to merge together.

Lastly, several overlay data sources were used. One gave demographics (income, age, size of household), another gave interests, lifestyle and a third gave number of competitors in trade area, etc.


The stores were grouped by geography, typical in retail. (Another possibility–often preferred, depending on operational tactics—would be to do a behavioral segmentation and then do simultaneous equations by segment.) Each Group VP had from 30 – 70 stores to manage. Most of their annual bonuses are based on same store sales so understanding drivers to increase unit sales is critical.

There would have to be a separate model for first time purchases as differentiated from repeat purchasers. (About 30% of total sales are from first time purchasers.)

Likewise, because of the differences by region, there would have to be a different model for each region. This would amplify the key insight: what kinds of sales performance can be expected given differences by region? Obviously a national KPI could not be standardized across all regions, it needed to be distinct at least at the region level.

Simultaneous Equations

The dependent variable for first time and repeat customers would be units, typical in retail. It was hypothesized there would be some variables unique to first units and some variables unique to repeat units and some variables shared by both. (This was one of the reasons a systems approach was needed.) Causality also suggested a staged approached in that satisfaction was caused by some variables and repeat units were caused by satisfaction. See figure 1 for a graphic representation.

The below are the simplified hypothesized equations.

First Units = f(net price, # competitors, store appearance, marketing

spend, age, income, lifestyle, partnerships, reputation, lead generation, seasonality)

Repeat Units = f(net price, # competitors, store appearance, marketing

spend, age, income, lifestyle, SATISFACTION, customer service, staffing, employee engagement, seasonality)

Satisfaction = f(net price, customer service,   store culture, employee

Engagement, staffing)





The above meant that three stage least squares (3SLS) was one of the key econometric techniques of choice.

(A quick note about another popular (simultaneous equation) choice, Vector Auto Regression (VAR), is that because different independent variables are in each of the equations a vector would be inappropriate. That is, the ability to have different variables by equation (rather than a vector) is more accurate and more insightful.)

Thus, in this case, 3SLS is preferred and instrumental variables had to be found. These would have to be correlated with the endogenous variables and uncorrelated with the error terms. Often large scale macro variables (consumer confidence, industry growth, etc.) can be used as they are correlated with many dependent / endogenous variables (units, revenue, etc.) and (hopefully) less correlated with error terms.

The endogenous variables are those estimated by the system of equations, in this case the dependent variables and those shared by all equations, e.g., first units, repeat units, satisfaction, net price. The exogenous variables are those given and thus outside the system, in this case marketing spend, store appearance, demographics, etc.

In order to be solved, each equation must be at least identified. That is, the number of exogenous variables excluded from each equation has to be greater than the number of endogenous variables included, less one.

As typical, in terms of generating the model results, the data file was split into two random samples. The model was estimated using the “training” sample and verified using the “testing” sample. There was no attempt to “simulate” via some Monte Carlo, etc., process. That is, the point of the model was not to asses risk or range of outputs.

The cost of doing simultaneous equations is that the only desirable property remaining for estimators is consistency. Because variables depend on values from other equations, they cannot be assumed to be fixed. (That is, the assumption of non-stochastic X is violated.) The benefit is that simultaneous equations more accurately model the behavior sought to be understood. Added complexity means added insights.



The model showed differences between first time and repeat units and satisfaction by region. As hypothesized, different regions are sensitive to different independent variables.


That is, repeat visitors have different sensitivities varying by region. In one region net price may dominate and in another region staffing may dominate and in yet another region satisfaction may dominate. Likewise, first time visitors have different sensitivities varying by region. In one region partnerships may dominate and in another region lead generation may dominate and in yet another region their online reputation score may dominate. All of the above are controllable (within the firm’s ability to change) variables.


The model showed differences between controllable and uncontrollable variables by region. In one region unemployment may dominate and in another region the number of competitors may dominate and in yet another region demographics (income, size of household, education, etc.) may dominate.



Model Output


To show the power of this kind of analysis, two regions are detailed below. These are the (final) results of the 3SLS model applied by each region. The key thing to notice is that different independent variables (controllable as well as uncontrollable) are significant by different regions. This is as expected. Note also a different elasticity, even for the same variables, is different by regions.


In table 1, first timers are very sensitive to net price, in that a 10% decrease in net price causes a 22.5% decrease in units. The number of sales associates is significant in this region and a 10% increase in sales associates causes a 6.5% increase in units, so while it’s impactful it would be classified as insensitive. Each of these variables gives lucrative strategic insights and provides a business case. The cost of changing price and the cost of hiring more associates can be weighed again the benefits of additional units (and ultimately addition revenue). That is, this analysis pinpoints not only which “levers” a regional VP can pull but by how much in order to maximize total revenue.


Their reputation score (a calculation similar to Net Promoter Score) is (barely) elastic and obviously as the firm closes more leads this drives more units. There could be a business case made here as well, in that perhaps hiring more call center reps could increase more closed leads.


The number of competitors has a negative impact on first time units and this is an uncontrollable variable. The value in this is that it provides quantification to number of new units as more (or less) competitors move into the trade area.




net price -0.78 10.87 -2.25
# of sales assoc 0.22 11.22 0.65
reputation score 0.06 77.66 1.13
# leads closed 0.45 27.11 3.24
Uncontrollable 0.00
# competitors -0.45 24.77 -2.92
net price -3.55 18.55 -1.47
# of emails sent -0.91 112.55 -2.30
# of direct mails sent 4.55 14.22 1.45
# sales assoc 5.09 11.22 1.28
internal appear 11.24 0.11 0.03
med income 0.001 $61,244 1.37
# competitors -0.60 24.77 -0.33
satisfaction 5.78 7.80 1.01
customer service 1.55 6.07 1.21
systems 0.87 5.22 0.58
product assort 1.55 8.97 1.78
# competitors -0.25 24.77 -0.79
distance from store -0.08 9.87 -0.10




Repeat visitors are also sensitive to net price. If a 10% decrease were applied there would be a corresponding increase in units by 14.7% and a resulting increase in net revenue. Note that the number of emails sent is negative in terms of responding units. This is rationalized as email fatigue. The number of direct mail sent is positive and impactful. Hiring more associates (which will drive both first and repeat units) in the repeat model is more impactful (as expected) than the first time model. Here, increasing 10% more associates drives 12.8% more units and resulting more total revenue. And the internal appearance of the store is important and positive but rather minor impact. Thus this region has several ways to affect their performance, that is, there are a few variables directly in their control to impact units.


Number of competitors (as with first time units) is also impactful and negative. Distance from store is negative but these are uncontrollable. That is, these should be watched but cannot really be acted upon.


Lastly it is important to understand the impact of satisfaction. This is why simultaneous equations was an appropriate choice. As satisfaction increases by 10% repeat units increase by 10.1%. The question is, how will the regions increase satisfaction? The answer lies in the below satisfaction model.


There are several things that compose satisfaction, and these may differ somewhat by region. Customer service and systems improvement and product assortment all have a positive impact on satisfaction. Product assortment is greatest and a system is actually insensitive to an impact on satisfaction. But again the issue is in the ability to calculate a business case. What is the cost of increasing customer service, or improving systems or (if even possible) explaining product assortment? Whatever the cost, the return it gives generates an ROI. This model details a way to optimizing which projects best improve satisfaction, which will in turn improve repeat units which will drive repeat net revenue.


As expected number of competitors and distance from store decreases satisfaction. Both of these have a minor impact on satisfaction.







net price -0.36 10.09 -0.89
# of partners 1.05 2.59 0.67
reputation score 1.98 3.55 1.72
# leads closed 0.27 31.99 2.12
lost major employer -25.77 0.09 -0.56
# competitors -0.22 9.55 -0.51
net price -2.44 19.08 -0.69
# of emails sent -0.72 121.50 -1.31
# of direct mails sent 4.99 15.01 1.12
# sales assoc 6.55 12.97 1.27
remodel amount 0.0002 $115,208 0.34
med income 0.001 $68,055 1.01
size of household 2.28 2.09 0.07
satisfaction 6.88 6.70 0.69
customer service 2.22 5.55 1.84
assoc engage 1.55 8.99 2.08
product assort 2.08 6.88 2.14
distance from store -0.04 5.55 -0.03



Table 2 presents a very different region. Again the idea is to note these regions vary and those variances give managers ways to improve their regions and the stores performance.


First visitors in region Y are insensitive to price, which is a very different finding than in region X. This finding means that instead of lowering price to increase revenue, this region should raise price to increase revenue. While reputation score and closed leads are again significant, the amount is very different then regions X. Lastly, the number of associates does not show up in the region’s model but instead partnerships are a significant variable.


In terms of uncontrollable variables number of competitors is significant as well as the loss of a major employer. While the firm can do nothing about these variables just noting their occurrence gives them items to be careful of and pay attention to.


The repeat visitors are very different in this region as well. While net price is significant, repeat (as with first time visitors) visitors are insensitive to price. There are similar findings in both regions in terms of number of emails and direct mail sent ands number of associates. However, this region is sensitive to the remodel amount rather than internal appearance, perhaps only a subtle difference but interesting in its own way.


Uncontrollable variables show up as median income and size of household. This region does not have the competitive pressures of region X. This has implications for enterprise optimization / subsidies, etc. Note also that while satisfaction is significate, it actually has an insensitive elasticity.


This regions satisfaction model has customer service, number of associates and product assortment under the firm’s control. Associate engagement was not found to be significant in region X.


For uncontrollable variables, distance from store shows up again as significant.




11 +11% +2% -3% +0% +1% +4% 1.12 1.01 +0%
113 +9% +2% +1% +1% -2% +6% 1.24 0.98 +1%
209 +8% +1% -2% +3% +3% -1% 0.98 1.03 +2%
7 -2% +0% -2% +1% +2% +2% 0.77 0.89 -3%
18 -4% -3% +4% +4% -4% -3% 1.02 1.03 +0%
90 -5% -1% +4% +4% -2% +0% 0.68 0.99 +2%






Table 3 shows part of region Y’s scorecard. These are the top and bottom three store performers, measured by year over year percent growth. The whole point of the modeling was to find which variables are significant, believing that these would differ by region, and then look at how each store operated in terms of those variables.


That is, if price is important and say the region tends to operate on the inelastic side of demand, the appropriate strategy would be to increase price which should help increase revenue. Given that, each store’s operations can be ascertained (and ultimately guided) in terms of correct strategy by particular variables or metrics.


Thus, the top three performing stores all did tend to increase price. Notice that the bottom three performers moved price in the wrong way. In terms of number of emails, the more sent the more negative pressure is put on revenue and the bottom three performers tended to send more emails then the top three performers. Also the bottom three stores decreased their number of associates which tended to decrease units. While minor, increasing the amount spent on store upkeep, modernizing, etc. has a positive impact on revenue.


In terms of store operations, if the store is struggling during the year a common tactic would be to decrease price, and without the model management would not know what the appropriate action is. Also it may seem that sending out more emails would help counteract a sub-par year but in this case those are exactly the wrong actions. Likewise decreasing the dollars spent on associates and decreasing the amount spent on remodeling may appear to be a cost cutting measure but again those are exactly the wrong decisions. The bottom performers also sent out more direct mail and that is the correct response.


Thus the scorecard is intended to find which levers are impactful in each region and give store managers a tool to help optimize revenue. This can be in the form of a test and learn plan, or managing KPIs, etc.


The above referenced controllable variables, those levers that store management can change. Looking at the uncontrollable variables note that the top three performance tend to be over indexed on income and size of household whereas the bottom three tend to be less than average.


Taking a quick look at overall satisfaction shows the same trend: the top performers tended to increase their satisfaction score while the bottom performers tended to decrease their satisfaction score. Since the managers know form the model that customer services, associate engagement and product assort drives satisfaction in this regions, these metrics can be examined as a scorecard as well and a focus can help drive satisfaction.




Using the Store planning matrix


Note Figure 2 below, which shows the store planning matrix (SPM). Only the six stores mentioned in the scorecard are shown.







The SPM plots stores on two dimensions, economic area and revenue performance. Economic area can be defined as some combination of number of competitors, the changing of a large employer, income, household size, unemployment, etc. The idea is to create some dimension of how attractive is a particular trade area. The other dimension is YOY change in revenue.


The issue is how a store performs given the economic environment they find themselves in. As an obvious example look at store number 11 versus store number 18. They each, as shown, have similar economic operating areas but drastically different revenue performance. The SPM is a tool that gives managers an immediate way to understand which store is delivering given their operations. That is, store number 18 cannot claim that they can do no better because store number 11 is in the same environment and did perform much better. Then, using the store scorecard above, as mentioned, particular recovery plans can be put into place.


In terms of a strategic approach, the four quadrants of the SPM each have a specific goal. The top left might be to gain share. The top right might be to maximize and defend. The bottom left might be to manage for profit. The bottom right might be to manage for revenue. Plotting where a specific store lands in terms of its peers gives management a quickly relevant POV.





Same store sales modeling can be used to operationally predict future sales but the real power is to provide tools to understand what drives revenue. If, as is usually the case, these drivers are different in terms of a region (or a segment) then it becomes more critical to find how they differ. And more importantly, the ability to drill down to the store level, and find which stores are performing optimally, is critical for YOY success.





Mike Grigsby has been in marketing analytics for nearly three decades. He worked in CRM / database marketing at Dell, HP, Sprint, the Gap and is now a marketing science consultant at Targetbase. His PhD is in marketing science and he has taught marketing analytics at UTD, UD, and St. Edwards. He has published in both academic and trade journals and led seminars at DMA, NCDM, etc. He is the author of MARKETING ANALYTICS and his second book, ADVANCED CUSTOMER ANALYTICS, comes out October, 2016. Link to him on LinkedIn, follow on Twitter, or read the blog at






OK, in marketing the customer is king.  We all know that.  If marketing is not customer-centric it probably is NOT really marketing.  We all know that.

Or do we?

Why such a focus on competitive behavior?  I know John Nash just died and A Beautiful Mind was a great book (and a less than great movie) and Game Theory is very cool–but is it talked about in board rooms?  No.  I have never heard a CEO lean toward his CMO and ask, “Do you think our competition is doing prisoner’s dilemma?”  But a lot of attention is about competition to the distraction of focusing on consumer behavior.

I have set through many seminars and presentations on Game Theory and even been asked to teach a class on Game Theory.  While it seems important, and is certainty mathematically rigorous, what does it get us?  To me it functions more as an academic construct than an actionable insight.  Much like Michael Porter’s competitive intensity: have you ever used, or seen quantified, competitive rivalry?   Has there been a model quantifying the bargaining power of suppliers and buyers, the threat of substitutes and new entrants?  It functions as an abstract talking point, like debating the number of angels dancing on the head of a pin.

That’s why I posit a knowledge of customer behavior over a knowledge of Game Theory.  Indeed, I suggest that a knowledge of the analytics around customer behavior is a substitute for Game Theory.  I can hear the gasps.

Stephan Sorger’s excellent Marketing Analytics has a brief description of competitive moves, both offensive and defensive. Below are summaries of each move but applied via consumer behavior.  This can serve as a thumbnail sketch of what I have in mind

Defensive Reactions to Competitor Moves:

Bypass Attack (the attacking firm expands into one of our product areas) and the correct counter is for us to constantly explore new areas.  Remember Theodore Levitt’s Marketing Myopia? If not, re-read it, you know you had to in school.

 Encirclement Attack (the attacking firm tries to overpower us with larger forces) and the correct counter is to message how our products are superior / unique and of more value. This requires a constant monitoring of message effectiveness.

 Flank Attack (the attacking firm tries to exploit our weaknesses) and the correct counter is to not have any weaknesses. This again requires monitoring and messaging the uniqueness / value of our products.

Frontal Attack (the attacking firm aims at our strength) and the correct counter is to attack back in the firm’s territory. Obviously this is a rarely used technique.

Offensive Actions:

New Market Segments: this uses behavioral segmentation (see the latter chapters on segmentation) and incents consumer behavior for a win-win relationship.

Go-to-Market Approaches: this learns about consumer’s preferences in terms of bundling, channels, buying plans, etc.

Differentiating functionality: this approach extends consumer’s needs by offering product and purchase combinations most compelling to potential customers.

My book, Marketing Analytics (Kogan Page, 2015) offers additional analytic techniques to quantify the causality of customer behavior.


For Immediate Release

NEW BOOK – Marketing Analytics: A Practical Guide to Real Marketing Science

New Book Reveals When Your Customers are Most Likely to Buy

Available today, Marketing Analytics arms business analysts and marketers with the understanding and techniques they need to solve real-world marketing problems, from testing campaign effectiveness and forecasting demand to employing survival analysis to determine when your customers are most likely to buy. It outlines everything practitioners need to ‘do’ marketing science by following fictional analyst Scott as he progresses through his career and makes increasingly better marketing decisions.

The author Mike Grigsby has been involved in marketing science for over 25 years. He was marketing research director at Millward Brown and has held leadership positions at Hewlett-Packard and the Gap. He now heads up the strategic retail analysis practice at Targetbase and is an adjunct professor at the University of Texas at Dallas.

Part of the new Marketing Science series by Kogan Page, which makes difficult topics accessible by grounding them in business reality, Marketing Analytics helps readers refine their marketing skills so they can compete more effectively in the marketplace. It provides insight into the power of data analytics in the context of marketing problems; explains and demonstrates marketing data modelling techniques in a practical way, illustrates how data modelling methodology can be applied to a range of practical scenarios and offers advice and step-by-step guidance for ways to solve some of the most common situations, opportunities and problems in marketing.

Dr. James Mourey, Assistant Professor of Marketing at DePaul University in Chicago, has offered advance praise, declaring, ‘For those MBAs who barely passed their quantitative marketing and statistics classes without truly understanding the content, Marketing Analytics provides everything managers and executives need to know presented as a conversation with examples to boot! You’ll definitely sound smarter in the boardroom after reading this book!’

For a review copy (ISBN 9780749474171), a by-lined article or to arrange an interview with the author, please contact Megan Mondi: or +44 (0)20 7843 1952.



The Required Spiel on B-I-G D-A-T-A


Okay, this had to be done.  It’s time.

I’ve avoided it because Big Data (yes, you have to capitalize it!) is everywhere.  You can’t get away from it.  It’s in every post and every update and every blog and every article and every book and every resume and every college class anywhere you look.  It’s inescapable.  Big Data has become the Kim Kardashian of analytics.

So now it’s time to add to the fray.



No one knows.  I’ll provide a working definition here but it will evolve over the years.

First, Big Data is BIG

Duh.  By “Big” I mean many many rows and many many columns.  Note that there is no magic threshold that suddenly puts us in the “Oh my, we are now in the Big Data range!”  It’s relative.

This brings us to the second and third dimension of what is Big Data: complexity.

Second, Big Data is potential multiple sources merged together

The dimension of Big Data came about because of the proliferation of multiple sources of data, both traditional and non-traditional.

So we have traditional data.  This means transactions from say a POS and marcomm responses.  This is what we’ve had for decades.  We also created our own data, things like time between purchases, discount rate, seasonality, click through rate, etc.

The next step was to add overlay data and marketing research data.  This was third-party demographics and / or lifestyle data merged to the customer file.  Marketing research responses could be merged to the customer file to provide things like satisfaction, awareness, competitive density, etc.

Then came the first wave of different data: web logs.  This was different and the first taste of Big Data.  It is another channel.  Merging it with customer data is a whole other process.

Now there is non-traditional data.  I’m talking about the merge-to-customer view.  IN terms of social media the merge to individual customers is a whole technology / platform issue.  But there are several companies who’ve developed technologies to scrape off the customer’s id: email, link, handle, tag, etc. and merge with other data sources.  This is key!  This is clearly a very different kind of data but it shows us say number of friends / connections, blog / post activity, sentiment, touch points, site visits, etc.

Third, Big Data is potential multiple structures merged together

Lastly Big Data has an element of degrees of structure.  I’m talking about the very common structured data through semi-structured and all the way to unstructured data.  Structured data is the traditional codes that are expected by type and length–it is uniform. Unstructured data is everything but that.  It can include text mining from say call records and free form comments, it can also include video and audio and graphics, etc.  Big Data gets us to structure this unstructured data.

Fourth, Big Data is analytically and strategically valuable

Just to be obvious: data that is not valuable can barely be called data.  It can be called clutter or noise or trash.  But it’s true that what is trash to me might be gold to you.  Take click stream data.  That URL has a lot of stuff in it.  To the analyst what is typically of value is the page the visitor came from and is going to, how long they were there, what they clicked on, etc.  Telling me what web browser they used or whether it’s an active server page or the time to load the wire frame (all probably critically important to some geek somewhere) is of little to no value to the analyst.  So Big Data can generate a lot of stuff but there has to be a (say text mining) technique / technology to put it in a form that can be consumed.  That’s what makes it valuable–not the quantity but the quality.



Probably.  As alluded to above, what multiple data sources can provide the marketer is insights into consumer behavior.  It’s important to the extent that it provides more touch points of the shopping and purchasing process.  To know that one segment always looks at word of mouth opinions and blogs for the product in question is very important.  To know that another segment reads reviews and puts a lot of attention on negative sentiment can be invaluable for marketing strategy (and PR!)

Just like 20 years ago click stream data provided another view of shopping and purchasing, Big Data adds layers of complexity.  Because consumer behavior is complex, added granularity is a benefit.  Beware of “majoring on the minors” or paralysis of analysis.



There needs to be a theory: THIS causes THAT.  An insight has to be new and provide an explanation of causality and of a type that can be acted upon.  Otherwise (no matter how BIG it is) it is meaningless.  So the only value of Big Data is that it gives us a glimpse into the consumer’s mindset, it shows us their “path to purchase.”

For analytics this means a realm of attribution modelling that places weight on each touch point, by behavioral segment.  Strategically, from a portfolio POV, it tells us that this touch point is of value to shoppers / purchasers and this one is NOT.  Therefore attention needs to be paid to those (pages, sites, networks, groups, communities, stores, blogs, influencers, etc.) touch points that are important to consumers.  The biggest difference that Big Data gives us is that now we have more things to look at, more complexity, and this cannot be ignored.  To pretend consumers do not travel down that path is to be foolishly simplistic.  When a three dimensional globe is forced into two dimensional (from a sphere to a wall) space, Greenland looks to be the size of Africa.  The over simplification created distortion. Same is true of consumer behavior.  The tip of the iceberg that we see is motivated by many unseen, below the surface, causes.



Big Data is not going to go away.  Like the Borg, we will assimilate it, we will add its technological uniqueness to our own.  We will be better for it.

The new data does not require new analytic techniques.  The new data does not require new marketing strategies.  Marketing is still marketing and understanding and incenting and changing consumer behavior is still what marketers do.  Now–as always–size does matter, and we have more.  Enjoy!




So I was in a meeting the other day and a retail client said they wanted to do segmentation.  Now, those who know me know that that is what I LOVE to do.  I think that is often a good first step.  It is the foundation of much analytics that follow.  Remember the 4 Ps of strategic marketing?  Partition (segmentation), probe (marketing research), prioritize (rank financially) and position (compelling messaging).  Strategy starts with segmentation.

They began talking about what data they have available.  But that is NOT the right place to start.  Segmentation is a strategic, not an analytic, exercise.  Surprised to hear me say that?  Note that while segmentation is the first step in strategic marketing, (see above) it is PART of strategic marketing.  That is, it starts with strategy.

Where does strategy start?  It starts with clearly defined objectives.  For segmentation to work it must start with strategy and strategy starts with clearly defined objectives.

So I asked the client what is it they wanted to do.

“Sell more stuff, man!  Make money.”  Duh.

Yeah, I get that.  Have you thought about HOW you are going to sell more stuff?  How are you going to make more money?


Sure, I can take all their data (demographics, transactions, attitudes / lifestyle, loyalty, marcom, etc.) and throw it into some algorithm–I like latent class myself–and out will pop a statistically valid (within the confines of the algorithm) segmentation solution.  That will be acceptable analytically, but: IT WON’T WORK.  It does not solve anything, it does not give levers for a solution because the solution was not inherent in the design.

For example, a recent telecom client had a problem with churn (attrition).  They needed a list of who is most likely to churn in the next 60 days so they could intervene and try to slow down / stop the churn.

The solution was to segment based on reasons to churn.  We brainstormed about what causes churn–high bills, high uses of data / minutes, dropped calls, etc..  Then we collected data on the causes of churn and segmented based on that data.  We came up with a segment that was sensitive to price and churned because of high bills.  Another segment was sensitive to dropped calls and churned because of an increase in dropped calls.  Then survival modeling was applied to each segment and we could produce a list of those most likely to churn and WHY they would churn.  This WHY gave the client a marketing lever to use in combating churn.  For the “sensitive to high bill” segment, those most at risk could be offered a discount.  (If a $5 discount keeps a subscriber on the system for 60 more days, it’s worth it.)  Note that the solution had marketing actions in the design.  That’s why it worked.

We did not segment based on demographics.  We did not segment based on attitudes.  But we could have.  The algorithm does not know (or care) what the data is.  The mathematics around the solution have nothing to do with what variables are used.  Analytically, a solution is a solution.  But without marketing strategy as part of the design it will not work.

So for the retail client, there was a conversation.  Segmentation is NOT a magic bullet that will solve all marketing problems.  But thinking will help.

So the retail client admitted they were probably discounting too much (all retailers discount too much) but they did not know how to target their discounting.  Clearly some of their customers would not buy without a discount, but some where more loyal and did not really need a discount to buy, etc.  So one way to make more money is to not give such high discounts.  Thus that is a marketing strategy.  If we could find groups that differ on price sensitivity, we could segment base don that.  One segment needs a discount and another segment does not.

Another way to make more money is to save on direct mail.  Some customers preferred a catalog and others did not care and were happy with email.  Direct mail is expensive so if segmentation could be done to find a group that required direct mail and to find a group that did not, clearly send a catalog to the DM group but send the email group an email.  See?

Note again that demographics, attitudes, loyalty metrics, etc. were not part of the solution because they were not part of the problem.  There could be a strategy that needs a segmentation based on loyalty, etc. but not in the current example.

So the key take away is that segmentation does NOT start with data, it starts with thinking about objectives, what marketing levers can be pulled, what problem is (specifically) being solved.  Without that you have nothing and “He who aims at nothing will hit it.”












Because pricing is one of the Four Ps of Marketing (product, price, promotion and place) it is critically important in understanding consumer behavior.  Pricing is where marketing happens, because a market is the place where buyers meet sellers.

However, how do marketers know what price to charge?  Part of it has to do with what strategy (skimming, penetrating, other) are they pursuing.  But in general, too high a price and they get no sales, too low a price and they get no profit.  Typically the market is the LAST place to experiment with trail and error.

So there are several generally accepted ways to research the “right” price to charge.  A practical decision is whether or not the product exists, real in the market place.  If it is a new product then elasticity modeling cannot really be done.  If it exists then there are four choices: a general survey, a van Westendorp survey, conjoint analysis and elasticity modeling.

This post favors (for an existing product) elasticity modeling because it is real responses to real price changes in an economic environment.  If it is a new product the least favorable choice is a general survey.  Each of these methods will be briefly discussed.



The first and simplest solution is to ask customers what they think.  This requires taking a random sample and asking “Are our prices too high?”

There is a lot of thought put behind this, to seek granularity, but generally marketers want to know if customers think their prices are too high.  The probing can be /should be aimed at different segments (high volume or low volume users, new or established customers, a particular product, geographically disbursed, etc.)

But the overwhelming answers customers give to the question “Are our prices too high” is “Yes!  Your prices are too high.”  It is self- reported and self-serving.  Money was wasted on a survey.  So what usually happens in a large corporation is that many creative people will try to slice-and-dice until they find some answers they want, some “segments” or cohorts, etc., where prices are reported as NOT too high.  That is, if you look at say customers who have been on the database longer than three years that have bought more than $450 of product X who reside in the northeast reported that, “No, your prices are competitive”!”  This is just window dressing and not analytic.

Thus, for an existing product, a general survey among current customers offers no real insights.

This post advocates NOT using a survey for an existing product.  The only choice is to use a survey for a new product.  This too has pitfalls.  Remember Chrysler used marketing research and asked potential customers how likely they would be to buy a minivan, a very new concept.  Thee customers had no experience with it and indicated lackluster demand.  Iacocca ignored the research and built the minivan anyway, saving the company.  So in short, a general marketing research survey has little value except as gee whiz info.



A second common option is the van Westendorp survey.  (Those who use it do not call it a survey but a Price Sensitivity Analysis (PSA)).  But it really is a survey.  It takes a random sample of customers and asks them questions.  It is usually a “tracking” study, so as to gauge price sensitivity movement over time.  In general the point is to find out what prices are considered too high or too low (again, self-reported).  These results are graphed onto a “Price Map”.

Customers are asked four questions:

  • At what price would you consider the product/service to be priced so low that you feel that the quality can’t be very good?
  • At what price would you consider this product/service to be a bargain—a great buy for the money?
  • At what price would you say this product/service is starting to get expensive—it’s not out of the question, but you’d have to give some thought to buying it?
  • At what price would you consider the product/service to be so expensive that you would not consider buying it?


Usually question 1 (too cheap) and question 4 (too expensive) are the primary graphs.  The intersection of these two are meant to reveal the optimal price, in the below case about $21.


too cheap too expensive price
100% 0% $5
100% 0% $6
100% 0% $7
100% 0% $8
100% 0% $9
100% 6% $10
100% 6% $11
100% 6% $12
100% 13% $13
100% 19% $14
94% 44% $15
88% 50% $16
81% 56% $18
69% 63% $19
56% 69% $20
44% 75% $21
38% 81% $22
25% 88% $25
19% 94% $30
13% 100% $35



The “optimal” price is a bit debatable.  The idea though is that about 65% think the price of $21 is too cheap and 65% think the price of $21 is too expensive.  That is, to extract maximum value from customers, $21 is seen as simultaneously too cheap and too expensive, hence the “optimal” price.

Conjoint (considered jointly) is a powerful technique favored primarily by marketing researchers.  There are dozens of books detailing all the cool types and techniques of conjoint.

To elaborate the last point, conjoint serves an important purpose, especially in marketing research, especially in product design (before the product is introduced).  My main problem with surveys overall is that they are self-reported and artificial.  Conjoint sets up a contrived situation for each respondent (customer) and asks them to make choices.  The customer makes choices and these choices are typically in terms of purchasing a product.  You know I’m an econ guy and these customers are not really purchasing.  They are not weighing real choices.  They are not using their own money.  They are not buying products in a real economic arena.  The artificialness is why I do not advocate conjoint for much else other then new product design.  That is, if you have real data use it, if you need (potential) customer’s input in designing a new product use conjoint for that.  Also, please recognize that conjoint analysis is not actually an “analysis” (like regression, etc.) but a framework for parsing out simultaneous choices.  Conjoint means “considered jointly”.

The general process of conjoint is to design choices, depending on what is being studied.  Marketing researchers are trying to understand what attributes (independent variables) are more / less important in terms of customers purchasing a product.  So a collection of experiments is designed to ask customers how they’d rate (how likely they would be to purchase) given varying product attributes.

In terms of say PC manufacturing, choice 1 might be: $800 PC, 17 inch monitor, 1 Gig hard drive, 1 Gig RAM, etc.  Choice 2 might be: $850 PC, 19 inch monitor, 1 Gig hard drive, 1 Gig RAM, etc.  There are enough choices designed to show each customer in order to calculate “part-worths” that show how much they value different product attributes.  This is supposed to give marketers and product designers an indication of market size and optimal design for the new product.

Note that it is important to design the types and number of levels of each attribute so that the independent variables are orthogonal (not correlated) to each other.  These choice design characteristics are critical to the process.  At the end an ordinary regression is used to optimally calculate the value of part-worths.  It is this estimated value that makes conjoint strategically useful.

Note that the idea is to present to responders choices (in such a way that they are random and orthogonal) and the responders rank these choices.  The choice rankings are a responder’s judgment about the “value” (economists call it utility) of the product or service evaluated.  It is assumed that this total value is broken down into the attributes that make up the choices.  These attributes are the independent variables and these are the part-worths of the model.  That is:

where Ui = total worth for product / service and

X11 = part-worth estimate for level 1 of attribute 1

X12  = part-worth estimate for level 1 of attribute 2

X21 = part-worth estimate for level 2 of attribute 1

X22 = part-worth estimate for level 2 of attribute 2

Xmn = part-worth estimate for level m of attribute n.


Conjoint is not appropriate in the way usually used, especially in terms of pricing, except, as mentioned, in a new product–a product where there is no real data.  For an existing product, it is possible to design a conjoint analysis and put price levels in as choice variables.  Marketing researchers tell me that this price variable derives an elasticity function.  I disagree for the following reasons: 1) those estimates are NOT real economic data.  They are contrived and artificial.  2) The size of the sample that is derived from is too small to make real corporate strategic choices.  3) The data is self-reported.  Those respondents are not responding with their own money in a real economic area purchasing real products.  4) Using real data is far superior to using conjoint data.  Have I said this enough yet?  Ok, the rant will now stop.



Let’s go back to microeconomics 101: price elasticity is the metric that measures the change in an output variable (typically units) from a change, in this case price, from an input variable.  This change is usually calculated as a “pure number” without dimensions.  It is a marginal function over and average function, that is:

mathematically dQ/dP * P / Q or statistically β * P / Q

where P / Q are average price over average units.

If the change is > |1.00|, that demand is called elastic.  If it is < |1.00|, that demand is called inelastic.  These are unfortunate terms, as they nearly hide the real meaning.  The clear concept is one of sensitivity.  That is, how sensitive are customers who purchase units to a change in price?  If there is a 10% increase in price and customers respond by purchasing < 10 % units, they are clearly insensitive to price.  If there is a 10 % increase in price and customers respond by purchasing > 10 % units, they are sensitive to price.

But this is not the key point, at least in terms of marketing strategy.  The law of demand is that price and units are inversely correlated (remember the downward sloping demand curve?)  Units will always go the opposite direction of a price change.  But the real issue is what happens to revenue.  Since revenue is price * units, if demand is inelastic, revenue will follow the price direction.  If demand is elastic revenue will follow unit direction.  Thus, to increase revenue in an inelastic demand curve, price should increase.  To increase revenue in an elastic demand curve, price should decrease.



INELAST 0.075 increase price by 10.0%
p1 $10.00 p2 $11.00 10.0%
u1 1,000 u2 993 -0.75%
tr1 $10,000 tr2 $10,918 9.2%
ELAST 1.250 increase price by 10.0%
p1 $10.00 p2 $11.00 10.0%
u1 1,000 u2 875 -12.50%
tr1 $10,000 tr2 $9,625 -3.8%


See the table above.  There are two kinds of demand: inelastic (0.075) and elastic (1.250).  In the inelastic portion, we increase price (p1)  10% from $10.00 to $11.00 (p2).  Units decrease (because of the law of demand) from u1 at 1,000 units to 993 (note a .75% decrease.)  Now see total revenue tr1 goes from $10,000 ($10.00 * 1,000) to tr2 $10,918 ($11.00 * 993).    This was an inelastic demand curve and price increased and note that while units decreased total revenue increased.

Now the  opposite happens for an elastic demand curve.  p1 = $10.00 and p2 = $11.00 but while u1 starts at 1,000 units the 12.5% decrease sends u2 to 875.  Now see that tr1 goes from $10,000 to a decrease of $9,625.  This means that in order to raise TR prices must be increased in an inelastic demand curve but decreased in an elastic demand curve.  This means that a marketer does not know which way to move price unless they do elasticity modeling.  See?  Wasn’t that fun?

A quick note on a mathematically correct but practically incorrect concept: modeling elasticity in logs.  While it’s true that if the natural log is taken both of the demand and price, there is no calculation at the means, the beta coefficient is the elasticity.  However, and this is important, running a model in natural logs also implies a very wrong assumption: constant elasticity.  This means there is the same impact at a small price change as at a large price change and no marketer believes that.  Thus, modeling in natural logs is never recommended.  No other analytic technique gives these insights except elasticity modeling.



A couple of obvious points: I would clearly recommend real data used on real customers responding to real price changes.  This is the operating economic environment.  That is, for an existing product, use the database of customer’s behavior in purchasing products.  The strategic insights this generates will help save margin and increase total revenue.  For a new product a general survey is the worst choice, incorporate either conjoint or van Westerdorp survey.

Price sensitivity is a key concept in economics and marketing.  Elasticity modeling is hardly ever done, but it should be investigated more often.  The strategic insights gathered from elasticity modeling are worth that investigation.



Ok, prepare for a rant or two.

First, REPLY ALL.  If I ever find out who designed / enabled the easy to find and use REPLY ALL button I will go to their house and run over them with my truck.  Then back up and run over them again.

It should not be an option, probably ever.  It should not be an easy to use and easy to find option, probably ever.  face it, do you really really ever need to reply all?  Sure, maybe, once in a while.  But if that button was hard to find you would find that you really don;t need to fill up everyone’s mail boxes with all kinds of stuff, relevant or not.

I despise when our team sends back and forth to each other, little jokes, comments, funny pictures and videos.   I mean, if a couple of folks are having a ha ha ha conversation (“Oh yeah?”  “Sez you!”  “Yo mama!”) and sending it out to the entire 50-member group, with sizzling comments and cute pictures, that easily adds up to hundreds of emails.  99% of which are immediately deleted and 89% not even read.  (I have the statistics.)  Come on, people!

This week I got over 575 emails, over 300 were reply-all conversations back and forth.  Only 5 or 6 were relevant to me, those I needed to actually read.  I have a rule that puts in junk box now any picture or video attachments.

I’m the guy that now deletes any email not sent directly to me.  If you send it to a group you did not send it to me.  You do not need for me to specifically read it, you sent it to many people.  Any of them can read it and if I need to know something, I will find out.  That is, if you want me to read it, and if you want me to get the info, send it to me.  A little extreme?  Perhaps, but things are getting out of hand.

Remember that 1960s musical, “How to Succeed in Business Without Really Trying”?  Here’s an early conversation between two corporate types.

Guy X “Did you get my memo?”

Guy Y “What memo?”

Guy X “What memo?  My Memo about memos.  We’re sending out too many memos and it’s gotta stop!”

Guy Y “Okay.  I’ll send out a memo.”

Funny, yeah.  But not so funny.  So, stop the madness.  Be that guy in your group that says STOP reply all.

Second, elevator etiquette.  Look, everyone knows it, when the doors start to close you folks outside the elevator stand back and do not attempt to get inside.  That’s a universal rule.  That will stop the doors and for your safety the doors will slowly open.  Then let you in and after that will start closing again.  And probably stop because some other jack ass sticks his hand in to get on.  Again.  Groan.

Yesterday I got on the elevator by myself and the doors started to close.  When they had just about touched a slender hand slunk in and opened them up.  A Barbie wanna-be smiled at me and said ,”Sorry” and shrugged as the doors started to close.  Just before they closed I stuck my hand out and stopped them.  I walked just outside the doors and looked at her and as the doors began to close again I stuck my arm inside and stopped the doors and got back inside.  “Sorry,” I smiled.  She did not smile at me as the doors closed again.

So yes, there are jackasses out there like me.  But you don’t know which side of the elevators doors we might be on.  So therefore everyone please remember the rule:  when the doors start to close let them close.  Leave them alone.  Wait for the next elevator.  What are you in such a rush for anyway?  To read that funny email and reply all?  Jeez!

Third , come to meetings on time.  It’s not that hard.  It’s okay to even arrive a minute or two early.  I know what you’re trying to demonstrate: that you are so busy and so important that you can only run from one meeting to another and only get there after it starts.  Again and again.  Day after day.  So we who are already there have to set around and chit chat (Watch the game last night, what is Paris Hilton doing now, see the youtube video, etc.)  Or if we go ahead and start without you we will have to back up and start again when you arrive.  You have wasted everyone’s time.  If there are 9 people in the meeting and you are 5 minutes late that is 45 man-minutes that are spent because of you.  Again and again.  Day after day.

Now I know sometimes there is just no other way.  You really do have back-to-back and you cannot get to the next one early or on time.  But if it happens every day, several times a day (you know who you are) it is really just discourteous and disrespectful and no one buys that crap about you being so busy and so important.  A 10 am meeting means it starts at 10 am because people have begun to arrive a couple of minutes before 10 am.  It’s not that 10 am is the time people start to arrive and 10:05 or 10:10 it actually starts.  Because that likely will make it go over the 11 am ending time.

It’s like the speed limit sign.  When the sign says 55 mph speed limit, that is not a lower limit, but an upper limit.  That is not a minimum but a maximum.  Read the fine print.  When you are invited to a 10 am meeting it’s okay to be there one minute before, be prepared, and contribute and all will be well.  We will be much more productive.

Okay, the rants are over.  For now.





Life-Time Value is typically done as just a calculation, using past (historical) data.  That is, it’s only descriptive.

While there are many versions of LTV (depending on data, industry, interest, etc.) the following is conceptually applied to all.  LTV, via descriptive analysis:

1)Uses historical data to sum up each customer’s total revenue.

2)This sum then has subtracted from it some costs: typically cost to serve, cost to market, maybe cost of goods sold, etc.

3)This net revenue is then converted into an annual average amount and depicted as a cash flow.

4)These cash flows are assumed to continue into the future and diminish over time (depending on durability, sales cycle, etc.) often decreasing arbitrarily by say 10% each year until they are effectively zero.

5)These (future, diminished) cash flows are then summed up and discounted (usually by Weighted Average Cost of Capital) to get their net present value.

6)This NPV is called LTV.  This calculation is applied to each customer.

Thus each customer has a value associated with it.  The typical use is for marketers to find the “high valued” customers (based on past purchases).  These high valued customers get most of the communications, promotions / discounts, marketing efforts, etc.  Descriptive analysis is merely about targeting those already engaged (much like RFM).

This seems to be a good starting point but, as is usual with descriptive analysis, contributes nothing about WHY.  Why is one customer more valuable, will they continue to be?  Is it possible to extract additional value, but at what cost?  Is it possible to garner more revenue from a lower valued customer because they are more loyal or cost less to serve?  What part of the marketing mix is each customer most sensitive to?  LTV (as described above) gives no implications for strategy.  The only strategy is to offer and promote to the high valued customers.



How would LTV change using predictive analysis instead of descriptive analysis?  First note that while LTV is a future-oriented metric, descriptive analysis uses historical (past) data and the entire metric is built on that, with assumptions about the future applied unilaterally to every customer.  Prediction will specifically thrust LTV into the future (where it belongs) by using independent variables to predict the next time until purchase.  Since the major customer behavior driving LTV is timing, amount and number of purchases, a statistical technique needs to be used that predicts time until an event.  (Ordinary regression predicting the LTV amount ignores timing and number of purchases.)

Survival analysis is a technique designed specifically to study time until event problems.  It has timing built into it and thus a future view is already embedded in the algorithm.  This removes much of the arbitrariness of typical (descriptive) LTV calculations.

So, what about using survival analysis to see which independent variables, say, bring in a purchase?  This decreasing time until purchase tends to increase LTV.  While survival analysis can predict the next time until purchase, the strategic value of survival analysis is in using the independent variables to CHANGE the timing of purchases.  That is, descriptive analysis shows what happened; predictive analysis gives a glimpse of what might CHANGE the future.

Strategy using LTV dictates understanding the causes of customer value: why a customer purchases, what increases / decreases the time until purchase, probability of purchasing at future times, etc.  Then when these insights are learned, marketing levers (shown as independent variables) are exploited to extract additional value from each customer.  This means knowing that one customer is say sensitive to price and that a discount will tend to decrease their time until purchase.  That is, they will purchase sooner (maybe purchase larger total amounts and maybe purchase more often) with a discount.  Another customer prefers say product X and product Y bundled together to increase the probability of purchase and this bundling decreases their time until purchase.  This insight allows different strategies for different customer needs and sensitivities, etc.  Survival analysis applied to each customer yields insights to understand and incent changes in behavior.

This means just assuming the past behavior will continue into the future (as descriptive analysis does) with no idea why, is no longer necessary.  It’s possible for descriptive and predictive analysis to give contradictory answers.  Which is why “crawling” might be detrimental to “walking”.

If a firm can get a customer to purchase sooner, there is an increased chance of adding purchases–depending on the product.  But even if the number of purchases is not increased, the firm getting revenue sooner will add to their financial value (time is money).

Also a business case can be created by showing the trade-off in giving up say margin but obtaining revenue faster.  This means strategy can revolve around maximization of cost balanced against customer value.

The idea is to model next time until purchase, the baseline, and see how to improve that.  How is this carried out?  A behaviorally-based method would be to segment the customers (based on behavior) and apply a survival model to each segment and score each individual customer.  By behavior is typically meant purchasing (amount, timing, share of products, etc.) metrics and marcom (open and click, direct mail coupons, etc.) responses.



Let’s use an example.  Table 1 shows two customers from two different behavioral segments.  Customer XXX purchases every 88 days with an annual revenue of $43,958, costs of $7,296 for a net revenue of $36,662.  Say the second year is exactly the same.  So year 1 discounted at 9% is NPV of $33,635 and year 2 discounted at 9% for two years is $30,857 for a total LTV of $64,492.  Customer YYY has similar calculations for LTV of $87,898.

XXX 88 4.148 $43,958 $7,296 $36,662 $36,662 $33,635 $30,857 $64,492
YYY 58 6.293 $62,289 $12,322 $49,967 $49,967   $45,842 $42,056 $87,898


The above (using descriptive analysis) would have marketers targeting customer YYY with > $23,000 value over customer XXX.  But do we know anything about WHY customer XXX is so lower valued?  Is there anything that can be done to make them higher valued?

Applying a survival model to each segment outputs independent variables and shows their effect on the dependent variable.  In this case the dependent variable is (average) time until purchase.  Say the independent variables (which defined the behavioral segments) are things like price discounts, product bundling, seasonal messages, adding additional direct mail catalogs, offering online exclusives, etc.  The segmentation should separate customers based on behavior and the survival models should show how different levels of independent variables drive different strategies.

Table 2 below shows results of survival modeling on the two different customers that come from two different segments.  The independent variables are price discounts 10%, product bundling, etc.  The TTE is time until event and shows what happens to time until purchase based on changing one of the independent variable.  For example, for customer XXX, giving a price discount of 10% on average decreases their time until purchase by 14 days.  Giving YYY a 10% discounts decreases their time until purchase by only 2 days.  This means XXX is far more sensitive to price then YYY–which would not be known by descriptive analysis alone. Likewise giving XXX more direct mail catalogs pushes out their TTE but pulls in YYY by 2 days.  Note also that very little of the marketing levers affect YYY very much.  We are already getting nearly all from YYY that we can, no marketing effort does very much to impact the TTE.  However, with XXX there are several things that can be done to bring in their purchases.  Again, none of these would be known without survival modeling on each behavioral segment.


  xxx yyy
price discount 10% -14 -2
product bundling  -4 12
seasonal message   6 21
5 more catalogs  11 -2
online exclusive -11  3


Table 3 below shows new LTV calculations on XXX after using survival modeling results.  We decreased TTE by 24 days, by using some combinations of discounts and bundling and online exclusives, etc.  Note now the LTV for XXX (after using predictive analysis) is greater than YYY.


XXX 64 5.703 $60,442 $10,032 $50,410 $50,410 $33,635 $30,857 $88,677
YYY 58 6.293 $62,289 $12,322 $49,967 $49,967   $45,842 $42,056 $87,898


What survival analysis offers, in addition to marketing strategy levers, is a financial optimal scenario, particularly in terms of costs to market.  That is, customer XXX responds to a discount.  It’s possible to calculate and test what is the (just) needed threshold of discounts to bring a purchase in by so many days with the estimated level of revenue.  This ends up being a cost / benefit analysis that makes marketers think about strategy.  This is the advantage of predicative analysis–giving marketers strategic options.




What is a Market Basket?

In economics, a market basket is a fixed collection of items that consumers buy.  This is used for metrics like CPI (inflation) etc.  In marketing, a market basket is any 2 or more items bought together.

Market basket analysis is used, especially in retail / CPG, to bundle and offer promotions and gain insight in shopping / purchasing patterns.  “Market basket analysis” does not, by itself, describe HOW the analysis is done.  That is, there is no associated technique with those words.

How is it usually done?

There are three general uses of data: descriptive, predictive and prescriptive.  Descriptive is about the past, predictive uses statistical analysis to calculate a change on an output variable (e.g., sales) given a change in an input variable (say, price) and prescriptive is a system that tries to optimize some metric (typically profit, etc.)  Descriptive data (means, frequencies, KPIs, etc.) is a necessary but not usually a sufficient step.  Always get to at least the predictive step as soon as possible.  Note that predictive here does not necessarily mean forecast-ed into the future.  Structural analysis uses models to simulate the market, and estimate (predict) what causes what to happen.  That is, using regression, given a change in price what is the estimated (predicted) change in sales.

Market basket analysis often uses descriptive techniques.  Sometimes it is just a “report” of what percent of items are purchased together.  Affinity analysis (a step above) is mathematical, not statistical.  Affinity analysis simply calculates the percent of time combinations of products are purchased together.  Obviously there is no probability involved.  It is concerned with the rate of products purchased together, and not with a distribution around that association.  It is very common and very useful but NOT predictive–therefore NOT so actionable.

Logistic Regression

Let’s talk about logistic regression.  This is an ancient and well known statistical technique, probably the analytic pillar upon which database marketing has been built.  It is similar to ordinary regression in that there is a dependent variable that depends on one or more independent variables.  There is a coefficient (although interpretation is not the same) and there is a (type of) t-test around each independent variable for significance.

The differences are that the dependent variable is binary in logistic and continuous in ordinary regression and to interpret the coefficients requires exponentiation.  Because the dependent variable is binary, the result is heteroskedasticity.  There is no (real) R2, and “fit” is about classification.

How to Estimate / Predict the Market Basket

The use of logistic regression in terms of market basket becomes obvious when it is understood that the predicted dependent variable is a probability.  The formula to estimate probability from logistic regression is:

P(i) = 1 / 1+ e –Z

where Z = α + βXi.  This means that the independent variables can be products purchased in a market basket to predict likelihood to purchase another product as the dependent variable.   The above means specifically take each (major) category of product (focus driven by strategy) and running a separate model for each, putting in all significant other products as independent variables.  For example, say we have only three products, x, y and z.  The idea is to design three models and test significance of each.  Meaning using logistic regression:

x = f(y,z)

y = f(x,z)

z = f(x,y).

Of course other variable can go into the model as appropriate but the interest is whether or not the independent (product) variables are significant in predicating the probability of purchasing the dependent product variable.  Of course, after significance is achieved, the insights generated are around the sign of the independent variable, i.e., does the independent product increase or decrease the probability of purchasing the dependent product.

An Example

As a simple example, say we are analyzing a retail store, with categories of products like consumer electronics, women’s accessories, newborn and infant items, etc.  Thus, using logistic regression, a series of models should be run.  That is,


This means the independent variables are binary, coded as a “1” if the customer bought that category and a “0” if not.  The table below details the output for all of the models.  Note that other independent variables can be included in the model, if significant.  These would often be seasonality, consumer confidence, promotions sent, etc.

To interpret, look at say home décor model.  If a customer bought consumer electronics, that increases the probability of buying home décor by 29%.  If a customer bought newborn / infant items, that decreases the probability of buying home décor by 37%.  If a customer bought furniture, that increases the probability of buying home décor by 121%.  This has implications


CONSUMER ELECTRONICS XXX Insig Insig -23% 34% 26% 98%
WOMEN’S ACCESSOR Insig XXX 39% 68% 22% 21% Insig
NEWBORN, INFANT,ETC. Insig 43% XXX -11% -21% -31% 29%
JEWELRY, WATCHES -29% 71% -22% XXX 12% 24% -11%
FURNITURE 31% 18% -17% 9% XXX 115% 37%
HOME DÉCOR 29% 24% -37% 21% 121% XXX 31%
ENTERTAIN 85% Insig 31% -9% 41% 29% XXX


especially for bundling and messaging.  That is, offering say home décor and furniture together makes great sense, but offering home décor and newborn / infant items does not make sense.


The above detailed a simple (and more powerful way) to do market basket analysis.  If given a choice, always go beyond mere descriptive techniques and apply predictive techniques.

See my MARKETING ANALYTICS for additional details: