All posts by mikegrigsby

I've been involved in marketing science for over 25 years. It’s the only job I've ever had. Sad but true. I guess I’m a mile deep and an inch wide. I was a statistical analyst at a water utility before moving to Sprint where I forecasted calls. At Dell my job was demand analysis, where I learned elasticity and had my first taste of direct / database marketing. I went to Millward Brown, as marketing research director. There were leadership stints at Hewlett-Packard in database marketing, online analytics at the Gap and marketing operations at Emerson. I segued into consulting at Rapp, as VP Analytics and settled at Targetbase, heading up the strategic retail analysis practice. I believe in marketing science as focused on understanding consumer behavior that is, driving demand. My PhD dissertation was about new ways of modeling demand. I've written articles for academic and trade journals. I've taught at both the graduate and under-graduate levels. I have spoken at trade conventions and seminars (National Conference of Database Marketing, Internet Retailing, Direct Marketing Association, American Statistical Association, etc.) I think I have a general understanding of the industry and great sympathy for those actually trying to do analytics. I have developed all this as a basis for promotion. The industry needs a gentle guidebook focused on helping analysts do marketing science: pull a targeted list, provide segmentation, test campaign effectiveness, forecast demand, etc. What are mostly available are dry, scholarly references that require wading through pages of mathematic formulas. Often by then the thread of the argument is lost and frustration results. I have felt this frustration myself, which is why I started this marketing science blog.



Because pricing is one of the Four Ps of Marketing (product, price, promotion and place) it is critically important in understanding consumer behavior.  Pricing is where marketing happens, because a market is the place where buyers meet sellers.

However, how do marketers know what price to charge?  Part of it has to do with what strategy (skimming, penetrating, other) are they pursuing.  But in general, too high a price and they get no sales, too low a price and they get no profit.  Typically the market is the LAST place to experiment with trail and error.

So there are several generally accepted ways to research the “right” price to charge.  A practical decision is whether or not the product exists, real in the market place.  If it is a new product then elasticity modeling cannot really be done.  If it exists then there are four choices: a general survey, a van Westendorp survey, conjoint analysis and elasticity modeling.

This post favors (for an existing product) elasticity modeling because it is real responses to real price changes in an economic environment.  If it is a new product the least favorable choice is a general survey.  Each of these methods will be briefly discussed.



The first and simplest solution is to ask customers what they think.  This requires taking a random sample and asking “Are our prices too high?”

There is a lot of thought put behind this, to seek granularity, but generally marketers want to know if customers think their prices are too high.  The probing can be /should be aimed at different segments (high volume or low volume users, new or established customers, a particular product, geographically disbursed, etc.)

But the overwhelming answers customers give to the question “Are our prices too high” is “Yes!  Your prices are too high.”  It is self- reported and self-serving.  Money was wasted on a survey.  So what usually happens in a large corporation is that many creative people will try to slice-and-dice until they find some answers they want, some “segments” or cohorts, etc., where prices are reported as NOT too high.  That is, if you look at say customers who have been on the database longer than three years that have bought more than $450 of product X who reside in the northeast reported that, “No, your prices are competitive”!”  This is just window dressing and not analytic.

Thus, for an existing product, a general survey among current customers offers no real insights.

This post advocates NOT using a survey for an existing product.  The only choice is to use a survey for a new product.  This too has pitfalls.  Remember Chrysler used marketing research and asked potential customers how likely they would be to buy a minivan, a very new concept.  Thee customers had no experience with it and indicated lackluster demand.  Iacocca ignored the research and built the minivan anyway, saving the company.  So in short, a general marketing research survey has little value except as gee whiz info.



A second common option is the van Westendorp survey.  (Those who use it do not call it a survey but a Price Sensitivity Analysis (PSA)).  But it really is a survey.  It takes a random sample of customers and asks them questions.  It is usually a “tracking” study, so as to gauge price sensitivity movement over time.  In general the point is to find out what prices are considered too high or too low (again, self-reported).  These results are graphed onto a “Price Map”.

Customers are asked four questions:

  • At what price would you consider the product/service to be priced so low that you feel that the quality can’t be very good?
  • At what price would you consider this product/service to be a bargain—a great buy for the money?
  • At what price would you say this product/service is starting to get expensive—it’s not out of the question, but you’d have to give some thought to buying it?
  • At what price would you consider the product/service to be so expensive that you would not consider buying it?


Usually question 1 (too cheap) and question 4 (too expensive) are the primary graphs.  The intersection of these two are meant to reveal the optimal price, in the below case about $21.


too cheap too expensive price
100% 0% $5
100% 0% $6
100% 0% $7
100% 0% $8
100% 0% $9
100% 6% $10
100% 6% $11
100% 6% $12
100% 13% $13
100% 19% $14
94% 44% $15
88% 50% $16
81% 56% $18
69% 63% $19
56% 69% $20
44% 75% $21
38% 81% $22
25% 88% $25
19% 94% $30
13% 100% $35



The “optimal” price is a bit debatable.  The idea though is that about 65% think the price of $21 is too cheap and 65% think the price of $21 is too expensive.  That is, to extract maximum value from customers, $21 is seen as simultaneously too cheap and too expensive, hence the “optimal” price.

Conjoint (considered jointly) is a powerful technique favored primarily by marketing researchers.  There are dozens of books detailing all the cool types and techniques of conjoint.

To elaborate the last point, conjoint serves an important purpose, especially in marketing research, especially in product design (before the product is introduced).  My main problem with surveys overall is that they are self-reported and artificial.  Conjoint sets up a contrived situation for each respondent (customer) and asks them to make choices.  The customer makes choices and these choices are typically in terms of purchasing a product.  You know I’m an econ guy and these customers are not really purchasing.  They are not weighing real choices.  They are not using their own money.  They are not buying products in a real economic arena.  The artificialness is why I do not advocate conjoint for much else other then new product design.  That is, if you have real data use it, if you need (potential) customer’s input in designing a new product use conjoint for that.  Also, please recognize that conjoint analysis is not actually an “analysis” (like regression, etc.) but a framework for parsing out simultaneous choices.  Conjoint means “considered jointly”.

The general process of conjoint is to design choices, depending on what is being studied.  Marketing researchers are trying to understand what attributes (independent variables) are more / less important in terms of customers purchasing a product.  So a collection of experiments is designed to ask customers how they’d rate (how likely they would be to purchase) given varying product attributes.

In terms of say PC manufacturing, choice 1 might be: $800 PC, 17 inch monitor, 1 Gig hard drive, 1 Gig RAM, etc.  Choice 2 might be: $850 PC, 19 inch monitor, 1 Gig hard drive, 1 Gig RAM, etc.  There are enough choices designed to show each customer in order to calculate “part-worths” that show how much they value different product attributes.  This is supposed to give marketers and product designers an indication of market size and optimal design for the new product.

Note that it is important to design the types and number of levels of each attribute so that the independent variables are orthogonal (not correlated) to each other.  These choice design characteristics are critical to the process.  At the end an ordinary regression is used to optimally calculate the value of part-worths.  It is this estimated value that makes conjoint strategically useful.

Note that the idea is to present to responders choices (in such a way that they are random and orthogonal) and the responders rank these choices.  The choice rankings are a responder’s judgment about the “value” (economists call it utility) of the product or service evaluated.  It is assumed that this total value is broken down into the attributes that make up the choices.  These attributes are the independent variables and these are the part-worths of the model.  That is:

where Ui = total worth for product / service and

X11 = part-worth estimate for level 1 of attribute 1

X12  = part-worth estimate for level 1 of attribute 2

X21 = part-worth estimate for level 2 of attribute 1

X22 = part-worth estimate for level 2 of attribute 2

Xmn = part-worth estimate for level m of attribute n.


Conjoint is not appropriate in the way usually used, especially in terms of pricing, except, as mentioned, in a new product–a product where there is no real data.  For an existing product, it is possible to design a conjoint analysis and put price levels in as choice variables.  Marketing researchers tell me that this price variable derives an elasticity function.  I disagree for the following reasons: 1) those estimates are NOT real economic data.  They are contrived and artificial.  2) The size of the sample that is derived from is too small to make real corporate strategic choices.  3) The data is self-reported.  Those respondents are not responding with their own money in a real economic area purchasing real products.  4) Using real data is far superior to using conjoint data.  Have I said this enough yet?  Ok, the rant will now stop.



Let’s go back to microeconomics 101: price elasticity is the metric that measures the change in an output variable (typically units) from a change, in this case price, from an input variable.  This change is usually calculated as a “pure number” without dimensions.  It is a marginal function over and average function, that is:

mathematically dQ/dP * P / Q or statistically β * P / Q

where P / Q are average price over average units.

If the change is > |1.00|, that demand is called elastic.  If it is < |1.00|, that demand is called inelastic.  These are unfortunate terms, as they nearly hide the real meaning.  The clear concept is one of sensitivity.  That is, how sensitive are customers who purchase units to a change in price?  If there is a 10% increase in price and customers respond by purchasing < 10 % units, they are clearly insensitive to price.  If there is a 10 % increase in price and customers respond by purchasing > 10 % units, they are sensitive to price.

But this is not the key point, at least in terms of marketing strategy.  The law of demand is that price and units are inversely correlated (remember the downward sloping demand curve?)  Units will always go the opposite direction of a price change.  But the real issue is what happens to revenue.  Since revenue is price * units, if demand is inelastic, revenue will follow the price direction.  If demand is elastic revenue will follow unit direction.  Thus, to increase revenue in an inelastic demand curve, price should increase.  To increase revenue in an elastic demand curve, price should decrease.



INELAST 0.075 increase price by 10.0%
p1 $10.00 p2 $11.00 10.0%
u1 1,000 u2 993 -0.75%
tr1 $10,000 tr2 $10,918 9.2%
ELAST 1.250 increase price by 10.0%
p1 $10.00 p2 $11.00 10.0%
u1 1,000 u2 875 -12.50%
tr1 $10,000 tr2 $9,625 -3.8%


See the table above.  There are two kinds of demand: inelastic (0.075) and elastic (1.250).  In the inelastic portion, we increase price (p1)  10% from $10.00 to $11.00 (p2).  Units decrease (because of the law of demand) from u1 at 1,000 units to 993 (note a .75% decrease.)  Now see total revenue tr1 goes from $10,000 ($10.00 * 1,000) to tr2 $10,918 ($11.00 * 993).    This was an inelastic demand curve and price increased and note that while units decreased total revenue increased.

Now the  opposite happens for an elastic demand curve.  p1 = $10.00 and p2 = $11.00 but while u1 starts at 1,000 units the 12.5% decrease sends u2 to 875.  Now see that tr1 goes from $10,000 to a decrease of $9,625.  This means that in order to raise TR prices must be increased in an inelastic demand curve but decreased in an elastic demand curve.  This means that a marketer does not know which way to move price unless they do elasticity modeling.  See?  Wasn’t that fun?

A quick note on a mathematically correct but practically incorrect concept: modeling elasticity in logs.  While it’s true that if the natural log is taken both of the demand and price, there is no calculation at the means, the beta coefficient is the elasticity.  However, and this is important, running a model in natural logs also implies a very wrong assumption: constant elasticity.  This means there is the same impact at a small price change as at a large price change and no marketer believes that.  Thus, modeling in natural logs is never recommended.  No other analytic technique gives these insights except elasticity modeling.



A couple of obvious points: I would clearly recommend real data used on real customers responding to real price changes.  This is the operating economic environment.  That is, for an existing product, use the database of customer’s behavior in purchasing products.  The strategic insights this generates will help save margin and increase total revenue.  For a new product a general survey is the worst choice, incorporate either conjoint or van Westerdorp survey.

Price sensitivity is a key concept in economics and marketing.  Elasticity modeling is hardly ever done, but it should be investigated more often.  The strategic insights gathered from elasticity modeling are worth that investigation.



Ok, prepare for a rant or two.

First, REPLY ALL.  If I ever find out who designed / enabled the easy to find and use REPLY ALL button I will go to their house and run over them with my truck.  Then back up and run over them again.

It should not be an option, probably ever.  It should not be an easy to use and easy to find option, probably ever.  face it, do you really really ever need to reply all?  Sure, maybe, once in a while.  But if that button was hard to find you would find that you really don;t need to fill up everyone’s mail boxes with all kinds of stuff, relevant or not.

I despise when our team sends back and forth to each other, little jokes, comments, funny pictures and videos.   I mean, if a couple of folks are having a ha ha ha conversation (“Oh yeah?”  “Sez you!”  “Yo mama!”) and sending it out to the entire 50-member group, with sizzling comments and cute pictures, that easily adds up to hundreds of emails.  99% of which are immediately deleted and 89% not even read.  (I have the statistics.)  Come on, people!

This week I got over 575 emails, over 300 were reply-all conversations back and forth.  Only 5 or 6 were relevant to me, those I needed to actually read.  I have a rule that puts in junk box now any picture or video attachments.

I’m the guy that now deletes any email not sent directly to me.  If you send it to a group you did not send it to me.  You do not need for me to specifically read it, you sent it to many people.  Any of them can read it and if I need to know something, I will find out.  That is, if you want me to read it, and if you want me to get the info, send it to me.  A little extreme?  Perhaps, but things are getting out of hand.

Remember that 1960s musical, “How to Succeed in Business Without Really Trying”?  Here’s an early conversation between two corporate types.

Guy X “Did you get my memo?”

Guy Y “What memo?”

Guy X “What memo?  My Memo about memos.  We’re sending out too many memos and it’s gotta stop!”

Guy Y “Okay.  I’ll send out a memo.”

Funny, yeah.  But not so funny.  So, stop the madness.  Be that guy in your group that says STOP reply all.

Second, elevator etiquette.  Look, everyone knows it, when the doors start to close you folks outside the elevator stand back and do not attempt to get inside.  That’s a universal rule.  That will stop the doors and for your safety the doors will slowly open.  Then let you in and after that will start closing again.  And probably stop because some other jack ass sticks his hand in to get on.  Again.  Groan.

Yesterday I got on the elevator by myself and the doors started to close.  When they had just about touched a slender hand slunk in and opened them up.  A Barbie wanna-be smiled at me and said ,”Sorry” and shrugged as the doors started to close.  Just before they closed I stuck my hand out and stopped them.  I walked just outside the doors and looked at her and as the doors began to close again I stuck my arm inside and stopped the doors and got back inside.  “Sorry,” I smiled.  She did not smile at me as the doors closed again.

So yes, there are jackasses out there like me.  But you don’t know which side of the elevators doors we might be on.  So therefore everyone please remember the rule:  when the doors start to close let them close.  Leave them alone.  Wait for the next elevator.  What are you in such a rush for anyway?  To read that funny email and reply all?  Jeez!

Third , come to meetings on time.  It’s not that hard.  It’s okay to even arrive a minute or two early.  I know what you’re trying to demonstrate: that you are so busy and so important that you can only run from one meeting to another and only get there after it starts.  Again and again.  Day after day.  So we who are already there have to set around and chit chat (Watch the game last night, what is Paris Hilton doing now, see the youtube video, etc.)  Or if we go ahead and start without you we will have to back up and start again when you arrive.  You have wasted everyone’s time.  If there are 9 people in the meeting and you are 5 minutes late that is 45 man-minutes that are spent because of you.  Again and again.  Day after day.

Now I know sometimes there is just no other way.  You really do have back-to-back and you cannot get to the next one early or on time.  But if it happens every day, several times a day (you know who you are) it is really just discourteous and disrespectful and no one buys that crap about you being so busy and so important.  A 10 am meeting means it starts at 10 am because people have begun to arrive a couple of minutes before 10 am.  It’s not that 10 am is the time people start to arrive and 10:05 or 10:10 it actually starts.  Because that likely will make it go over the 11 am ending time.

It’s like the speed limit sign.  When the sign says 55 mph speed limit, that is not a lower limit, but an upper limit.  That is not a minimum but a maximum.  Read the fine print.  When you are invited to a 10 am meeting it’s okay to be there one minute before, be prepared, and contribute and all will be well.  We will be much more productive.

Okay, the rants are over.  For now.





Life-Time Value is typically done as just a calculation, using past (historical) data.  That is, it’s only descriptive.

While there are many versions of LTV (depending on data, industry, interest, etc.) the following is conceptually applied to all.  LTV, via descriptive analysis:

1)Uses historical data to sum up each customer’s total revenue.

2)This sum then has subtracted from it some costs: typically cost to serve, cost to market, maybe cost of goods sold, etc.

3)This net revenue is then converted into an annual average amount and depicted as a cash flow.

4)These cash flows are assumed to continue into the future and diminish over time (depending on durability, sales cycle, etc.) often decreasing arbitrarily by say 10% each year until they are effectively zero.

5)These (future, diminished) cash flows are then summed up and discounted (usually by Weighted Average Cost of Capital) to get their net present value.

6)This NPV is called LTV.  This calculation is applied to each customer.

Thus each customer has a value associated with it.  The typical use is for marketers to find the “high valued” customers (based on past purchases).  These high valued customers get most of the communications, promotions / discounts, marketing efforts, etc.  Descriptive analysis is merely about targeting those already engaged (much like RFM).

This seems to be a good starting point but, as is usual with descriptive analysis, contributes nothing about WHY.  Why is one customer more valuable, will they continue to be?  Is it possible to extract additional value, but at what cost?  Is it possible to garner more revenue from a lower valued customer because they are more loyal or cost less to serve?  What part of the marketing mix is each customer most sensitive to?  LTV (as described above) gives no implications for strategy.  The only strategy is to offer and promote to the high valued customers.



How would LTV change using predictive analysis instead of descriptive analysis?  First note that while LTV is a future-oriented metric, descriptive analysis uses historical (past) data and the entire metric is built on that, with assumptions about the future applied unilaterally to every customer.  Prediction will specifically thrust LTV into the future (where it belongs) by using independent variables to predict the next time until purchase.  Since the major customer behavior driving LTV is timing, amount and number of purchases, a statistical technique needs to be used that predicts time until an event.  (Ordinary regression predicting the LTV amount ignores timing and number of purchases.)

Survival analysis is a technique designed specifically to study time until event problems.  It has timing built into it and thus a future view is already embedded in the algorithm.  This removes much of the arbitrariness of typical (descriptive) LTV calculations.

So, what about using survival analysis to see which independent variables, say, bring in a purchase?  This decreasing time until purchase tends to increase LTV.  While survival analysis can predict the next time until purchase, the strategic value of survival analysis is in using the independent variables to CHANGE the timing of purchases.  That is, descriptive analysis shows what happened; predictive analysis gives a glimpse of what might CHANGE the future.

Strategy using LTV dictates understanding the causes of customer value: why a customer purchases, what increases / decreases the time until purchase, probability of purchasing at future times, etc.  Then when these insights are learned, marketing levers (shown as independent variables) are exploited to extract additional value from each customer.  This means knowing that one customer is say sensitive to price and that a discount will tend to decrease their time until purchase.  That is, they will purchase sooner (maybe purchase larger total amounts and maybe purchase more often) with a discount.  Another customer prefers say product X and product Y bundled together to increase the probability of purchase and this bundling decreases their time until purchase.  This insight allows different strategies for different customer needs and sensitivities, etc.  Survival analysis applied to each customer yields insights to understand and incent changes in behavior.

This means just assuming the past behavior will continue into the future (as descriptive analysis does) with no idea why, is no longer necessary.  It’s possible for descriptive and predictive analysis to give contradictory answers.  Which is why “crawling” might be detrimental to “walking”.

If a firm can get a customer to purchase sooner, there is an increased chance of adding purchases–depending on the product.  But even if the number of purchases is not increased, the firm getting revenue sooner will add to their financial value (time is money).

Also a business case can be created by showing the trade-off in giving up say margin but obtaining revenue faster.  This means strategy can revolve around maximization of cost balanced against customer value.

The idea is to model next time until purchase, the baseline, and see how to improve that.  How is this carried out?  A behaviorally-based method would be to segment the customers (based on behavior) and apply a survival model to each segment and score each individual customer.  By behavior is typically meant purchasing (amount, timing, share of products, etc.) metrics and marcom (open and click, direct mail coupons, etc.) responses.



Let’s use an example.  Table 1 shows two customers from two different behavioral segments.  Customer XXX purchases every 88 days with an annual revenue of $43,958, costs of $7,296 for a net revenue of $36,662.  Say the second year is exactly the same.  So year 1 discounted at 9% is NPV of $33,635 and year 2 discounted at 9% for two years is $30,857 for a total LTV of $64,492.  Customer YYY has similar calculations for LTV of $87,898.

XXX 88 4.148 $43,958 $7,296 $36,662 $36,662 $33,635 $30,857 $64,492
YYY 58 6.293 $62,289 $12,322 $49,967 $49,967   $45,842 $42,056 $87,898


The above (using descriptive analysis) would have marketers targeting customer YYY with > $23,000 value over customer XXX.  But do we know anything about WHY customer XXX is so lower valued?  Is there anything that can be done to make them higher valued?

Applying a survival model to each segment outputs independent variables and shows their effect on the dependent variable.  In this case the dependent variable is (average) time until purchase.  Say the independent variables (which defined the behavioral segments) are things like price discounts, product bundling, seasonal messages, adding additional direct mail catalogs, offering online exclusives, etc.  The segmentation should separate customers based on behavior and the survival models should show how different levels of independent variables drive different strategies.

Table 2 below shows results of survival modeling on the two different customers that come from two different segments.  The independent variables are price discounts 10%, product bundling, etc.  The TTE is time until event and shows what happens to time until purchase based on changing one of the independent variable.  For example, for customer XXX, giving a price discount of 10% on average decreases their time until purchase by 14 days.  Giving YYY a 10% discounts decreases their time until purchase by only 2 days.  This means XXX is far more sensitive to price then YYY–which would not be known by descriptive analysis alone. Likewise giving XXX more direct mail catalogs pushes out their TTE but pulls in YYY by 2 days.  Note also that very little of the marketing levers affect YYY very much.  We are already getting nearly all from YYY that we can, no marketing effort does very much to impact the TTE.  However, with XXX there are several things that can be done to bring in their purchases.  Again, none of these would be known without survival modeling on each behavioral segment.


  xxx yyy
price discount 10% -14 -2
product bundling  -4 12
seasonal message   6 21
5 more catalogs  11 -2
online exclusive -11  3


Table 3 below shows new LTV calculations on XXX after using survival modeling results.  We decreased TTE by 24 days, by using some combinations of discounts and bundling and online exclusives, etc.  Note now the LTV for XXX (after using predictive analysis) is greater than YYY.


XXX 64 5.703 $60,442 $10,032 $50,410 $50,410 $33,635 $30,857 $88,677
YYY 58 6.293 $62,289 $12,322 $49,967 $49,967   $45,842 $42,056 $87,898


What survival analysis offers, in addition to marketing strategy levers, is a financial optimal scenario, particularly in terms of costs to market.  That is, customer XXX responds to a discount.  It’s possible to calculate and test what is the (just) needed threshold of discounts to bring a purchase in by so many days with the estimated level of revenue.  This ends up being a cost / benefit analysis that makes marketers think about strategy.  This is the advantage of predicative analysis–giving marketers strategic options.




What is a Market Basket?

In economics, a market basket is a fixed collection of items that consumers buy.  This is used for metrics like CPI (inflation) etc.  In marketing, a market basket is any 2 or more items bought together.

Market basket analysis is used, especially in retail / CPG, to bundle and offer promotions and gain insight in shopping / purchasing patterns.  “Market basket analysis” does not, by itself, describe HOW the analysis is done.  That is, there is no associated technique with those words.

How is it usually done?

There are three general uses of data: descriptive, predictive and prescriptive.  Descriptive is about the past, predictive uses statistical analysis to calculate a change on an output variable (e.g., sales) given a change in an input variable (say, price) and prescriptive is a system that tries to optimize some metric (typically profit, etc.)  Descriptive data (means, frequencies, KPIs, etc.) is a necessary but not usually a sufficient step.  Always get to at least the predictive step as soon as possible.  Note that predictive here does not necessarily mean forecast-ed into the future.  Structural analysis uses models to simulate the market, and estimate (predict) what causes what to happen.  That is, using regression, given a change in price what is the estimated (predicted) change in sales.

Market basket analysis often uses descriptive techniques.  Sometimes it is just a “report” of what percent of items are purchased together.  Affinity analysis (a step above) is mathematical, not statistical.  Affinity analysis simply calculates the percent of time combinations of products are purchased together.  Obviously there is no probability involved.  It is concerned with the rate of products purchased together, and not with a distribution around that association.  It is very common and very useful but NOT predictive–therefore NOT so actionable.

Logistic Regression

Let’s talk about logistic regression.  This is an ancient and well known statistical technique, probably the analytic pillar upon which database marketing has been built.  It is similar to ordinary regression in that there is a dependent variable that depends on one or more independent variables.  There is a coefficient (although interpretation is not the same) and there is a (type of) t-test around each independent variable for significance.

The differences are that the dependent variable is binary in logistic and continuous in ordinary regression and to interpret the coefficients requires exponentiation.  Because the dependent variable is binary, the result is heteroskedasticity.  There is no (real) R2, and “fit” is about classification.

How to Estimate / Predict the Market Basket

The use of logistic regression in terms of market basket becomes obvious when it is understood that the predicted dependent variable is a probability.  The formula to estimate probability from logistic regression is:

P(i) = 1 / 1+ e –Z

where Z = α + βXi.  This means that the independent variables can be products purchased in a market basket to predict likelihood to purchase another product as the dependent variable.   The above means specifically take each (major) category of product (focus driven by strategy) and running a separate model for each, putting in all significant other products as independent variables.  For example, say we have only three products, x, y and z.  The idea is to design three models and test significance of each.  Meaning using logistic regression:

x = f(y,z)

y = f(x,z)

z = f(x,y).

Of course other variable can go into the model as appropriate but the interest is whether or not the independent (product) variables are significant in predicating the probability of purchasing the dependent product variable.  Of course, after significance is achieved, the insights generated are around the sign of the independent variable, i.e., does the independent product increase or decrease the probability of purchasing the dependent product.

An Example

As a simple example, say we are analyzing a retail store, with categories of products like consumer electronics, women’s accessories, newborn and infant items, etc.  Thus, using logistic regression, a series of models should be run.  That is,


This means the independent variables are binary, coded as a “1” if the customer bought that category and a “0” if not.  The table below details the output for all of the models.  Note that other independent variables can be included in the model, if significant.  These would often be seasonality, consumer confidence, promotions sent, etc.

To interpret, look at say home décor model.  If a customer bought consumer electronics, that increases the probability of buying home décor by 29%.  If a customer bought newborn / infant items, that decreases the probability of buying home décor by 37%.  If a customer bought furniture, that increases the probability of buying home décor by 121%.  This has implications


CONSUMER ELECTRONICS XXX Insig Insig -23% 34% 26% 98%
WOMEN’S ACCESSOR Insig XXX 39% 68% 22% 21% Insig
NEWBORN, INFANT,ETC. Insig 43% XXX -11% -21% -31% 29%
JEWELRY, WATCHES -29% 71% -22% XXX 12% 24% -11%
FURNITURE 31% 18% -17% 9% XXX 115% 37%
HOME DÉCOR 29% 24% -37% 21% 121% XXX 31%
ENTERTAIN 85% Insig 31% -9% 41% 29% XXX


especially for bundling and messaging.  That is, offering say home décor and furniture together makes great sense, but offering home décor and newborn / infant items does not make sense.


The above detailed a simple (and more powerful way) to do market basket analysis.  If given a choice, always go beyond mere descriptive techniques and apply predictive techniques.

See my MARKETING ANALYTICS for additional details:








Where Is It Now?

OK, it’s been two years in the making (actually it’s been thirty years in the making) but I finished it last month.  It’s about 55,0000 works (Some of them good).

It’s called A GUIDEBOOK FOR MARKETING ANALYSTS, A Conceptual Overview of Real Marketing Science.  Isn’t that a great title?  Yeah, I did not think so either but it may change.  The idea is to have a guidebook, NOT a textbook, for marketing analysts.  Pretty much the subject of this blog.  Same idea, same style, etc.

I have bookcases full of textbooks on marketing science, econometrics, marketing research, statistics, multivariate analysis, etc.  I use them sometimes (if the leg on my desk has become wobbly, pages from those books  will help prop it up).  I’m kidding of course.  They have their place.

But it is my experience that marketing analysts (or students about to become marketing analysts) appreciate a conceptual, guided overview of how to apply analytics to solve a marketing problem, without all the mathematic clutter of most academic tomes.  That’s the purpose of A GUIDEBOOK FOR MARKETING ANALYSTS.

It’s making the rounds in New York now.  I’ll keep you informed.


How Do You Know if You’re “Analytic”?

Okay, since some of this blog is aimed at students of analytics, how do you know if YOU are analytic?  Sure, sure , you’ve been pushed by your parents into taking a lot of math and science, etc., and are now in school studying analytics–but deep down, sometimes at night, you wonder if it is really for you.  It’s not about how much money you might be able to make, you sometimes wonder if you should change your major to something fun and interesting, maybe music or art or politics or history.

Or, you’re already working IN analytics and also question if you’ve made the right decision.  Do you fit in?  Can you be successful?  You’re early in your career and it’s not too late.  How does the prospect of doing SAS on dirty data and searching for insights with no time for the next 30 years sound?  If your heart skipped a beat, you should worry.


How do you know if you’re an analytic person?  You should love the simple joy that comes when seeing a variable that should be significant, be proved in the data.  The satisfied look of wonder pervades your face when the world makes sense.  That replaces the constant, cynical caveat-laden weariness we usually have to carry around.  That’s what got us into analytics in the first place, right?  People are confusing, full of irrational gray areas, but data is data, truth is truth.  When well-understood relationships make sense, it’s comforting.  When insights are found, it’s exciting.  Murder solved!  Puzzle completed!  And because it’s consumer behavior we are trying to predict–this helps us believe that maybe people are NOT so confusing.

So, look over your life.  Do you find enjoyment in black and white answers?  Do you naturally distrust any data / claims that you yourself have not been in to?  Do you like learning how things work, do you naturally and quickly see relationships (especially causal relationships) and are you constantly curious?  If the answers to these are mostly “Yes” then you might be analytic.


When I was in elementary school I was the class clown.  (Can you believe it?)  I have a strong introvert streak but also have always found it necessary to make the joke, point out the funny thing, and teachers usually hated me, the class clown.  I didn’t eat paste or do funny dances, it was always verbal.

Anyway, in third grade we were learning long division.  The previous couple of weeks the teacher had been warning us that LONG DIVISION was a very big deal, difficult, complicated, and would require all our attention, and she would have to mentor us along.  (She would be in no mood for class clowning when we started.)

So, the first day arrived and she motivated us to appreciate the central issue of long division, remainders, by asking, “Now, how can you divide 5 evenly?  You can’t.  Thus–”

I immediately shouted, “Yes, you can: two-and-a-half and two-and-a- half.  See, evenly.”

She sent me to the office.  The one time I was NOT being the class clown–I was being analytic–got me in trouble.  In truth, it served me right.  The statistics in that class proved that 9 out of 10 times whatever I said was worthy of sending me to the office.


So, another issue.  The real trouble is that being analytic in corporate America is not enough to be successful in analytics.  This is because most analytic folks are a little quiet, maybe introverted.  We can pretend it’s because of the left-brain domination where we get our sense of logic and rationality.  But to be successful in analytics you will have to be able to push yourself to find insights and present them to others.  You will often have to convince other people (sometimes those many levels above you, those that have the purse strings to carry the project forward).  So the key personality trait, as a test for analytic talent, is passionate curiosity.   That is, you are so excited by what you have found, you can easily overcome your natural shyness.  The love of discovery so drives you that you can';t keep your mouth shut and you tell the world that you have found the truth, and it’s shouted from the rooftops.

Therefore, I would say you are analytic if you love finding relationships in data.  But you can only be successful in analytics if  you are so thrilled by what you have found you must socialize that to everyone you can.  Right now.  Make sense?




Greetings.  Thanks for stopping by.  The below is a quick introduction so you’ll know whether or not this is for you.

We’ll start by trying to get a few things straight. is not meant as a replacement for a textbook in marketing analytics / econometrics, etc.  I’ll mention some textbooks down the line that might be helpful in some areas but this meant to be like a textbook.  This is meant to be a gentle overview, more conceptual than statistical for the business analyst that just needs to know how to get on with their job.

Who is the Intended Audience for This Blog?

This is not meant to be an academic tome filled with mathematic minutia and cluttered with statistical mumbo-jumbo.  There will need to be an equation now and then, but if your interest is econometric rigor, you’re in the wrong place.  A couple of good books for that are Econometric Analysis by William H. Greene and Econometric Models, Techniques and Applications by Michael Intrilligator, Ronald G. Bodkin and Chang Hsiao.  So, this is not aimed at the statistician, although there will be a fair amount of verbiage about statistics.

If you’re all about (and only about) BI (business intelligence), which means mostly reporting / visualizing data, (if you live and die by creating KPIs) this is not for you.

This will not be a marketing strategy guide, but be aware that as mathematics is the handmaiden of science, marketing science is the handmaiden of marketing strategy.  There is no point to analytics unless it has a strategic payoff.  It’s not what is interesting to the analyst, but what is impactful to the business, that is the focus of marketing science.

So, to whom is this blog aimed?  Not necessarily at the professional (academic) econometrician / statistician, but there ought to be some satisfaction here for them.  And not necessarily for the student, but a conceptual overview is usually what students need most.  Primarily, the aim is at the practitioner.   The intended audience is the business analyst that has to pull a targeted list, the campaign manager that needs to know which promotion worked best, the guy that has to forecast next quarter’s demand units, the marketer that must DE-market some segment of her customers to gain efficiency, the marketing researcher that needs to design and implement a satisfaction survey, the pricing analyst that has to set optimal prices between products and brands, etc.

So What is Marketing Science?

As alluded to above, marketing science is the analytic arm of marketing.  Marketing science seeks to quantify causality. Marketing science is not an oxymoron (like military intelligence, happily married or jumbo shrimp) but is a necessary (although not sufficient) part of marketing strategy.  It is more than simply designing campaign test cells.  Its overall purpose is to decrease the chance of marketers making a wrong decision.  It cannot replace managerial judgment, but it can offer boundaries and guard rails to inform strategic decisions.  It encompasses wide areas from marketing research to database marketing.

What Kind of People in What Jobs Use Marketing Science?

Most people in marketing science (also called decision science, analytics, CRM, direct / database marketing, etc.) have a quantitative bent.  Duh.  Their education is typically some combination involving statistics, econometrics / economics, mathematics, programming / computer science, business / marketing / marketing research, strategy, etc.  Their experience certainly touches any and all parts of the above.  The ideal analytic person has a strong quantitative orientation as well as a feel for consumer behavior and the strategies that affect consumer behavior.  As in all marketing, consumer behavior is the focal point of marketing science.

Marketing science is usually practiced in firms that have a CRM or direct / database marketing component, or firms that do marketing research and analytics must be done on the survey responses.  Forecasting is a part of marketing science, as well as design of experiments (DOE), web analytics and even choice behavior (conjoint).  In short, any quantitative analysis applied to economic / marketing data will have a marketing science application.  So while the subjects of analysis are fairly broad, the number of (typical) analytic techniques tends to be fairly narrow.

Why Do I Think I Have Something to Say about Marketing Science?

Fair question.  My whole career has been involved in marketing science.  For more than 25 years I’ve done direct marketing, CRM, database marketing, marketing research, decision sciences, forecasting, segmentation, DOE and all the rest.  While my BBA and MBA are in finance and economics, my PhD is in marketing science.  I’ve published a few trade and academic articles, I’ve taught school at both graduate and undergraduate levels and I’ve spoken at conferences, all involved in marketing science.  I’ve done all this for firms like Dell, HP, the Gap, Sprint as well as consultancies like Targetbase, etc.  Over the years I’ve gathered a few opinions that I’d like to share with y’all.  And yes, I’ve been in Texas for over 15 years.

What is the Approach / Philosophy of This Book?

As with most bloggers of non-fiction, I wrote this because I would have loved to have had it, or something like it, far earlier.  What I had in mind did not actually exist, as far as I knew.

I had been a practitioner for decades and there were times I just wanted to know what I should do, what analytic technique best would solve the problem I had.  I did not need a mathematically-oriented econometrics textbook.  I did not need a list of statistical techniques.  What I needed was a (simple) explanation of which technique would address the marketing problem I was working on.  I wanted something direct, accessible, and easy to understand so I could use it and then explain it.  It was okay if the book / blog / website, etc. went into more technical details later, but first I needed something conceptual to guide in solving a particular problem.  What I needed was a marketing-focused book / blog explaining how to use statistical / econometric techniques on marketing problems.  It was good if it showed examples and case studies doing just that.  Voila’.

Generally this blog will have the same point of view as books like Peter Kennedy’s A Guide to Econometrics and Glenn L. Urban’s and Steven H. Star’s Advanced Marketing Strategy.  That is, the techniques will be described in two or three levels.  The first is really just conceptual, devoid of mathematics and the aim is to understand.  The next level is more technical, and will use SAS or something else as needed to illustrate what is involved, how to interpret it, etc.  Then the final level, if there is one, will be rather technical and aimed really only for the professional.

One thing I like about Stephan Sorger’s book, Marketing Analytics, is in the opening pages he champions action-ability.  Marketing science ought to be about action-ability.  I know some of you academic purists will read the following pages and gasp that I occasionally allow “bad stats” to creep in.  (For example, it is well known that forecasting often is improved if collinear independent variables are found.  Shock!)  But the point is that even an imperfect model is far more valuable than waiting for academic white tower purity.  Business is about time and money and even a cloudy insight can help improve targeting.  Put simply, this blog and marketing science is ultimately about what works, not what will be published in an academic research paper.

All of the above will be cast in terms of business problems, that is, in terms of marketing questions.  For example, the point is that a marketer, say, needs to target his market and he has to learn to do segmentation.  Or she has to manage a group that will do segmentation for her (a consultant) and needs to know something about it in order to intelligently question.  The problem will be addressed in terms of what is segmentation, what does it mean to strategy, why do it, etc.  Then a description of several analytic techniques used for segmentation will be detailed.  Then a fairly involved and technical discussion will show more additional statistical output.  Then an example or two will be shown.  This output will use SAS (or SPSS, etc.) as necessary.

Therefore, the philosophy is to present a business case (a need to answer the marketing question) and describe conceptually various marketing science techniques (in two or three increasingly detailed levels) that can answer those questions.  Then with SAS, etc., output will be developed that shows how the technique works, how to interpret it and use it to solve the business problem.  Finally, more technical details may be shown, as needed.  Okay?

So, now you know where we’ll try to go.  You can come along if you like.