Tuesday, 16 February 2016

Part 3 of a Review of “Marketing Analytics – A Practical Guide to Real Marketing Science” (by Mike Grigsby Kogan)

Part 3 of a Review of “Marketing Analytics – A Practical Guide to Real Marketing Science” (by Mike Grigsby Kogan)


A while back, I got a book from my Skillsoft learning library, with the above title. As a statistician/analyst at a university, I was curious about how the statistical techniques that I use on a routine basis are applied in the marketing world. And as someone who is involved in a small publishing venture, I was also curious about the theory and practice of marketing in general, and how it might be used to sell more novels. So, I thought I would read the book and do a write-up for the blog, to help fix ideas in my own mind and inform blog readers as well.

Naturally, if the book interests you, you should go to the source. The Amazon link is given above. The book sells for about 20 bucks, in both e-book and paperback form. Though the content gets somewhat technical, given the subject matter, the writer maintains a very readable style in my opinion.

Since the book is fairly long, a proper look at it will take at least two blogs, maybe three. I previously did a blog on Part One of the book, which was concerned with some elementary statistical ideas, as well as some fundamental concepts and strategies within the marketing world. The second blog was about Part Two of the book, which covered some fairly advanced techniques, whereby one uses one or more variables to predict another.

This third and final section deals with some other fairly advanced statistical techniques, which are less about predicting an outcome variable than in gaining deeper understanding of a dataset, in a more holistic sense. They include:

  • Factor analysis.

  • Cluster analysis (hierarchical, k-means, latent class).

  • Decision trees.

  • He also gives some review of statistical testing in general, on matters such as experimental vs. observational studies, sample sizes, A/B testing, and the new data mining methods.

In some cases, I have inserted an example of a given technique, from my own book related research, in italics.

Part Three – Inter-relationship Techniques

Chapter 7 – Modeling Inter-relationship Techniques – What Does the Customer (Market) Look Like?

  • This part of the book focuses on techniques that relate a number of variables together, rather predict the value of one variable from the value of others. So, for example, factor analysis can help determine which variables “go together”, in higher order concepts. These techniques also include market segmentation methods, such as cluster analysis, which determine which groups of customers are most like each other, in order to help conceptualize a broad mass of people into some higher order categories (i.e. sub-markets).
Essentially, these market segmentation methods group the dataset into subsets that are similar to each other within the group but different from other groups. Obviously, one can segment by things such as age, gender or income, but these methods take a deeper dive into the data to discover unexpected groupings. The point of this, of course, is to identify these segments and use different marketing techniques accordingly. After this is accomplished, one might run separate regression analyses for each identified market segment, to see what variables drive behavior for each segment, and how marketing can therefore be optimized. He explains a marketing notion, the four Ps (partition, probe, prioritize, position), and how segmentation methods help in executing those principles. He also notes that good segmentation should produce groups that are:
  • identifiable (e.g. by statistical scoring),
  • substantive (the group is big enough to be worthwhile developing a separate strategy for them),
  • accessible (you are able to contact them),
  • stable (the membership persists over reasonable time intervals)
  • responsive (this all makes a difference in an important metric, like sales).
The actual data used for segmentation can be from the firm's data (transactions and communications), demographics (e.g. census data), survey data, and the like. Once the segments are identified via the appropriate technique (e.g. k-means clustering), some key metrics can be run for each segment, to show how they differ in a real-world, easily interpretable way. Naming the segments is also a key part of the process – relevant, memorable segment names increases buy-in from management (but don't get too cute with trendy, playful names). Finally, of course, the segmentation must be tested and verified as effective and useful.

Chapter 8 – Segmentation Tools and Techniques

  • This chapter gets into the nuts and bolts of segmentation. First, there is a caution about using management-driven a priori rules for segmentation (e.g. big spenders versus small). Letting the data drive the segmentation is more likely to lead to fresh insights.
  • The first method looked at is the CHAID decision trees algorithm. This is really a dependent variables method (somewhat akin to logistic regression), but is often seen as a segmentation method. It determines optimal “splits” of the dataset, based upon a dependent variable of interest. It is a method that is popular in the data mining community, as it is relatively easy to understand and interpret, though it doesn't yield the sort of deeper explanations that a more traditional statistical model like logistic regression can (e.g. coefficient strengths, model R-square, etc).
  • He then briefly notes hierarchical clustering, a method that produces output someone akin to family trees, before giving a more in depth treatment of k-means clustering. This method creates a Euclidean-type distance metric, based on a choice of variables given by the analyst, then creates a number of clusters, again defined by the analyst. Essentially, these clusters are based upon minimizing within-group distances and maximizing between-group distances. As noted, there are a number of subjective steps, namely choosing the number of clusters and the variables to create the metric. This can result in a plethora of alternative models, with no cut and dried decision rule to use, to pick the best one.
Below is a quick example of k-means clustering, using my Amazon Top 100 dataset, for illustrative purposes. I used the following variables for the cluster analysis (please ignore the formatting, as the output from stats packages can be rather primitive):
  • Rank in the book’s Top 100 year (2013 or 2014, when pubished).
  • Rank in mid-2015
  • Writer’s age
  • Price in Top 100 year.
  • Number of reviews in Top 100 year.
  • Number of reviews in mid 2015.
  • Writer’s sex (male=0, female=1)
Final Cluster Centers
│ Cluster ║
├──────┼────── ┼──   ╢
│        1         │       2       │ 3    ║
├──────            ┼────── ┼───           ╢
Rank               25       │ 53    │ 56   ║
Rank_July19_2015  │1039     │1309   │10279 ║
WriterAge          54       │ 54      51   ║
Price1_Top100_Yr  │ 6.70    │ 7.20  │ 5.92 ║
NumRev_YR1        │ 4430    │ 1568  │ 1385 ║
NumRev_July2015   │9919     │2434   │ 1907 ║
D_FEMALE          │ .39     │ .71   │ .71  ║

Number of Cases in each Cluster
╔═══════╤═╤══ ╗
Cluster  │1│ 31 ║
║         │2│ 52 ║
║         │3│112║
Valid │    │195 ║
╚═══════╧═╧═ ╝
I set the analysis to break the dataset into three clusters. It would seem that:
  • Cluster 1 is a smallish set of books (n=31), with the highest ranking in both their publication year and in 2015. The writers were slightly older, prices were mid-way between the other two clusters, and the number of reviews in both the publishing year and mid-2015 were very high. The majority of writers were men (61%). We might give this group a name like “consistent male big sellers”.
  • Cluster 2 is a larger set of books (n=52), with middle ranks in their publishing year, and relatively high ranks in 2015 (not far behind cluster 1). The writers’ ages were the same as cluster 1, and they were somewhat higher priced. However, they had far fewer reviews than the books in cluster 1, both in their original publishing year and in 2015. The majority of writers were women (71%). We might give this group a name like “steady female mid-list sellers”.
  • Cluster 3 is the largest set of books (n=112). They had middle ranks in their publishing year, but dropped considerably by 2015. Writers were a bit younger, and prices were lower. They had the fewest reviews in both their original publishing year and in mid-2015. The majority of writers were women (71%). We might give this group a name like “female writers with inconsistent sales”.
  • A deeper dive into the dataset would probably show that much of this segmentation pertains to the writer’s genre and length of tenure as a published writer, but this will do for quickly illustrating how the technique works.
  • The author prefers a technique called LCA (latent class analysis), as it overcomes some of the shortcomings of k-means, noted above. However, SAS and SPSS don't include this method – it requires purchasing some add-ons to these programs (or using R). Two nice features of LCA are that it determines the “best” number of clusters and the most relevant set of variables to use for the segmentations, in a rigorous, statistical manner. It also provides a probability measure of class membership for each case, assisting strategists in determining just how strong a given case's association with a cluster is.

Chapter 9 – Marketing Research

  • This chapter deals with some distinctions between database marketing and marketing research. The former is driven more by transactional data on customers that the firm has, while the latter is often based on self-reported survey data. Survey fatigue and missing data are both issues in survey data, so the author gives some interesting ideas on how to get around these.
  • The author does a brief review of a method known as conjoint analysis. He feels that it is too contrived and lacking in grounding in actual customer decisions to be very useful, however.
  • He also gives a quick tour of structural equation modeling or path analysis, a technique used by the more quantitative sort of social scientist to model deep cause and effect analyses. This method, a variation on regression analysis, attempts to uncover “latent” variables based on “manifest” variables, and thus see what is motivating people under the surface. It is a complex subject, worthy of its own book.

Chapter 10 – Statistical Testing – Knowing What Works

  • Proper experimental design, as one would have in a medical experiment, is difficult in a corporate environment. Things such as randomization, double-blinding, and non-treatment are not always possible, for technical or business-cultural reasons.
  • It is important to get the appropriate sample size for a proper analysis, and that is often not properly computed.
  • A/B testing is a keystone of marketing. This is simply dividing a sample into two (randomly), and giving the treatment to one group rather than the other, then measuring the difference in some relevant response. A variation on a t-test can then determine if the test is statistically significant. However, it can be difficult, in a business environment to be sure that there is no cross-contamination from earlier treatments not related to the test (e.g. sales). Sometimes ANOVA or regression might be useful, with a more multi-factorial design.

Chapter 11 – Capstone – Focusing on Digital Analytics

  • The rise of the web has led to an explosion of data, often referred to as “big data”. New techniques and algorithms have sprung up, with names such as neural nets, machine learning and so forth. These tend to be less statistically based than older techniques, less theoretically rigorous, but often seem easier to implement and interpret. The author advised caution against over-reliance on techniques that have a “blackbox” feel to them, and promise to take the well paid analyst out of the scenario. Sometimes, you get what you pay for. One might accuse us statistical analysts of having a vested interest in the status quo, however.
  • Social media is also an important new marketing reality. Much research is still needed to determine its impact.

Chapter 12 – Finale and Take-Away

  • Marketing Analytics is really about “quantifying causality”. Although correlation is not causality, causality can generally be reliably determined, via some general rules of thumb about sequence of events and counter-factuals.
  • The customer may not always be right, but it is always right to focus on the customer.
  • Have a plan to implement your analytical results, and get buy-in for it.
  • Remember, that while individual persons can be very unpredictable, people as a group can be surprisingly predictable. And you don't have to be 100% correct, just usually correct and usefully correct.
So, there you have it. Most of the simpler and more advanced statistical techniques used in marketing (or in social science in general) have been outlined. Note, however, that there are other techniques, more along the data mining domain, that are also popular these days.
And, since this is a book themed blog, here is your chance to buy a book. This is a travelogue, featuring a statistician and a truck driver, on a long haul trip, taking lumber to Texas and oilfield equipment to Alberta. So, you get content that alludes to the theme of the blog – statisticians and markets. :).
On the Road with Bronco Billy - A Trucking Journal
Kindle Edition
What follows is an account of a ten day journey through western North America during a working trip, delivering lumber from Edmonton Alberta to Dallas Texas, and returning with oilfield equipment. The writer had the opportunity to accompany a friend who is a professional truck driver, which he eagerly accepted. He works as a statistician for the University of Alberta, and is therefore is generally confined to desk, chair, and computer. The chance to see the world from the cab of a truck, and be immersed in the truck driving culture was intriguing. In early May 1997 they hit the road.

Some time has passed since this journal was written and many things have changed since the late 1990’s. That renders the journey as not just a geographical one, but also a historical account, which I think only increases its interest.

We were fortunate to have an eventful trip - a mechanical breakdown, a near miss from a tornado, and a large-scale flood were among these events. But even without these turns of fate, the drama of the landscape, the close-up view of the trucking lifestyle, and the opportunity to observe the cultural habits of a wide swath of western North America would have been sufficient to fill up an interesting journal.

The travelogue is about 20,000 words, about 60 to 90 minutes of reading, at typical reading speeds.

No comments:

Post a Comment