Opinion, Berkeley Blogs

Avoiding the traps of big data

By Ikhlaq Sidhu

By now it’s well known that Target Corporation (Target) “knew a teen girl was pregnant before her father did." Not only was the story told many times over in the New York Times, but it also became one of the lead examples illustrating the intrinsic value of “big data." A bit creepy, yes, but basically Target uses a pregnancy-prediction score, inferred from past purchases, to develop a pregnancy-likelihood and confidence interval on every woman who shops at Target.  They use this score to target baby product ads at the right time and to the right people.

Cost Value Matrix - Arrow and Target

If you are a retailer or other business you might think this implies that you could mine all kinds of seemingly unimportant data to increase profits or save costs, if only you had the right “big data” hardware and software technology.

But there is a bit more to this story.  All the data and analytics tools are not going to do you much good unless you also have sound judgment and employ the leadership lessons articulated below. Today’s data analytics technology is becoming increasingly powerful.  It’s as if we previously had the equivalent of a glider and now we have a jet powered fighter plane.  Without a skilled pilot, the new technology is more dangerous than it is helpful.  If we take a closer look at the Target example, we can see the leadership skills required to make use of these more powerful tools.

Lesson 1:  It's critical to have a  revenue-driven or risk management-driven business case.

To start with, in Target’s case there is a well researched business thesis:  based on studies from the 1980s, we know that people mindlessly buy everyday items by habit and are almost completely immune to advertising or coupons that attempt to get them to switch to other products.  But there is one big exception – a life changing event like getting married, moving to a new city, or having a child.  Without this powerful business insight, the pregnancy prediction score and ad targeting would be pretty useless.

Lesson 2:  You need modeling, simulation skills, and creative measures.

Second, a model must be developed using samples or estimation.  This also requires human insight and creativity.  In Target’s case, they used the baby registry as the sample space from which a model could be developed and then used to analyze larger data sets.

Lesson 3:  Understand the concept of Value of Information.

Third, is the concept  called the Value of Perfect Information popularized in Douglas Hubbard’s book, How to Measure Anything.  The idea, which we will slightly adapt here, is that you can estimate the maximum that you could expect to save or expect to gain with a given decision by assuming you had perfect information.  By comparing  the perfect information to the expected result (given that you have no new information), you can figure out how much it's worth to be more accurate.  In Target’s case,  they could figure out how much more value could be captured by more accurately modeling a Pregnancy Prediction Score (i.e. from 80% confidence to 90% confidence interval).  This concept allowed Target to estimate whether it’s worth the expense to get better data and a better model.

Lesson 4:  Yes, you still need the tools. 

Lesson 5:  The process must however lead to decisions.

Lesson 6:  It’s a continuous process, not a one-time event.

With this background and context, data can be collected and analyzed and used to determine who to target with coupons.  And with this information, there is a new state of business results and the cycle of business justification and data collection starts over.

Measurement and Learning Models MeasurementandLearningModelGeneric Chart2BigDataa

Lesson 7:  Big data leadership requires judgment for ethical considerations and privacy.

The risks?  After some time, people start to feel as if they are being observed a little too closely.  To counter this, Target then began interlacing baby product coupons to pregnant women with coupons for lawn mowers and other random items to avoid the perception that they are spying on their customers.  Even then, this case may be immersed in privacy, ethics, and regulatory issues. If Target crosses the line about what information they collect and how they use it, they run risk of a public backlash.  If by chance they violate a regulation, they may suffer legal action and even damage the brand itself.

The central point is that leadership skills for big data require a great deal more than technology and statistics.  In fact, to manage it effectively, requires a holistic understanding of the business case, modeling techniques, value of information, decision-based action, and ethical judgment.  The enabling technology is great, but it’s just the first step.