MITSloan
Management Review
Thriving in a Big Data World
Alden M. Hayashi
Three
recent books offer managers expert perspectives on the increasing power and
importance of analytics.
One airline discovered that vegetarians are less likely to miss their flights, according to the book Predictive Analytics.
U.S. President Barack Obama’s 2012 campaign owed much of its
success to quantitative analysis, with staffers able to identify, for example,
which people would likely be swayed to vote for him after receiving a flyer,
phone call or home visit, thus tipping the balance in the fight for crucial
swing states. Wal-Mart has learned that before a hurricane strikes an area, not
only does the demand for flashlights increase but also that for Pop-Tarts. Even
the world of sports has become enamored of quant power, as famously popularized
in the best-selling book Moneyball. But what exactly are these new
quantitative techniques, and how can businesses best deploy them to their
advantage?
Executives can find some answers to such questions in three
recent books: Big Data: A Revolution That Will Transform How We Live,
Work, and Think (Houghton Mifflin Harcourt, 2013) by Viktor
Mayer-Schönberger, a professor of Internet governance and regulation at Oxford
University, and Kenneth Cukier, data editor of The Economist;
Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die (John
Wiley & Sons, 2013) by Eric Siegel, founder of Predictive Analytics World
and a former assistant professor at Columbia University; and Keeping Up
with the Quants: Your Guide to Understanding and Using Analytics (Harvard
Business School Publishing, 2013) by Thomas H. Davenport, the President’s
Distinguished Professor of Information Technology & Management at Babson
College, and Jinho Kim, a professor of business and statistics at the Korea
National Defense University. The first two books primarily focus on the power
of big data and quantitative analytics, and the third advises how companies can
tap into that power. Together, the combination of description and advice
provide a good primer for executives seeking a better understanding of this
emerging era of sophisticated number-crunching.
Understanding “Datafication”
According to Eric Siegel’s estimate, we are adding 2.5
quintillion bytes of data every single day. Words have become data; the
physical states of our machinery have become data; our physical locations have
become data; and even our interactions with each other have become data. “Data
can frequently be collected passively, without much effort or even awareness on
the part of those being recorded. And because the cost of storage has fallen so
much, it is easier to justify keeping data than discarding it,” observe Viktor
Mayer-Schönberger and Kenneth Cukier. The authors refer to this new phenomenon
as the “datafication” of everything. Indeed, we are awash in information, but
what does it all mean?
Certainly, companies that have become adept at selective
data-crunching have uncovered all kinds of valuable correlations. Some are not
entirely surprising. For instance, Siegel reports, people who buy small felt
pads that adhere to the bottom of chair legs (to protect the floor) are more
likely than others to be good credit risks. Other results are quite unexpected.
Smokers in some workplaces tend to suffer less from carpal tunnel syndrome
(perhaps because they generally take more work breaks), and vegetarians tend to
miss fewer flights (maybe because they pre-order a special meal and are thus
more committed to making their flight).
To gain such insights, however, executives need to adopt a
mind-set completely different from the “small data” perspective of the past. In
their engaging and informative book, Mayer-Schönberger and Cukier explain three
new imperatives:
1. Use all the data, not just a sample. In
the past, businesses did not have the economical means to capture, store and
analyze all the data from their operations, so they had to settle for a sample
of it. But now a company like Amazon can economically capture and store data
from every single customer transaction.
2. Accept messiness. Inaccuracies in
measurements are less harmful than they once were because they can often be
smoothed over by the sheer quantity of data. In the authors’ words, “more
trumps better.”
3. Embrace correlation. For many purposes,
correlation is sufficient and people don’t need to know causality.
Mayer-Schönberger and Cukier report that one analysis of used cars found that
orange vehicles are about half as likely as others to have defects. That
correlation between orange and defects may be valuable information even if the
underlying cause is unknown. (Perhaps owners of orange cars are more likely to
be passionate about their vehicles and thus take better care of them?)
Another important lesson of big data is that many applications
can arise far from the purposes for which the data was collected. Take, for
instance, location information that cellphone companies collect so that they
can efficiently route calls. The same data can be used to identify where people
tend to gather on weekend nights — information that could be useful in
predicting real estate prices. Indeed, Mayer-Schönberger and Cukier contend
that “Much of the value of data will come from its secondary uses, its option
value, not simply its primary use.” In fact, the authors predict, “Every single
dataset is likely to have some intrinsic, hidden, not-yet-unearthed value, and
the race is on to discover and capture all of it.” That said, many potential
applications could skim along the edges of what might be ethical, moral or even
legal. A person’s social network, for example, might be used to determine his
or her credit risk. If a person, for example, has a close circle of friends who
are credit deadbeats, then, applying a “birds of a feather” assumption, might
he or she not also be more likely to default on a loan?
Quantifying the likelihood that a particular person will do
something — whether it is defaulting on a loan, upgrading to a higher level of
cable service or seeking another job — is at the heart of Siegel’s Predictive
Analytics. The author describes how quantitative techniques can be deployed
to find valuable patterns in data, enabling companies to predict the likely
behavior of customers, employees and others. FedEx can reportedly identify
(with 65% to 90% accuracy) which customers are likely to defect to a
competitor. Citizens Bank was able to curtail losses from check fraud by 20%
thanks to more sophisticated quantitative analyses. And Hewlett-Packard has
relied on predictive analytics to identify which employees are most likely to
leave, allowing managers time to implement measures to retain those individuals
or prepare for their departures. (Interestingly, in one HP division, employees
who had received a promotion were actually more likely to leave unless they
had also received a significant salary increase.)
Of course, each human being is unique, and the possibility of
“black swan” events must never be discounted. But, as a whole, people do tend
to be creatures of habit, and that regularity enables companies to predict the
likelihood of certain behaviors. Moreover, Siegel makes a clear distinction
between forecasting and predictive analytics: “Whereas forecasting estimates
the total number of ice cream cones to be purchased next month in Nebraska,
predictive technology tells you which individual Nebraskans
are most likely to be seen with cone in hand.”
A bit talky at times (one long chapter focuses solely on IBM’s
Watson computer and its success on “Jeopardy!”), Predictive Analytics nevertheless
contains enough pithy insights to make the book at least worth skimming. One of
those insights is what Siegel calls “The Prediction Effect.” To wit: Even a
modest increase in the accuracy of predictions can often result in substantial
savings. For example, according to Siegel, an insurance business has been able
to save almost $50 million a year by using predictive analytics to shave just
half a percentage point off its loss ratio (the total amount paid in claims
divided by the total amount collected in premiums).
Harnessing Quant Power
Understanding that predictive analytics can save a company $50
million a year is one thing; tapping into that power is quite another. Indeed,
executives must go far beyond the “gee whiz” fascination with big data and
quantitative techniques to learn how their businesses can profit best from this
new era of computational sophistication. For that journey, Keeping Up
with the Quants is a basic guide. As its title suggests, the book is
geared toward executives who are not themselves analytics experts but whose jobs
increasingly require them to understand and deal with those who have such
expertise, both inside and outside their organizations.
In their book, authors Davenport and Kim provide a logical
approach for thinking more like a quantitative analyst. The framework consists
of three major steps, which the authors describe as “framing the problem,”
“solving the problem” and “communicating and acting on results.”
1. Framing the problem. This step might at
first seem simple and straightforward, but it is often neither. Take, for
example, the company that wants to learn the success rate of its direct mail
campaign, so it asks, How many people will buy the product after receiving the
mailing? Instead, the question it should ask is this: How many people who
wouldn’t have bought the product will now buy it after receiving the mailing?
(That is, in this instance causality is important. The company wants to know
how effective the mailing is.)
This
new era of computational prowess does not obviate the need for intuition and
creativity, and that is especially true in the important first step of framing
a problem.
In framing a problem, executives must involve all the
stakeholders, not just to get their perspective but also to get a sense of
whether they will buy into the results after the analysis is complete. A key
question to ask is: What actions will be taken based on the analysis? Davenport
and Kim recount the story of a restaurant chain that wanted to investigate the
profitability of each item on its menu. When the executives were asked what
they intended to do with the results of that analysis, one replied that they
should consider whether to remove the unprofitable items, but another executive
countered that the company had not removed a single item from its menu over the
past 20 years. After further discussion, the executives decided to focus the
study on pricing and not profitability.
2. Solving the problem. This step consists
of modeling, data collection and data analysis. Here the authors emphasize how
valuable a new source of information can be — and that more and better data
will often trump a better algorithm for analyzing that information. Case in
point: the insurance company Progressive, which gained a competitive edge over
rivals by using FICO credit scores and other data to assess the likelihood that
a particular person would be involved in a car accident in the future. And,
thanks to tools like Hadoop and MapReduce, companies can consider not only
structured data (such as a person’s age and income) but also unstructured
information (such as text and images).
3. Communicating and acting on results. Many
quantitative analysts make the mistake of assuming that “the results speak for
themselves.” Well, they don’t. “The clearer the results presentation, the more
likely that the quantitative analysis will lead to decisions and actions —
which are, after all, usually the point of doing the analysis in the first
place,” write Davenport and Kim. And sometimes it’s not enough just to be
clear; the results also have to be presented in an engaging, user-friendly
format. For example, for Delta Air Lines, Deloitte Consulting developed an iPad
app that enables executives to quickly query the airline’s operations.
Different colors indicate the performance at particular airports, and touching
an airport on a map brings up additional data about the location’s operations.
Executives can then drill further down to obtain granular information on
staffing, customer service levels and problems.
An important point made in Keeping Up with the Quants is
that this new era of computational prowess does not obviate the need for
intuition and creativity, and that is especially true in the important first
step of framing a problem. “Half the battle in problem solving and decision
making is framing the problem or decision in a creative way so that it can be
addressed effectively,” assert Davenport and Kim. For example, a clever
researcher — Junxiang Lu — figured out a way to predict customer lifetime value
in the telecom industry by creatively reframing the problem in terms of
“survival analysis,” a biological statistical technique used to determine the
proportion of a population of living organisms that will survive past a certain
time.
Unresolved Issues
To be sure, the use of big data and predictive analytics raises
a number of difficult issues. One very hot topic is privacy concerns. In 2012,
Target ignited a media firestorm after consumers learned that the company was
using its quantitative methods to predict which customers were pregnant.
(Siegel discusses the controversy in Predictive Analytics.) And, as
is the case with many new tools, the technology often outpaces the laws and
regulations governing its deployment. According to Mayer-Schönberger and
Cukier, “Society has built up a body of rules to protect personal information.
But in an age of big data, those laws constitute a largely useless Maginot
Line.”
Another prickly issue is figuring out what all these data are
worth in monetary terms. In the past, companies have struggled with trying to
assess the value of their brands, patents, trade secrets and other intellectual
property. Data should now be part of that discussion — but exactly what’s the
value of all the “likes” that Facebook has amassed? And what about the value of
all that Google search information? Moreover, do consumers have any right to
some of that value, especially if information is used to reap profits in ways
other than the purpose it was originally collected for?
Such thorny issues aside, one thing is certain: The emerging era
of big data and quantitative analytics has only just begun. “Seeing the world
as information, as oceans of data that can be explored at ever greater breadth
and depth, offers us a perspective on reality that we did not have before,”
write Mayer-Schönberger and Cukier. Those companies that grasp this new reality
will likely outperform those that don’t — and that’s a view of the future
business landscape that predictive analytics itself might well have foreseen.
Useful blog.I have learned a lot of informative stuff from your blog.Thank you so much for sharing this wonderful post. Keep posting such valuable contents https://suryainformatics.com
ReplyDelete